skip to content



Friday 16th September 2016


Data linkage is the process of identifying and linking records about the same entity across one or more databases. However, there are benefits as well as challenges associated with data linkage. For instance, enabling linkage between two or more sources of information can help maximise the value of research data, as well as adding high value to existing surveys and effectively creating new data resources. Policy decisions and much scientific research hinge on accurate and comprehensive data and those who are linking data want to do more. However, there are difficulties in defining boundaries around what is technically possible, what is legally permitted, and what should be done ethically. Indeed, linking data can also compromise the privacy of an individual's personal information, making them identifiable, even though their information had been anonymised.

In recent years there has been increased interest from businesses and Governments in data linkage. There are a number of reasons for this, such as the exponential growth in data collected, the need to integrate data from different sources, effective data mining to analyse large data collections and the growth in data mining and web applications. However, various challenges need to be overcome, such as ethical issues, complex technical requirements and balancing the drive for automation against the need for human judgement. This workshop, part of the Isaac Newton Institute research programme on Data Linkage and Anonymisation, sought to bring together experts and stakeholders to investigate the development of techniques for the safer linkage and merging of data.

Aims and Objectives

The overall aim of this workshop on data linkage was to strongly encourage interaction between participants from different disciplines, to facilitate cross-disciplinary learning and to set an agenda of the big challenges in the area of data linkage. Topics covered and discussed ranged from computational and statistical aspects of data linkage, to privacy and confidentiality and application case-studies and examples.

The programme of talks highlighted developments in state-of-the-art techniques such as:

  • Privacy preserving probabilistic record linkage
  • Deterministic linkage
  • Computer science approaches
  • Synthetic data

The programme investigated challenges in the context of various applications with perspectives from both researchers and end-users, who highlighted methodologies and techniques, software/systems used and what works/doesn't work.

This event was of interest to individuals from a number of areas including:

  • Biomedical and health research
  • Social and economic research
  • Government, policy makers, regulatory authorities and statisticians
  • Commercial organisations in the financial and retail sectors, and others

There was an opportunity to present posters during the refreshment and lunch breaks.


Registration and Venue

Registration was free of charge.

The workshop took place at the Isaac Newton Institute, Cambridge.