Skip to content

Commit

Permalink
add sections for data linking tables
Browse files Browse the repository at this point in the history
  • Loading branch information
RossKen committed Jan 24, 2024
1 parent fc19e84 commit c8b3893
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 1 deletion.
10 changes: 10 additions & 0 deletions source/data/data-linking/index.html.md.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Data Linking
weight: 100
last_reviewed_on: 2024-01-24
review_in: 1 year
show_expiry: true
owner_slack: '#ask-data-linking'
---

<%= partial 'documentation/data-docs/curated-databases-docs/data-linking' %>
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# Data Linking

As an analyst it is important to be able to link together different datasets across domains.
As a department, we struggle with a lack of consistent, reliable, unique identifiers within and across our systems. Unique IDs are critical to getting a true picture of the justice system and as an analyst it is important to be able to link together different datasets across domains.

The Internal Data Linking team have created Data Linking tables (using the [Splink](https://moj-analytical-services.github.io/splink/index.html) under the hood) for use across Data & Analysis to allow analysts to:

1. Deduplicate Individual Datasets
2. Link between Datasets (i.e. across domains)

The Data Linking tables contain estimated unique IDs attached to each ID within the linked datasets. They function as a lookup table that associates a raw system ID with the unique linked ID we have generated. This linked ID can then be used to deduplicate and/or link datasets.

For more on the Data Linking tables, how they are made and how to use them, check out the [data discovery tool](https://data-discovery-tool.analytical-platform.service.justice.gov.uk/data_linking_anonymised/index.html).
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ This is guidance contains information on using curated databases on the Analytic
* [Amazon Athena](amazon-athena/)
* [Querying Athena from the AP](dbtools/)
* [Databases](databases/)
* [Data Linking](data_linking/)
* [Data Discovery Tool](data-documentation/)

0 comments on commit c8b3893

Please sign in to comment.