-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add sections for data linking tables
- Loading branch information
Showing
3 changed files
with
21 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
--- | ||
title: Data Linking | ||
weight: 100 | ||
last_reviewed_on: 2024-01-24 | ||
review_in: 1 year | ||
show_expiry: true | ||
owner_slack: '#ask-data-linking' | ||
--- | ||
|
||
<%= partial 'documentation/data-docs/curated-databases-docs/data-linking' %> |
11 changes: 10 additions & 1 deletion
11
source/documentation/data-docs/curated-databases-docs/data-linking.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,12 @@ | ||
# Data Linking | ||
|
||
As an analyst it is important to be able to link together different datasets across domains. | ||
As a department, we struggle with a lack of consistent, reliable, unique identifiers within and across our systems. Unique IDs are critical to getting a true picture of the justice system and as an analyst it is important to be able to link together different datasets across domains. | ||
|
||
The Internal Data Linking team have created Data Linking tables (using the [Splink](https://moj-analytical-services.github.io/splink/index.html) under the hood) for use across Data & Analysis to allow analysts to: | ||
|
||
1. Deduplicate Individual Datasets | ||
2. Link between Datasets (i.e. across domains) | ||
|
||
The Data Linking tables contain estimated unique IDs attached to each ID within the linked datasets. They function as a lookup table that associates a raw system ID with the unique linked ID we have generated. This linked ID can then be used to deduplicate and/or link datasets. | ||
|
||
For more on the Data Linking tables, how they are made and how to use them, check out the [data discovery tool](https://data-discovery-tool.analytical-platform.service.justice.gov.uk/data_linking_anonymised/index.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters