Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curate spreadsheet/template linking biomarkers to disease #1087

Open
dosumis opened this issue May 7, 2021 · 5 comments
Open

Curate spreadsheet/template linking biomarkers to disease #1087

dosumis opened this issue May 7, 2021 · 5 comments
Assignees

Comments

@dosumis
Copy link
Collaborator

dosumis commented May 7, 2021

Terms to target:

  • All subclasses of existing biomarker terms
  • All terms with biomarker in definition/comment text
  • Via DL query: `'is about' some (has_role some biomarker)
column content description
measurement_ID ID of a measurement term. ID should take the form EFO:nnnnnnn
measurement_label label of a measurement term
biomarker_for_label what the measurement is a biomarker for (label). Typically a disease (MONDO)
biomarker_for_ID what the measurement is a biomarker for (ID). ID should take the form MONDO:nnnnnnn
evidence_comment evidence/description of use as biomarker - free text
supporting publications supporting publications PMID:nnnnn or DOI:nnn... Delimit multiple using a |

These can be curated into a google spreadsheet or Excel, but should then be copied into a TSV file on this repository.

The aim of this spreadsheet is to generate axioms linking measurement to the diseases (etc) for which they are biomarkers and to use these to automate classification under biomarker grouping classes. Schema TBD.

@dosumis
Copy link
Collaborator Author

dosumis commented May 7, 2021

Related ticket: #787 + tickets linked to it via ZenHub.

@paolaroncaglia @zoependlington comments/context on prior work on this would be most welcome. We have taken this table-based curation approach for now in order to try to be as neutral possible about schema. It will be straightforward to use this as a template for axiom generation -> EFO.

@paolaroncaglia
Copy link
Collaborator

Hi @dosumis and @kallia-p ,
I wasn't involved in GWAS EFO work, and only created a few measurements terms requested from non-GWAS users. Zoë and I tagged and linked relevant tickets, and tried to collate everything in an Epic, as you noted above, but I'm afraid I can't offer much context on prior work as that was carried out by Dani and then Trish afaik. I looked among my emails as I vaguely remember that Sandra Machlitt-Northen (who used to be at Open Targets on campus on secondment from GSK) might have provided some thoughts in the past, but I couldn't find records. As far as I remember, there weren't resources to follow up.
Have a good weekend,
Paola

@dosumis
Copy link
Collaborator Author

dosumis commented May 20, 2021

These could potentially be mined using simple SPARQL queries

  • All subclasses of existing biomarker terms - use regex on label
    Made a start here but currently failing https://api.triplydb.com/s/oXAf5wor7
  • All terms with biomarker in definition/comment text - use regex filter

This may give incomplete results with SPARQL - even with all the pre-reasoning in the ubergraph database.
Via DL query: 'is about' some (has_role some biomarker)

@dosumis
Copy link
Collaborator Author

dosumis commented May 25, 2021

Working query to find all subclasses of existing biomarker terms:

https://api.triplydb.com/s/SEHYRH18_

Finds quite a lot - 527 lines returned.

Might be useful to add a clause that returns definition text too.

@kallia-p
Copy link
Collaborator

@dosumis Working SPARQL queries:
Working query which gets biomarker class labels, subclasses of biomarker classes (transitive), labels for subclassOf, definition for biomarker subclasses, definition citations and dbxrefs (PMIDs)
https://api.triplydb.com/s/V0qlPxoBo
Resulting table
https://docs.google.com/spreadsheets/d/1hHWHai_IeKKTrPyaCpck3Jv_4yXDvnBl8YIyDKIFdeU/edit?usp=sharing
(should be accessible to all EBI users - let me know if not!)

Working query as above without definition citations and dbxrefs (PMIDs)
https://api.triplydb.com/s/lkDBw-qV8

Practice queries
https://api.triplydb.com/s/5muIyh22W
https://yasgui.triply.cc/#query=prefix%20owl%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2[…]2Fsparql-results%2Bjson%2C*%2F*%3Bq%3D0.9&outputFormat=table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants