Skip to content
/ modena Public

Detect epigenetic/epitranscriptomic modifications

License

Notifications You must be signed in to change notification settings

sbidin/modena

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

modena

Modena is a nanopore-based computational method for detecting a wide spectrum of epigenetic and epitranscriptomic modifications.

It uses an unsupervised learning approach, namely resampling of nanopore signals followed by the Kuiper test. Unlike other unsupervised tools, classification is performed by 1D clustering of scores into two groups.

Important

This version of Modena is v2 beta. To find the stable v1 version of Modena, visit the v1.0.0 git tag.

setup

To install and use Modena, you need at least Python 3.10 and the Poetry package manager. Then run the following commands:

$ git clone https://github.com/sbidin/modena.git
$ cd modena
$ poetry install
$ poetry run python -m modena --help # See options.
$ poetry --directory path/to/modena/dir/ run python -m modena # Run outside modena dir.

inputs

Both datasets need to be supplied in blow5 or slow5 format, alongside their f5c resquiggle output tsv files. If your dataset is in single/multi fast5 format, or pod5 format, you can apply conversions using one of the following tools:

To resquiggle your data with f5c, install f5c and run the resquiggle command:

$ f5c resquiggle data.fastq data.blow5 > resquiggled.tsv

example modena usage

Both datasets (in this case a and b) need a blow5 or slow5 file and a corresponding f5c-resquiggled tsv file.

$ poetry run python -m modena -a a.blow5 -ax a.tsv -b b.blow5 -bx b.tsv -o out.tsv
$ poetry run python -m modena --help # See here for more options.

output format

Modena outputs a simple tsv file with four columns:

  • position, int, 1-based
  • coverage, int, a count of all reads that contributed to the signal
  • distance, float, a two-sample Kuiper-test-based measure (a distance sum)
  • label, str, "pos" or "neg", separating positions into two clusters

About

Detect epigenetic/epitranscriptomic modifications

Resources

License

Stars

Watchers

Forks

Languages