Investigating patterns with localise
One a bedMethyl table has been created, modkit localise
will use the pileup and calculate per-base modification aggregate information around genomic features of interest.
For example, we can investigate base modification patterns around CTCF binding sites.
The input requirements to modkit localise
are simple:
- BedMethyl table that has been bgzf-compressed and tabix-indexed
- Regions file in BED format (plaintext).
- Genome sizes tab-separated file:
<chrom>\t<size_in_bp>
an example command:
modkit localise ${bedmethyl} --regions ${ctcf} --genome-sizes ${sizes}
The output table has the following schema:
column | Name | Description | type |
---|---|---|---|
1 | mod code | modification code as present in the bedmethyl | str |
2 | offset | distance in base pairs from the center of the genome features, negative values reflect towards the 5' of the genome | int |
3 | n_valid | number of valid calls at this offset for this modification code | int |
4 | n_mod | number of calls for this modification code at this offset | int |
5 | percent_modified | n_mod / n_valid * 100 | float |
Optionally the --chart
argument can be used to create HTML charts of the modification patterns.