Common Arguments¶
Required Argument¶
fast5s_dirPath to directory containing raw FAST5-format nanopore reads.
Both single and multi FAST5 formats are supported.
Default searches recursively for fast5 read files. To search only one-level specify
--not-recursive.
Guppy Backend Argument¶
--guppy-configGuppy config.
Default:
dna_r9.4.1_450bps_modbases_5mc_hac.cfg
--guppy-server-pathPath to guppy server executable.
Default:
./ont-guppy/bin/guppy_basecall_server
Output Arguments¶
--live-processingAs of version 2.2, Megalodon now supports live run processing.
Activate live processing mode by simply adding the
--live-processingargument and specifying the MinKNOW output directory as the input FAST5 directory.Megalodon will continue to search for FAST5s until the
final_summary*file is created by MinKNOW, indicating data production has completed.
--outputsSpecify desired outputs.
Options are
basecalls,mod_basecalls,mappings,variant_mappings,mod_mappings,per_read_variants,per_read_mods,variants, andmods.mod_basecallsare output in a BAM file via theMmandMltags described by hts-specs here.variant_mappingsare intended for obtaining highly accurate phased variant genotypes, but also provide a nice genome browser visualiztion of per-read variant calls.These mappings contain reference sequence at all positions except for per-read called variants. The base quality scores encode the likelihood for that reference anchored variant for use in the
whathapphasing algorithm.
mod_mappingsprovide reference-anchored per-read modified base calls.As of version 2.2, the default output uses the
MmandMlhts-specs tags (see above) with all modified bases in one output file.Specify the
--mod-map-emulate-bisulfiteoption to output one BAM per modified base with modified bases converted using--mod-map-base-convThis file is useful for visualizing per-read modified base calls (e.g. IGV bisulfite mode for CpG calls).
This file may also allow a port to standard bisulfite pipelines that are capable of processing long-reads.
Default output is
basecallsonly.
--output-directorySpecify the directory to output results. Default
megalodon_results
--overwriteOverwrite the
--output-directoryif it exists.Note that this is a recursive file deletion and should be used with caution.
Mapping Arguments¶
--mappings-formatFormat for
mappingoutput.Options include
bam(default),cram, andsam.As of version 2.2, mappings are no longer sorted by default.
Set
--sort-mappingsto sort mappings. Ifsamtoolsis not in$PATHprovide path to executable via the--samtools-executableargument.
--referenceReference genome or transcriptome in FASTA or minimap2 index format.
If
--referenceis a minimap2 index and--mapping-formatiscram, provide FASTA reference via--cram-reference.
Sequence Variant Arguments¶
--haploidCompute sequence variants assuming a haploid reference. Default: diploid
--variant-filenameFile containing putative variants in VCF/BCF format.
Variants file must be sorted.
If variant file is not compressed and indexed this will be performed before further processing.
Variants must be matched to the
--referenceprovided.
Modified Base Arguments¶
--mod-motifRestrict modified base results to the specified motifs.
This argument takes 3 values representing:
Modified base single letter codes (see
megalodon_extras modified_bases describe_alphabetcommand)Canonical sequence motif (may contain ambiguity codes)
Relative position (0-based) of the modified base within the canonical sequence motif
Multiple
--mod-motifarguments can be provided to a singlemegalodoncommand.If not provided (and
per_read_modsormodsoutputs requested) all relevant sites are tested (e.g. allCbases for5mC).Note that restricting to motifs of interest can save computationally expensive steps and is considered more than a simple post-processing filter.
Compute Resource Arguments¶
--processesNumber of CPU read-processing workers to spawn.
--devicesGPU devices to use for basecalling acceleration.
If not provided CPU basecalling will be performed.
Device names can be provided in the following formats:
0,cuda0orcuda:0.Multiple devices can be specified separated by a space.