Command manual¶
Here we provide a full description of the various commands and their various options. In the command line, you can always type motus <command>
to obtain a short description of the various flags and useage.
To execute the motus profiler you need to call motus <command> [options]
. The possible values for command are:
profile
, perform taxonomic profiling on a sample (map_tax
+calc_mgc
+calc_motu
);merge
, append different profiles to create a table.
The profile command can be split into:
map_tax
, map reads to the marker gene database, output a SAM/BAM file;calc_mgc
, aggregate reads from the same marker gene cluster (mgc) and output the mgc abundance table. It uses the SAM/BAM file produced by map_tax;calc_motu
, from a mgc abundance table (created by calc_mgc), produce the mOTUs abundance table;
We also have a command to handle long reads. You can find a more detailed tutorial here.
prep_long
, which converts long read data into short read data, which can then be used by mOTUs profile .
And commands to perform SNV profiling using the metaSNV package https://metasnv.embl.de/. Again you have a more detailed tutorial using these commands here.
map_snv
, map reads to the mOTUs marker gene database and produce a BAM file suitable for metaSNV.snv_call
, SNV calling using metaSNV
profile
¶
Performs taxonomic profiling from reads in fastq format, outputs the relative abundances of each profiled species.
Input options
Option |
Input type |
Description |
Example |
|||
---|---|---|---|---|---|---|
|
FILE[,FILE] |
Input fastq file in the forward orientation. When present, it requires also |
|
|||
|
FILE[,FILE] |
Input fastq file(s) in the reverse orientation. When present, it requires also |
|
|||
|
FILE[,FILE] |
Input fastq file(s) for unpaired reads, The file(s) can also be a ZIP file (.gz or .bz2). You can also analyze single read files alone or together with forward and reverse reads. |
|
|||
|
STR |
Name of the sample. Supplying a unique name is essential for merging profiles later on. |
|
|||
|
FILE |
From the intermediate alignment result of |
|
|||
|
DIR |
Provide a different database directory DIR |
If the database is in directory |
|||
|
FILE |
From the intermediate MGC read count table (as produced by the |
|
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output file name. If you don’t provide this option then it will print to stdout. |
|
|
FILE |
Save the intermediate alignment result of |
|
|
FILE |
Save the intermediate marker gene cluster (MGC) read count table result from |
|
|
Print the abundances of only the ref-mOTUs in the output. All other mOTU types (meta and ext) will be part of |
|
|
|
Output the taxonomic profile with counts rather than relative abundances |
|
|
|
Print the NCBI TaxID of the mOTU in the output |
|
|
|
Output the taxonomy profile in BIOM format |
|
|
|
STR |
Print result in CAMI format (BioBoxes format 0.9.1). Possible values: [precision, recall, parenthesis]. Note that the mOTUs species definition and the NCBI species definition is not always congruent. As a result, you can decide three methods to save the result in CAMI format: “precision”, where the discrepancies are deleted; “recall”, where the relative abundances of the discrepancies are split and “parenthesis” where all the discrepancies are kept. |
|
|
Report the full rank taxonomy in the taxonomic profile output |
|
|
|
STR |
Report abundances at a specific taxonomic level. You can choose between [kingdom, phylum, class, order, family, genus, mOTU]. |
|
|
Print all taxonomic levels together (kingdom to mOTUs; overrides |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 3] |
Number of marker genes required to calculate a mOTU’s abundance. Given a mOTU, we calculate its abundance if at least |
|
|
INT [Default 75] |
Minimum alignment length for reads. This has to be lower than the average read length; a warning will be produced in the stderr if |
|
|
INT [Default 1] |
Number of threads to use when running |
|
|
STR [Default insert.scaled_count] |
Type of read counts that we use. Possible values: [base.coverage, insert.raw_counts, insert.scaled_counts] |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
merge
¶
Merges taxnonomic profiles from multiple samples into one (tab-separated) table. Requires that each profile is named (using -n
in motus profile
).
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
[,FILE] |
List of profiled samples to be merged. It is important that each profile has a unique name (given by |
|
|
DIR |
Merge all profiles within directory DIR. Note that the command will fail if any file other than mOTUs profiles are present in the directory. |
If all profiles to be merged are in DIR |
|
STR [,STR] |
Append profiles pre-computed using publicly available metagenomic and metatransciptomic samples from various environments (available from |
Appending profiles from a single environment: |
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output file name. If you don’t provide this option then it will print to stdout. |
|
|
Print result in BIOM format |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
map_tax
¶
Maps reads from fastq files to marker gene database, outputs a SAM/BAM file.
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE[,FILE] |
Input fastq file in the forward orientation. When present, it requires also |
|
|
FILE[,FILE] |
Input fastq file(s) in the reverse orientation. When present, it requires also |
|
|
FILE[,FILE] |
Input fastq file(s) for unpaired reads, The file(s) can also be a ZIP file (.gz or .bz2). You can also analyze single read files alone or together with forward and reverse reads. |
|
|
DIR |
Provide a different database directory DIR |
If the database is in directory |
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output file name. If you don’t provide this option then it will print to stdout. |
|
|
Save the result of |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT |
Number of threads to use when running |
|
|
INT [Default 75] |
Minimum alignment length for reads. This has to be lower than the average read length; a warning will be produced in the stderr if |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
calc_mgc
¶
Aggregate reads from the same marker gene cluster (mgc) and outputs the mgc abundance table. It uses the SAM/BAM file produced by map_tax
.
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE [,FILE] |
Input SAM or BAM file (or list of files) result of |
|
|
STR |
Name of the sample. Supplying a unique name is essential for merging profiles later on. |
|
|
DIR |
Provide a different database directory DIR |
|
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE [,FILE] |
Input SAM or BAM file (or list of files) result of |
|
|
STR |
Name of the sample. Supplying a unique name is essential for merging profiles later on. |
|
|
DIR |
Provide a different database directory DIR |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 75] |
Minimum alignment length for reads. This has to be lower than the average read length; a warning will be produced in the stderr if |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
|
STR [Default insert.scaled_count] |
Type of read counts that we use. Possible values: [base.coverage, insert.raw_counts, insert.scaled_counts] |
|
calc_motu
¶
Produces the mOTUs abundance table (final output of motus profile
) from a mgc abundance table (created by calc_mgc
).
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Input MGC read count table (produced by |
|
|
STR |
Name of the sample. Supplying a unique name is essential for merging profiles later on. |
|
|
DIR |
Provide a different database directory DIR |
If the database is in directory |
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output file name. If you don’t provide this option then it will print to stdout. |
|
|
Print the abundances of only the ref-mOTUs in the output. All other mOTU types (meta and ext) will be part of |
|
|
|
Output the taxonomic profile with counts rather than relative abundances |
|
|
|
Print the NCBI TaxID of the mOTU in the output |
|
|
|
Output the taxonomy profile in BIOM format |
|
|
|
STR |
Print result in CAMI format (BioBoxes format 0.9.1). Possible values: [precision, recall, parenthesis]. Note that the mOTUs species definition and the NCBI species definition is not always congruent. As a result, you can decide three methods to save the result in CAMI format: “precision”, where the discrepancies are deleted; “recall”, where the relative abundances of the discrepancies are split and “parenthesis” where all the discrepancies are kept. |
|
|
Report the full rank taxonomy in the taxonomic profile output |
|
|
|
STR |
Report abundances at a specific taxonomic level. You can choose between [kingdom, phylum, class, order, family, genus, mOTU]. |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 3] |
Number of marker genes required to calculate a mOTU’s abundance. Given a mOTU, we calculate its abundance if at least |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
|
STR [Default insert.scaled_count] |
Type of read counts that we use. Possible values: [base.coverage, insert.raw_counts, insert.scaled_counts] |
|
prep_long
¶
Prepares long reads to be profiled by mOTUs.
Input Options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Input long read file to convert into shorter reads. The file can be fasta(.gz) or fastq(.gz). |
|
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output file name. If you don’t provide this option then it will print to stdout. |
|
|
Do not compress the output file. |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 300] |
Splitting length for the long reads. |
|
|
INT [Default 50] |
Minimum read length. Reads shorter than |
|
|
INT |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
map_snv
¶
Maps reads to the marker gene database and produces a BAM file suitable for metaSNV https://metasnv.embl.de/. You can find a more detailed explanation on this page
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE[,FILE] |
Input fastq file in the forward orientation. When present, it requires also |
|
|
FILE[,FILE] |
Input fastq file(s) in the reverse orientation. When present, it requires also |
|
|
FILE[,FILE] |
Input fastq file(s) for unpaired reads, The file(s) can also be a ZIP file (.gz or .bz2). You can also analyze single read files alone or together with forward and reverse reads. |
|
|
DIR |
Provide a different database directory DIR |
|
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FILE |
Output BAM file name. If you don’t provide this option then it will print to stdout. |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
INT [Default 75] |
Minimum alignment length for reads. |
|
|
INT [Default 1] |
Number of threads to use |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|
snv_call
¶
Performs single nucleotide variant calling using the metaSNV package https://metasnv.embl.de/.
Input options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
DIR |
Call metaSNV on all BAM files in the directory DIR [Mandatory] |
|
Output options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
DIR |
Output Directory. It will fail if output directory already exists. |
|
|
Save in the output directory all the files and directories produced by metaSNV. By default cov, distances, filtered, snpCaller are deleted. |
|
Algorithm options
Option |
Input type |
Description |
Example |
---|---|---|---|
|
FLOAT [Default 80.0] |
Coverage breadth, minimal horizontal genome coverage percentage per sample per species. Sample filter. |
|
|
FLOAT [Default 5.0] |
Coverage depth: minimal average vertical genome coverage per sample per species. Sample filter. |
|
|
INT [Default 2] |
Minimum number of samples per species. mOTU filter. |
|
|
FLOAT [Default 0.9] |
Required proportion of informative samples (coverage should be non-zero) per position. Position filter. |
|
|
FLOAT [Default 5.0] |
Minimum coverage per position per sample per species. Position filter. |
|
|
INT [Default 1] |
Number of threads |
|
|
INT [Default 3] |
Change verbosity level: 1=error, 2=warning, 3=message, 4+=debugging |
|