Single nucleotide variant (SNV) profiling using mOTUs

Calling variants using marker genes is divided into two subroutines, namely alignment and variant calling (map_snv and snv_call). map_snv aligns sequencing reads against the mOTUs database. snv_call utilizes the metaSNV package to call variants on these marker genes.

map_snv takes one or multiple sequencing files and aligns reads against the mOTUs database:

#sample with just a single file of unpaired reads
motus map_snv -s sample.fq.gz > sample.bam
#sample with paired end reads
motus map_snv -f sample_R1.fq.gz -r sample_R2.fq.gz > sample.bam 

Tweaking alignment parameters allows for changes in the minimum alignment length (-l). The -t flag allows to accelerate the alignment step using multithreading:

 motus map_snv -f sample_R1.fq.gz -r sample_R2.fq.gz -l 100 -t 8> sample.bam 

snv_call takes the bam files created in the map_snv step as input and calls variants using the metaSNV package. This information is then be used to create a distance matrix between samples. The input for snv_call is a directory with bam files. Each bam file will be treated as an individual sample:

motus snv_call -d DIRECTORY -o OUTPUT_DIRECTORY 

An example distance matrix for the comparison of 3 samples is shown below.

--------sample_1  sample_2  sample_3
sample_10.0000   0.0012   0.1430
sample_20.0012   0.0000   0.1392
sample_30.1430   0.1392   0.0000 

There are multiple filtering parameters that influence if variants are called such as coverage depth (-fd), coverage breadth (-fb) or the minimum number of samples that report a variant (-fm). A list of all parameters can be found when executing the plain motus snv_call command:

motus snv_call