# Profiling long reads samples Long reads can be profiled using From `mOTUs 3.0.3` and onwards. It works by first splitting the long reads into shorter reads (using the command `motus prep_long`) that can then be profiled with the default `motus profile`. ## Installing `mOTUs 3.0.3` If you are having difficulty installing the latest version of `mOTUs` with conda, [installing via `pip`](install.md) might be better ## Download example data Let us try to profile long reads using mOTUs on a mock comminity. First download the long reads: ```bash wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.250k.fastq.gz ``` This dataset is a subsample of the larger dataset `SRR9328980` (subsampled to 10% of the original number of reads). ## Preparing the long reads We first prepare the sample by splitting the long reads into shorter reads. ```bash motus prep_long -i HiFi-ATCC-MSA-1003.250k.fastq.gz -o HiFi-ATCC-MSA-1003.250k.short.fastq -no_gz gzip HiFi-ATCC-MSA-1003.250k.short.fastq # or "pigz -p 32 HiFi-ATCC-MSA-1003.250k.short.fastq" if pigz is installed # Or download the prepared result produced by the command: wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.250k.short.fastq.gz ``` Note: We compress the file manually due to performance issues with the python gzip module. ## Running mOTUs We can now use the usual `motus profile` command on the prepared long reads. ```bash # We use -A to be consistent with the report shown below. -A prints out all the taxonomic levels # -t defines the number of threads motus profile -A -s HiFi-ATCC-MSA-1003.250k.short.fastq.gz -o HiFi-ATCC-MSA-1003.motus -t 32 # Or download the prepared result produced by the command: wget https://sunagawalab.ethz.ch/share/MOTUS_DATA/motus_3.0.3/motus_long_reads/HiFi-ATCC-MSA-1003.motus ``` Exploring the result we get: ```bash # Get abundances for genus level grep "g__" HiFi-ATCC-MSA-1003.motus | grep -v "s__" k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas 0.0256625687 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Moraxellaceae|g__Acinetobacter 0.0041047739 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacterales|f__Enterobacteriaceae|g__Escherichia 0.1690642823 k__Bacteria|p__Proteobacteria|c__Alphaproteobacteria|o__Rhodobacterales|f__Rhodobacteraceae|g__Rhodobacter 0.2692651120 k__Bacteria|p__Proteobacteria|c__Betaproteobacteria|o__Neisseriales|f__Neisseriaceae|g__Neisseria 0.0016740789 k__Bacteria|p__Proteobacteria|c__Epsilonproteobacteria|o__Campylobacterales|f__Helicobacteraceae|g__Helicobacter 0.0019215690 k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Clostridiaceae|g__Clostridium 0.0065334779 k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Bacillaceae|g__Bacillus 0.0208843590 k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus 0.0725363474 k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus 0.1647341426 k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Lactobacillaceae|g__Lactobacillus 0.0009095220 k__Bacteria|p__Deinococcus-Thermus|c__Deinococci|o__Deinococcales|f__Deinococcaceae|g__Deinococcus 0.0013104470 k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Propionibacteriales|f__Propionibacteriaceae|g__Cutibacterium 0.0035726154 k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Porphyromonas 0.1891746967 # Get abundances starting for mOTU/species level grep "s__" HiFi-ATCC-MSA-1003.motus k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacterales|f__Enterobacteriaceae|g__Escherichia|s__Escherichia coli [ref_mOTU_v3_00095] 0.1690642823 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas aeruginosa [ref_mOTU_v3_00201] 0.0256625687 k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Moraxellaceae|g__Acinetobacter|s__Acinetobacter baumannii [ref_mOTU_v3_00259] 0.0041047739 k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Bacillaceae|g__Bacillus|s__Bacillus sp. [ref_mOTU_v3_00329] 0.0208843590 k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus|s__Staphylococcus aureus [ref_mOTU_v3_00340] 0.0053912499 k__Bacteria|p__Firmicutes|c__Bacilli|o__Bacillales|f__Staphylococcaceae|g__Staphylococcus|s__Staphylococcus epidermidis [ref_mOTU_v3_00346] 0.0671450975 k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Propionibacteriales|f__Propionibacteriaceae|g__Cutibacterium|s__Cutibacterium acnes [ref_mOTU_v3_00800] 0.0035726154 k__Bacteria|p__Proteobacteria|c__Epsilonproteobacteria|o__Campylobacterales|f__Helicobacteraceae|g__Helicobacter|s__Helicobacter pylori [ref_mOTU_v3_00897] 0.0019215690 k__Bacteria|p__Bacteroidetes|c__Bacteroidia|o__Bacteroidales|f__Porphyromonadaceae|g__Porphyromonas|s__Porphyromonas gingivalis [ref_mOTU_v3_00985] 0.1891746967 k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Lactobacillaceae|g__Lactobacillus|s__Lactobacillus gasseri [ref_mOTU_v3_01039] 0.0009095220 k__Bacteria|p__Proteobacteria|c__Alphaproteobacteria|o__Rhodobacterales|f__Rhodobacteraceae|g__Rhodobacter|s__Rhodobacter sphaeroides/johrii [ref_mOTU_v3_01513] 0.2692651120 k__Bacteria|p__Proteobacteria|c__Betaproteobacteria|o__Neisseriales|f__Neisseriaceae|g__Neisseria|s__Neisseria meningitidis [ref_mOTU_v3_01539] 0.0016740789 k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus|s__Streptococcus mutans [ref_mOTU_v3_01605] 0.1567639053 k__Bacteria|p__Firmicutes|c__Bacilli|o__Lactobacillales|f__Streptococcaceae|g__Streptococcus|s__Streptococcus agalactiae [ref_mOTU_v3_01860] 0.0079702373 k__Bacteria|p__Deinococcus-Thermus|c__Deinococci|o__Deinococcales|f__Deinococcaceae|g__Deinococcus|s__Deinococcus radiodurans [ref_mOTU_v3_02207] 0.0013104470 k__Bacteria|p__Firmicutes|c__Clostridia|o__Clostridiales|f__Clostridiaceae|g__Clostridium|s__Clostridium beijerinckii [ref_mOTU_v3_03007] 0.0065334779 ```