Search within files#
A fast way of searching for a specific pattern within an open file (similar to Ctrl+F/Command+F for searching within a text document) is /<pattern>. The disadvantage of opening a file is however that there is no simple way to display all occurences of the search, you rather have to type n or N to browse through the hits as the file will be visualized so that one hit can be found on the top line.
# Searching for a pattern within an open file
less E.coli.fna
/AAAAAAAAA + Enter # searches for pattern within an open file
q # closes the open file
The command grep allows you to search within files without opening them first with another program. It has a number of useful options to help give you the right output.
# A simple grep
grep "AAAAAAAAA" E.coli.fna # shows all lines containing "AAAAAAAAA" highlighted
# Useful options
grep -o # show only the matches
grep -c # show only a count of the matches
Exercise 1.7#
Exercise 1.7
Navigate to the directory you copied the E. coli files to earlier.
# Navigation
cd ~/ecoli
Use less to look at the
GCF_000005845.2_ASM584v2_cds_from_genomic.fna
file, containing nucleotide gene sequences.
# Look at the file
less GCF_000005845.2_ASM584v2_cds_from_genomic.fna
# Press q to quit
Search within less to find the sequence for dnaA (searching within opened file).
less GCF_000005845.2_ASM584v2_cds_from_genomic.fna
/dnaA
# Type n or N (stands for Next) after to see if there are more search hits
# Press q to quit
Use grep to find the sequence for dnaA within the file (without opening it).
grep "dnaA" GCF_000005845.2_ASM584v2_cds_from_genomic.fna
Use grep -o to only show the matches for the pattern dnaA within the file. How does the output change?
grep -o "dnaA" GCF_000005845.2_ASM584v2_cds_from_genomic.fna
Use grep -c count how many dnaA sequences there are within the file.
grep -c "dnaA" GCF_000005845.2_ASM584v2_cds_from_genomic.fna