In-class problems#
In-class Problems Week 2
Remember that for the in-class problems, you can use different resources (command line manual/help pages, browser-based searches, AI-based problem solving) to find an answer. There is no right or wrong approach to finding an answer to a problem.
Now we’re going to return to the file /nfs/teaching/551-0132-00L/2_Good_practices/metadata.tsv
- this is a real file of metadata from the Human Microbiome Project (read more here). Although we could load it into say, R, and perform various tasks there, we’re going to work only with command line tools to find the following information:
How many samples are there in total?
How many different body sites are there? Subsites? How do the subsites correspond to the sites?
What is the distribution of data for each column?
Are there any clear biases in the distribution of metadata?
How can we select a specific subset of data to look at?
Revisiting what you learned this week, how can you count the number of occurrences of a pattern in a text file?
How can you prevent misinterpretation of answers received by ChatGPT?
How could you use ChatGPT for your own learning progress about a topic, commands or problem solving and for testing your own knowledge?
Write a script containing at least two lines of code and at least one pipe of multiple commands. There is no limit to your creativity. What script did you come up with?
Note: There will be no example solutions for in-class problems. It is expected that students take notes and engage in the discussions during the lecture. If questions come up, students can use the Slack channels to receive help.