BONJOUR, In this folder, you will find the data required for a gene-content exploration of ocean microbial populations. The folder includes some summary files: - SPECIES-SUMMARY: info about the 415 species that were looked at for population structures. Note that for some downstream analyses only the 271 with 'High' or 'Good' confidence in the population structures were kept. - GENOMES-SUMMARY: info about all the genomes in the database (including those belonging to the 415 species, but more) - POPULATIONS-SUMMARY: info about what population was found in which samples for the 415 species The good stuff (focus on the genes): - GENE-CATALOG-MEMBERSHIP: gives you the genes from all the genomes of the 271 species as well as their gene catalog (95% nucl. id.) cluster ('representative') - PANGENOME-GENE-MEMBERSHIP: same as above but more functional (includes a column with the species and the pangenome category according to Cap's thesis) - PANGENOME-GC-FREQUENCY: detailed info used for pangenomic classification - GENE-CATALOG-PROFILE-SUBSET: the abundance of the representatives across a buuuunch of metagenomes Some additional fun: - PNPS contains the PNPS values of the genes in the representative genome across the metagenomic samples (to be matched with POPULATIONS-SUMMARY) - GENOMES-SCAFFOLDS-MEMBERSHIP allows you to link genomes and scaffolds, and a fortiori, genomes and genes.