



However, the advent of high-throughput sequencing and new metagenomic analysis methods-namely, involving genome assembly and binning-has transformed understanding of the microbiome composition in both humans and other environments 13, 14, 15. But, as the IGC comprises genes with no direct link to their genome of origin, it lacks contextual data to perform high-resolution taxonomic classification, establish genetic linkage and deduce complete functional pathways on a genomic basis.Ĭulturing studies have continued to unveil new insights into the biology of human gut communities 11, 12 and are essential for applications in research and biotechnology. This gene catalog has been applied successfully to the study of microbiome associations in different clinical contexts 7, revealing microbial composition signatures linked to type 2 diabetes 8, obesity 9 and other diseases 10. The Integrated Gene Catalog (IGC) 5 was subsequently created, combining the sequence data available from the HMP and the Metagenomics of the Human Intestinal Tract (MetaHIT) 6 consortium. Hundreds of genomes from bacterial species with no sequenced representatives were obtained as part of this project, allowing their use for the first time in reference-based metagenomic studies. The Human Microbiome Project (HMP) 4 was a pioneering initiative to enrich knowledge of human-associated microbiota diversity. Hence, establishing a comprehensive collection of microbial reference genomes and genes is an important step for accurate characterization of the taxonomic and functional repertoire of the intestinal microbial ecosystem. However, incomplete reference data that lack sufficient microbial diversity 3 hamper understanding of the roles of individual microbiome species and their functions and interactions. The human gut microbiome has been implicated in important phenotypes related to human health and disease 1, 2.
