by Michael Manhart
I recently attended a conference on the gut microbiome and health (mostly basic scientists, a few clinicians), and it left me thinking about the general state of the field right now in terms of what most people are doing, as well as what most people are not doing.
The vast majority of studies discussed in the talks were remarkably similar in their form: take mice or people at two or more conditions (time points, diets, disease states, etc.), then perform 16S sequencing on their fecal samples and possibly collect physiological data from the host (e.g., weight, glycemic response). The analysis of the 16S sequencing data almost always consisted of comparing the taxonomic composition (especially its diversity) between conditions. This is of course a reasonable thing to do, but I was struck by how dominant the paradigm still is after all these years, especially in light of some obvious shortcomings and the availability of alternative approaches. Here I will highlight the ones that I think are most important.
Why do we still focus so much on taxonomic composition and diversity? I see three main problems with this approach:
The emphasis on taxonomic composition mostly excluded other omics approaches. There were a few metabolomic studies, but little to no metagenomics, metaproteomics (except for one speaker who did focus on this), or metatranscriptomics. Given the hype of multi-omics, why aren’t more people using this for the gut microbiome? Is it due to technical limitations of these methods — for example, a lack of annotated reference genomes — or a lack of imagination for how to interpret the results?
The role of genetic evolution as a force in shaping gut microbiomes over human timescales is still mostly ignored. There was no mention of this or mutations in general from what I recall. This is presumably coupled to the lack of metagenomic data, but I was surprised to see no consideration of it at all, especially given valiant efforts by a few in the field to show that it can be significant (Garud and Good et al. 2019, Zhao et al. 2019).
No quantitative mechanistic models. I use these qualifiers because there certainly were statistical models (e.g., comparing distributions to null models, principal-component analysis) as well as non-quantitative mechanistic models (e.g., this protein interacts with that protein). But there were no ODEs, agent-based models, stochastic models, or the like which would help to quantitatively connect data to underlying mechanisms. I would argue this is linked with many of the aforementioned issues, since a wider appreciation and usage of these models could help to incorporate a wider range of data and would facilitate more complex analyses than simply comparing diversity of two samples.