Gut microbiota wellbeing index predicts overall health in a cohort of 1000 infants
Impact of exposures and priority effect on infant gut microbiota
Utilising the 16S rRNA gene amplicon sequences of 984 infants (50.7% male; 49.3% female), 768 mothers, and 515 fathers, comprising 7211 faecal samples, in unsupervised principal coordinates analysis using log-Pearson distance (Fig. 1a), we observed a clear age gradient in the gut microbiota composition, with the infant gut microbiota gradually approaching, but not reaching, the adult-like composition over the first two years of life (age p = 0.001, R2 = 0.18; 104 weeks vs adult p = 0.001, R2 = 0.21). The infant gut microbiota composition varied greatly between individuals in the first 6 months (26 weeks) but converged thereafter (Fig. 1a–c). Birth mode was associated with the development (Fig. 1b, c). In the vaginally born infants not exposed to intrapartum antibiotics (VD), we observed an increase in principal component (PC) 2 scores in the first 26 weeks, and an increase in PC1 scores from 26 to 104 weeks (Fig. 1b, c). C-section birth (CS), and to a lesser extent, intrapartum antibiotic exposure during vaginal birth (V-ABX), altered this pattern, as these infants had consistently high PC2 scores already at 3 weeks (CS compared to VD, p < 0.001 at 3–78 weeks; p = 0.003 at 104 weeks), but rather showed a constant increase in PC1 scores already during the early weeks (PC 1 scores CS compared to VD, p < 0.01).
To identify factors (Supplementary Data 1) influencing the gut microbiota, we analysed the associations of background variables with the ordination (envfit) and performed permutational multivariate analysis of variance (adonis2) (Fig. 1d). The major determinant of PC1 scores was age (R2 = 0.178 in adonis2). In addition, solid food consumption (R2 = 0.02), use of antibiotics (R2 = 0.0008), time spent with non-parental carer (R2 = 0.005), and use of gut-targeting medications (laxative/antiflatulence/antidiarrheal/constipation; R2 = 0.004 each) were significantly associated with PC1, after adjusting for age. Breastfeeding (R2 = 0.004) and probiotic consumption had the opposite association, but the latter was not significant in a multivariate model (p = 0.99). The major driver of PC2 scores was birth mode (R2 = 0.02) and having siblings (R2 = 0.007).
To assess the predictability of microbiota development, we built multivariate regression models for the PC scores by age and quantified the variance explained by each factor at each time point. The most important determinant of gut microbiota composition was birth mode in the first 26 weeks, defecation rate throughout the first year, and diet and family composition from 1 to 2 years (52–104 weeks) (Fig. 2a). Maternal characteristics had a modest and consistent impact at all time points, while maternal microbiota composition (mother’s PC scores) became influential at 26 weeks (1.22% of variation explained, p = 0.017) increasing with time (5.12% at 104 weeks, p < 0.0001) (Fig. 2a).
We added the infants’ previous time points’ microbiota composition (PC scores) into the models, discovering that it was by far the most influential factor determining the current composition, explaining 59% of the variation in the 6 weeks’ samples (Fig. 2b, Supplementary Fig. 1a, p < 0.001). With the previous composition included, the impact of birth mode decreased or disappeared, demonstrating that the effect of birth mode on gut microbiota is due to its effects on the initial colonization. The impact of the initial composition was tested by replacing the previous time points’ PC scores with the 3 weeks’ PC scores (Fig. 2c, Supplementary Fig. 1b). The effect of the initial composition was strong for the first 26 weeks (over 10%), and still significant (1.7%) at 52 weeks (p < 0.001), highlighting the long-term importance of initial inoculation (Fig. 2b, c, Supplementary Fig. 1a, b).
The predictability of gut microbiota development was tested using model-based simulation in infants with full time series and complete background information (N = 98). Based on the background information (Fig. 2b) and the microbiota composition at 3 weeks, we used parameter estimates from the model that included the previous time points’ microbiota composition to simulate the PC scores of each infant over time. The simulated microbiota development followed strikingly close to the observed patterns, achieving a correlation of 0.84, showing that the early life microbiota development is highly predictable with moderate stochasticity.
The impact of exposures was further assessed at each time point by projecting them onto the age-specific ordination using envfit. High maternal BMI, long duration of ruptured membranes, high gestational age, and formula feeding in parallel to high defecation rate had a similar but weaker impact on the gut microbiota as CS and V-ABX in the first 26 weeks (Supplementary Fig. 2). Maternal and infant probiotic use prior to sample collection had an impact like that of breastfeeding, while high appetite, siblings, and pacifier use were associated with formula-fed-like microbiota composition at 9 months (36 weeks) and beyond (Supplementary Fig. 2). Some associations may have been driven partly by confounders. For example, defecation rate, indicative of transit time, was strongly associated with breastfeeding—breastfeeding correlated with high defecation rate at 6 (p = 0.012) and 12 (p = 0.017) weeks but had the opposite association at 36 (p = 0.035), 78 (p = 0.007), and 104 (p = 0.006) weeks. Infants with a high appetite were more likely fed formula (p < 0.05) at 78 weeks (18 months). The PC scores were correlated with the individual microbial taxa to identify indicator organisms of overall microbial composition. These were Bifidobacterium at 3–26 weeks, Bacillus at 39 weeks, Collinsella at 12 months (52 weeks), Enterococcus at 78 weeks, and Christensenella at 104 weeks (Supplementary Fig. 2).
We then looked further into the taxonomic associations of the most important exposures. The abundant microbial families could be divided into those that naturally decline with age (Bifidobactericeae, Bacteroidaceae, Enterobactericeae), peak at 26–52 weeks (Veillonellaceae), or increase with age (Lachnospiraceae, Ruminococcaceae) (Fig. 2d). These patterns were affected by the infant’s exposures. CS birth was strongly associated (p < 0.001) with delayed colonization by the genera Bacteroides, Parabacteroides, Bifidobacterium, and Collinsella (Fig. 2d, Supplementary Data 2). V-ABX had a CS-like impact mainly on the Gram-positive organisms (Fig. 2d). Breastfeeding was associated with increased relative abundance of Lactobacillaceae at 3 and 26–52 weeks (p < 0.001), Bacteroidaceae at 6 weeks (p < 0.001), and Bifidobacteriaceae at 36–52 weeks (p < 0.01), and a consistent decrease in Lachnospiraceae in the first two years of life (p < 0.001). Exclusive formula feeding was associated with advanced microbiota maturation indicated by faster Enterobacteriaceae decline, earlier Veillonellaceae peak and earlier increase in Lachnospiraceae (Fig. 2d, Supplementary Data 3). In the first weeks of life, having siblings was associated with increased relative abundance of Bifidobacteriaceae (p < 0.0001) and Lactobacillaceae (p = 0.007, Fig. 2d). The sibling effect on Bifidobacteriaceae was evident already at 3 weeks in the vaginally born (p < 0.0001), but not in the CS born (p = 0.736, Supplementary Data 4), suggesting that the effect is mediated by maternal gut microbiota. Indeed, we found that multiparous mothers had significantly higher relative faecal abundance of Bifidobacterium compared to primiparous mothers (p < 0.05). After 26 weeks, siblings had a more widespread impact, being associated with increased relative abundance of Ruminococcaceae and decreased Lachnospiraceae (Fig. 2d). In addition, green stool colour was associated with reduced relative abundance of Bifidobacterium at 6 and 13 weeks (p = 0.007).
The most important microbial taxa affecting gut microbiota development at each time point were identified using multivariate analysis of variance with the previous time points’ taxonomic composition as the explanatory variables. Bifidobacterium and Bacteroides were the most influential taxa throughout the first 2 years, having an especially strong influence in the first months, while Veillonella and Collinsella became dominant influencers at 52–78 weeks (Supplementary Fig. 3). In the first 26 weeks, the relative abundances of Bifidobacteriaceae and Bacteroidaceae at a given time point were negatively correlated with the next time point’s relative abundances of Clostridiaceae, Enterobactericeae, and Ruminococcaceae, and positively with members of Bacilli, Actinobacteria and Bacteroidia (Supplementary Fig. 4).
Microbiota community types in infants
To identify the main microbiota community types in the infants, we clustered the infant samples at the genus level using K-means clustering and log-pearson distance, identifying four community types (Fig. 3a). We tested additional distance metrics alongside the log-pearson derived community types: Bray-Curtis, Jaccard and Aitchinson (Supplementary Fig. 5a–c). The same community types are replicated irrespective of the distance metric and see an overall similarity between the metrics. Community types 1 and 2 (C1 and C2) characterized the first 26 weeks’ microbiota, C3 was common at 39–52 weeks and C4 thereafter. C1 was dominated by Bifidobacterium (39.2% relative abundance) and Bacteroides (12.8%), together with other members of Actinobacteria and Bacteroidia covering over 50% of the relative abundance on average (Fig. 1b). Community type 2 (C2) was nearly devoid of bifidobacteria (4.8%), having a high relative abundance of Clostridiaceae (13.4%), and Enterobacteriaceae (25.7%). In community type 3 (C3), Bifidobacteriaceae (27.3%), Lachnospiraceae (18.5%), and Veillonellaceae (20.1%) were the dominant families, and community type 4 (C4) was dominated by Lachnospiraceae (30.0%) and Ruminococcaceae (30.0%) (Supplementary Data 5).
Microbial richness varied significantly between community types, being highest in C4 and lowest in C2 (Fig. 3c), and generally showing an increasing association with infant age. The relative abundance of potential pathobionts was the most abundant in C2 (Fig. 3d). Stool colour was significantly different between the different community types (p < 0.0001, c2 test, Supplementary Fig. 6), with C1 and C2 being more likely to have yellow and green colour stool while C3 and C4 were observed mostly in brown stool, representing the change in stool composition when the amount of solid foods increases in infants’ diet after 6 months. In the early months (C1 and C2) green stool was more likely to occur in C2 (p < 0.0001).
To explain the infants’ community type, we used recursive partitioning. A model with 4 variables explained 56% of the variation in community types (Fig. 4a). Age was the most important explanatory variable. The early communities, C1 and C2, were dependent on birth mode, siblings, and bifidobacteria-containing probiotics (ever taken prior to the sample), C1 being typical in the first 6 months of life in the vaginally born infants that had not been exposed to intrapartum antibiotics. Before the age of 6 months, CS-born infants were typically in C2, but by 6 months those that had siblings had often transitioned to C1. The V-ABX infants’ samples were also classified into C2 before the age of 6 months, unless they had received bifidobacteria-containing probiotics or had siblings, which facilitated their transition to C1. At the age of 9 months, most infants were in C3. At 12 months, having siblings promoted early transition to C4. After 12 months, most infants were in C4 (Fig. 4a).
We tested the associations between community types at each time point and health outcomes at 2 and 5 years and discovered that C2 was associated with increased risk of undesirable health outcomes, especially allergic diseases (Fig. 4b, Supplementary Fig. 6, Supplementary Data 6). Children in C3 before the age of 6 months had an increased risk of allergic diseases and height-for-age Z-score < −1 sd at 5 years. Early transition to C4 (12 months) was associated with height-for-age Z-score < −1 sd at 2 years, but at 2 years, C4 was negatively associated with concurrent asthma diagnosis. At 12 months, C1 was associated with having had gastrointestinal infections.
Developmental trajectories
As the microbiota development was found to follow a consistent and predictable pattern in the individual infants, we utilized group-based trajectory modelling of the microbiota cluster scores to identify different patterns of microbiota development. Five distinct developmental trajectories were identified, with differences that mostly manifested over the first 6 months. Trajectory 1 (T1) was the most common one (N = 388, 47%), characterized by stable C1 membership in the first 6 months, transition to C3 by 9 months, and to C4 by 12–18 months (Fig. 5a). These infants had a high initial relative abundance of Bifidobacterium, which declined gradually, being replaced initially by Veillonella and then by Faecalibacterium and members of Lachnospiraceae (Supplementary Fig. 8a). Infants in trajectory 2 (T2, N = 95, 11%) were initially in C1, but moved to C2 before transitioning to C3 (Fig. 5b), showing a rapid decline in Bifidobacterium and a transient increase in Clostridium and Klebsiella (Fig. 5b, Supplementary Fig. 8b). Infants in T3 (N = 78, 9%) began in C1 but oscillated repeatedly between C1 and C2 in the first 6 months (Fig. 5c, Supplementary Fig. 8c). The reverse pattern of T3 was represented by T4 (N = 151, 18%), where infants that started in C2 oscillated between C1and C2 in the first 6 months, showing a peak of Bifidobacterium at 6–9 months (Fig. 5d, Supplementary Fig. 8d). Infants in T5 (N = 116, 14%) were consistently in C2 throughout the first 6 months with a high relative abundance of Clostridium and Klebsiella (Fig. 5e, Supplementary Fig. 8e). The intraindividual similarity over time for the 5 trajectories (Supplementary Fig. 9) shows that microbiota development in general was the most rapid at 6–9 months, with infants in T3 showing the highest volatility at 3–12 weeks, and those in T2 at 9 months. Infants in T1 had generally the most stable microbiota compositions.
We compared the trajectories to our earlier data on average infant gut microbiota compositions around the world at class/phylum level, collected from 30 studies and 5732 infants in the first two years of life8. T1 resembled most closely the global normal development pattern, while T2 and T3 resembled the average compromised pattern (Fig. 5f). These results indicate that the trajectories that we identified here in a Finnish cohort can be recapitulated, at least partly, in other cohorts.
Associations between background factors and trajectory membership were assessed using the χ2 test (Fig. 5g). Trajectories 1–3 were associated with vaginal delivery, while T4 and T5 were associated with CS birth and antibiotic prophylaxis during vaginal birth. In contrast to the other trajectories, T1 was associated with having siblings, living in a single-family house, and no formula feeding in the first 12 months. The transition to C2 exhibited by infants in T2 may have been promoted by formula feeding or lack of siblings, and potentially reflected in symptoms, as these infants were more likely than others to have received probiotics. The only identifiable reasons for the fluctuations between C1 and C2 in T3 was the lack of siblings in T3 and the possibly lower socioeconomic status indicated by housing type. Perhaps counterintuitively, infants in T3 were less likely than others to have received antibiotics in the first 3 months. The spontaneous microbiota correction in T4 may have been driven by breastfeeding, or other factors related to higher socioeconomic status.
The trajectories were tested for associations with infant health and wellbeing (Supplementary Data 7, FDR adusted p-values Supplementary Data 8) over the follow-up from birth to 2 (N = 984) and 5 years (N = 496) of age (Fig. 5h), and fever and several infection types which were recorded at 0–3, 0–6, 6–12 and 12–24 months. After adjusting for parental allergies and education level, maternal BMI, paternal smoking, maternal smoking prior to pregnancy, gestational diabetes, pregnancy weight gain, infant sex, pets, siblings, birthmode and the sequencing run ID used at each time point, T1 was negatively associated with the following: reduced risk of reported allergy symptoms in the first 2 years, upper respiratory infections in the first 2 years, fever reported in 0–6 months, doctor-diagnosed allergic rhinitis at 5 years, ISO-BMI Z-scores32 >1 standard deviation (>+1 SD) at 5 years, and height-for-age Z-scores less than −1 standard deviation (<−1 SD) at 2 and 5 years. T2 was associated with an increased risk of atopy at 2 years, and parent-reported allergy symptoms during the first 2 years (p < 0.05, p < 0.01, and p < 0.05 respectively). T3 was associated with a decrease rick of parent reported allergy symptoms at 2 year (p < 0.001). T4 and T5 were both associated with height-for-age < −1 SD at 2 years, and upper respiratory infections and fever between 0–6 months (p < 0.01; p < 0.05; p < 0.001). However, possibly due to the microbiota correction in T4, these infants did not have the increased risk for altered growth or diagnosed allergic rhinitis at 5 years that were observed in T5 (p < 0.05). T5 was additionally associated with increased rick of being diagnosed with atopy at 5 year (p < 0.05). Trajectory membership was more strongly associated with health outcomes than community type membership at any given time point (Supplementary Data 9), indicating that longitudinal analysis of development is more informative than single time points.
As an alternative way to represent microbiota development, we constructed a microbiota maturity index based on age-associated microbes. We found only minimal associations between the maturity index and health outcomes at different ages, mostly regarding growth (Supplementary Fig. 10a). We then compared the index to the trajectories, which did not greatly differ (Supplementary Fig. 10b), indicating that the maturity index was an insufficient representation of microbiota development. In the total data, we identified a set of bacterial genera associated with age (Supplementary Fig. 10c–g). Overall, the taxa displayed similar patterns across the trajectories, being broadly divided into early (members of Actinobacteria, Bacteroidia, Enterobacteria, Negativicutes, Bacilli) and late infancy (mainly Clostridia) groups. However, certain key groups such as Bifidobacterium and Bacteroides showed different temporal patterns in the different trajectories (Supplementary Fig. 10c–g). Species level community type and developmental trajectories were tested in addition to the genus level. The PC space and community type are highly similar to genus level (Supplementary Fig. 11a). Using the same criteria for trajectory creation we show that the background factor-trajectory associations are similar as the genus level, with minor differences (Supplementary Fig. 11b, c).
Microbiota wellbeing index
Due to the inability of the maturity index to differentiate between different developmental trajectories and to capture various health associations, an alternative method to characterise microbiota wellbeing was devised. Because T1 was the most common pattern, associated with vaginal birth and positive health outcomes, and most representative of the normal gut microbiota development globally, we took this pattern to represent the natural undisturbed gut microbiota development (“eubiosis”). To identify microbes associated with natural gut microbiota development, we formed a reference group of infants in T1 that did not have diagnosed allergic diseases, allergic symptoms or atypical growth (absolute WHO Z-scores > 2 SD) during the first 5 years of life (N = 198). We used logistic regression to identify microbes predictive of membership in the reference group at different ages. The estimates from the model for the predictive bacteria were used as microbiota wellbeing influence scores, positive scores indicating that the microbe was associated with the reference. The strongest overall positive microbiota wellbeing association was seen in Bifidobacterium and Bacteroides, which were consistently indicative of the reference gut microbiota (Supplementary Fig. 6a). Most taxa showed an age-dependent association, either becoming increasingly positive with age (Eisenbergiella, Oscillibacter, Parabacteroides, Anaerostipes, Streptococcus), increasingly negative with age (Lachnospira, Faecalicatena, Lacrimispora, Klebsiella, Sutterella), showing a transient negative association (Roseburia, Faecalibacterium), or a transient positive association (Citrobacter, Blautia, Gemmiger, Hungatella). The amount of variance of the index explained by each microbe by time point further elucidates the age-dependency (Supplementary Fig. 12). The indicator microbes were individually assessed against the health outcomes at each time point (Supplementary fig. 13), verifying that these microbes were associated with health and wellbeing in a consistent manner.
The relative abundances of the indicator microbes were used to create a microbiota wellbeing index (MWI), representing the microbiota-based estimated likelihood of belonging to the reference group. The MWI was significantly lower in infants with an allergic disease or growth differences (Fig. 6b) and in more detailed analysis the MWI was associated with several different types of health outcomes from allergic diseases to growth at both 2 and 5 years of age (Supplementary Data 7), and the incidence of infections (Fig. 6c).
MWI was reduced in CS (p < 0.0001) and V-ABX exposed (p < 0.0001 at 3–6 weeks, p < 0.0513-104 weeks) infants throughout the first 2 years (Fig. 6d). Having siblings increased MWI in the vaginally born not antibiotic exposed (p < 0.0001 at 3 weeks, p = 0.003 at 26 weeks, p = 0.008 at 104 weeks), but in the V-ABX infants the effect was not observed until 26 weeks (p = 0.003), and in CS infants not until 78 weeks (p = 0.02). When analysed together, the impact of siblings on the CS/V-ABX infants was significant but weak at 3 weeks, strengthened at 26 weeks and remained significant until 104 weeks (p = 0.01,p = 0.0003, p = 0.007, respectively). Exclusive breastfeeding created a modest increase on the WMI in the vaginally born non-antibiotic exposed infants at 13 weeks (p = 0.034), but the main impact of breastfeeding was observed at the time of solid foods’ introduction (26 weeks), when those that were no longer breastfed experienced a significant drop in MWI (p = 0.001 at 26 weeks, p = 0.002 at 39 weeks, p = 0.038 at 52 weeks), while breastfed infants retained a high MWI. Breastfeeding was not associated with MWI in the C-section born infants, but modestly increased MWI in the V-ABX infants in the first 13 weeks (p < 0.05). When analysed together, breastfeeding increased MWI in the CS/V-ABX infants in the first 13 weeks (p < 0.01).
link