1 Introduction

“Yond bap” is a traditional fermented milk product from the Yunnan province of China that has been consumed and produced for over 600 years in Lunan Yi, an autonomous county to the southwest of Kunming City (i.e., Shilin Yizu autonomous county) and the northern region of the Dali prefecture. Most yond baps are produced with goat’s milk and occasionally with cow’s milk. Yond bap is slightly hard with a solids content of 52% (Zhou et al. 2008), and it has a white color. Nutritionally, yond bap has a high content in fatty acids and has crude protein, vitamins, and diverse trace elements that are essential for the human body (Chen et al. 2009). The fat and protein contents of yond bap are 25 and 20.4%, respectively (Zhou et al. 2008). Yond bap is a nutritious food suitable for people of all ages and is deeply loved by the Yunnan people and tourists from China and abroad.

Until now, yond bap in the Yunnan province has been produced in traditional family workshops; thus, production of yond bap could not reach large-scale production. As a natural milk product, the production of yond bap is very unique (Fig. 1). It is generally made using a traditional method during the summer. First, fresh goat milk is boiled and added with the coagulating agent. Addition of the coagulating agent is the unique step. When local residents make yond bap for the first time, they add a coagulating agent (pH = 4) taken from Marsdenia tenacissima. Several residents make the natural raw milk ferment directly. Following the initial production of yond bap, future production only requires the addition of the whey from the previous batch as the coagulating agent. The whey is the by-product of the initial yond bap production process (Chen et al. 2009). Subsequently, the milk is mixed well and incubated for 30 min for the whey to separate. The whey is then removed by filtration using a cake cloth, and the curd obtained is transferred to cake molds and pressed to obtain yond bap. Sometimes, the whey from the production of yond bap is transferred to a separate container for spontaneous fermentation. The whey is preserved and used as the source of coagulating agent for the next yond bap production (Bao et al. 2011). Therefore, an abundant microbiota, including lactic acid bacteria (LAB), is found in the acid whey and yond baps after several generations. It is generally known that LAB plays an important role in the fermentation process. Several studies (Kinová et al. 2008; Lee et al. 2008) indicate that most LAB had a strong effect on healthy human immune function. Spontaneous fermentation without the use of starter cultures or sterilization leads to the growth of various microorganisms during yond bap preparation, and the product is likely to contain pathogenic and environmental bacteria in addition to potentially probiotic bacteria. For this reason, studying the bacterial diversity of yond bap is of critical importance.

Fig. 1
figure 1

The manufacturing process of yond bap. Bold-type letters are used to differentiate the product from the process

In this study, samples were collected from different regions in the Yunnan province (Fig. 2). The different regions have different geographical and meteorological conditions that influence the spontaneous fermentation of yond baps. Shilin County is located in the subtropical zone and has a continental monsoon climate suitable for raising goats. Shilin’s yond bap is the most famous, and therefore, most samples in this study were collected from Shilin County. Dali is located in a low-latitude plateau and has low-latitude-plateau monsoon climate characteristics, where temperature differences across the four seasons are minimal. Lijiang is located in an area connecting the Yunnan–Guizhou and Qinghai–Tibet plateau that has a plateau-type southwest monsoon climate. The temperatures in Lijiang are low, and temperature differences between day and night are large. Shangri-la is located in a high-altitude, low-latitude area, and its climate varies with altitude.

Fig. 2
figure 2

The different Chinese regions from which the products were obtained

To develop an in-depth and comprehensive understanding of the different bacterial communities in yond baps, we carried out barcoded pyrosequencing of 16S rRNA genes. The results of this study help to further understand the community structure and composition of microorganisms in yond baps from different regions in China. By studying the microbial community structure of yond bap, we could learn the predominant species playing an important role in the yond bap fermentation process. In further studies, we can study these species, which could provide a reliable theoretical foundation for selecting appropriate starter cultures.

2 Material and methods

2.1 Collection of samples

In August 2012, 11 yond baps made with goat’s milk in the Yunnan Province— seven came from Shilin (SL1, SL2, SL3, SL4, SL5, SL6, and SL7), two from Shangri-La (SR1, SR2), one from Dali (DL), and one from Lijiang (LJ). The yond baps were sampled from the farmer’s market randomly. All of the samples were stored at −20 °C and crushed down before DNA extraction.

2.2 DNA extraction

Genomic DNA was extracted from 0.5 g of each sample in a 1.5-mL centrifuge tube using a CTAB-based method as previously described (Schmidt et al. 1991), with slight modifications. Briefly, samples were treated with lysozyme (1 mg.mL−1) after being resuspended in 450 μL lysing buffer (0.1 mol.L−1 Tris/HCl,0.1 mol.L−1 EDTA, 0.75 mol.L−1 sucrose) and incubated in a water bath at 37 °C for 30 min. Then, samples were treated with SDS (1%) and CTAB (1%). Subsequently, samples were treated with phenol/chloroform/isoamylalcohol (25:24:1, v/v) and chloroform/isoamyl alcohol (24:1, v/v) to remove impurities. Finally, samples were resuspended in 50 μL TE buffer (0.01 mol.L−1 Tris-HCl, pH 8.0, 0.001 mol.L−1 EDTA, pH 8.0) after being treated with ethanol. DNA samples were stored at −20 °C until needed.

2.3 Tag-PCR amplification of bacterial 16S rRNA genes and pyrosequencing

The V3–V6 hypervariable region of the 16S rRNA gene was PCR-amplified from extracted DNA samples using two primers, 338 F (5′-ACH [A/C/T] Y [C/T] CT ACG GGA GGC H [A/C/T] GC-3′) and 907R (5′-CCG TCA ATT CM [A/C] T TTG AGT TT-3′), which contain the modified universal sequences (Mao et al. 2012), and a unique 10-bp barcode was added to the 5′-end of the forward primer sequence to tag the samples at the same time in the synthesis of the primers (Hamady et al. 2008). The PCR reactions were carried out in a total volume of 50 μL, and approximately 50 ng DNA was used as the template. The PCR conditions used were 95 °C for 5 min, followed by 30 cycles of denaturation at 95 °C for 30 s, annealing at 51 °C for 1 min, an extension at 72 °C for 2.5 min, and a final elongation step of 10 min at 72 °C. To eliminate heteroduplexes, the amplified reaction was diluted tenfold into a fresh PCR reaction mixture and PCR-amplified as above using only five cycles (Thompson et al. 2002). The PCR products were analyzed on 1 % (w/v) agarose gel and purified using a Gel Extraction Kit (BioTeKe Corporation, China) according to the manufacturer’s manual. To obtain similar numbers of sequences from each sample, 50 ng of each purified PCR product was combined. Pyrosequencing was performed using the Genome Sequencer FLX System (Roche, Switzerland) by Chinese National Human Genome Center at Shanghai.

2.4 Classification and phylogenetic analysis

Original sequences shorter than 350 bp were omitted from further analysis. The bar-coded sequence, linker, and both forward and reverse primers were removed from the original sequences using the BioEdit software v7.0.9.0. The sequences were aligned using the Mothur program v.1.25.1 with the Silva v108 database. The distance matrices were also performed with Mothur program v.1.25.1 (Schloss et al. 2009) to define operational taxonomic units (OTU) on the basis of a similarity distance cutoff of 0.03. Phylogenetic trees were created with Clustal v1.81 and MEGA v4 with the neighbor joining (NJ) method.

2.5 Diversity index analysis

Richness and diversity statistics were calculated using the Mothur program v.1.29.1, including the abundance-based coverage estimator (ACE) (Chao and Lee 1992) and the bias-corrected Chao1 (Chao 1987). The estimated coverage of the 16S rRNA gene sequences was calculated as C = [1 − (n 1/N)])*100, where n 1 is the number of singleton sequences and N is the total number of sequences. Analysis of similarities (ANOSIM) was performed to evaluate the similarities between the 11 samples using Mothur program v.1.29.1.

2.6 Nucleotide sequence accession numbers

The 16S rRNA gene sequences of the bacteria in this study are available in sequence read archive (SRA) under the accession number SRP042062.

3 Results

3.1 Sequence information and diversity index

Pyrosequencing of 11 yond bap samples (SL1, SL2, SL3, SL4, SL5, SL6, SL7, SR1, SR2, DL, LJ) yielded a total of 14,799 reads. A total of 12,583 high-quality sequences with an average length of 546 bp were obtained after removing the replicates, duplicates, and sequences with low quality or short read lengths (<350 bp). The reads were distributed among the samples as reported in Table 1. Using a 0.03 cutoff, the estimated OTUs ranged from 76 to 261 according to the rarefaction method. The OTUs were highest in the DL sample. Distance matrices were used to define OTUs for determining the ACE and Chao1 richness estimator. The coefficient indicates that higher bacterial genotype diversities were observed in the DL sample. The highest estimated sample coverage was found in the SL2 sample (Table 1).

Table 1 Analysis of OTUs in the 11 samples

To compare the OTU richness of samples and to assess whether our sampling effort provided sufficient OTU coverage to accurately describe the bacterial composition of each sample, rarefaction analysis was carried out by plotting the number of OTU observed (as approximated using OTUs at 97% identity cutoff) against the sequencing effort (Fig. 3). The curves of all of the samples are still ongoing, indicating relatively higher species diversities in these samples, especially for samples SR1 and DL.

Fig. 3
figure 3

Rarefaction analysis at the 97% sequence similarity level

3.2 Diversity of bacteria communities

Overall, the vast majority of sequences belonged to four major phyla: Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes. Among them, the Firmicutes and Proteobacteria were detected as the dominant phyla in all 11 samples, but their ratio and composition among the samples varied considerably. It is worth mentioning that in all samples, the vast majority of sequences (more than 50%, except for sample SL8) came from Proteobacteria. Sample SL4 had the highest levels of Proteobacteria sequences (94.0%). It can thus be observed that among the phyla represented in the samples, Proteobacteria was the most abundant, followed by Firmicutes. All sequences could be grouped into more than ten families. Nine main families with sequences are shown in Fig. 4. In the Proteobacteria phylum, the majority of the sequences were restricted to Pseudomonadaceae and Enterobacteriaceae. At the same time, the abundance of Pseudomonadaceae was the highest in the vast majority of samples, except for SR1, DL, and LJ, followed by Streptococcaceae of Firmicutes and Enterobacteriaceae. The Acetobacteraceae family of Proteobacteria accounted for a high proportion in SR1 and SR2, with the percentage in SR2 reaching up to 39% (Fig. 4).

Fig. 4
figure 4

Comparison of the bacterial flora in the 11 samples as revealed by 16S rRNA gene sequences

At the genus level, the numbers of genera in every sample ranged from 9 to 18. The shared genera were mainly Pseudomonas, Lactococcus, and Leuconostoc. Leuconostoc was detected in all of the samples except for sample SL3, and less than 7.0% of sequences could be assigned to Leuconostoc in every sample. Lactobacillus was also prevalent among the samples; however, Lactobacillus accounted for a very small fraction of the samples with the exception of sample DL, which contained as many as 46.6% Lactobacillus sequences. Among all the genera, the sequences belonging to Pseudomonas were the most abundant, and the samples from Shilin had the largest quantity of Pseudomonas sequences. For sample SL4, sequences belonging to Pseudomonas constituted approximately 72.6% of all the detected sequences. Sample SL5 had the lowest percentage at 29.7%. The sequences belonging to Lactococcus constituted approximately 33.8% of the sequences from sample SL2 and only 7.5% in sample SL1. In sample SL5, the percentage of sequences belonging to Lactococcus came to 31.2%, which was greater than the percentage of sequences belonging to Pseudomonas. SL5 was the only sample with more Lactococcus sequences than Pseudomonassequences. In sample LJ, Vagococcus of Firmicutes constituted approximately 25.8%, which was different from other samples.

3.3 Phylogenetic analysis

An NJ tree was constructed using the OTU sequences based on a 3% cutoff, and the related sequences were obtained from the NCBI GenBank database. The 12 OTUs which have the sequence percentage greater than 1% were selected for constructing the phylogenetic tree (Fig. 5a). In all of the samples, the most abundant OTU was MK9, followed by MK5. Approximately 17.4 and 10.3% of sequences observed came from MK9 and MK12, respectively, whose sequences belong to Pseudomonas psychrophila (KC904093) and Pseudomonas fragi (AB685610). The OTU MK11 belonged to Pseudomonas brenneri (KF040473), another Pseudomonas species. The OTU MK5 represented approximately 10.6% of the 16S rRNA gene sequences and came from Lactococcus lactis subsp. cremoris (KF149853). Approximately 4.2% of the sequences belonged to the OTU MK2, which showed a high similarity to sequences from L. lactis subsp. cremoris (KF149853). One third of the OTUs were members of the Lactobacillales order. Among them, 4.8% of the sequences belonged to the OTU MK3, and sequences from this OTU were similar to Lactobacillus kefiranofaciens (KF149811). This OTU was found in a higher proportion in the DL sample (Fig. 5b). The OTUs MK5 and MK2 had a higher similarity to L. lactis. subsp. cremoris (KF149853). The OTU MK2 was mainly distributed in the SR1, SR2, and DL samples, while the MK5 OTU was more highly detected in the SL2 and SL5 sample, but not in the DL sample (Fig. 5b). Additionally, approximately 1.7% of all the sequences belonging to the OTU MK6 could be assigned to Vagococcus salmoninarum (AM490375) and was more frequently detected in the LJ sample (Fig. 5b). At the same time, the OTU MK7, with sequences similar to Moellerella wisconsensis (GQ451444), was also only detected in the LJ sample (Fig. 5b).

Fig. 5
figure 5

a NJ tree constructed using pyrosequencing sequences and the related sequences obtained from the National Center for Biotechnology Information GenBank database. b The abundance of OTUs in the 11 samples. The OTUs obtained in this study are shown in boldface. MK name of the OTUs

4 Discussion

In this study, pyrosequencing-based 16S rRNA profiling provided detailed insights into the complex microbiota of yond bap. Because Shilin was the major place of production, Shilin had the most samples in this study. A previous study (Ercolini et al. 2012) characterizing the microbial community diversity of buffalo mozzarella cake indicated that greater than 2,000 reads per sample would be sufficient to obtain good coverage, and all were more than 93%. The estimated coverage of the 16S rRNA gene sequences was all over 80% (Table 1). Although rare biosphere might have a certain influence on the fermented food, the dominant microorganisms that had the largest effect on the fermentation process were detected and analyzed. No rarefaction curve of any of the samples reached an asymptote (Fig. 3), implying that not all phylotypes present in the bacterial communities were detected and that the microbial community diversity of the fermented food is higher.

The results indicated that all the bacteria belonged to Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes. Generally, Firmicutes and Proteobacteria were the dominant bacterial communities associated with LAB and environmental strains, respectively, which is in line with the bacterial community diversity found in most fermented foods (Laureys and De Vuyst 2014; Nie et al. 2013). The bacterial community showed various profiles in different sample types (Figs. 4 and 5b), although there were no significant differences in the microbial community structures in the samples from Shilin. Because of spontaneous fermentation, there may be different types of factors affecting the community clustering, such as sampling site (Kim et al. 2011), the freshness of samples, water quality, the local air, etc. In LAB, the most abundant OTU, MK5, was assigned to L. lactis subsp. cremoris (JF297374), which derived from starter-free cakes made of raw milk (Fernandez et al. 2011). The sequences of OTU MK3 belonged to L. kefiranofaciens (AM419051), which was previously detected in kefir grain (Ninane et al. 2007). Starter-free cakes and kefir produced by kefir grains are fermented products, indicating that the two OTUs are associated with fermentation processes using raw materials. Therefore, L. lactis subsp. cremoris and L. kefiranofaciens may play a leading role in the fermentation process of yond bap. We were also able to isolate LAB from samples and identify the beneficial microorganisms, such as the Lactobacillus plantarum strain AY01 (Li et al. 2013). Thus, investigating the microbial community structure has contributed to the improvement and quality control of fermented food. It is worth mentioning that V. salmoninarum was identified only in the LJ sample. V. salmoninarum is a gram-positive, chain-forming, coccoid- to ovoid-shaped coccus and regularly isolated from diseased fish (Michel et al. 1997). The sequences of OTU MK6 were similar to those of V. salmoninarum (KF012888), which was detected in rainbow trout (Quintela-Baluja et al. 2013). Vagococcus spp. are very common in fermented foods, such as fermented soybean, fermented fish products, and fermented kimchi (Singh et al. 2014; Thokchom and Joshi 2012). The annual temperature of Lijiang is 4–26 °C, which is in accordance with the growth temperature of V. salmoninarum. In a recent study, V. salmoninarum was also detected in cake (Delcenserie et al. 2014). Most studies on V. salmoninarum were on diseased fish. The specific reasons for V. salmoninarum’s presence in yond bap should be further studied.

Pseudomonas was clearly the most abundant genus in each sample. The most abundant OTUs, MK9 and MK12, were similar to P. psychrophila (FN554506) (Mulet et al. 2010) and P. fragi (EU255303) (Selvakumar et al. 2009), respectively, which were detected in a cold room for food storage and in high-altitude rhizospheric soil. It is well known that Pseudomonas generally exists in soil, water, dirt, and air. Its presence in yond bap is most likely due to the fact that its production is still in traditional family workshops, and it does not have a strictly sealed storage environment. When yond baps are transported for sale at the farmer's market, many environmental strains might be able to adhere to the surface of yond baps. Environmental strains may also come in contact with the product during the production process, such as in the coagulating agent addition step. A previous study (Delgado et al. 2013) confirmed the presence of Pseudomonas spp. in raw milk, and thus we speculated that Pseudomonas spp. could be derived from raw goat’s milk. Although the microorganisms in milk should be dead because it is boiled, a certain amount of living Pseudomonas spp. may still persist due to lack of boiling. The OTUs MK7 and MK4 were closest to M. wisconsensis (DQ217617) and Hafnia alvei (DQ217641), respectively, which were all isolated from the intestinal tract of Salmo trutta fario L. (Skrodenyte-Arbaciauskiene et al. 2006). Among them, M. wisconsensis was detected only in the LJ sample (Fig. 5b), as was V. salmoninarum. It is speculated that the location of this samples was closer to the fish market and polluted. In addition, the yond bap is not salted under normal circumstances. However, we cannot rule out the possibility that adding salt led to the presence of M. wisconsensis and V. salmoninarum as marine bacteria. It is also possible that the yond bap was contaminated by these species from other salted foods during the work process. The pH of the coagulating agent is approximately 4 (Fig. 1), which may be the cause of the high numbers of Acetobacteraceae in SR1 and SR2. In addition, there were high numbers of Lactobacillaceae in both samples, which could produce organic acid (e.g., acetic acid); this could also explain the presence of Acetobacteraceae in SR1 and SR2.

5 Conclusion

In conclusion, this report describes the bacterial community diversity found in yond baps, a traditional fermented dairy product in China. Previous studies usually applied culture methods or culture-dependent methods, such as T-RFLP, DGGE, and others. This study takes advantage of pyrosequencing, which comprehensively revealed the complex bacterial flora. All of the bacteria detected were affiliated with Proteobacteria, Firmicutes, Bacteroidetes, and Actinobacteria. The research results revealed different microbial community structures for the samples from different regions. To investigate the role of all the microorganisms found in yond baps, it is necessary to further study the microbial community involved in the fermentation process of yond baps.