So my first go at identifying wild yeast using ITS sequencing did not work as well as I had hoped. Of course, after setting up the system I learned that the method I was using was best for non-yeast fungi, rather than for yeast. Luckily one of my few readers, Sam – author of eurekabrewing – came to the rescue with a few papers showing better ways to do it.
So I’ve revamped the plan and will use a slight variation of my old method. Previously I was trying to sequence the regions of the genome between pieces of the ribosome (full explanation of the method and what a ribosome is can be found here). Instead, I’m going to sequence part of the 26s portion of the ribosome (26S rRNA). The specific part I’m amplifying has been used successfully to identify yeast in the past, including the very species of yeast we’re likely to find in a wild brew1-3.
As always, the meat is below the fold…
So the method is easy – its just a minor adjustment to the old method. The only thing that is different is the primers used to amplify the DNA – instead of using the ITS1 & ITS4 primers, we instead use the NS1 and NS4 primers:
- NS 1: 5′ – GCATATCAATAAGCGGAGGAAAAG – 3′
- NS 4: 5′ – GGTCCGTGTTTCAAGACGG – 3′
- Setup a 30ul PCR using 2ul of the DNA prepared last time, PFU polymerase, and the NS1 & NS4 primers. PCR conditions are 40 cycles of:
- 96C/30sec melt
- 50C/30sec annealing
- 72C/45sec extension
- Run PCR products on a 2% agarose gel & purify the DNA
- Sequence using the NS1 and NS4 primers*
The Main Results
|Once again, Brett PCR’d nicely;|
the other yeasts, not-so-much
As last time, only the Brett sequences amplified. But they amplified very well, producing products ~600bp in size. This can be seen on the gel to the left – the B. bruxellensis & lambicus produced strong bands at the expected size, while the Irish ale (1084) and mystery yeast from my clubs president’s ruined batch (Pres) did not amplify.
This lack of amplification could be due to a number of problems. I can think of two issues. Firstly, the 1084 & Pres strains are/may be commercial yeast strains. As these strains were selected for by brewers they lost a lot of the characteristics which would allow them to survive in the wild. As such, they may not be as tough as the more-wild Bretts, thus causing the DNA to be destroyed in the fairly rough handeling processing used to purify DNA. Secondly, there was a lot more yeast used in the Pres & 1084 isolations – perhaps I saturated the system and the DNA ended up stuck in the protein/chloroform layers of the purification.
Either way, I’m going to try an isolation “free” method next time, which is described later in this post – in the section bearing the astoundingly astute title ‘Next Time’…
As for the two Brett sequences which worked, they were packed up and sent for sequencing. I messed up my preparation, so only the lambicus got sequenced form both ends. The resulting sequences were BLASTed against NCBI’s full nucleotide collection, as well as aligned relative to the canonical Saccharomyces cerevisiae4-5 and Brettanomyces bruxellensis6 26S rRNA sequences.
|Sample||Sequence||ID via BLAST||% Align|
|Hundreds of equal score||99% or more across >95% of the sequence|
|Hundreds of equal score||99% or more across >95% of the sequence|
Note: ‘Sample’ indicates the yeast strain sequenced, ‘ID via BLAST’ is the closest match in the NCBI nucleotide database, ‘% Align’ is the % positive alignment with the closest matches
Again, the extreme density of data in the NCBI database bites me in the ass. Over 1200 strains matched, with >99% accuracy, my sequences. The good news is that my two sequences are not 100% identical – not counting the 4 places where a sequence had an unknown (‘n’) in the sequence, there are 388 out of 519 bases which match (75% match). That was far more variability than I had expected!
So what happens if we compare these only to the matching genera – i.e. brettanomyces or dekkera? Our brux strain matches, 100% on ~90% of sequenced bases, with multiple Brettanomyces naardenensis strains. If we further limit our search to just bruxellensis strains we also get 100% matches, on short gene segments. Our lambicus strains aligns most closely with Dekkera anomala. As with the bruxellensis-restricted search, all of the anomala sequences in NCBI are partials, so we only get a partial match with our sequences.
I think that latter issue is going to limit the usefulness of this method – because many of the sequences in the NCBI are partial sequences – i.e. fragments of genes instead of entire genes/genomes – matches end up being biased towards sequences which match modestly well across the whole of the sequence I enter, instead of perfect matches between my larger sequences and the shorter fragments typical of the bruxellensis and anomala sequences in the database. I.E. the system biases towards longer segments with a few mis-matches over shorter sequences with perfect matches.
What does this mean for this method? Unfortunately, it means that its not as useful as I had hoped it would be. Not because the method is flawed, but rather because the coverage quality of genome sequences in the NCBI database are variable. That doesn’t mean that this method is useless – it does, however, mean that I need to first acquire some more conventional information on the yeasts I’m trying to identify – i.e. reduce the list of possibles to a smaller number of species, and then BLAST the 26S sequences against that list. The ‘possibles’ list can simply be built from a mixture of morphology and some basic biochemical observations (e.g. bromocresol green metabolism). Likewise, as I acquire my own library of high-quality sequences, I can BLAST new samples against those, providing a beer-orientated “short list” of sequences to compare to.
If you’ve made it this far in this already-long write-up, congratulations! While I’ve exhausted the basic discussion of the results at this point, I did have a bit of a nerd-gasm and did some deeper analysis. This is discussed below – warning, its discussed without my usual attempts at explaining a lot of the science lingo, and includes a bit of a rant on an irrelevant topic…
In Which I Totally Nerd Out
The discussion I have been having with Sam (author of Eureka Brewing) in my last yeast sequencing attempt has motivated me to geek-out a little and write a but more in-depth analysis of the results of this trial. But before I begin I should comment on one ‘issue’ is the classification of species and how this relates to our friend Brettanomyces.
Prior to the age of modern genetics, species were classified based on biochemical, physical and other properties. This form of classification is termed “systematics”. With the modern genetic era a new (circa 1963) method of classifying organisms developed where the patterns of genetic inheritance are used to classify organisms. This ‘new’ method is termed “cladistics”. As odd as it may sound, the systemists maintain a stranglehold on the naming of new species, and while 99% of the time the systemists & cladists agree, their occasional disagreements tend to lead to vigorous – and often less-than-civil – debate. There are legitimate scientific issues on both sides of the fence – systamists often have issues when closely related species undergo convergent evolution, thus creating the appearance of one species where two exists, while cladists face issues of determining if the genes being used to compare species are truly homologous. As our knowledge of genetics has improved, this limitation of cladistics is fading, but as you will soon see, remains an issue. Arguments on both sides are further complicated by the fact that the term ‘species’ is not clearly defined, while the terms “strain” and “subspecies” remain ambiguous terms without accepted definitions.
So that aside, I’m now going to take a cladistic approach to my sequences. Simply because cladistics is the right way to do things – and now you know which side of the phylogeny war I am on…
|The genetic relationship between various Brettanomyces|
species/strains. Note that B. lambicus appears in two
locations in the figure, indicating that two sub-strains of B.
lambicus exist. One of these clusters with B. bruxellensis,
while the other falls between two recognized Brettanomyces
species – bruxellensis and anomalus.
From : The Yeasts – A Taxonomic Study
Brettanomyces is (or at least, has the potential to be) one area of potential systamist/cladist disagreement. Systamists have long held B. lambicus to be a synonym for B. bruxellensis – i.e. two names for the exact same thing. As you can see in the above figure, which uses a simple method to measure relative genetic differences, this is absolutely true for some lambicus strains. At the top of the diagram, represented by the black area, is a clade (group of closely related organisms) including all of the prototypical B. bruxellensis, as well as a strain of B. lambicus. Clearly, this one strain of lambicus is identical to bruxellensis – i.e. they are the same thing.
But, mid-way down the diagram is another strain of B. lambicus, one that is closer related to B. anomolus (which is widely agreed to be a separate species), than it is to B. bruxellensis. And here we see an example of the failure of systematics – as measured by a (crude) genetic profile, some strains of B. lambicus form what is either a unique clade (i.e. group of strains unique from other groups of related organisms) or are an evolutionary intermediary between B. bruxellensis & anomolus. Based on the above diagram, we cannot determine which of the two possibilities is true, but with todays modern genetics we can do such a comparison.
Below I’ve included a genetic tree comparing the 26S rRNAs of the two strains I sequenced here (WLP 653 [lambicus] & 650 [bruxellensis]) along with ‘default’ B. anomolus, bruxellensis and Saccharomyces cerevisiae strains for comparison. The tree itself was build using phylogeny.fr‘s one-click tree generator, and was restricted to the variable (D1, D2 and D3) regions of the 26S rRNA. I did this restriction as this region was the only one available for anomala. By restricting to this segment we reduce any bias induced by different sequence lengths:
|Phylogenic tree comparing my B. lambicus (WLP653) B. bruxellensis (WLP650) sequences to “standard”|
sequences of B. bruxellensis, anomala, and conventional brewers yeast (Saccharomyces cerevisiae).
Unexpectedly, our two brewing strains fall far off of the “default” sequences for their supposed species (B. bruxellensis), suggesting that significant evolution has occurred during their pseudo-domestication in a lambic brewery. Amazingly, the B. lambicus strain (WLP653) is in a completely separate clade from the other brett’s, while the B. bruxellensis strain (WLP650) appears to be an outlier when compared to both “default” B. bruxellensis and an accepted separate species B. anomala.
On the surface this may seem odd – how could strains evolve further away from their parental species (bruxellensis) in a brewery over a few centuries than a completely separate species (anomala) that diverged tens of millions of years ago? There are several possible answers to this, none of which I can conform or eliminate at this juncture:
- Conventional brewing yeasts have undergone huge changes over the past few centuries, including inter-species crosses (which appear to have led to lager yeast), partial duplications of large segments of their genomes, etc. Its possible that these semi-domesticated Brettanomyces underwent a similar case of rapid evolution in the brewery, possibly including hybridization with other Brettanomyces species.
- The rRNAs of the brewing strains may have evolved faster than the remainder of the genome, meaning if we compared other regions we may see lesser change. This is a potential issue with using only one gene to establish the evolutionary relationship between species. Indeed, the sequenced region is known to be highly variable, which is why we sequenced it in the first place.
- I may have contamination. I doubt this – mixed sequences usually create unsequencable results – but its a possibility to keep in mind. The fact I get 100% matches against segments of the species these strains are known to be derived from suggests that contamination is unlikely.
- Curtin CD, et al. (2007) Genetic diversity of Dekkera bruxellensis yeasts isolated from Australian wineries. FEMS Yeast Research. Vol 7 #3.
- Boekhout T et al. (1994). Phylogeny of the Yeast Genera Hanseniaspora (Anamorph Kloeckera), Dekkera (Anamorph Brettanomyces), and EenieZZa as Inferred from Partial 26s Ribosomal DNA Nucleotide Sequences. Int. Jour. System. and Evol. Microbiol. Vol 44 #4.
- Guillamon JM et al. (1998) Rapid identification of wine yeast species based on RFLP analysis of the ribosomal internal transcribed spacer (ITS) region. Arch Microbiol. Vol 169.
- Saccharomyces cerevisiae strain YJM789 complete ribosomal sequence
- Saccharomyces cerevisiae 25S ribosomal RNA gene, complete sequence
- Dekkera bruxellensis 26S ribosomal RNA gene, partial sequence