Traditional directed evolution relies on the experimenter laboriously creating new generations of candidate molecules, performing a selection by hand, extracting the DNA of the best performers, and repeating the cycle. Because directed evolution is most effective over many generations, the time and labor costs are significant and typically limit experiments to only a handful of cycles.
During phage-assisted continuous evolution (PACE), we harness the fastest-evolving replicator in nature, the bacteriophage, to continuously evolve proteins or nucleic acids toward desired functions in vivo. We link the phage's ability to reproduce to the desired molecular activity by moving a gene required for phage infection (gene III) from the phage genome into the host cell and replacing it with the gene(s) we want to evolve. We then engineer the host cell to produce the missing phage protein (pIII) in proportion to the desired activity.
When grown in a continuous culture apparatus, the phages compete to optimize the function of the gene(s) to be evolved in order to maximize the amount of phage protein produced, and thus their own fitness. Due to the speed of the phage life cycle, automated evolution occurs dozens of times faster than alternative methods and is readily performed in parallel.
Importantly, cells in PACE are merely transient replication factories for the evolving phage population. All host cells flow through the evolution vessel - termed a "lagoon" - faster than they can divide. This prevents the cellular genome from evolving or otherwise interfering with the experiment, and also allows us to diversify the phage population by increasing the mutation rate higher than the bacteria can tolerate.
In David Liu's laboratory, I initially used PACE to generate polymerase variants capable of recognizing distinct promoters and initiating with alternative bases. Each evolution required less than one week of PACE. More recently, we developed and applied a method to experimentally measure the extent of protein evolutionary convergence and reproducibility at the genetic and phenotypic levels over hundreds of generations.
A second key challenge is to understand the factors influencing evolutionary outcomes. To investigate the importance of path-dependence and stochasticity, we programmed PACE to evolve initially identical populations of phage encoding T7 RNA polymerase towards different intermediate activities, then converge towards a shared final activity. Multiple populations were evolved along each path. We found that protein evolution is highly path-dependent and infrequently reproducible. Independent populations are less likely to become trapped at local fitness peaks, experimentally validating Sewall Wright's hypothesis of 80 years ago.