Organisms: the schistosome blood flukes S. mansoni and S. haematobium
Laboratories: Brindley (George Washington University), Esvelt (MIT).
Safeguards: ecological confinement (not endemic in the D.C. area), , life-cycle complexity (requires intermediate snail host), barrier confinement
Schistosomes are parasitic flatworms that live in freshwater. On encountering a person, they burrow through the skin, migrate through the circulatory system to the intestines and liver (S. Mansoni) or urinary tract (S. Haematobium), pair up with an parasite of the opposite sex, and begin releasing thousands of eggs. In some cases infection is asymptomatic, but in others the the egg release causes chronic damage to internal organs, usually due to an immune reaction. In children, this causes poor growth and permanent learning difficulty. Death can result from renal or liver failure, bladder cancer, or other organ damage. An estimated 20 million people suffer from severe consequences. The disease is also called bilharzia. GiveWell has ranked the Schistosomiasis Control Initiative as a top-rated charity since November 2011.
Paul Brindley's lab at George Washington University is leading the project. As pioneers in schistosome transgenesis and genetics, they are among the very few researchers skilled enough in working with schistosomes to undertake such an effort. We are also profoundly grateful to the many other researchers who have tirelessly studied this neglected disease and the funders who have supported their efforts, and can only hope to build upon their work. Particular thanks go to Thomas Mather and MaxMind Inc. who have generously offered to fund the initial stage of the project.
The project will begin by characterizing components necessary to create efficient CRISPR-based global drive systems in schistosomes. Using this knowledge, we will then design and build population suppression drives intended to locally or globally eradicate the schistosomes that cause most human disease. The complete genome sequences of S. mansoni, which causes intestinal schistosomiasis, and S. haematobium, the cause of urogenital schistosomiasis, are available at genedb.net and schistodb.org. Because the genetics and chromosomal architecture of S. mansoni are better understood, we will initially focus on that organism and follow with S. haematobiumonce comparable information is available.
We propose to build population suppression drives intended to locally or globally eradicate the schistosomes that cause most human disease. The complete genome sequences of S. mansoni, which causes intestinal schistosomiasis, and S. haematobium, the cause of urogenital schistosomiasis, are available at genedb.net and schistodb.org. Because the genetics and chromosomal architecture of S. mansoni are better understood, we will initially focus on that organism and follow with S. haematobium once comparable information is available.
Suppression gene drive options
Austin Burt outlined two types of population suppression gene drives. Genetic load drives disable recessive genes required for fertility or viability and are copied exclusively in the germline during pre-meiosis. This ensures that the gene drive spreads rapidly when it is rare, as most offspring will be fully fertile and viable heterozygotes, but will crash the population once abundant because matings between two drive-carrying organisms will produce infertile or nonviable offspring. Unfortunately, schistosome genetics are not sufficiently developed to reliably pinpoint suitable target genes.
Sex-biasing gene drives ensure that nearly all offspring are a particular sex – usually male – in order to progressively reduce the reproductive potential of the population. They are typically encoded on a sex chromosome and receive a fitness advantage due to shredding the “opposing” sex chromosome during meiosis such that the the driving chromosome is preferentially inherited by offspring. In XY animals, a male-biasing drive is a Y-encoded “X-shredder” that acts exclusively in males to shred the X chromosome and consequently ensuring inheritance inheritance. Austin Burt's group has encoded the homing endonuclease I-PpoI, which targets a repeated ribosomal RNA gene, on an autosome and shown that the offspring are over 95% male, but they have not yet succeeded in encoding the shredder on the Y chromosome to create a true sex-biasing drive.
Schistosomes are a ZW rather than XY species, so females are ZW and males ZZ. Unlike humans and mice, the Z and W are largely homologous, but there are unique regions of both. It is therefore possible to make a female-biasing Z-shredder and a male-biasing W-shredder. From a fitness perspective the Z-encoded W-shredder is a superior option because it can also act as a conventional gene drive in ZZ males. That is, in ZW females it will shred the W, thereby ensuring that offspring inherit the driving Z chromosome and are consequently male, but will also copy itself from the driving Z to the wild-type Z in heterozygous males. It will consequently be even more fit than a comparable X- or Z-shredder. Both activities are advantageous and should consequently be evolutionarily stable and even optimized over time, albeit opposed by any unlikely suppressors that might evolve on the W or autosome. The greater fitness should result in an increased likelihood of driving the target schistosome to extinction.
Evolutionary stability of a W-shredder suppression drive in S. mansoni
Schistosomes are also uniquely suited for the W-shredder approach because the W chromosome is distinguished by large numbers of repeated sequences that are clustered around the centromere. Because these sequences are not shared by the Z and there are many copies, targeting multiple sites within multiple repeats can ensure that there are dozens if not hundreds of target sites available for cutting such that the W chromosome should be reliably shredded. Unlike the proposed I-PpoI X-shredder in Anopheles gambiae, CRISPR/Cas can target multiple sites within different W-encoded repeats to ensure that resistance is unlikely to arise. Cutting at several different sites can prevent the cell from simply converting all repeats to a resistant sequence, which is what occurs when I-PpoI is expressed in yeast (Muscarella et al. 1993). In contrast, the region unique to the Z does not appear to contain unique repeated sequences. Rather, it is quite gene-rich, offering several good targets for a conventional gene drive that cuts and recodes the 3' region of an essential gene.
Expression timing and expected outcomes
The W-shredder needs to express Cas9 in both males and females to ensure that it can be copied from Z to Z in males and shred the W in females. Ideally, this should occur exclusively in the germline to minimize the risk that accidental cuts elsewhere in the genome will reduce the fitness of the organism. But when the copying happens is also important, as it will determine the overall impact of the gene drive on the schistosome population. There are 3 possible outcomes of a W-shredder drive depending on when it is expressed and its effects on egg production.
1) The drive is active just before sperm and egg production in the germline. W-shredding in females causes (nearly) all eggs to inherit a Z chromosome carrying the drive. Total egg production is >50% that of a female with a wild-type Z. In males that inherit only 1 copy, the drive element is copied onto the wild-type Z before sperm production so that nearly all sperm carry the driving Z.
In this scenario, a W-shredding Z has higher fitness than a wild-type Z because it is inherited by nearly all offspring rather than only half. That it also copies itself from Z to Z in males only increases its fitness. It's evolutionarily stable because both types of drive are actively beneficial regardless of their frequency. The population becomes increasingly dominated by males and consequently crashes. This is severe enough that affected schistosomes will likely go locally extinct. Depending on how far schistosomes travel, this may spread from subpopulation to subpopulation until the entire species is extinct, or the crash might be fast enough that drive-carrying schistosomes don't manage to spread to relatively isolated subpopulations. In this case directly releasing drive-carrying organisms in unaffected subpopulations will also drive them extinct. This will enable us to totally eradicate schistosomiasis caused by the target species.
2) The drive is active just before sperm and egg production in the germline. As above, W-shredding in females causes (nearly) all eggs to inherit a Z chromosome carrying the drive. But intact Z chromosomes aren't preferentially distributed into eggs, so it simply kills (nearly) all eggs don't inherit a Z. Total egg production is 50% of a female with wild-type Z, and (nearly) all of them carry a Z with the drive. In males that inherit only 1 copy, the drive element is copied onto the wild-type Z before sperm production so that nearly all sperm carry the driving Z.
W-shredding doesn't hurt the prospects of the driving Z chromosome because those eggs that would have carried it anyway all survive. But it doesn't get any more eggs to inherit it than a wild-type Z. It might still help if there's sibling competition between males and females such that eliminating the females makes the males more likely to survive. We don't know if this sort of density-dependent selection exists in schistosomes. Assuming that it doesn't, W-shredding is neutral, which means that it won't be actively selected for and consequently may be evolutionarily unstable – but only slightly, because it shouldn't be particularly harmful either.
The advantage of the driving Z is that W-shredding in females is only one activity. In males, nearly all offspring will inherit the driving Z even if the male is also born with a wild-type Z. Note that if the male partners with a wild-type female, they will have both male and female offspring that carry the driving Z. This gives the driving Z a powerful fitness advantage over the wild-type Z. It should be more than enough to counterbalance any costs imposed by the drive. It should cause local schistosome populations to go extinct, but more slowly than outcome 1.
3) The drive is active when the egg is fertilized by a sperm. If the egg carries a W, it is immediately shredded by the driving Z, causing (nearly) all such females to die. If the egg carries a wild-type Z, the drive will cut it and subsequently be copied. With 2 copies of the driving Z, the resulting male is guaranteed to transmit it to all offspring.
In this scenario, the driving Z kills (nearly) all females that carry it, which is costly. But it ensures that males (nearly) always pass it on to offspring. This should mostly counterbalance the loss of females, but is unlikely to completely balance it unless there is fairly strong density-dependent selection that favors schistosomes with fewer siblings. If the net fitness of the driving Z relative to a wild-type Z ends up slightly negative, as seems likely, it should gradually disappear from the population rather than spreading indefinitely, but will effectively bias the local population towards males and cause a crash wherever the drive is abundant. Such a drive would be useful for local population eradication without risking the possibility of spread into other regions that haven't agreed to use it. If the net fitness is slightly positive, it will reliably cause a crash that might extend to all connected subpopulations, but even more slowly than in outcome 2. If, as is most likely, the fitness varies by density, it should spread when rare but slowly die out after causing a population crash, thereby drastically reducing schistosome numbers without driving them extinct.
Before designing any gene drive, it is essential to take proper precautions. In the case of schistosomes, the risk of accidental release is very low. The organisms are not particularly mobile, to put it mildly, and are not endemic anywhere within several thousand miles of the Brindley lab. The only plausible routes to accidental release and the spread of a gene drive system through wild schistosomes involve the unintended and undiagnosed infection of someone in the laboratory who then travels to an endemic region, or deliberate unauthorized release by a human. Consequently, laboratory security will be a top priority. Treatment of potentially exposed laboratory members with praziquantel prior to travel to endemic regions should be considered. Any addition physical barriers beyond those normal for a schistosome laboratory are probably not necessary due to the low mobility of the species.
All laboratories engaged in building gene drive systems intended as candidates for eventual release should simultaneously build immunizing reversal drives capable of undoing their effects in case something goes wrong. In this case, an immunizing reversal drive is a gene drive that similarly spreads from Z to Z by targeting and recoding the same gene used by the W-shredder drive. It should cut both the wild-type gene, thereby allowing it to spread through the wild population, but also the recoded version used by the W-shredder drive. Its own recoded version must not be susceptible to cutting by the W-shredder. This allows it to spread through the wild population and immunize it against the W-shredder and also overwrite the W-shredder when both are present in the same organism.
The advantage of building an immunizing reversal drive is that it can serve as a gene drive control. It should be capable of copying itself onto the wild-type Z chromosome without any of the complications of shredding the W. If the immunizing reversal drive works and the W-shredder does not, we will know it is due to the W-shredder activity and will be able to go about fixing it.
Choosing a target gene exclusive to the Z
To be evolutionarily stable against drive-resistant alleles that block cutting, gene drives must cut sequences important for the fitness of the organism at multiple sites (Esvelt 2014 eLife). Genes encoding ribosomal proteins are often good candidates as they are often either haploinsufficient (e.g. one broken copy is lethal) or at least substantially reduce the chances of reproduction when even one copy is broken. In addition, exchanging the downstream sequence, or 3' untranslated region (UTR) of different ribosomal genes is unlikely to affect fitness as they are all highly expressed.
Lepesant and coworkers in the Gruneau laboratory (Genome Biology 2012) identified sequences unique to the W and Z via high-throughput sequencing and qPCR. The gene encoding ribosomal protein S4 (Smp_011570) is present in one of the scaffolds (Smp_scaff000019) they identified as male-specific in Additional file 1 of their Supplementary Information. It has a single intron in the middle. Expression data suggests that it is produced as two transcripts with different start sites. The rpS4 gene is consequently a good target for a gene drive that will target and recode the tail end of the gene, replace the 3'UTR with that of a different ribosomal protein, and insert Cas9 and guide RNAs next to it.
Using the sgRNAcas9 program (Xie et al. PLoS ONE 2014) to identify target sequences with the fewest possible off-targets identified five good target sequences within rpS4 that can be targeted with guide RNA encoding corresponding spacers. Of these, sites 1 and 2 have GG in spacer positions 1 and 2 (the end closest to the PAM), which corresponds to the highest activity in nematode worms (Farboud and Meyer, Genetics 2015). Unfortunately, target 1 located in the first half of the gene before the intron. Because introns can't be recoded like protein-coding sequences, any introns between the gene drive components and the most distant cut site must be removed – which can affect expression. Since we'd rather avoid that complication, we'll go with sites 2-5.
S4 Spacer 2: AACATGTGTATGGTTACGGG
S4 Spacer 3: GGACGTAACTTGGGACGAGT
S4 Spacer 4: GTTCACGTGCGCGACAGCAC
S4 Spacer 5: AAGAGCGCGATAAGCGAATA
Next, we must recode that portion of the gene to remove the cut sites from the version carried by the gene drive, eliminate potential homology to ensure that homology-directed repair of the Cas9-created double-strand break copies the entire gene drive, and swap the 3'UTR for that of another ribosomal protein gene for the same reason. Two candidate 3'UTRs from other ribosomal protein genes:
rpS6 3'UTR: AATTGTGTAACTAAAAATAAATATGTGGAT (polyA)
rpL4 3'UTR: TGTCCTGAAATACAGTTGTGATGTTGG (polyA)
The original coding region and two different recoded versions, with amino acid sequences and the 3'UTRs above:
The first recoded sequence was chosen specifically to create good target sites for the immunizing reversal drive. These are:
R spacer 1: AATATGTGCATGGTCACAGG
R spacer 2: GGTCGTAATCTTGGCCGTGT
R spacer 3: GTCCACGTACGTGATTCAAC
R spacer 4: ACAAGAGAATTCGTGCTCGG
Choosing target repeated sequences exclusive to the W
The next step is to find target sites that will shred the W. Lepesant and coworkers identified numerous apparently repeated sequences unique to the W and verified that they were female-specific by PCR (Genome Biology 2012) . Whereas most of the repeats identified via high-throughput sequencing read counts gave some signal from males in qPCR experiments, the W7 and W13 repeats did not. Both of these are clustered in the pericentromeric region, which suggests that cutting sequences within these repeats should disrupt proper chromosome sorting of the W during the completion of female meiosis, and could possibly cause the driving Z chromosome to preferentially end up in the egg (scenario 1, above).
Evaluating the W7 and W13 repeats with the sgRNACas9 identifies four target sequences with few potential off-target sites:
W7 spacer 1: ACCAACCACTGTTGTACACA
W7 spacer 2: CTGTAGAAGCACTCGTACCA
W7 spacer 3: TTAGTACTGGAATGAAGAGG
W13 spacer 4: GGAGTGAATTGGATGTGATG
Expressing guide RNAs
The W-shredder drive must express four guide RNAs targeting the rpS4 gene and four that target the W repeats. The immunizing reversal drive must express the same four guide RNAs that target wild-type rpS4 (enabling it to immunize wild-type populations by spreading through them) as well as four that target sites in the W-shredder drive.
Guide RNAs are most effective when driven by U6 promoters, which drive RNA Polymerase III-based transcription of the U6snRNA spliceosome subunit. Running a BLAST search using the related nematode U6 sgRNA identifies 8 U6sgRNAs in S. mansoni, one of which is duplicated 11 times.
This is somewhat unusual; most organisms have fewer copies. The duplicated version and a copy on chromosome 2 appear to be among the closest to the conserved sgRNA sequence across species. Because we don't know which promoter is most effective without trying them, we will use both in the initial designs.
Preventing internal recombination
There is one major problem remaining: the gene drive cassettes must not contain repetitive elements. Studies of gene drives based on zinc finger nucleases and TALENs demonstrated that internal recombination was extremely frequent between sequences that were exact repeats or had substantial regions of internal homology (Simoni et al Nucl. Acids Res. 2014).
Using a different U6 promoter for each guide is probably not an option. The resulting construct would be very large, and it is unclear how many of the promoters in S. Mansoni are functional. Some organisms have only a couple of U6 promoters, so a different strategy is preferred.
Placing two groups of guide RNAs in opposite orientations with substantial distance in between can help. This arrangement ensures that any recombination event between the two groups will invert the region in between rather than excising it. Since an inverted version of the guide cassette is equally functional, there is no cost. This effectively reduces the requirements in half, although it can make reversal drive design slightly more complicated.
Even with inversion, enough guide RNAs are required for evolutionary stability (at least 3 and probably 4 for large populations) that more than one must be encoded in the same orientation. How can this be done using as few U6 promoters as possible?
The first method is to simply fuse the guide RNAs together using a 15 base pair linker in between. This method hasn't been published, but both guides retain ~80%+ of the activity of an equivalent single guide. It only works for two guides, but that may be enough. The downside is that it is impossible to truncate the second guide for improved specificity.
The second method is to process the U6-transcribed RNA into separate guides using some form of RNA-cleaving enzyme. One option is to include such an enzyme in the gene drive cassette, most likely the Type I CRISPR processing enzyme Csy4. But this adds an extra protein that could increase toxicity or otherwise reduce fitness. Instead, we should use the native tRNA-processing machinery. That is, including tRNAs in between guide RNAs will cause the native machinery to cut out the tRNAs, thereby freeing the individual guides. This method can produce truncated guide RNAs of any desired length. Using several different tRNAs, which can differ considerably in sequence, prevents repeated sequences.
The final problem is the inherent repetitiveness of guide RNAs. Single/synthetic guide RNAs (sgRNAs) are typically ~80 base pairs in length. Stacking several in a row in the same orientation is highly likely to cause internal recombination upon copying of the drive cassette. To address this problem, we have identified numerous sequence variants of the standard sgRNA sequence for S. pyogenes Cas9 that retain high activity. These are described in our daisy drive preprint.
The S. mansoni generation time is ~3 months, which is consequently the minimum design-build-test cycle length. Because this is a long time and working with schistosomes is difficult, we want to maximize the chance that one of our initial designs will function as intended and require only minimal fine-tuning to generate an efficient drive. Given that there are several possible drive outcomes depending on the timing of Cas9 expression and the outcome of shredding, several of which would be independently useful, we will build several gene drives in parallel using different Cas9 promoters. We will also build candidates using both the paired-guide and the tRNA-processing architectures. Immunizing reversal drives will serve as controls. This adds up to four different versions per Cas9 promoter to be tested, which is quite a few. We will consequently test the two most likely promoters for germline-specific expression in both males and females: the smvlg2 (nos-2) promoter and the daz-1 promoter.
At the same time, we recognize that future designs will likely need to use better-characterized components. Building both paired and tRNA-based architectures will enable us to directly compare the two and use only the better version for future designs. We additionally hope to compare different U6 promoters in cultured cell assays and use only the most effective in future. Lastly, we will build immunizing reversal drives using two additional promoters to be determined for comparative purposes, bringing the total number of constructs to ten.