ABSTRACT
The clustered regularly interspaced short palindromic repeats (CRISPR) associated sequences (Cas) system is a prokaryotic acquired immunity against viral and plasmid invasion. The CRISPR Cas9 system is highly conserved throughout bacteria and archaea. Recently, CRISPR/Cas has been utilized to edit endogenous genomes in eukaryotic species. In certain contexts, it has proven invaluable for in vitro and in vivo modeling. Currently, CRISPR genome editing boasts unparalleled efficiency, specificity, and cost compared to other genome editing tools, including transcription activator-like effector nucleases (TALENs) and zinc finger nucleases (ZFNs). This review discusses the background theory of CRISPR and reports novel approaches to genome editing with the CRISPR system.
INTRODUCTION
CRISPR as a prokaryotic adaptive immune system
CRISPR was originally discovered in bacteria1 and is now known to be present in many other prokaryotic species.2,3 CRISPR systems in bacteria have been categorized into three types, with Type II as the most widely found. The essential components of a Type II CRISPR System located within a bacterial genome include the CRISPR array and a Cas9 nuclease. A third component of the Type II system is the protospacer adjacent motif on the target/foreign DNA. The CRISPR array is composed of clusters of short DNA repeats interspaced with DNA spacer sequences.4 These spacer sequences are the remnants of foreign genetic material from previous invaders and are utilized to identify future invaders. Upon foreign invasion, the spacer sequences are transcribed into pre-crisprRNAs (pre-crRNAs), which are further processed into mature crRNAs. These crRNAs, usually 20 base pairs in length, play a crucial role in the specificity of CRISPR/Cas. Upstream of the CRISPR array in the bacterial genome is the gene coding for transactivating crisprRNA (tracrRNA). tracrRNA provides two essential functions: binding to mature crRNA and providing structural stability as a scaffold within the Cas9 enzyme.5
Post-transcriptional processing allows the tracrRNA and crRNA to fuse together and become embedded within the Cas9 enzyme. Cas9 is a nuclease with two active sites that each cleaves one strand of DNA on its phosphodiester backbone. The embedded crRNA allows Cas9 to recognize and bind to specific protospacer target sequences in foreign DNA from viral infections or horizontal gene transfers. The crRNA and the complement of the protospacer are brought together through Watson-Crick base pairing. Before the Cas9 nuclease cleaves the foreign double-stranded DNA (dsDNA), it must recognize a protospacer adjacent motif (PAM), a trinucleotide sequence. The PAM sequence is usually in the form 5’-NGG-3’ (where N is any nucleotide) and is located directly upstream of the protospacer but not within it. Once the PAM trinucleotide is recognized, Cas9 creates a double-stranded breakage three nucleotides downstream of the PAM in the foreign DNA. The cleaved foreign DNA will not be transcribed properly and will eventually be degraded.5 By evolving to target and degrade a range of foreign DNA and RNA with CRISPR/Cas, bacteria have provided themselves with a remarkably broad immune defense.6
CRISPR Cas9 as an RNA-guided genome editing tool
The prokaryotic CRISPR/Cas9 system has been reconstituted in eukaryotic systems to create new possibilities for the editing of endogenous genomes. To achieve this seminal transition, virally-derived spacer sequences in bacterial CRISPR arrays are replaced with 20 base pair sequences identical to targeting sequences in eukaryotic genomes. These spacer sgRNAsequences are transcribed into guide RNA (gRNA), which functions analogously to crRNA by targeting specific eukaryotic DNA sequences of interest. The DNA coding for the tracrRNA is still found upstream of the CRISPR array. The gRNA and tracrRNA are fused together to form a single guide RNA (sgRNA) by adding a hairpin loop to their duplexing site. The complex is then inserted into the Cas9 nuclease. Within Cas9, the tracrRNA (3’ end of sgRNA) serves as a scaffold while the gRNA (5’ end of sgRNA) functions in targeting the eukaryotic DNA sequence by Watson-Crick base pairing with the complement of the protospacer (Fig. 1). As in bacterial CRISPR/Cas systems, a PAM sequence located immediately upstream of the protospacer must be recognized by the CRISPR/Cas9 complex before double-stranded cleavage occurs.5,7 Once the sequence is recognized, the Cas9 nuclease creates a double-stranded break three nucleotides downstream to the PAM’s location in the Eukaryotic DNA of interest (Fig. 1). The PAM is the main restriction on the targeting space of Cas9. Since the PAM is required to be immediately upstream of the protospacer, it is theoretically possible to replace the 20 base pair gRNA in order to target other DNA sequences near the PAM.5,7
Once the DNA is cut, the cell's repair mechanisms are leveraged to knockdown a gene, or insert a new oligonucleotide into the newly formed gap. The two main pathways of double-stranded DNA lesion repair associated with CRISPR genome editing are non-homologous end joining (NHEJ) and homology directed repair (HDR). NHEJ is mainly involved with gene silencing. It introduces a large number of insertion/deletion mutations, which manifest as premature stop codons, that effectively silence the gene of interest. HDR is mainly used for gene editing. By providing a DNA template in the form of a plasmid or a single-stranded oligonucleotide (ssODN), HDR can easily introduce desired mutations in the cleaved DNA.5
The beauty of the CRISPR system is its simplicity. It is comprised of a single effector nuclease and a duplex of RNA. The endogenous eukaryotic DNA can be targeted as long as it is in proximity to a PAM. The goal of this system is to induce a mutation, and the CRISPR Cas9 complex will cut at the site repeatedly until a mutation occurs. When a mutation does occur, the site will no longer be recognized by the complex and cleavage will cease.
Optimization and specificity of CRISPR/Cas systems
If CRISPR systems are to be widely adopted in research or clinical applications, concerns regarding off-target effects must be addressed. On average, this system has a target every eight bases in the human genome. Thus, virtually every gRNA has the potential for unwanted off-target activity. Current research emphasizes techniques to improve specificity, including crRNA modification, transfection optimization, and a Cas9 nickase mutation.
The gRNA can be modified to minimize its off-target effects while preserving its ability to target sequences of interest. Unspecific gRNA can be optimized by inserting single-base substitutions that enhance its ability to bind to target sequences in a position and base-dependent manner. Libraries of mutated genes containing all possible base substitutions along the gRNA have been generated to examine the specificity of gRNA and enzymatic activity of Cas9. It is important to note that if mutations occur near the PAM, Cas9 nucleases do not initiate cleavage. Targeting specificity and enzymatic activity are not affected as strongly by base substitutions on the 5’ end of gRNA. This leads to the conclusion that the main contribution to specificity is found within the first ten bases after the PAM on the 3’ end of gRNA.5
The apparent differential specificity of the Cas9 gRNA guide sequence can be quantified by an open source online tool (http://crispr.mit.edu/). This tool identifies all possible gRNA segments that target a particular DNA sequence. Using a data-driven algorithm, the program scores each viable gRNA segment depending on its predicted specificity in relation to the genome of interest.
Depending on the redundancy of the DNA target sequence, scoring and mutating gRNA might not provide sufficient reduction of off-target activity. Increasing concentrations of CRISPR plasmids upon transfection can provide a modest five to seven fold increase in on-target activity, but a much more specific system is desirable for most research and clinical applications. Transforming Cas9 from a nuclease to a nickase enzyme yields the desired specificity.5 Cas9 has two catalytic domains, each of which nicks a DNA strand. By inactivating one of those domains via a D10A mutation, Cas9 is changed from a nuclease to a nickase.
Two Cas9 nickases (and their respective gRNAs) are required to nick complementary DNA strands simultaneously. This technique, called multiplexing, mimics a double-stranded break by inducing single-stranded breaks in close proximity to one another. Since single-stranded breaks are repaired with a higher fidelity than double-stranded breaks, off-target effects caused by improper cleavage can be mitigated, leaving the majority of breaks at the sequence of interest. The two nickases should be offset 10—30 base pairs from each other.5 Multiplex nicking offers on-target modifications comparable to the wild type Cas9, while dramatically reducing off-target modifications (1000—1500 fold).5
DISCUSSION
CRISPR/Cas9 systems have emerged as the newest genome engineering tool and have quickly been applied in in vitro and in vivo research applications. However, before these systems can be used in clinical applications, off-target effects must be controlled. In spite of its current shortcomings, CRISPR has proven invaluable to researchers conducting high-throughput studies of the biological function and relevance of specific genes. CRISPR Cas9 genome editing provides a rapid procedure for the functional study of mutations of interest in vitro and in vivo. Tumor suppressor genes can be knocked out, and oncogenes with specific mutations can be created via NHEJ and HDR, respectively. The novel cell lines and mouse models that have been created by CRISPR technologies have thus far galvanized translational research by enabling more perspectives of studying the genetic foundation of diseases.
References
- Ishino, Y. et al. J Bacteriol. 1987, 169, 5429–5433.
- Mojica, F.J. et al. Mol Microbiol. 1995, 17, 85–93.
- Masepohl, B. et al. Biophys Acta. 1996, 1307, 26–30.
- Mojica, F.J. et al. Mol Microbiol. 2000, 36, 244–246.
- Cong, L. et al. Science. 2013, 6121, 819–823.
- Horvath, P. et al. Science. 2010, 327, 167.
- Ran, F.A. et al. Nat. Protoc. 2013, 8, 2281–2308.