CRISPR: Hit or Miss?
Twishaa Kartik
Westmount Mid/High School
Grade 7
Presentation
Hypothesis
Hypothesis: Allowing more mismatches between the guide RNA (gRNA) and DNA will increase detection of true targets but also increase the number of off-target binds, while stricter mismatch rules will reduce off-target errors but can miss some real targets.
Research
Research Question: How does mismatch tolerance influence CRISPR’s ability to identify target DNA?
I studied genes along with the structure of DNA, methods of gene-editing prior to CRISPR, and CRISPR technology. This included: researching how CRISPR-Cas9 technology was discovered, its role within the bacterial (prokaryotic) adaptive immune system, and how scientists adapted it into a gene-editing tool in eukaryotic cells. My research also looked at DNA repair mechanisms after CRISPR’s double stranded breaks/cleavage, applications of CRISPR in therapeutical and diagnostic-based use cases, as well as the risks and ethical concerns associated with this innovation. If you wish to view my research in more detail, please refer to “Section 1- Background Research” of my logbook. Thank you!
Variables
Manipulated Variables (what I change):
- Mismatch tolerance level (rule strictness)- the maximum number of allowed differences between the guide RNA and the DNA sequence before there is “no bind”
- Example: If best mismatches ≤ rule number = match
Responding Variables (what I measure):
-
CRISPR accuracy- measured as:
-
Number of true positives → said match and target was present
- Number of false positives → said match and target was not present
- Number of true negatives → said no match and target was not present
- Number of false negatives → said no match and target was present
Control Variables (what stays the same):
- Same sequence length
- Same gRNA strip
- Same scoring methods
- Same dataset of paper strips
Procedure
- I selected a 12-letter guide RNA (gRNA) sequence to represent the CRISPR targeting sequence.
- I used 20 DNA strips, each 30 letters long as the dataset. Each strip either contained or did not contain a target sequence similar to the guide RNA, as indicated by a truth key.
- I created five mismatch tolerance rules: Rule 0: 0 mismatches allowed, Rule 1: up to 1 mismatch allowed, Rule 2: up to 2 mismatches allowed, Rule 3: up to 3 mismatches allowed, and Rule 4: up to 4 mismatches allowed
- For each DNA strip, I slid the guide RNA across the sequence to create 19 possible 12-letter comparison windows.
- I compared each window to the guide RNA, and counted the number of mismatched letters.
- I recorded the lowest number of mismatches found among the 19 windows as the “best mismatch score” for that DNA strip.
- For each mismatch tolerance rule, I classified the DNA strip as a “match” if the best mismatch score was less than or equal to the rule number, and as a “no match” if it was greater.
- I repeated this process for all 20 DNA strips under each of the five rules.
- Then, the results I predicted were compared to the truth key to rank each outcome as either a true positive, false positive, true negative, or false negative.
- Lastly, I recorded the results to see how mismatch tolerance affected CRISPR accuracy.
Observations
After completing the matching process for all of the 20 DNA strips, I collected data for each mismatch tolerance rule from Rule 0 to Rule 4. For each rule, every DNA strip was determined to be either a match or no match based on its best mismatch score. These predictions were then compared to the truth key to find out whether the outcome was a true positive, false positive, true negative, or false negative. Under stricter rules (Rule 0 and Rule 1), only a small number of DNA strips were identified as matches. These rules produced very few true positives and no false positives, but a large number of false negatives occurred. As the rules became looser, the number of true positives increased and the number of false negatives decreased. Having said that though, the number of false positives also increased under higher mismatch tolerance rules. By Rule 4, all true target sequences were detected, but several DNA strips without the target were also incorrectly identified as matches.
Analysis
The data I collected shows that changing mismatch tolerance directly affects CRISPR accuracy. Under stricter mismatch rules, CRISPR was extremely selective, resulting in reduced false positives. However, this specificity caused many real target sequences to be missed, leading to a high number of false negatives. As mismatch tolerance increased, CRISPR became more likely to identify target sequences. This led to an rise in true positives and a decrease in false negatives. At the same time, these looser rules created more false positives, meaning that sometimes DNA sequences without the target were incorrectly identified as matches. These results shed light on a trade-off between avoiding false positives and detecting all true targets. Stricter rules reduce off-target binding but miss real targets, while looser rules improve detection but reduce overall accuracy. This pattern or “trend” remained consistent throughout all the five mismatch tolerance rules tested in my experiment.
Conclusion
The results of my experiment support my original hypothesis that increasing mismatch tolerance improves CRISPR’s ability to detect target DNA sequences but reduces its accuracy. Strict mismatch rules produced very few false positives but resulted in many false negatives. In contrast, looser mismatch rules detected most or almost all true targets but created much more false positives. This shows that CRISPR accuracy is heavily based upon how strictly mismatches between the guide RNA (gRNA) and DNA are allowed. Additionally, my project illustrates that CRISPR does not operate as a perfectly precise system, but rather relies on a careful balance between sensitivity and specificity that changes depending on mismatch tolerance.
Application
The results of my experiment show why mismatch tolerance is a crucial factor in real-world applications of CRISPR. In gene-editing and therapeutical use cases, both accuracy and specificity are extremely important and are prioritized. Off-target effects caused by a higher mismatch tolerance can lead to unintended mutations, the disruption of essential genes, or even the activation of harmful genes. This is why, for these applications, stricter mismatch rules would be more beneficial, even at the cost of some true target sequences potentially being missed, as this reduces the risk of off-target edits that could have dire consequences. However, with CRISPR-based diagnostics, higher sensitivity would be more appropriate. In diagnostic situations, missing a target sequence could be dangerous, especially if detecting infections or genetic mutations. Allowing a higher mismatch tolerance increases the chances of identifying true targets, even though it may also increase false positives. In these circumstances, detecting a possible target is typically more important than avoiding every single wrong match. Overall, my project proves and demonstrates that by adjusting mismatch rules, scientists can find an ideal balance between sensitivity and specificity to match the different goals of CRISPR applications, such as: minimizing off-target errors/effects in gene therapy as well as increasing detection capabilities in diagnostic fields.
Sources Of Error
One source of error in this project is that my paper-based model oversimplifies actual CRISPR biology. In real-life biological systems, CRISPR binding and cutting are impacted by other factors, beyond the number of mismatches, such as: the position of the mismatch and the conditions of the cell, which weren’t represented in this simulation. Another source of error is the limited dataset size. Only 20 DNA strips were used, which may not fully represent the vast span of possible DNA sequences that CRISPR encounters in real genomes. Lastly, all mismatches were treated equally in the model, even though mismatches closer to important regions such as PAM sites could have a greater effect on CRISPR’s accuracy.
Citations
References and Sources: APA 7th Edition Format
Anderson, E. M., Haupt, A., Schiel, J. A., Chou, E., Machado, H. B., Strezoska, Ž., Lenger, S., McClelland, S., Birmingham, A., Vermeulen, A., & Smith, A.V (2015). Systematic analysis of CRISPR-Cas9 mismatch tolerance reveals low levels of off-target activity. Journal of biotechnology, 211, 56–65. https://doi.org/10.1016/j.jbiotec.2015.06.427
Ansori, A. N., Antonius, Y., Susilo, R. J., Hayaza, S., Kharisma, V. D., Parikesit, A. A., Zainul, R., Jakhmola, V., Saklani, T., Rebezov, M., Ullah, M. E., Maksimiuk, N., Derkho, M., & Burkov, P. (2023). Application of CRISPR-Cas9 genome editing technology in various fields: A review. Narra J, 3(2), e184. https://doi.org/10.52225/narra.v3i2.184
Azizoglu R. O., CRISPR-Cas systems in diagnostics: A comprehensive assessment of Cas effectors and biosensors, Gene and Genome Editing, Volumes 3–4, 2022, 100019, ISSN 2666-3880, https://doi.org/10.1016/j.ggedit.2022.100019.
Davis, D. J., & Yeddula, S. G. R. (2024). CRISPR Advancements for Human Health. Missouri medicine, 121(2), 170–176. https://pmc.ncbi.nlm.nih.gov/articles/PMC11057861/
Gaj, T., Gersbach, C. A., & Barbas, C. F., 3rd (2013). ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends in biotechnology, 31(7), 397–405. https://doi.org/10.1016/j.tibtech.2013.04.004
Gostimskaya I. (2022). CRISPR-Cas9: A History of Its Discovery and Ethical Considerations of Its Use in Genome Editing. Biochemistry. Biokhimiia, 87(8), 777–788. https://doi.org/10.1134/S0006297922080090
Iorio, F., Behan, F.M., Gonçalves, E. et al. Unsupervised correction of gene-independent cell responses to CRISPR-Cas9 targeting. BMC Genomics 19, 604 (2018). https://doi.org/10.1186/s12864-018-4989-y Rojahn, S. Y. (2014, February 11). Genome surgery. MIT TechnologyReview. https://web.archive.org/web/20160112051853/http://www.technologyreview.com/review/524451/genome-surgery/
What is CRISPR: Your Ultimate Guide | Synthego. (2025\, December 1). Synthego.https://www.synthego.com/learn/crispr/
Jolany Vangah, S., Katalani, C., Boone, H.A. et al. CRISPR-Based Diagnosis of Infectious and Noninfectious Diseases. Biol Proced Online 22, 22 (2020). https://doi.org/10.1186/s12575-020-00135-3
Acknowledgement
To start things off, I would like to greatly thank my teacher, Ms. Lai for her guidance, patience and support during the entirety of this project. I would also like to acknowledge the volunteer judges for taking time out of their busy lives to examine my project and provide valuable critiques and feedback. Lastly, I would like to thank my family for always being encouraging and supportive towards me. Artificial intelligence was utilised in both a limited as well as a responsible way to help simplify complex concepts during some parts of my background research and to generate random letter sequences for the paper DNA strips used in the model/simulation.
