Science Reviews - Biology, 2024, 3(1), 9-15 Martina Elena Tarozzi
Artificial intelligence for Next generation sequencing
data analysis
Martina Elena Tarozzi, indipendent researcher. Florence, Tuscany, Italy
https://doi.org/10.57098/SciRevs.Biology.3.1.2
Received March 22, 2024. Revised April 09, 2024. Accepted April 10, 2024.
References
1. Muir P, Li S, Lou S, et al. The real cost of sequencing: Scaling computation to keep pace with data
generation. Genome Biol. 2016; 17:19
2. Jia W, Sun M, Lian J, et al. Feature dimensionality reduction: a review. Complex Intell. Syst. 2022
83 2022; 8:26632693
3. Meng C, Zeleznik OA, Thallinger GG, et al. Dimension reduction techniques for the integrative
analysis of multi-omics data. Brief. Bioinform. 2016; 17:628641
4. Hinton G; L van der M. Visualizing Data using t-SNE. Ann. Oper. Res. 2014; 219:187202
5. McInnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for
Dimension Reduction. 2018;
6. Yang Y, Sun H, Zhang Y, et al. Dimensionality reduction by UMAP reinforces sample
heterogeneity analysis in bulk transcriptomic data. Cell Rep. 2021; 36:
7. Li W, Cerise JE, Yang Y, et al. Application of t-SNE to human genetic data. J. Bioinform. Comput.
Biol. 2017; 15:114
8. Jin J, Wang H, Shu Z, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial
Databases with Noise. Cailiao Yanjiu Xuebao/Chinese J. Mater. Res. 2017; 31:219225
9. Xu C, Jackson SA. Machine learning and complex biological data. Genome Biol. 2019; 20:76
10. Bzdok D, Krzywinski M, Altman N. Points of significance: Machine learning: Supervised
methods. Nat. Methods 2018; 15:56
11. Murphy KP. Machine Learning A Probabilistic Perspective. MIT Press Cambridge, Massachusetts
London, Engl. 2018; 16:
12. Omta WA, Heesbeen RG van, Shen I, et al. Combining Supervised and Unsupervised Machine
Learning Methods for Phenotypic Functional Genomics Screening:
https://doi.org/10.1177/2472555220919345 2020; 25:655664
13. Zou J, Huss M, Abid A, et al. A primer on deep learning in genomics. Nat. Genet. 2019; 51:1218
14. Gong X, Zhang H, Liu X, et al. Is liquid biopsy mature enough for the diagnosis of Alzheimer’s
disease? Front. Aging Neurosci. 2022; 14:891
15. Lone SN, Nisar S, Masoodi T, et al. Liquid biopsy: a step closer to transform diagnosis, prognosis
and future of cancer treatments. Mol. Cancer 2022 211 2022; 21:122
16. Liu L, Chen X, Petinrin OO, et al. Machine learning protocols in early cancer detection based on
liquid biopsy: A survey. Life 2021; 11:139
17. Peneder P, Stütz AM, Surdez D, et al. Multimodal analysis of cell-free DNA whole-genome
sequencing for pediatric cancers with low mutational burden. Nat. Commun. 2021; 12:
Martina Elena Tarozzi Science Reviews - Biology, 2024, 3(1), 9-15
18. Gu F, Wang X. Analysis of allele specific expression-a survey. Tsinghua Sci. Technol. 2015;
20:513529
19. Im YR, Tsui DWY, Diaz LA, et al. Next-Generation Liquid Biopsies: Embracing Data Science in
Oncology. Trends in Cancer 2021; 7:283292
20. Zhou J, Li L, Wang L, et al. Establishment of a SVM classifier to predict recurrence of ovarian
cancer. Mol. Med. Rep. 2018; 18:35893598
21. Xu G, Zhang M, Zhu H, et al. A 15-gene signature for prediction of colon cancer recurrence and
prognosis based on SVM. Gene 2017; 604:3340
22. Constantin N, Sina AAI, Korbie D, et al. Opportunities for Early Cancer Detection: The Rise of
ctDNA Methylation-Based Pan-Cancer Screening Technologies. Epigenomes 2022; 6:127
23. Bahado-Singh RO, Radhakrishna U, Gordevičius J, et al. Artificial Intelligence and Circulating
Cell-Free DNA Methylation Profiling: Mechanism and Detection of Alzheimer’s Disease. Cells 2022;
11:119
24. Kelley DR, Snoek J, Rinn JL. Basset: Learning the regulatory code of the accessible genome with
deep convolutional neural networks. Genome Res. 2016; 26:990999
25. Onimaru K, Nishimura O, Kuraku S. Predicting gene regulatory regions with a convolutional
neural network for processing double-strand genome sequence information. PLoS One 2020;
15:e0235748
26. Kelley DR, Reshef YA, Bileschi M, et al. Sequential regulatory activity prediction across
chromosomes with convolutional neural networks. 2018;
27. Fudenberg G, Kelley DR, Pollard KS. Predicting 3D genome folding from DNA sequence with
Akita.
28. Jinek M, Chylinski K, Fonfara I, et al. A programmable dual-RNA-guided DNA endonuclease in
adaptive bacterial immunity. Science (80-. ). 2012; 337:816821
29. Doench JG, Hartenian E, Graham DB, et al. Rational design of highly active sgRNAs for CRISPR-
Cas9mediated gene inactivation. Nat. Biotechnol. 2014 3212 2014; 32:12621267
30. Doench JG, Fusi N, Sullender M, et al. Optimized sgRNA design to maximize activity and
minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 2015 342 2016; 34:184191
31. Kim HK, Kim Y, Lee S, et al. SpCas9 activity prediction by DeepSpCas9, a deep learningbased
model with high generalization performance. Sci. Adv. 2019; 5:
32. Chuai G, Ma H, Yan J, et al. DeepCRISPR: Optimized CRISPR guide RNA design by deep
learning. Genome Biol. 2018; 19:118
33. Kim HK, Min S, Song M, et al. Deep learning improves prediction of CRISPRCpf1 guide RNA
activity. Nat. Biotechnol. 2018 363 2018; 36:239241
34. Moreno-Mateos MA, Vejnar CE, Beaudoin JD, et al. CRISPRscan: designing highly efficient
sgRNAs for CRISPR-Cas9 targeting in vivo. Nat. Methods 2015 1210 2015; 12:982988
35. Wang J, Zhang X, Cheng L, et al. An overview and metanalysis of machine and deep learning-
based CRISPR gRNA design tools. RNA Biol. 2020; 17:13