Science Reviews - Biology, 2024, 3(1), 9-15 Martina Elena Tarozzi
Artificial intelligence for Next generation sequencing
data analysis
Martina Elena Tarozzi, indipendent researcher. Florence, Tuscany, Italy
Received March 22, 2024. Revised April 09, 2024. Accepted April 10, 2024.
1. Muir P, Li S, Lou S, et al. The real cost of sequencing: Scaling computation to keep pace with data
generation. Genome Biol. 2016; 17:19
2. Jia W, Sun M, Lian J, et al. Feature dimensionality reduction: a review. Complex Intell. Syst. 2022
83 2022; 8:26632693
3. Meng C, Zeleznik OA, Thallinger GG, et al. Dimension reduction techniques for the integrative
analysis of multi-omics data. Brief. Bioinform. 2016; 17:628641
4. Hinton G; L van der M. Visualizing Data using t-SNE. Ann. Oper. Res. 2014; 219:187202
5. McInnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for
Dimension Reduction. 2018;
6. Yang Y, Sun H, Zhang Y, et al. Dimensionality reduction by UMAP reinforces sample
heterogeneity analysis in bulk transcriptomic data. Cell Rep. 2021; 36:
7. Li W, Cerise JE, Yang Y, et al. Application of t-SNE to human genetic data. J. Bioinform. Comput.
Biol. 2017; 15:114
8. Jin J, Wang H, Shu Z, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial
Databases with Noise. Cailiao Yanjiu Xuebao/Chinese J. Mater. Res. 2017; 31:219225
9. Xu C, Jackson SA. Machine learning and complex biological data. Genome Biol. 2019; 20:76
10. Bzdok D, Krzywinski M, Altman N. Points of significance: Machine learning: Supervised
methods. Nat. Methods 2018; 15:56
11. Murphy KP. Machine Learning A Probabilistic Perspective. MIT Press Cambridge, Massachusetts
London, Engl. 2018; 16:
12. Omta WA, Heesbeen RG van, Shen I, et al. Combining Supervised and Unsupervised Machine
Learning Methods for Phenotypic Functional Genomics Screening: 2020; 25:655664
13. Zou J, Huss M, Abid A, et al. A primer on deep learning in genomics. Nat. Genet. 2019; 51:1218
14. Gong X, Zhang H, Liu X, et al. Is liquid biopsy mature enough for the diagnosis of Alzheimer’s
disease? Front. Aging Neurosci. 2022; 14:891
15. Lone SN, Nisar S, Masoodi T, et al. Liquid biopsy: a step closer to transform diagnosis, prognosis
and future of cancer treatments. Mol. Cancer 2022 211 2022; 21:122
16. Liu L, Chen X, Petinrin OO, et al. Machine learning protocols in early cancer detection based on
liquid biopsy: A survey. Life 2021; 11:139
17. Peneder P, Stütz AM, Surdez D, et al. Multimodal analysis of cell-free DNA whole-genome
sequencing for pediatric cancers with low mutational burden. Nat. Commun. 2021; 12:
Martina Elena Tarozzi Science Reviews - Biology, 2024, 3(1), 9-15
18. Gu F, Wang X. Analysis of allele specific expression-a survey. Tsinghua Sci. Technol. 2015;
19. Im YR, Tsui DWY, Diaz LA, et al. Next-Generation Liquid Biopsies: Embracing Data Science in
Oncology. Trends in Cancer 2021; 7:283292
20. Zhou J, Li L, Wang L, et al. Establishment of a SVM classifier to predict recurrence of ovarian
cancer. Mol. Med. Rep. 2018; 18:35893598
21. Xu G, Zhang M, Zhu H, et al. A 15-gene signature for prediction of colon cancer recurrence and
prognosis based on SVM. Gene 2017; 604:3340
22. Constantin N, Sina AAI, Korbie D, et al. Opportunities for Early Cancer Detection: The Rise of
ctDNA Methylation-Based Pan-Cancer Screening Technologies. Epigenomes 2022; 6:127
23. Bahado-Singh RO, Radhakrishna U, Gordevičius J, et al. Artificial Intelligence and Circulating
Cell-Free DNA Methylation Profiling: Mechanism and Detection of Alzheimer’s Disease. Cells 2022;
24. Kelley DR, Snoek J, Rinn JL. Basset: Learning the regulatory code of the accessible genome with
deep convolutional neural networks. Genome Res. 2016; 26:990999
25. Onimaru K, Nishimura O, Kuraku S. Predicting gene regulatory regions with a convolutional
neural network for processing double-strand genome sequence information. PLoS One 2020;
26. Kelley DR, Reshef YA, Bileschi M, et al. Sequential regulatory activity prediction across
chromosomes with convolutional neural networks. 2018;
27. Fudenberg G, Kelley DR, Pollard KS. Predicting 3D genome folding from DNA sequence with
28. Jinek M, Chylinski K, Fonfara I, et al. A programmable dual-RNA-guided DNA endonuclease in
adaptive bacterial immunity. Science (80-. ). 2012; 337:816821
29. Doench JG, Hartenian E, Graham DB, et al. Rational design of highly active sgRNAs for CRISPR-
Cas9mediated gene inactivation. Nat. Biotechnol. 2014 3212 2014; 32:12621267
30. Doench JG, Fusi N, Sullender M, et al. Optimized sgRNA design to maximize activity and
minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 2015 342 2016; 34:184191
31. Kim HK, Kim Y, Lee S, et al. SpCas9 activity prediction by DeepSpCas9, a deep learningbased
model with high generalization performance. Sci. Adv. 2019; 5:
32. Chuai G, Ma H, Yan J, et al. DeepCRISPR: Optimized CRISPR guide RNA design by deep
learning. Genome Biol. 2018; 19:118
33. Kim HK, Min S, Song M, et al. Deep learning improves prediction of CRISPRCpf1 guide RNA
activity. Nat. Biotechnol. 2018 363 2018; 36:239241
34. Moreno-Mateos MA, Vejnar CE, Beaudoin JD, et al. CRISPRscan: designing highly efficient
sgRNAs for CRISPR-Cas9 targeting in vivo. Nat. Methods 2015 1210 2015; 12:982988
35. Wang J, Zhang X, Cheng L, et al. An overview and metanalysis of machine and deep learning-
based CRISPR gRNA design tools. RNA Biol. 2020; 17:13