2019-146

2019-146

Detection of DNA Cytosine Methylation from Nanopore Sequencing using Machine Learning

ELIJAH A. JORDAN, TRANG A. VU

Methylation of DNA is a promising biomarker for cancer screening at early stage, risk assessment, and personalized medicine. Nanopore sequencing offers a new portable method for direct detection of methylation sites in DNA, where biological molecules biological molecules are passed through an electrically charged nanopore (nano-scale pore) and changes in current can be used to identify molecular properties. Through our collaborators, data was collected from a biological nanopore sensor to investigate among unmethylated-, methylated- and methylated-DNA bound with MBD2 protein. We employ machine learning, including supervised and deep learning techniques, to perform detection of cytosine methylation in double-stranded DNA based on current blockage data. We present results on using various classification algorithms to distinguish the three aforementioned types of DNA and their mixtures based on training data. These algorithms were then evaluated based on their accuracy scores consisting of mean and standard deviation. The algorithm with the highest accuracy was then used to make model predictions on an nknown set, consisting of data points with all three DNA types.