College

College of Sciences

Department

Computer Science

Program

Computer Science

Publication Date

4-2021

Abstract

The telomeres are the “caps” of the chromosomes and their vital role is to protect them. Possible telomere dysfunction caused by telomere rearrangements can be fatal for the cell and result in age-related diseases, including cancer. The telomeres and subtelomeres are regions that are hard to investigate. The current technology cannot provide their complete sequence, instead the DNA is given in multiple pieces. Current methods of assembling the pieces of these regions are not accurate enough due to the region’s high variability and complex repeated patterns. We propose a hybrid assembly method, the NPGREAT, which utilizes two of the latest available data: Linked-Reads and ultralong Nanopore reads. It consists of five main steps: (i) The input selection of the data, (ii) the Orientation, Order and Enhanced Correction of the short contigs by using the long reads as scaffolds, upon which the short contigs are mapped to. Particularly, the Enhanced Correction step allows for the correction of potential misassemblies within the short contigs due to deletions in tandem repeat regions. The nanopore sequence is used to fill the missing portion, representing the tandem repeat region accurately, a region which is highly variable from one human to another. Next, in the (iii) Region Extraction step, the segments of the multiple long reads that can be used to connect the short contigs, are extracted. Then, in the (iv) Gap Filling step, all possible segments are taken into account and one is selected to fill each gap. Finally, in the (v) Combination step, the corrected short pieces are combined with the connector segments. The output is the subtelomere region of the chromosome. NPGREAT is evaluated with the use of the QUAST tool and the resulting assemblies are of high quality.

Keywords

Genome assembly, Subtelomeres

Disciplines

Computer Sciences | Genomics | Molecular Genetics | Nanoscience and Nanotechnology

Files

Download

Download Full Text (488 KB)

Nanopore Guided Regional Assembly


Share

COinS