Document Type
Article
Publication Date
2021
DOI
10.1109/TCBB.2019.2913845
Publication Title
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume
18
Issue
1
Pages
365-372
Abstract
Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the Gel Bead in Emulsion (GEM) microfluidic method. Our results describe the accuracy and relative performance of these two approaches using the reference-based assessment module of QUAST. We show that REXTAL dramatically outperforms the Supernova whole genome assembler in subtelomeric segmental duplication regions, and results in highly accurate assemblies. Nearly all of the REXTAL misassemblies identified using default QUAST parameters simply pinpoint locations of tandem repeat arrays in the reference sequence where the repeat array length differs from that in the cognate REXTAL assembly by > 1000 bp.
Rights
© 2021 IEEE.
"The revised policy reaffirms the principle that authors are free to post the accepted version of their articles on their personal websites or those of their employers."
Included in accordance with publisher policy.
Original Publication Citation
Islam, T., Ranjan, D., Zubair, M., Young, E., Ming, X., & Riethman, H. (2021). Analysis of subtelomeric REXTAL assemblies using QUAST. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(1), 365-372. https://doi.org/10.1109/TCBB.2019.2913845
Repository Citation
Islam, T., Ranjan, D., Zubair, M., Young, E., Ming, X., & Riethman, H. (2021). Analysis of subtelomeric REXTAL assemblies using QUAST. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(1), 365-372. https://doi.org/10.1109/TCBB.2019.2913845
ORCID
0000-0002-5449-1779 (Zubair)