Date of Award
Spring 2015
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Mathematics & Statistics
Program/Concentration
Computational and Applied Mathematics
Committee Director
Norou Diawara
Committee Director
Nak-Kyeong Kim
Committee Member
N. Rao Chaganty
Committee Member
Michael Doviak
Abstract
It is essential to determine the protein-DNA binding sites to understand many biological processes. A transcription factor is a particular type of protein that binds to DNA and controls gene regulation in living organisms. Chromatin immunoprecipitation followed by highthroughput sequencing (ChIP-seq) is considered the gold standard in locating these binding sites and programs use to identify DNA-transcription factor binding sites are known as peak-callers. ChIP-seq data are known to exhibit considerable background noise and other biases. In this study, we propose a negative binomial model (NB), a zero-inflated Poisson model (ZIP) and a zero-inflated negative binomial model (ZINB) for peak-calling. Using real ChIP-seq datasets, we show that ZINB model is the best model for ChIP-seq data. Then we incorporate control data, GC count information, and mappability information into the ZINB regression model as covariates using two link functions. We implemented this approach in C++, and our peak-caller chooses the optimal parameter combination for a given dataset. Performace of our approach is compared with two frequently used peak-callers: QuEST and MACS.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/1s51-cs87
ISBN
9781321843439
Recommended Citation
Viswakula, Sameera D..
"Zero-Inflated Models to Identify Transcription Factor Binding Sites in ChIP-seq Experiments"
(2015). Doctor of Philosophy (PhD), Dissertation, Mathematics & Statistics, Old Dominion University, DOI: 10.25777/1s51-cs87
https://digitalcommons.odu.edu/mathstat_etds/70