Date of Award

Summer 2018

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematics and Statistics

Committee Director

N. Rao Chaganty

Committee Member

Norou Diawara

Committee Member

Kayoung Park

Committee Member

Tina Cunningham

Abstract

Count data often exhibits inflated counts for zero. There are numerous papers in the literature that show how to fit Poisson regression models that account for the zero inflation. However, in many situations the frequencies of zero and of some other value k tends to be higher than the Poisson model can fit appropriately. Recently, Sheth-Chandra (2011), Lin and Tsai (2012) introduced a mixture model to account for the inflated frequencies of zero and k. In this dissertation, we study basic properties of this mixture model and parameter estimation for grouped and ungrouped data. Using stochastic representation we show how the EM algorithm can be adapted to obtain maximum likelihood estimates of the parameters. We derive the observed information matrix which yields standard errors of the EM estimates using ideas from Louis (1982). We also derive asymptotic distributions to test significance of the inflation points. We use real life examples to illustrate the procedure of fitting our model via EM algorithm.

The second part of this dissertation deals with a generalization of this mixture model where the one parameter Poisson distribution is replaced by a two parameter Conway-Maxwell-Poisson (CMP) distribution, which unlike the Poisson distribution accounts for both over and under dispersion in the count data. The CMP distribution has recently gained popularity, and a CMP model for zero inflated count data was introduced by Sellers and Raim (2016). We discuss properties of the CMP distribution and propose a new mixture distribution, namely zero and k inflated Conway-Maxwell-Poisson (ZkICMP) to address inflated counts with over and under dispersions. We develop regression models based on ZkICMP and discuss parameter estimation using analytical and numerical methods. Finally, we compare goodness of fit of inflated and standard models on simulated and real life data examples.

DOI

10.25777/nz1e-d763

Share

COinS