Date of Award
Summer 1989
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Mathematics & Statistics
Program/Concentration
Computational and Applied Mathematics
Committee Director
Dayanand N. Naik
Committee Member
Ram C. Dahiya
Committee Member
Michael J. Doviak
Committee Member
Edward P. Markowski
Abstract
Observations arising from a linear regression model, lead one to believe that a particular observation or a set of observations are aberrant from the rest of the data. These may arise in several ways: for example, from incorrect or faulty measurements or by gross errors in either response or explanatory variables. Sometimes the model may inadequately describe the systematic structure of the data, or the data may be better analyzed in another scale. When diagnostics indicate the presence of anomalous data, then either these data are indeed unusual and hence helpful, or contaminated and, therefore, in need of modifications or deletions.
Therefore, it is desirable to develop a technique which can identify unusual observations, and determine how they influence the response variate. A large number of statistics are used, in the literature, to detect outliers and influential observations in the linear regression models. Two kinds of comparison studies to determine an optimal statistic are done in this dissertation: (i) using several data sets studied by different authors, and (ii) a detailed simulation study. Various choices of the design matrix of the regression model are considered to study the performance of these statistics in the case of multicollinearity and other situations. Calibration points using the exact distributions and the Bonferroni's inequality are given for each statistic. The results show that, in general, a set of two or three statistics is needed to detect outliers, and a different set of statistics to detect influential observations.
Various measures have been proposed which emphasize different aspects of influence upon the linear regression model. Many of the existing measures for detecting influential observations in linear regression models have natural extensions to the multivariate regression. The measures of influence are generalized to the multivariate regression model and multivariate analysis of variance models. Several data sets are considered to illustrate the methods. The regression models with autocorrelated errors are also studied to develop diagnostic statistics based on intervention analysis.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
DOI
10.25777/gte7-c039
Recommended Citation
Hossain, Anwar M..
"Detection of Outliers and Influential Observations in Regression Models"
(1989). Doctor of Philosophy (PhD), Dissertation, Mathematics & Statistics, Old Dominion University, DOI: 10.25777/gte7-c039
https://digitalcommons.odu.edu/mathstat_etds/80