Inference for Informative Intra-Cluster Group Sizes in Clustered Data
College
College of Sciences
Department
Mathematics and Statistics
Graduate Level
Doctoral
Graduate Program/Concentration
Computational and Applied Mathematics (Statistics Concentration)
Presentation Type
Oral Presentation
Abstract
Clustered data is a special type of correlated data where units within a cluster are correlated while units between different clusters are independent. An example of such clustered data can be found in dental studies where individuals are treated as clusters and the teeth in an individual are the units within a cluster. When the aim is to compare the outcomes from two different groups of units in a clustered data, then the number of units belonging to a group in a typical cluster, i.e., an intra-cluster group size, can be associated with the outcome from that group in that cluster. Although such clustered data analysis has recently gained importance, there does not exist any formal statistical method for testing a hypothesis of informative intra-cluster group sizes (IICGS). However, ignoring the possible existence of IICGS during group-based outcome comparisons in clustered data can result in a biased inference. In this research, we focus on developing a statistical hypothesis-testing mechanism that can test a claim of IICGS in a clustered data setting. We use Kolmogorov-Smirnov test-type nonparametric test statistic and a bootstrap hypothesis testing procedure to develop our testing method. Through a variety of simulated data, we demonstrate that our proposed statistical testing method maintains the nominal type-I error rate and has substantial power in identifying IICGS in clustered data. We apply our proposed test to analyze real-life data sets.
Keywords
Bootstrapping, clustered data, correlated data, hypothesis testing, informative intra-cluster group size, resampling
Inference for Informative Intra-Cluster Group Sizes in Clustered Data
Clustered data is a special type of correlated data where units within a cluster are correlated while units between different clusters are independent. An example of such clustered data can be found in dental studies where individuals are treated as clusters and the teeth in an individual are the units within a cluster. When the aim is to compare the outcomes from two different groups of units in a clustered data, then the number of units belonging to a group in a typical cluster, i.e., an intra-cluster group size, can be associated with the outcome from that group in that cluster. Although such clustered data analysis has recently gained importance, there does not exist any formal statistical method for testing a hypothesis of informative intra-cluster group sizes (IICGS). However, ignoring the possible existence of IICGS during group-based outcome comparisons in clustered data can result in a biased inference. In this research, we focus on developing a statistical hypothesis-testing mechanism that can test a claim of IICGS in a clustered data setting. We use Kolmogorov-Smirnov test-type nonparametric test statistic and a bootstrap hypothesis testing procedure to develop our testing method. Through a variety of simulated data, we demonstrate that our proposed statistical testing method maintains the nominal type-I error rate and has substantial power in identifying IICGS in clustered data. We apply our proposed test to analyze real-life data sets.