Document Type

Article

Publication Date

2024

DOI

10.1002/sim.10194

Publication Title

Statistics in Medicine

Volume

Article in Press

Pages

1-16

Abstract

Increasingly, large, nationally representative health and behavioral surveys conducted under a multistage stratified sampling scheme collect high dimensional data with correlation structured along some domain (eg, wearable sensor data measured continuously and correlated over time, imaging data with spatiotemporal correlation) with the goal of associating these data with health outcomes. Analysis of this sort requires novel methodologic work at the intersection of survey statistics and functional data analysis. Here, we address this crucial gap in the literature by proposing an estimation and inferential framework for generalizable scalar-on-function regression models for data collected under a complex survey design. We propose to: (1) estimate functional regression coefficients using weighted score equations; and (2) perform inference using novel functional balanced repeated replication and survey-weighted bootstrap for multistage survey designs. This is the first frequentist study to discuss the estimation of scalar-on-function regression models in the context of complex survey studies and to assess the validity of various inferential techniques based on re-sampling methods via a comprehensive simulation study. We implement our methods to predict mortality using diurnal activity profiles measured via wearable accelerometers using the National Health and Nutrition Examination Survey 2003-2006 data. The proposed computationally efficient methods are implemented in R software package surveySoFR.

Rights

© 2024 The Authors.

This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Data Availability

Article states: "The data that support the findings of this study are openly available in CDC NHANES at https://www.cdc.gov/nchs/nhanes/index.htm."

Original Publication Citation

Smirnova, E., Ciu, E., Tabacu, L., & Leroux, A. (2024). Scalar-on-function regression: Estimation and inference under complex survey designs. Statistics in Medicine. Advance online publication. https://doi.org/10.1002/sim.10194

Share

COinS