Document Type

Article

Publication Date

2024

DOI

10.1049/cit2.12327

Publication Title

CAAI Transactions on Intelligence Technology

Volume

Article in Press

Pages

1-16

Abstract

Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high-dimensional data. In real big data-related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS-Vague. Its main idea is to combine uncertainty and three-way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS-Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS-Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS-Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS-Vague outperforms six state-of-the-art OSFS algorithms in terms of selection accuracy and computational efficiency.

Rights

© 2024 The Authors.

This is an open access article under the terms of the Creative Common Attribution-NonCommercial-NoDerivs 4.0 International (CC BY-NC-ND 4.0) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.

Data Availability

Article states: "The data used to support the findings of this study are included within the article."

Original Publication Citation

Yang, J., Wang, Z. J., Wang, G. Y., Liu, Y. M., He, Y., & Wu, D. (2024). OSFS-Vague: Online streaming feature selection algorithm based on vague set. CAAI Transactions on Intelligence Technology. Advance online publication. https://doi.org/10.1049/cit2.12327

ORCID

0000-0002-5357-6623 (He)

Share

COinS