Document Type
Article
Publication Date
2024
DOI
10.1049/cit2.12327
Publication Title
CAAI Transactions on Intelligence Technology
Volume
Article in Press
Pages
1-16
Abstract
Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high-dimensional data. In real big data-related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS-Vague. Its main idea is to combine uncertainty and three-way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS-Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS-Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS-Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS-Vague outperforms six state-of-the-art OSFS algorithms in terms of selection accuracy and computational efficiency.
Rights
© 2024 The Authors.
This is an open access article under the terms of the Creative Common Attribution-NonCommercial-NoDerivs 4.0 International (CC BY-NC-ND 4.0) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Data Availability
Article states: "The data used to support the findings of this study are included within the article."
Original Publication Citation
Yang, J., Wang, Z. J., Wang, G. Y., Liu, Y. M., He, Y., & Wu, D. (2024). OSFS-Vague: Online streaming feature selection algorithm based on vague set. CAAI Transactions on Intelligence Technology. Advance online publication. https://doi.org/10.1049/cit2.12327
Repository Citation
Yang, J., Wang, Z. J., Wang, G. Y., Liu, Y. M., He, Y., & Wu, D. (2024). OSFS-Vague: Online streaming feature selection algorithm based on vague set. CAAI Transactions on Intelligence Technology. Advance online publication. https://doi.org/10.1049/cit2.12327
ORCID
0000-0002-5357-6623 (He)