Document Type
Conference Paper
Publication Date
2020
DOI
10.1145/3383583.3398589
Publication Title
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, August 1-5, 2020, Virtual Event, China
Pages
513-514
Conference Name
ACM/IEEE Joint Conference on Digital Libraries in 2020, August 1-5, 2020, Virtual Event, China
Abstract
Researchers reuse data from past studies to avoid costly re-collection of experimental data. However, large-scale data reuse is challenging due to lack of consensus on metadata representations among research groups and disciplines. Dataset File System (DFS) is a semi-structured data description format that promotes such consensus by standardizing the semantics of data description, storage, and retrieval. In this paper, we present analytic-streams – a specification for streaming data analytics with DFS, and streaming-hub – a visual programming toolkit built on DFS to simplify data analysis work-flows. Analytic-streams facilitate higher-order data analysis with less computational overhead, while streaming-hub enables storage, retrieval, manipulation, and visualization of data and analytics. We discuss how they simplify data pre-processing, aggregation, and visualization, and their implications on data analysis workflows.
Original Publication Citation
Jayawardana, Y., & Jayarathna, S. (2020). Streaming analytics and workflow automation for DFS. ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. https://doi.org/10.1145/3383583.3398589
Repository Citation
Jayawardana, Y., & Jayarathna, S. (2020). Streaming analytics and workflow automation for DFS. ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. https://doi.org/10.1145/3383583.3398589
ORCID
0000-0001-5992-6818 (Jayawardana), 0000-0002-4879-7309 (Jayarathna)
Comments
© 2020 the Authors
Included with written permission.