Document Type

Conference Paper

Publication Date

2020

DOI

10.1145/3383583.3398589

Publication Title

Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, August 1-5, 2020, Virtual Event, China

Pages

513-514

Conference Name

ACM/IEEE Joint Conference on Digital Libraries in 2020, August 1-5, 2020, Virtual Event, China

Abstract

Researchers reuse data from past studies to avoid costly re-collection of experimental data. However, large-scale data reuse is challenging due to lack of consensus on metadata representations among research groups and disciplines. Dataset File System (DFS) is a semi-structured data description format that promotes such consensus by standardizing the semantics of data description, storage, and retrieval. In this paper, we present analytic-streams – a specification for streaming data analytics with DFS, and streaming-hub – a visual programming toolkit built on DFS to simplify data analysis work-flows. Analytic-streams facilitate higher-order data analysis with less computational overhead, while streaming-hub enables storage, retrieval, manipulation, and visualization of data and analytics. We discuss how they simplify data pre-processing, aggregation, and visualization, and their implications on data analysis workflows.

Comments

© 2020 the Authors

Included with written permission.

Original Publication Citation

Jayawardana, Y., & Jayarathna, S. (2020). Streaming analytics and workflow automation for DFS. ACM/IEEE Joint Conference on Digital Libraries in 2020, Virtual Event, China, August 1-5, 2020. https://doi.org/10.1145/3383583.3398589

ORCID

0000-0001-5992-6818 (Jayawardana), 0000-0002-4879-7309 (Jayarathna)

Plum Print visual indicator of research metrics
PlumX Metrics
  • Citations
    • Citation Indexes: 5
  • Usage
    • Downloads: 134
    • Abstract Views: 19
  • Captures
    • Readers: 6
  • Mentions
    • Blog Mentions: 1
see details

Share

COinS