Document Type
Article
Publication Date
2026
DOI
10.4018/IJMDEM.406715
Publication Title
International Journal of Multimedia Data Engineering and Management (IJMDEM)
Volume
16
Issue
1
Pages
1-19
Abstract
Joint visual attention (JVA) provides important insight into how individuals coordinate attention during social interaction. Egocentric eye tracking enables the study of JVA in natural, multi-user settings. This work presents a multi-stage framework to identify and analyze JVA using egocentric video and gaze data. The approach consists of three steps: spatiotemporal tube-based visual similarity, gaze-guided object detection, and attention pattern analysis using the ambient–focal coefficient K. Results show that object-focused collaborative activities exhibit high JVA, with object detection capturing higher joint attention than visual similarity, whereas conversation-based or independent activities show lower and more fragmented joint attention. Analysis of K reveals convergence during shared object interaction and divergence during independent tasks. Overall, the study demonstrates the value of combining object-level semantics and attention dynamics to understand JVA in real-world settings, with implications for psychology, human–computer interaction, and social robotics.
Rights
© 2026 The Authors.
This article published as an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) License, which permits unrestricted use, distribution, and production in any medium, provided the author of the original work and original publication source are properly credited.
Original Publication Citation
Thennakoon, K., Abeysinghe, Y., Mahanama, B., Ashok, V., & Jayarathna, S. (2026). Modeling joint visual attention in naturalistic dyadic interactions. International Journal of Multimedia Data Engineering and Management (IJMDEM), 16(1), 1-19. https://doi.org/10.4018/IJMDEM.406715
Repository Citation
Thennakoon, K., Abeysinghe, Y., Mahanama, B., Ashok, V., & Jayarathna, S. (2026). Modeling joint visual attention in naturalistic dyadic interactions. International Journal of Multimedia Data Engineering and Management (IJMDEM), 16(1), 1-19. https://doi.org/10.4018/IJMDEM.406715
ORCID
0009-0009-1697-1614 (Thennakoon), 0000-0002-5114-9732 (Abeysinghe), 0000-0001-7773-7471 (Mahanama), 0000-0002-4772-1265 (Ashok), 0000-0002-4879-7309 (Jayarathna)
Included in
Artificial Intelligence and Robotics Commons, Cognition and Perception Commons, Communication Technology and New Media Commons, Graphics and Human Computer Interfaces Commons