G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios

Wang, Zeyu; Shi, Yuanchun; Wang, Yuntao; Yao, Yuchen; Yan, Kun; Wang, Yuhan; Ji, Lei; Xu, Xuhai; Yu, Chun

dc.contributor.author	Wang, Zeyu
dc.contributor.author	Shi, Yuanchun
dc.contributor.author	Wang, Yuntao
dc.contributor.author	Yao, Yuchen
dc.contributor.author	Yan, Kun
dc.contributor.author	Wang, Yuhan
dc.contributor.author	Ji, Lei
dc.contributor.author	Xu, Xuhai
dc.contributor.author	Yu, Chun
dc.date.accessioned	2024-06-06T16:47:57Z
dc.date.available	2024-06-06T16:47:57Z
dc.date.issued	2024-05-13
dc.identifier.issn	2474-9567
dc.identifier.uri	https://hdl.handle.net/1721.1/155208
dc.description.abstract	Modern information querying systems are progressively incorporating multimodal inputs like vision and audio. However, the integration of gaze --- a modality deeply linked to user intent and increasingly accessible via gaze-tracking wearables --- remains underexplored. This paper introduces a novel gaze-facilitated information querying paradigm, named G-VOILA, which synergizes users' gaze, visual field, and voice-based natural language queries to facilitate a more intuitive querying process. In a user-enactment study involving 21 participants in 3 daily scenarios (p = 21, scene = 3), we revealed the ambiguity in users' query language and a gaze-voice coordination pattern in users' natural query behaviors with G-VOILA. Based on the quantitative and qualitative findings, we developed a design framework for the G-VOILA paradigm, which effectively integrates the gaze data with the in-situ querying context. Then we implemented a G-VOILA proof-of-concept using cutting-edge deep learning techniques. A follow-up user study (p = 16, scene = 2) demonstrates its effectiveness by achieving both higher objective score and subjective score, compared to a baseline without gaze data. We further conducted interviews and provided insights for future gaze-facilitated information querying systems.	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	10.1145/3659623	en_US
dc.rights	Creative Commons Attribution-Noncommercial	en_US
dc.rights.uri	https://creativecommons.org/licenses/by-nc/4.0/	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios	en_US
dc.type	Article	en_US
dc.identifier.citation	Wang, Zeyu, Shi, Yuanchun, Wang, Yuntao, Yao, Yuchen, Yan, Kun et al. 2024. "G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8 (2).
dc.relation.journal	Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies	en_US
dc.identifier.mitlicense	PUBLISHER_CC
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2024-06-01T07:58:52Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2024-06-01T07:58:53Z
mit.journal.volume	8	en_US
mit.journal.issue	2	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: 3659623.pdf
Size:: 5.547Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record