COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization

Zlatintsi, Athanasia; Koutras, Petros; Evangelopoulos, Georgios; Malandrakis, Nikolaos; Efthymiou, Niki; Pastra, Katerina; Potamianos, Alexandros; Maragos, Petros

dc.contributor.author	Zlatintsi, Athanasia
dc.contributor.author	Koutras, Petros
dc.contributor.author	Malandrakis, Nikolaos
dc.contributor.author	Efthymiou, Niki
dc.contributor.author	Pastra, Katerina
dc.contributor.author	Potamianos, Alexandros
dc.contributor.author	Maragos, Petros
dc.contributor.author	Evangelopoulos, Georgios
dc.date.accessioned	2018-02-22T19:24:10Z
dc.date.available	2018-02-22T19:24:10Z
dc.date.issued	2017-08
dc.identifier.issn	1687-5281
dc.identifier.issn	1687-5176
dc.identifier.uri	http://hdl.handle.net/1721.1/113872
dc.description.abstract	Abstract: Research related to computational modeling for machine-based understanding requires ground truth data for training, content analysis, and evaluation. In this paper, we present a multimodal video database, namely COGNIMUSE, annotated with sensory and semantic saliency, events, cross-media semantics, and emotion. The purpose of this database is manifold; it can be used for training and evaluation of event detection and summarization algorithms, for classification and recognition of audio-visual and cross-media events, as well as for emotion tracking. In order to enable comparisons with other computational models, we propose state-of-the-art algorithms, specifically a unified energy-based audio-visual framework and a method for text saliency computation, for the detection of perceptually salient events from videos. Additionally, a movie summarization system for the automatic production of summaries is presented. Two kinds of evaluation were performed, an objective based on the saliency annotation of the database and an extensive qualitative human evaluation of the automatically produced summaries, where we investigated what composes high-quality movie summaries, where both methods verified the appropriateness of the proposed methods. The annotation of the database and the code for the summarization system can be found at http://cognimuse.cs.ntua.gr/database Keywords: Video database, SaliencyCross-media relations, Emotion annotation, Audio-visual events, Video summarization .	en_US
dc.publisher	Springer International Publishing	en_US
dc.relation.isversionof	http://dx.doi.org/10.1186/s13640-017-0194-1	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Springer International Publishing	en_US
dc.title	COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization	en_US
dc.type	Article	en_US
dc.identifier.citation	Zlatintsi, Athanasia, et al. “COGNIMUSE: A Multimodal Video Database Annotated with Saliency, Events, Semantics and Emotion with Application to Summarization.” EURASIP Journal on Image and Video Processing, vol. 2017, no. 1, Dec. 2017.	en_US
dc.contributor.department	McGovern Institute for Brain Research at MIT	en_US
dc.contributor.mitauthor	Evangelopoulos, Georgios
dc.relation.journal	EURASIP Journal on Image and Video Processing	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dc.date.updated	2017-08-08T04:02:46Z
dc.language.rfc3066	en
dc.rights.holder	The Author(s)
dspace.orderedauthors	Zlatintsi, Athanasia; Koutras, Petros; Evangelopoulos, Georgios; Malandrakis, Nikolaos; Efthymiou, Niki; Pastra, Katerina; Potamianos, Alexandros; Maragos, Petros	en_US
dspace.embargo.terms	N	en_US
dc.identifier.orcid	https://orcid.org/0000-0003-2240-1801
mit.license	PUBLISHER_CC	en_US

Files in this item

Name:: 13640_2017_Article_194.pdf
Size:: 2.667Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record