Categories |
![]()
MULTIMEDIA
![]()
COMPUTER VISION
![]()
KNOWLEDGE GRAPH
|
About |
Deep video understanding is a difficult task which requires systems to develop a deep analysis and understanding of the relationships between different entities in video, to use known information to reason about other, more hidden information, and to populate a knowledge graph (KG) with all acquired information. To work on this task, a system should take into consideration all available modalities (speech, image/video, and in some cases text). The aim of this new challenge is to push the limits of multimodal extraction, fusion, and analysis techniques to address the problem of analyzing long duration videos holistically and extracting useful knowledge to utilize it in solving different types of queries. The target knowledge includes both visual and non-visual elements. As videos and multimedia data are getting more and more popular and usable by users in different domains, the research, approaches and techniques we aim to be applied in this Grand Challenge will be very relevant in the coming years and near future. |
Call for Papers |
Interested participants are invited to apply their approaches and methods on a novel High-Level Video Understanding (HLVU) dataset being made available by the challenge organizers. These include 10 movies with a Creative Commons license. The dataset will be annotated by human assessors and ground truth (Ontology of relations, entities, actions & events, names and images of all main characters, and Knowledge Graph for 50% of the movies) provided to participating researchers for training and development of their systems. The organizers will support evaluation and scoring of three main query types distributed with the dataset (please refer to the dataset webpage for more details):
|
Credits and Sources |
[1] DVU-Challenge 2020 : Deep Video Understanding - ACM Multimedia Grand Challenge |