Recently, a growing number of the talks within conferences and
particularly scientific ones, is being recorded for later retrieval.
With the growth of the amount of recorded data, it becomes rather
complex to find the appropriate video or video-sequence of a talk. For
instance current search engines are not able to answer complex queries
such as ”Find a sequence of a recorded talk, in an academic training
lecture, in 2007, in Italy, where a colleague of professor X, talked
about image indexing after the coffee break”.
This example illustrates the so-called semantic gap which as defined in  “is
an important issue in many computer vision systems, but particularly
for indexing. It refers to the lack of coincidence between machine
low-level digital representations of visual data and the human
high-level cognitive understanding of the same data”. This is
particularly relevant for information retrieval activities. Low-level
features can be automatically extracted during a video indexing stage
(color, slide transition, slide animation, etc.), while higher-level
features are based on rich human semantics (concepts, topics, people,
etc.) and therefore involve human intervention , , .
In order to bridge the existing semantic gap, information should be
indexed according to the users expectations, allowing tsearch engines
to find suitable data matching users requirements. To allow users to
submit complex queries such as the one cited above, we argue that the
entire conference information refered by OHLA (cOnference High-Level informAtion), should be used to enrich video recording indexing. In our work OHLA refers
to the entire information related to a conference: the video recording
of the talk with all the information we can extract from its content
(video segmentation, keywords, topics, etc.), the presentation of the
talk (ppt, pdf, etc.), the speaker and attendee information (name,
organization he/she belongs to it, publication), the administrative
information (conference planning, logistics, etc.), the related demos,
the related events, etc.. OHLA information is generated
all along conference life cycle. Such information can be automatically
or manually extracted and used to provide higher video content indexing
from a user semantic point of view. Thus, OHLA can
bridges the semantic gap. Manual annotation, is accurate because the
description is based on human perception of the semantic content of the
video. Unfortunately, manual annotation is a labor, intensive,
time-consuming and tedious process especially with the exponentially
growth of the video collection. Automatic annotation, as stated above,
often lacks of the semantic dimension users need when searching for
information. Semiautomatic approach reduces the burden of the manual
annotation by combining the manual approach to the automatic one.
address the issue of video content description several standards and
annotation formats have been defined (MPEG-7, RDF, etc.). As a
consequence data is annotated using heterogeneous formats, complicating
even further the process of information retrieval.
Several works have been carried out to improve multimedia
information retrieval by focusing on either the issue of video content
description or heterogeneous metedata format. What is currently missing,
is an integrated system that jointly considers these two issues in
order to provide efficient information retrieval. Based on this
observation, this thesis work presents CALIMERA (Conference Advanced Level Information ManagEment & RetrievAl) an integrated framework for content and context based video retrieval, based on HELO (High-level modEL Ontology) an integrated conference ontology model that models the Conference Integration Information.
-CALIMERA framework global view-
 Andrew P. G., Ph.D. thesis, Queen Mary, University of London, 2006.
 Ying Liua and al., “A survey of
content-based image retrieval ith high-level semantics,” Sciences
direct, Pattern recognition, 2007, pp. 262–282.
 I.L. Coman I.K. Sethi, “Mining
association rules between low-level image features and high-level
concepts,” SPIE Data Mining and Knowledge Discovery, 2001, vol. 3, pp.
 Jonathon S. and al, “Bridging the semantic gap in multimedia information retrieval,” 2006.