The recent advances in the technologies of computers, electronics and communications
have made an increasing amount of audiovisual data available in a variety
of application areas, a few of which to name are television news, education,
art, and entertainment. Massive amounts of digital audiovisual data are provided
through digital cameras, digital TV, and the World Wide Web to a variety of
users from common consumers to sophisticated professionals.
Many applications such as Web search engines, Video-On-Demand (VOD) and digital
libraries, require strong support by audiovisual retrieval techniques. Moreover,
professional applications, such as TV news production and archiving systems
require new adapted techniques to enable the effective management of a digital
Without adequate techniques and tools to provide easy and effective access
to the audiovisual content, this huge and valuable quantity of data is hardly
usable. Users require powerful mechanisms that allow accessing audiovisual
data based on the rich information which is contained in the content rather
than the traditional retrieval methods based on a set of bibliographical descriptions.
Such "content-based access" to audiovisual data is a however a complex issue,
mainly because of the richness of the content information and also the diversity
of the semantic requirements of the users regarding their various goals in
the retrieval process.
Currently many research and commercial activities have focused on the development
of audiovisual retrieval techniques. A set of techniques have been proposed
based on low-level visual features such as colour, texture, shape, and motion.
These techniques which are mostly based on query-by-example and query-by-sketching
can take advantage of automatic algorithms for extraction of low-level visual
features. However, apart from specialised applications using restricted types
of data, the usage of such retrieval techniques is quite limited for retrieval
of more high-level semantic concepts in general purpose audiovisual retrieval
Some other works have focused on automatic indexing using the text and speech
parts of the audiovisual media. These techniques take advantage of the advances
in the domain of speech recognition, text recognition and natural language
analysis. Although totally automatic indexing of audiovisual data is not realistic,
these techniques can help improving the description of audiovisual data in