Background

The recent advances in the technologies of computers, electronics and communications have made an increasing amount of audiovisual data available in a variety of application areas, a few of which to name are television news, education, art, and entertainment. Massive amounts of digital audiovisual data are provided through digital cameras, digital TV, and the World Wide Web to a variety of users from common consumers to sophisticated professionals.

Many applications such as Web search engines, Video-On-Demand (VOD) and digital libraries, require strong support by audiovisual retrieval techniques. Moreover, professional applications, such as TV news production and archiving systems require new adapted techniques to enable the effective management of a digital audiovisual corpus.

Without adequate techniques and tools to provide easy and effective access to the audiovisual content, this huge and valuable quantity of data is hardly usable. Users require powerful mechanisms that allow accessing audiovisual data based on the rich information which is contained in the content rather than the traditional retrieval methods based on a set of bibliographical descriptions. Such "content-based access" to audiovisual data is a however a complex issue, mainly because of the richness of the content information and also the diversity of the semantic requirements of the users regarding their various goals in the retrieval process.

Currently many research and commercial activities have focused on the development of audiovisual retrieval techniques. A set of techniques have been proposed based on low-level visual features such as colour, texture, shape, and motion. These techniques which are mostly based on query-by-example and query-by-sketching can take advantage of automatic algorithms for extraction of low-level visual features. However, apart from specialised applications using restricted types of data, the usage of such retrieval techniques is quite limited for retrieval of more high-level semantic concepts in general purpose audiovisual retrieval applications.

Some other works have focused on automatic indexing using the text and speech parts of the audiovisual media. These techniques take advantage of the advances in the domain of speech recognition, text recognition and natural language analysis. Although totally automatic indexing of audiovisual data is not realistic, these techniques can help improving the description of audiovisual data in semi-automatic systems.

>> approach

​​