GET THE APP

Audio content analysis in the presence of overlapped classes: A n | 28380
Journal of Information Technology & Software Engineering

Journal of Information Technology & Software Engineering
Open Access

ISSN: 2165- 7866

+44 1300 500008

Audio content analysis in the presence of overlapped classes: A non-exclusive segmentation approach to mitigate information losses


Global Summit and Expo on Multimedia & Applications

August 10-11, 2015 Birmingham, UK

Duraid Y Mohammed

Scientific Tracks Abstracts: J Inform Tech Soft Engg

Abstract :

Soundtracks of multimedia files are information rich from which much content-related metadata can be extracted. There is
a pressing demand for automated classification, identification and information mining of audio content. A segment of the
audio soundtrack can be either speech, music, event sounds or a combination of them. There exist many individual algorithms
for the recognition and analysis of speech, music or event sounds allowing for embedded information to be retrieved in a
semantic fashion. A systematic review shows that a universal system that is optimized to extract the maximum amount of
information for further text mining and inference does not exist. Mainstream algorithms typically work with a single class of
sound e.g., speech, music or even sounds and classification methods are predominantly exclusive (detects one class at a time)
and losing much of information when two or three classes are overlapped. A universal open architecture for audio content
and scene analysis has been proposed by the authors. To mitigate information losses in overlapped content, non-exclusive
segmentation approaches were adopted. This paper is presented from one possible implementation deploying the universal
open architecture as a paradigm to show how the universal open architecture can integrate existing methods and workflow
but maximise extractable semantic information. In the current work, overlapped content is identified and segmented from
carefully tailored feature spaces and a family of decision trees are used to generate a content score. Results show that the
developed system when compared with well established audio content analysers can identify and thus extract information
from much more speech and music segments. The full paper will discuss the methods, detail the results and illustrate how the
system works.

Biography :

Duraid Y Mohammed is a 2nd year PhD student at University of Salford/School of Computing, Science and Engineering/Acoustics research centre. He has
published 1 conference paper in Audio Engineering Society (AES) 136th and his second paper is accepted and published in IEEE Journal.

Top