Key Audio Event Space Based on Multi-scale Block Features for Acoustic Scene Classification and Bionutrition Analysis
This paper investigates the use of multi-scale block B based on the labeled key acoustic event (MSBF-KAE) for real-environmental acoustic scene classification or bionutrition analysis by bioacoustics scene recognition. We consider that a key acoustic event is helpful for characterizing an acoustic scene, such as an ―exhaust fan‖ event in a ―cooking‖ scene. We aim to extract more discriminative information from the overlapped acoustic event for detecting real-environmental acoustic scenes. We achieve this objective by fusing the shortest distance visual word to the spectrum block component and re-aggregate these learned components into a new descriptor extracted from the dominant texture patterns utilizing the multi-scale space. We analyze the performances of the model on six datasets; the proposed model shows better performances than current state-of-the-art models used in the audio scene field.