Sound is a fundamental carrier of information for both physical events and human activities. Beyond speech, the auditory domain consists of a multitude of mixtures of environmental, musical, and spatial cues that allow humans and machines to perceive, interpret, and (inter-) act with and within their surroundings. Audio plays an essential role in perceptual intelligence, where the goal is not only to process signals but to learn and infer internal representations that support reasoning and interaction.
Recent advances in computational intelligence and machine/deep learning have significantly improved the ability of learning algorithms and computational methods to extract and manipulate meaningful information from audio signals. Methods for denoising, dereverberation, and source separation are increasingly coupled with representation learning methods, enabling the capturing of semantic and spatial aspects of sound. At the same time, data-driven models for spatial hearing, cross-modal learning and correspondence, and generative modeling are reshaping how methods and models represent and synthesize auditory scenes. These developments bridge the traditional boundary between low-level signal enhancement and high-level understanding, showing the way towards a unified perspective on computational audio perception.
This special session aims to bring together researchers working across various complementary domains. The goal is to explore how computational intelligence can support robust and adaptive computational audio processing systems, that can generalize across tasks and modalities.
The topics of the special session include (but are not limited to):
You can contact us for anything about the special session, by