Minisymposium: Sound and Music in the Brain (March 3, 2026)

2026-02-20 event

Date: Tuesday, March 3, 2026
Time: 2:30–5:30 pm (JST)
Venue: Frontier Science Building (先端科学研究棟) Room 103, Kyoto University
Google Maps | Campus Map

Overview

How does the brain transform acoustic signals into rich perceptual experiences—recognizing everyday sounds, reconstructing auditory scenes, and deriving pleasure from music? Recent advances in deep neural networks and brain decoding are beginning to answer these questions by revealing the computational principles shared between artificial and biological auditory systems.

This minisymposium brings together three complementary perspectives at the intersection of auditory neuroscience and AI: the neural and perceptual representation of natural sounds explored through CNN-based modeling and ultra-high-field fMRI; the generative reconstruction of diverse auditory experiences—including subjective attentional states—from brain activity; and computational approaches to understanding how musical expectations give rise to pleasure. Together, these talks illustrate how integrating neural measurements with modern computational models can illuminate the mechanisms underlying our everyday auditory experience.

✍ Registration

Program

Speaker	Title
Bruno L. Giordano (CNRS & Aix Marseille Université, INT)	The Perceptual and Cerebral Representation of Natural Sounds
Jong-Yun Park (Science Tokyo)	Reconstructing Auditory Experience from Human Brain Activity
Vincent Cheung 張家銘 (Sony CSL)	Computational approaches towards understanding why people enjoy music

Speakers & Abstracts

Bruno L. Giordano (CNRS & Aix Marseille Université, INT)

Title: The Perceptual and Cerebral Representation of Natural Sounds

Abstract: Every day, we encounter diverse natural sounds (chirps, impulses) and effortlessly recognize the objects and events producing them (a bird call, a nail being hammered). Studying how we perceive these sounds provides a unique window into how the auditory system operates under real-world conditions, moving beyond the traditional focus on speech and music. Investigating these signals promises a broader understanding of auditory processing, encompassing both acoustic complexity and semantic structure. I will start this talk by describing my early behavioural and fMRI work attempting to tease apart acoustic and semantic components of natural sound representation. I will then focus on more recent collaborative efforts that combined behavioral measures, ultra-high-field (7T) fMRI, and convolutional neural networks (CNNs) to reveal that CNN-based models closely mirror perceptual and fMRI representations. Here, mid-layer CNN representations best predicted brain and behavioral data, likely emphasizing the acoustic structure critical for identifying objects and actions (hyperacoustics). I will then describe ongoing work that aims to track the temporal unfolding of natural sounds representations through magnetoencephalography. I will conclude the talk with an overview of a new collaborative effort that aims to extend our AI + computationally modelling framework from isolated sounds to the complexity of real-life auditory scenes. By embracing the full richness of everyday listening, we seek a unified perspective on how the auditory system decodes our ever-changing acoustic environment, ultimately integrating neural, behavioral, and computational insights to illuminate the complexity of real-world hearing.

Jong-Yun Park (Science Tokyo)

Title: Reconstructing Auditory Experience from Human Brain Activity

Abstract: Recent advances in brain decoding have demonstrated that complex perceptual experiences can be externally reconstructed from human brain activity. In this talk, I will present a series of studies aimed at reconstructing diverse auditory experiences from fMRI by decoding hierarchical auditory representations that closely parallel those learned by deep neural networks. By integrating these decoded features with an audio-generative model, we show that arbitrary real-world sounds can be reconstructed while preserving key spectral structures and short-timescale temporal information. Behavioral and objective evaluations further indicate that the reconstructed sounds not only resemble the original stimuli but also reflect subjective perceptual states such as selective attention. In addition, I will introduce cross-modal experiments demonstrating that visual stimuli can evoke high-level auditory representations, suggesting top-down modulation from visual to auditory cortex. I will conclude by discussing future directions, including applications to auditory imagery and hallucination, as well as the implications of cross-modal interactions for understanding multisensory auditory experiences.

Vincent Cheung 張家銘 (Sony CSL)

Title: Computational approaches towards understanding why people enjoy music

Abstract: Music is one of life’s greatest pleasures. One way through which music is thought to elicit pleasure is by the confirmation and deviation of listeners’ expectations. In the first part of my talk, I show how we can apply computational models to quantify musical expectations and investigate the neural and cognitive mechanisms underlying our enjoyment of music. In the second part, I introduce more general computational methods we developed in multimodal alignment and multivariate decoding to bridge the epistemic gap between cognition and neural activity.