SLP-L11.2
RECOVERING PERFORMANCE IN SPEECH EMOTION RECOGNITION FROM DISCRETE TOKENS VIA MULTI-LAYER FUSION AND PARALINGUISTIC FEATURE INTEGRATION
Esther Sun, Abinay Reddy Naini, Carlos Busso, Carnegie Mellon University, United States of America
Session:
SLP-L11: Speech Emotion Recognition and Language Models Oral
Track:
Speech and Language Processing [SL]
Location:
Room 114
Presentation Time:
Thu, 7 May, 09:20 - 09:40
Session Co-Chairs:
Emily Mower Provost, University of Michigan and Kyu J. Han, Oracle
Session SLP-L11
SLP-L11.1: EMO-TTA: IMPROVING TEST-TIME ADAPTATION OF AUDIO-LANGUAGE MODELS FOR SPEECH EMOTION RECOGNITION
Jiacheng Shi, Hongfei Du, College of William & Mary, United States of America; Y. Alicia Hong, George Mason University, United States of America; Ye Gao, College of William & Mary, United States of America
SLP-L11.2: RECOVERING PERFORMANCE IN SPEECH EMOTION RECOGNITION FROM DISCRETE TOKENS VIA MULTI-LAYER FUSION AND PARALINGUISTIC FEATURE INTEGRATION
Esther Sun, Abinay Reddy Naini, Carlos Busso, Carnegie Mellon University, United States of America
SLP-L11.3: B-GRPO: UNSUPERVISED SPEECH EMOTION RECOGNITION BASED ON BATCHED-GROUP RELATIVE POLICY OPTIMIZATION
Yingying Gao, Shilei Zhang, Runyan Yang, Zihao Cui, Junlan Feng, Jiutian Artificial Intelligence Research Institute, China
SLP-L11.4: CONTRASTIVE DISTILLATION OF EMOTION KNOWLEDGE FROM LLMS FOR ZERO-SHOT EMOTION RECOGNITION
Minxue Niu, Emily Mower Provost, University of Michigan, United States of America
SLP-L11.5: MI-Fuse: Label Fusion for Unsupervised Domain Adaptation with Closed-Source Large Audio-Language Model
Hsiao-Ying Huang, Yi-Cheng Lin, Hung-yi Lee, National Taiwan University, Taiwan
SLP-L11.6: LEVERAGING LARGE SPEECH LANGUAGE MODELS AS EVALUATORS FOR EXPRESSIVE SPEECH
Bismarck Odoom, Philipp Koehn, Johns Hopkins University, United States of America
Contacts