AASP-P16.1
WHEN NOISE LOWERS THE LOSS: RETHINKING LIKELIHOOD-BASED EVALUATION IN MUSIC LARGE LANGUAGE MODELS
Xiaosha Li, Georgia Institute of Technology, United States of America; Chun Liu, ByteDance Inc., United States of America; Ziyu Wang, Courant Institute of Mathematical Sciences, New York University; Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE, United States of America
Session:
AASP-P16: Audio and Speech Quality and Intelligibility Measures III Poster
Track:
Audio and Acoustic Signal Processing [AA]
Location:
Poster Area 25
Presentation Time:
Thu, 7 May, 09:00 - 11:00
Session AASP-P16
AASP-P16.1: WHEN NOISE LOWERS THE LOSS: RETHINKING LIKELIHOOD-BASED EVALUATION IN MUSIC LARGE LANGUAGE MODELS
Xiaosha Li, Georgia Institute of Technology, United States of America; Chun Liu, ByteDance Inc., United States of America; Ziyu Wang, Courant Institute of Mathematical Sciences, New York University; Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE, United States of America
AASP-P16.2: MUSHRA–1S: A SCALABLE AND SENSITIVE TEST APPROACH FOR EVALUATING TOP-TIER SPEECH PROCESSING SYSTEMS
Laura Lechler, Ivana Balić, Cisco Systems, United Kingdom of Great Britain and Northern Ireland
AASP-P16.3: A GENERALIZATION STRATEGY FOR SPEECH QUALITY PREDICTION: FROM DOMAIN-SPECIFIC TO UNIFIED DATASETS
Imran Kibria, Ada Lamba, Donald Williamson, The Ohio State University, United States of America
AASP-P16.4: CAN HIERARCHICAL CROSS-MODAL FUSION PREDICT HUMAN PERCEPTION OF AI DUBBED CONTENT?
Ashwini Dasare, Nirmesh Shah, Ashishkumar Gudmalwar, Pankaj Wasnik, Sony Research India, India
AASP-P16.5: PADAM: Perceptual Audio Defect Assessment Model
Alex Mackin, Pratha Khandelwal, Veneta Haralampieva, Michael Lau, Benoit Vallade, David Higham, Josh Anderson, Amazon Prime Video, United Kingdom of Great Britain and Northern Ireland
AASP-P16.6: UNSEEN BUT NOT UNKNOWN: USING DATASET CONCEALMENT TO ROBUSTLY EVALUATE SPEECH QUALITY ESTIMATION MODELS
Jaden Pieper, Stephen Voran, Institute for Telecommunication Sciences, United States of America
AASP-P16.7: RHO-PERFECT: CORRELATION CEILING FOR SUBJECTIVE EVALUATION DATASETS
Fredrik Cumlin, KTH Royal Institute of Technology, Sweden
AASP-P16.8: ENHANCED GENERATIVE MACHINE LISTENER
Vishnu Raj, Gouthaman KV, Shiv Gehlot, Dolby Laboratories India Pvt Ltd, India; Lars Villemoes, Dolby Sweden AB, Sweden; Arijit Biswas, Dolby Germany GmbH, Germany
AASP-P16.9: MULTI-TASK LEARNING FOR SPEECH QUALITY ASSESSMENT USING ASR-DERIVED ENTROPY FEATURES
Tri Dung Do, Bao Thang Ta, Viettel AI, Viettel Group, Viet Nam; Van Hai Do, Thuyloi University, Viet Nam
AASP-P16.10: LEVERAGING MULTIPLE SPEECH ENHANCERS FOR NON-INTRUSIVE INTELLIGIBILITY PREDICTION FOR HEARING-IMPAIRED LISTENERS
Boxuan Cao, Linkai Li, Orka Labs Inc., China; Hanlin Yu, The University of British Columbia, Canada; Changgeng Mo, Haoshuai Zhou, Orka Labs Inc., China; Shan Xiang Wang, Stanford University, United States of America
Contacts