PicoAudio2: Temporal Controllable Text-to-Audio Generation with Natural Language Description Paper β’ 2509.00683 β’ Published Aug 31, 2025
UniFlow-Audio: Unified Flow Matching for Audio Generation from Omni-Modalities Paper β’ 2509.24391 β’ Published Sep 29, 2025
Bayesian Speech synthesizers Can Learn from Multiple Teachers Paper β’ 2510.24372 β’ Published Oct 28, 2025
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper β’ 2510.25955 β’ Published Oct 29, 2025