Követés
Dongchao Yang
Dongchao Yang
E-mail megerősítve itt: se.cuhk.edu.hk - Kezdőlap
Cím
Hivatkozott rá
Hivatkozott rá
Év
Diffsound: Discrete diffusion model for text-to-sound generation
D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu
IEEE Transactions on Audio, Speech and Language Processing (TASLP)., 2023
1652023
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models
R Huang*, J Huang*, D Yang*, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, ...
ICML 2023, 2023
1342023
AudioGPT: Understanding and generating speech, music, sound, and talking head
R Huang*, M Li*, D Yang*, J Shi*, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ...
AAAI, 2024, 2023
962023
InstructTTS: Modelling expressive TTS in discrete latent space with natural language style prompt
D Yang*, S Liu*, R Huang, C Weng, H Meng
IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2024
392024
Towards data distillation for end-to-end spoken conversational question answering
C You, N Chen, F Liu, D Yang, Y Zou
arXiv preprint arXiv:2010.08923, 2021
342021
A Mutual learning framework for Few-shot Sound Event Detection
D Yang, H Wang, Y Zou, Z Ye, W Wang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
30*2022
Hifi-codec: Group-residual vector quantization for high fidelity audio codec
D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou
arXiv preprint arXiv:2305.02765, 2023
292023
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Z Ye, H Wang, D Yang, Y Zou
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
282021
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
D Yang*, J Tian*, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ...
ICML 2024, 2023
222023
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss
Y Xin, D Yang, Y Zou
ICASSP2023, 2023
172023
Make-a-voice: Unified voice synthesis with discrete representation
R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ...
ACL 2024, 2023
142023
Norespeech: Knowledge distillation based conditional diffusion model for noise-robust expressive tts
D Yang, S Liu, J Yu, H Wang, C Weng, Y Zou
Interspeech2023, 2022
142022
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification
Y Xin, D Yang, Y Zou
Proc. Interspeech 2022, 1546-1550, 2022
132022
Make-an-audio 2: Temporal-enhanced text-to-audio generation
J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ...
arXiv preprint arXiv:2305.18474, 2023
112023
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Z Zhao, D Yang, R Gu, H Zhang, Y Zou
Interspeech2022, 2022
102022
Detect what you want: Target sound detection
D Yang*, H Wang*, Y Zou, F Cui, Y Wang
Workshop on Detection and Classification of Acoustic Scenes and Events …, 2022
82022
Prompttts 2: Describing and generating voices with text prompt
Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ...
ICLR 2024, 2023
72023
Unsupervised multi-target domain adaptation for acoustic scene classification
D Yang, H Wang, Y Zou
Interspeech2021, 2021
62021
NaturalSpeech 3: Zero-shot speech synthesis with factorized codec and diffusion models
Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ...
ICML 2024, 2024
52024
Improving Weakly Supervised Sound Event Detection with Causal Intervention
Y Xin, D Yang, F Cui, Y Wang, Y Zou
ICASSP2023, 2023
52023
A rendszer jelenleg nem tudja elvégezni a műveletet. Próbálkozzon újra később.
Cikkek 1–20