Dongchao Yang

Hivatkozott rá

	Összes	2019 óta
Hivatkozások	951	950
h-index	14	14
i10-index	17	17

560

280

140

420

202120222023202416 51 320 551

Nyilvános hozzáférés

Összes megtekintése

7 cikk

0 cikk

elérhető

nem érhető el

Finanszírozási megbízások alapján

Társszerzők

Yuexian ZouPeking University Shenzhen Graduate SchoolE-mail megerősítve itt: pku.edu.cn
Rongjie HuangFacebook AI Research (FAIR), Zhejiang UniversityE-mail megerősítve itt: meta.com
Helin WangJohns Hopkins UniversityE-mail megerősítve itt: jh.edu
Xu TanPrincipal Researcher and Research Manager, MicrosoftE-mail megerősítve itt: microsoft.com
Jinchuan TianLanguage Technologies Institute, Carnegie Mellon UniversityE-mail megerősítve itt: andrew.cmu.edu
Nuo ChenHong Kong University of Science and TechnologyE-mail megerősítve itt: connect.ust.hk
Songxiang LiuPhD. from CUHK

Követés

Dongchao Yang

The Chinese University of HongKong

E-mail megerősítve itt: se.cuhk.edu.hk - Kezdőlap

TTS Multi-modal Audio Fundation Models


Cím Rendezés hivatkozások szerint Rendezés év szerint Rendezés cím szerint	Hivatkozott rá Hivatkozott rá	Év
Diffsound: Discrete diffusion model for text-to-sound generation D Yang, J Yu, H Wang, W Wang, C Weng, Y Zou, D Yu IEEE Transactions on Audio, Speech and Language Processing (TASLP)., 2023	208	2023
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang*, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, ... ICML 2023, 2023	169	2023
AudioGPT: Understanding and generating speech, music, sound, and talking head R Huang, M Li, D Yang, J Shi, X Chang, Z Ye, Y Wu, Z Hong, J Huang, ... AAAI, 2024, 2023	118	2023
UniAudio: An Audio Foundation Model Toward Universal Audio Generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... ICML 2024, 2023	52	2023
InstructTTS: Modelling expressive TTS in discrete latent space with natural language style prompt D Yang, S Liu, R Huang, C Weng, H Meng IEEE Transactions on Audio, Speech and Language Processing (TASLP), 2024	51	2024
Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023	49	2023
Towards data distillation for end-to-end spoken conversational question answering C You, N Chen, F Liu, D Yang, Y Zou arXiv preprint arXiv:2010.08923, 2021	37	2021
A Mutual learning framework for Few-shot Sound Event Detection D Yang, H Wang, Y Zou, Z Ye, W Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	34*	2022
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information Z Ye, H Wang, D Yang, Y Zou Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021	31	2021
NaturalSpeech 3: Zero-shot speech synthesis with factorized codec and diffusion models Z Ju, Y Wang, K Shen, X Tan, D Xin, D Yang, Y Liu, Y Leng, K Song, ... ICML 2024, 2024	27	2024
Make-an-audio 2: Temporal-enhanced text-to-audio generation J Huang, Y Ren, R Huang, D Yang, Z Ye, C Zhang, J Liu, X Yin, Z Ma, ... arXiv preprint arXiv:2305.18474, 2023	22	2023
Make-a-voice: Unified voice synthesis with discrete representation R Huang, C Zhang, Y Wang, D Yang, L Liu, Z Ye, Z Jiang, C Weng, ... ACL 2024, 2023	21	2023
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss Y Xin, D Yang, Y Zou ICASSP2023, 2023	21	2023
Prompttts 2: Describing and generating voices with text prompt Y Leng, Z Guo, K Shen, X Tan, Z Ju, Y Liu, Y Liu, D Yang, L Zhang, ... ICLR 2024, 2023	16	2023
Norespeech: Knowledge distillation based conditional diffusion model for noise-robust expressive tts D Yang, S Liu, J Yu, H Wang, C Weng, Y Zou Interspeech2023, 2022	14	2022
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification Y Xin, D Yang, Y Zou Proc. Interspeech 2022, 1546-1550, 2022	13	2022
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches Z Zhao, D Yang, R Gu, H Zhang, Y Zou Interspeech2022, 2022	12	2022
Detect what you want: Target sound detection D Yang, H Wang, Y Zou, F Cui, Y Wang Workshop on Detection and Classification of Acoustic Scenes and Events …, 2022	8	2022
Unsupervised multi-target domain adaptation for acoustic scene classification D Yang, H Wang, Y Zou Interspeech2021, 2021	7	2021
Improving Weakly Supervised Sound Event Detection with Causal Intervention Y Xin, D Yang, F Cui, Y Wang, Y Zou ICASSP2023, 2023	6	2023

A rendszer jelenleg nem tudja elvégezni a műveletet. Próbálkozzon újra később.

Cikkek 1–20

Hivatkozások évente

Ismétlődő hivatkozások

Összevont hivatkozások

Társszerzők hozzáadásaTársszerzők

Követés

Hivatkozott rá

Társszerzők