Follow
Haiyang Xu
Haiyang Xu
Alibaba Group, DIDI AI LABS, SEU
Verified email at seu.edu.cn - Homepage
Title
Cited by
Cited by
Year
mPLUG-Owl: Modularization empowers large language models with multimodality
Q Ye, H Xu, G Xu, J Ye, M Yan, Y Zhou, J Wang, A Hu, P Shi, Y Shi, C Li, ...
arXiv preprint arXiv:2304.14178, 2023
1472023
Learning alignment for multimodal emotion recognition from speech
H Xu, H Zhang, K Han, Y Wang, Y Peng, X Li
InterSpeech 2019, 2019
1362019
E2E-VLP: End-to-End Vision-Language Pre-training Enhanced by Visual Learning
H Xu, M Yan, C Li, B Bi, S Huang, W Xiao, F Huang
ACL 2021, Oral, 2021
862021
Neural Topic Modeling with Bidirectional Adversarial Training
R Wang, X Hu, D Zhou, Y He, Y Xiong, C Ye, H Xu
ACL 2020, 2020
672020
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
C Li, H Xu, J Tian, W Wang, M Yan, ...
EMNLP2022, 2022
58*2022
mPLUG-2: A modularized multi-modal foundation model across text, image and video
H Xu, Q Ye, M Yan, Y Shi, J Ye, Y Xu, C Li
ICML2023 3, 2023
29*2023
An unsupervised Bayesian modelling approach for storyline detection on news articles
D Zhou, H Xu, Y He
EMNLP 2015, 1943-1948, 2015
282015
Unsupervised Storyline Extraction from News Articles.
D Zhou, H Xu, XY Dai, Y He
IJCAI 2016, 3014-3021, 2016
242016
Hitea: Hierarchical temporal-aware video-language pre-training
Q Ye, G Xu, M Yan, H Xu, Q Qian, J Zhang, F Huang
ICCV2023, 2022
232022
Semvlp: Vision-language pre-training by aligning semantics at multiple levels
C Li, M Yan, H Xu, F Luo, W Wang
arXiv preprint arXiv:2103.07829 3, 2021
222021
mPLUG-2: A modularized multi-modal foundation model across text, image and video
H Xu, Q Ye, M Yan, Y Shi, J Ye, Y Xu, C Li
International Conference on Machine Learning, ICML, 23-29, 2023
162023
Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, et al. 2022. mPLUG: Effective and efficient vision-language learning by cross-modal skip-connections
C Li, H Xu, J Tian, W Wang, M Yan
arXiv preprint arXiv:2205.12005, 2022
122022
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching
Y Shi, X Yang, H Xu, C Yuan, B Li, W Hu, ZJ Zha
CVPR 2022, 2021
122021
Delta: a deep learning based language technology platform
K Han, J Chen, H Zhang, H Xu, Y Peng, Y Wang, N Ding, H Deng, Y Gao, ...
arXiv preprint arXiv:1908.01853, 2019
112019
Bin Bi, Qi Qian, Wei Wang, Guohai Xu, Ji Zhang, Songfang Huang, Fei Huang, and Jingren Zhou. mplug-2: A modularized multi-modal foundation model across text, image and video
H Xu, Q Ye, M Yan, Y Shi, J Ye, Y Xu, C Li
International Conference on Machine Learning, ICML, 23-29, 2023
92023
mPLUG: Effective and efficient vision-language learning by cross-modal skip-connections
LS Li, Chenliang, Xu, Haiyang, Tian, Junfeng, Wang, Wei, Yan, Ming, Bin Bi ...
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
9*2022
Evaluation and analysis of hallucination in large vision-language models
J Wang, Y Zhou, G Xu, P Shi, C Zhao, H Xu, Q Ye, M Yan, J Zhang, J Zhu, ...
arXiv preprint arXiv:2308.15126, 2023
82023
Selective Attention Encoders by Syntactic Graph Convolutional Networks for Document Summarization
H Xu, Y Wang, K Han, B Ma, J Chen, X Li
ICASSP 2020, 2020
82020
mPLUG-DocOwl: Modularized multimodal large language model for document understanding
J Ye, A Hu, H Xu, Q Ye, M Yan, Y Dan, C Zhao, G Xu, C Li, J Tian, Q Qi, ...
arXiv preprint arXiv:2307.02499, 2023
72023
Learning Video-Text Aligned Representations for Video Captioning
Y Shi, H Xu, C Yuan, B Li, W Hu, ZJ Zha
ACM Transactions on Multimedia Computing, Communications and Applications 19 …, 2023
62023
The system can't perform the operation now. Try again later.
Articles 1–20