Biography

Zhou Yu is currently with School of Computer Science and Technology, in Hangzhou Dianzi University (HDU). He received his Bachelor Degree in Digital Media and Ph.D. Degree in Computer Science from Zhejiang University (Hangzhou, China) in 2010 and 2015, respectively. His Ph.D. advisor is Prof. Yueting Zhuang and Prof. Fei Wu. Before joining HDU, he was a senior algorithm engineer in Alibaba inc.

He mainly applies machine learning and deep learning techniques to bridge vision and language. His research interests include multimodal learning, visual question answering, visual grounding, visual captioning, cross-media retrieval and high-dimensional hashing indexing, etc. His research results have expounded in 30+ publications at prestigious conferences and journals (e.g., CVPR, ICCV, SIGIR, ACM Multimedia, IEEE TMM, TNNLS), and achieved 1800+ citations from global peers. Also, he has served as reviewers for a number of journals and conferences, including IEEE Trans. on Image Processing (TIP), IEEE Trans. on Multimedia (TMM), IEEE Trans. on Circuits and Systems for Video Technology (TCSVT), Information Sciences, Signal Processing, Neurocomputing, and CVPR, AAAI, IJCAI, ACMMM, etc.

Selected Publications

Yuhao Cui, Zhou Yu, Chunqi Wang, Zhongzhou Zhao, Ji Zhang, Meng Wang, Jun Yu, "ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge", ACM International Conference on Multimedia (ACM MM) , Chengdu, China, 2021.
The first VLP approach that incorporates cross- and intra-modal knowledge simultaneously.
Paper Project

Zhou Yu, Yuhao Cui, Jun Yu, Dacheng Tao, Qi Tian, "Deep Multimodal Neural Architecture Search", ACM International Conference on Multimedia (ACM MM) , Virtual, 2020.
The first deep NAS approach for universal multimodal learning tasks.
Paper Project

Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Tian Qi, "Deep Modular Co-Attention Networks for Visual Question Answering", IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Long Beach, USA, 2019.
The solution of the winner award (1st place) in VQA Challenge 2019
Paper Project Slides

Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, Dacheng Tao, "Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering", IEEE Transactions on Neural Networks and Learning Systems (TNNLS) , 29(12): 5947-5959, 2018.
The solution of the runner-up awards (2nd place) in VQA Challenge 2017 and VQA Challenge 2018
Paper Project Slides 2017 Slides 2018

Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, Dacheng Tao, "Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding", International Joint Conference on Artificial Intelligence (IJCAI) , Stockholm, Sweden, 2018.
A simple yet strong baseline for visual grounding.
Paper Project

Biography

Selected Publications

Awards

ACM Hangzhou Rising Star Award

VQA Challenge 2019 Winner Award

VQA Challenge 2018 Runner-up Award

VQA Challenge 2017 Runner-up Award