Zhenyu Huang

Room B316, College of CS

Hi! My name is Zhenyu Huang (黄振宇). I’m a Ph.D. student from College of Computer Science, Sichuan Univerisity, China. Currently, I’m a student of Professor Xi Peng [website]. In 2020, I interned at Baidu NLP mentored by Xinyan Xiao [website].

My research interests span Multimodal Learning, Noisy Label Learning, and Unsupervised Learning. For now, I focus on designing robust multimodal learning models with strong empirical performance & real-world deployability.

News

Feb 13, 2023	One paper was accepted by the TPAMI.
Oct 31, 2022	One paper was accepted by the AAAI 2023.
Mar 2, 2022	One paper was accepted by the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2022).

Selected Publications

NeurIPS Oral
Learning with Noisy Correspondence for Cross-modal Matching

Zhenyu Huang, Guocheng Niu, Xiao Liu, and 4 more authors

In Proceedings of the 35th Conference on Neural Information Processing Systems , NeurIPS’2021, 2021

Abs Bib HTML PDF Code

Cross-modal matching, which aims to establish the correspondence between two different modalities, is fundamental to a variety of tasks such as cross-modal retrieval and vision-and-language understanding. Although a huge number of cross-modal matching methods have been proposed and achieved remarkable progress in recent years, almost all of these methods implicitly assume that the multimodal training data are correctly aligned. In practice, however, such an assumption is extremely expensive even impossible to satisfy. Based on this observation, we reveal and study a latent and challenging direction in cross-modal matching, named noisy correspondence, which could be regarded as a new paradigm of noisy labels. Different from the traditional noisy labels which mainly refer to the errors in category labels, our noisy correspondence refers to the mismatch paired samples. To solve this new problem, we propose a novel method for learning with noisy correspondence, named Noisy Correspondence Rectifier (NCR). In brief, NCR divides the data into clean and noisy partitions based on the memorization effect of neural networks and then rectifies the correspondence via an adaptive prediction model in a co-teaching manner. To verify the effectiveness of our method, we conduct experiments by using the image-text matching as a showcase. Extensive experiments on Flickr30K, MS-COCO, and Conceptual Captions verify the effectiveness of our method. The code could be accessed from www.pengxi.me .
@inproceedings{huang2021learning, title = {Learning with Noisy Correspondence for Cross-modal Matching}, author = {Huang, Zhenyu and Niu, Guocheng and Liu, Xiao and Ding, Wenbiao and Xiao, Xinyan and Wu, Hua and Peng, Xi}, booktitle = {Proceedings of the 35th Conference on Neural Information Processing Systems , {NeurIPS'2021}}, volume = {34}, year = {2021}, }
NeurIPS Oral
Partially View-aligned Clustering

Zhenyu Huang, Peng Hu, Joey Tianyi Zhou, and 2 more authors

In Proceedings of the 34th Conference on Neural Information Processing Systems , NeurIPS’2020, Dec 2020

Abs Bib HTML PDF Code

In this paper, we study one challenging issue in multi-view data clustering. To be specific, for two data matrices \mathbfX^(1) and \mathbfX^(2) corresponding to two views, we do not assume that \mathbfX^(1) and \mathbfX^(2) are fully aligned in row-wise. Instead, we assume that only a small portion of the matrices has established the correspondence in advance. Such a partially view-aligned problem (PVP) could lead to the intensive labor of capturing or establishing the aligned multi-view data, which has less been touched so far to the best of our knowledge. To solve this practical and challenging problem, we propose a novel multi-view clustering method termed partially view-aligned clustering (PVC). To be specific, PVC proposes to use a differentiable surrogate of the non-differentiable Hungarian algorithm and recasts it as a pluggable module. As a result, the category-level correspondence of the unaligned data could be established in a latent space learned by a neural network, while learning a common space across different views using the “aligned” data. Extensive experimental results show promising results of our method in clustering partially view-aligned data.
@inproceedings{huang2020pvc, title = {Partially View-aligned Clustering}, address = {Virtual-only Conference}, author = {Huang, Zhenyu and Hu, Peng and Zhou, Joey Tianyi and Lv, Jiancheng and Peng, Xi}, booktitle = {Proceedings of the 34th Conference on Neural Information Processing Systems , {NeurIPS'2020}}, month = dec, year = {2020}, }
IJCAI
Multi-view Spectral Clustering Network

Zhenyu Huang, Joey Tianyi Zhou, Xi Peng, and 3 more authors

In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI’2019, 10–16 aug 2019

Abs Bib HTML PDF Code

Multi-view clustering aims to cluster data from diverse sources or domains, which has drawn considerable attention in recent years. In this paper, we propose a novel multi-view clustering method named multi-view spectral clustering network (MvSCN) which could be the first deep version of multi-view spectral clustering to the best of our knowledge. To deeply cluster multi-view data, MvSCN incorporates the local invariance within every single view and the consistency across different views into a novel objective function, where the local invariance is defined by a deep metric learning network rather than the Euclidean distance adopted by traditional approaches. In addition, we enforce and reformulate an orthogonal constraint as a novel layer stacked on an embedding network for two advantages, i.e. jointly optimizing the neural network and performing matrix decomposition and avoiding trivial solutions. Extensive experiments on four challenging datasets demonstrate the effectiveness of our method compared with 10 state-of-the-art approaches in terms of three evaluation metrics.
@inproceedings{huang2019mvscn, title = {Multi-view Spectral Clustering Network}, address = {Macao, China}, author = {Huang, Zhenyu and Zhou, Joey Tianyi and Peng, Xi and Zhang, Changqing and Zhu, Hongyuan and Lv, Jiancheng}, booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, {IJCAI'2019}}, publisher = {International Joint Conferences on Artificial Intelligence Organization}, month = {10--16 Aug}, pages = {2563--2569}, volume = {}, year = {2019}, doi = {10.24963/ijcai.2019/356}, }