The Visual Task Adaptation Benchmark

题目:The Visual Task Adaptation Benchmark

主讲人: 翟晓华(2014年王选所博士毕业,导师:肖建国、彭宇新) Google Research, Brain Team, Zurich

时  间:2019 年 11 月 5 日(周二)17:30 —18:30

地  点:北京大学王选计算机研究所大楼106会议室


Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets. Yet, the absence of a unified yardstick to evaluate general visual representations hinders progress. Many sub-fields promise representations, but each has different evaluation protocols that are either too constrained (linear classification), limited in scope (ImageNet, CIFAR, Pascal-VOC), or only loosely related to representation quality (generation). We present the Visual Task Adaptation Benchmark (VTAB): a diverse, realistic, and challenging benchmark to evaluate representations. VTAB embodies one principle: good representations adapt to unseen tasks with few examples. We run a large VTAB study of popular algorithms answering questions like: How effective are ImageNet representation on non-standard datasets? Are generative models competitive? Is self-supervision useful if one already has labels?


Xiaohua Zhai received the Ph.D degree in Computer Science from Peking University, Beijing, China, in July 2014. He is currently a senior research engineer in Google Research, Brain Team, Zurich.

He is the co-founder of "The Visual Task Adaptation Benchmark" (VTAB) project (, which is a diverse, realistic, and challenging benchmark to evaluate representations, i.e. generative models, self-supervised learning, semi-supervised learning and supervised learning. He leads the self-supervised learning research ( The project insights lead to the self-supervised semi-supervised learning (S4L) on ImageNet (SOTA on the paperswithcode leaderboard), and semi-supervised GANs on ImageNet (outperformed current SOTA BigGAN by using 5x less labeled data). He is the core contributor to the "compare GANs" project ( a framework for training and evaluating GANs, which reaches 1.4K stars and 269 forks on GitHub. He built a machine learning system for large scale knowledge graph (billions of entities) reconciliation.

He has authored papers in refereed international journals and conference proceedings, including TCSVT, ICML, ICCV, CVPR, AAAI, and ACM-MM. As second accomplisher of PKU-ICST team, which is lead by Prof. Yuxin Peng, he participate in the NIST TRECVID international video retrieval evaluation 2012 and win the first place. He is a reviewer of IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE Transactions on Image Processing (TIP), CVPR, AAAI and ACM-MM.


上一篇 下一篇