I am currently a research scientist at Google working on computer vision. I have recently finished my Ph.D. in Computer Science at UC Berkeley, advised by Prof. Trevor Darrell.
During my graduate study I've worked/interned at the National University of Singapore, Microsoft Research Asia, NEC Labs America, and Google Research. I obtained my bachelor and master degrees at Tsinghua University, China.
My current research topics include:
(Most recent publications to be added)
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
J Donahue, Y Jia, O Vinyals, J Hoffman, N Zhang, E Tzeng, T Darrell. arXiv preprint.
[ArXiv Link] [Live Demo] [Software] [Pretrained ImageNet Model]
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. We also released the software and pre-trained network to do large-scale image classification.
Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
Y Jia, J Abbott, J Austerweil, T Griffiths, T Darrell. NIPS 2013. [PDF coming soon]
It is marvelous that human can learn concept from a small number of examples, a challenge many existing machine vision systems fail to do. We present a system combining computer vision and cogscience to model such human behavior, as well as a new dataset for future experientation on human concept learning.
Latent Task Adaptation with Large-scale Hierarchies
Y Jia, T Darrell. ICCV 2013. [PDF coming soon]
How do we adapt our ImageNet classifiers to accurately classify just giraffes and bears on a zoo trip? We proposed a novel framework that benefits from big training data and adaptively adjusts itself for subcategory test scenarios.
Category Independent Object-level Saliency Detection
Y Jia, M Han. ICCV 2013. [PDF coming soon]
We proposed a simple yet efficient approach to combine high-level object models and low-level appearance information to perform saliency detection that identifies foreground objects.
We showed the suboptimality of spatial pyramids in feature pooling, and proposed an efficient way to learn task-dependent receptive fields for better pooled features.
Factorized Multi-modal Topic Model
S Virtanen, Y Jia, A Klami, T Darrell. UAI 2012. [PDF]
We factorized the information contained in corresponding image and text with a novel HDP-based topic model that automatically learns both shared and private topics.
We presented a dataset of color and depth image pairs collected from the Kinect sensor, gathered in real domestic and ofﬁce environments, for research on object-level recognition with multimodal sensor input.
Decaf is a general python framework for deep convolutional neural networks, relying on a set of scientific computation modules (such as numpy/scipy) to efficiently run CNN models without the need of a GPU. Decaf is still under development but an imagenet classification demo could be checked out here.
CS188 Artificial Intelligence , spring 2012.
Undergraduate AI course: search, CSP, games, MDP, Reinforcement Learning, Bayes' Nets, HMM, DBN, probabilistic inference, and a fun PacMan challenge.
Won the campus Outstanding GSI Award.
CS281a/Stat241a Statistical Learning Theory , fall 2011.
Graduate level course: graphical models, probabilistic inference, parameter estimation, regression, exponential family, EM and HMM, factor analysis, Junction Tree Algorithm, Monte Carlo, Variational Inference, etc.