Yoshua Bengio, University of Montreal
Deep learning has arisen around 2006 as a renewal of neural networks research allowing such models to have more layers. Theoretical investigations have shown that functions obtained as deep compositions of simpler functions (which includes both deep and recurrent nets) can express highly varying functions (with many ups and downs and different input regions that can be distinguished) much more efficiently (with fewer parameters) than otherwise. Empirical work in a variety of applications has demonstrated that, when well trained, such deep architectures can be highly successful, remarkably breaking through previous state-of-the-art in many areas, including speech recognition, object recognition, language models, and transfer learning. This talk will summarize the advances that have made these breakthroughs possible, and end with questions about some major challenges still ahead of researchers in order to continue our climb towards AI-level competence. Deep learning is bringing neural networks out of their traditional realm of pattern recognition and into higher level cognitive functions, including reasoning, attention, understanding and generating natural language, planning and reinforcement learning, with the ultimate goal to build model that understand the world by discovering the underlying explanatory factors.