The Great Conundrum of Hyperparameter Optimization

Regularization is tuning or selecting the preferred level of model complexity so deep learning models are more successful in predictions.

Techniques for regularization are applied to machine learning models to make the decision boundary/fitted model smoother. Those techniques help to prevent overfitting. Examples include: L1, L2, Dropout and Weight Decay in neural networks.

With every deep learning algorithm comes a set of hyperparameters, and optimizing them is crucial in achieving faster convergence and lower error rates. The majority of people working in deep learning use common heuristic methods to tune hyperparameters, such as learning rates, decay rates and L2 regularization. 

Recently researchers have tried to cast hyperparameter optimization as a deep learning problem but they are often limited by the lack of scalability. At the Deep Learning Summit in Singapore, speaker Ritchie Ng, Deep Learning Research Assistant at the National University of Singapore (NUS), will show how scalable hyperparameter optimization is now possible to accelerate convergence, and can be trained on one problem while enjoying the benefits of transfer learning that is scalable. This has impact on the industrial level where deep learning algorithms can be accelerated to convergence without manual hand-tuning even for large models.

I caught up with Ritchie ahead of the summit on 27-28 April, to learn more about advancing hyperparameter optimization, natural language processing (NLP), and challenges in the deep learning space.

What started your work in deep learning?
I started off with standard machine learning algorithms like Support Vector Machines (SVMs) for computer vision tasks. Subsequently, I observed remarkable performance using Deep Neural Networks (DNNs). Even further, I discovered the use of Deep Convolutional Neural Networks (CNNs) that gave superhuman performance on some computer vision tasks. And this marked the start of my deep learning journey. Indeed, I am a fan of the Turing machine. And coincidentally there were advances in a Turing complete deep learning algorithm called Recurrent Neural Networks (RNNS). This got me excited and I forayed into RNNs where it is a main focus of my research. As I ventured even deeper, I was plagued with the curse of a growing number of hyperparameters to tune and I was mainly tuning it by hand with established heuristics. And this leads me to my current main research focus which is on “learning to learn” where I am able to cast hyperparameter optimization as a learning problem.

What are the key factors that have enabled recent advancements in deep learning?
There are two factors and they are underpinned by the concept of openness. The main factor, in my opinion, is that research in this field is relatively open where many people openly publish their work from research labs in universities to private labs in corporations. This gives other researchers the ability to learn and build on one another’s research to continually push the boundaries of deep learning at a rapid pace. The other factor is the openness of code published today where researchers can build on existing code and share it openly with others. Moreover, the openness of the publication of code enables some level of homogeneity in terms of programming languages used which helps facilitate collaboration due to common languages.

What are the main types of problems now being addressed in the deep learning space?
One example is how unsupervised learning still requires a lot of work. Also learning to learn is increasingly gaining prominence and reinforcement learning with possibly game theory for complex multi-agent interactions. On the application side, for example healthcare where I have an interest in, there is growing attention paid to using deep learning for medical imaging diagnostics such as detecting breast cancers, lung nodules, pneumothorax, intracranial bleeding and more.

What developments can we expect to see in NLP in the next 5 years?
There would be better understanding of natural language where deep learning algorithms can better understand connections amongst sentences and paragraphs. Another possible interesting development is how NLP would advance further to have a more realistic dialogue with humans where we move one step closer to an AI agent being able to pass the Turing test.

What is the impact of your work on advancing hyperparameter optimization?
For many years, the mass majority of people in the deep learning community are currently using common heuristics to tune hyperparameters such as learning rates, decay rates and L2 regularization. In recent works, researchers have tried to cast hyperparameter optimization as a deep learning problem but they are limited by their lack of scalability. My work is a first step in showing how it is now possible for scalable hyperparameter optimization that accelerates convergence that can be trained on one problem while enjoying the benefits of transfer learning. This has impact on the industrial level where deep learning algorithms can be accelerated to convergence without manual hand-tuning even for large models.

There's just 1 month to go until the Deep Learning Summit, taking place alongside the Deep Learning in Finance Summit in Singapore on 27-28 April. Explore how deep learning will impact communications, manufacturing, healthcare, transportation and more. View further information here.

Confirmed speakers include Jeffrey de Fauw, Research Engineer at DeepMind; Vikramank Singh, Software Engineer at Facebook; Nicolas Papernot, Google PhD Fellow at Penn State University; Brian Cheung, Researcher at Google Brain; Somnath Mukherjee, Senior Computer Vision Engineer at Continental; and Ilija Ilievski PhD Student at NUS

Tickets are limited for this event. Register your place now.
Deep Learning Neural Networks Machine Learning A I Deep Learning Summit Natural Language Understanding NLP Deep Learning Algorithms



Recommended Posts

Latest Posts

Upcoming Events

Deep Learning Summit London

21 September 2017, London

The Deep Learning Summit is at the forefront of AI. Explore the impact of image & speech recognition as a disruptive trend in business and industry. How can multiple levels of representation and abstraction help to make sense of data such as images, sound, and text. Hear the latest insights and technology advancements from industry leaders, startups and researchers.

AI Assistant Summit London

21 September 2017, London

The next generation in predictive intelligence. Anticipating user & business needs to alert & advise logical steps to increase efficiency. The summit will showcase the opportunities of advancing trends in AI Assistants & their impact on business & society. What impact will predictive intelligence have on business efficiency & personal organization?

Track 1: Deep Learning Summit Montreal

10 October 2017, Montreal

What are the latest advancements in deep learning research? Where are the most recent scientific breakthroughs? Hear the latest research news from global pioneers in Natural Language Processing, GANs, Reinforcement Learning, CNNs and Unsupervised Learning at the summit.


Be Sociable

  • Twitter
  • Facebook
  • Linkedin
  • Youtube
  • Flickr
  • Lanyrd
  • Instagram
  • Google plus
  • Medium