A few days ago, Amazon announced the availability of a new set of automatic model tuning capabilities in the AWS SageMaker platform. Specifically, the new releases focuses on tuning and optimizing hyperparameters associated with SageMaker models. The release constitutes a powerful addition in order to streamline the adoption of SageMaker within the data science community. With the new hyperparameter tuning and model optimization capabilities, SageMaker joins a new group of platforms that are entering the market trying to solve this notorious challenge in deep learning applications.
Building deep learning solutions in the real world is a process of constant experimentation and optimization. Differently from any other type of software application, deep learning applications don’t have a linear lifecycle based on the fact that models need to constantly refined, optimized and tested.
What are Hyperparameters?
Among the possible optimizations in a deep learning model, none plays a more prominent role that the tuning of hyperparameters. Conceptually, hyperparameters are variables that, while external to the model, influence its behavior and knowledge. A mid-size deep learning model, can be influenced by many hyperparameters including some of the following:
· Learning Rate: The mother of all hyperparameters, the learning rate quantifies the learning progress of a model in a way that can be used to optimize its capacity.
· Number of Hidden Units: A classic hyperparameter in deep learning algorithms, the number of hidden units is key to regulate the representational capacity of a model.
· Convolution Kernel Width: In convolutional Neural Networks(CNNs), the Kernel Width influences the number of parameters in a model which, in turns, influences its capacity.
· Momentum: This hyperparameter helps to know the direction of the next step with the knowledge of the previous steps. It helps to prevent oscillations.
· Batch Size: Mini batch size is the number of sub samples given to the network after which parameter update happens.
Tuning hyperparameters in a deep learning application can be an incredibly challenging process given that it requires not only the execution of an experiment but the evaluation of the results against other versions of the model. Recently, there are a new group of platforms that have entered the market with the surgical focus on solving this challenge across different deep learning frameworks. Let’s look at a few of the most prominent solutions in this new space:
With its latest release, SageMaker incorporates the creation of hyperparameter tuning jobs directly in the SageMaker console. The platform performs the evaluation and comparison of the different hyperparameter models and directly integrates with the training jobs and other aspects of the model lifecycle.
Comet.ml provides a super simple interface for the tuning and optimization of hyperparameters across different deep learning frameworks such as TensorFlow, Keras, PyTorch, Scikit-Learn and many others. Developers can seamlessly integrate Comet.ml into their models using the several SDKs available with the platform as well as the REST API. The platform provides a very visual way to tune and evaluate hyperparameters.
One of my favorite stacks when comes to hyperparameter tuning is the newly released Weights&Biases(W&B). Used by deep learning power houses such as OpenAI, W&B provides an advanced toolset and programming model for recording and tuning hyperparameters across different experiments. The solution records and visualizes the different steps in the execution of the model and correlates it with the configuration of its hyperparameters
DeepCognition is one of the new platforms making inroads in the world of self-service deep learning. Functionally, DeepCognition enables the implementation, training and optimization of deep learning models with minimum coding. As part of the optimization process, DeepCognition includes a very powerful and visual hyperparameter tuning engine that records and compares the execution of experiments based on specific hyperparameter configurations.
Azure ML provides native hyperparameter tuning capabilities as part of its ML Studio. While the existing functionality seems limited compared to some of the other options covered in this article, the platform is able to record and provide basic visualizations relevant to the model hyperparameters. The hyperparameter tuning capabilities of Azure ML can be combined with other services such as Azure ML Experimentation to streamline the creation and testing of new experiments.
Just like AWS SageMaker and Azure ML, Google Cloud ML provides some basic hyperparameter tuning capabilities as part of its platform. The current Cloud ML feature set is a bit limited compared to some of its competitors but Google has proven its ability to iterate faster in this space.