From Experiments to Production with Azure Machine Learning
January 14, 2021 | by Denys Fedorchuk
The machine learning industry is slowly becoming a maze of technologies and practices that get harder to traverse. It sometimes feels like before you become a data scientist, you should learn all there is to know about designing production grade architecture, in order to create a modeling flow. Perhaps this is the reason behind the growing demand for the newly popular job title of a Machine Learning (ML) Engineer. It helps to drive the complexities of production away from the data scientist, so that they can concentrate on rolling out better models without bothering themselves about putting said models in production.
However, even with split responsibilities, communication is still a major bottleneck. The data scientists and ML engineers must agree and communicate on certain aspects of the model before they can successfully publish it. All of this collaboration could become a chokepoint of production and delay the benefits that machine learning brings to the table.
Enter the world of Azure Machine Learning! This tool is Azure’s way of helping ML engineers and data scientists to collaborate and roll out models faster and with ease. Azure ML brings the components of the ML development flow together in one convenient space. If what you are seeking is a low-code to no-code approach, Azure Machine Learning Studio is a portal that allows you to manipulate those components by using an intuitive UI. If you are looking to reduce the time to deploy the model to production, Azure ML is a perfect tool to use.
When trying to understand the benefits of Azure ML, an image of a restaurant comes to mind. It is no secret that restaurants are a form of managed chaos that, if managed properly, have faster food production time, innovative food ideas, and a reliable food delivery service to the customer tables. They also have a defined flow so that each component of the restaurant is autonomous and only the required interactions are happening. Let’s go through the components of Azure ML and see how they compare to a successful restaurant.
The kitchen is a hub for all food, tools, and personnel. This is where the core of the restaurant is, and it must seamlessly integrate with all other parts of the restaurant. Much like the kitchen, Azure ML is an entity that holds and manages all your resources. You can upload datasets, train and save models, create and share experiments, all within the Azure ML ecosystem. This is where the magic of machine learning takes place and where one-of-a-kind models can be created by the data science teams.
In addition, being that it is an Azure product, you get top of the line security and governance capabilities. Don’t want any rodents in your fridge? Azure Role Based Access Control will restrict access only to the desired users. Don’t want your secret recipes leaking out to the internet? Then build your models in isolation by using virtual networks and private link capabilities to secure your models from public view. All this functionality comes prebuilt and ready to use out-of-the-box. Azure ML gives you the capability to start cooking right as you get the order, rather than first building your custom kitchen and only then preparing a meal.
The Fridge and the Stockpile
Every kitchen must organize its produce. Improper organization can and will lead to time delays and expired products used for cooking. After all, the faster you can find the right potato, the quicker you can make the fries. Similarly, proper organization prevents you from grabbing a stale potato, which will lead to a below average product. When striving for the best meal, organization is key. Azure ML allows its users to take advantage of git to track work and create workflows, which in turn will accelerate time to production. Additionally, it allows to track datasets, models, and experiment runs.
There is nothing worse than training a model for an entire day, only to realize that stale data was used, and as a result, the model needs to be retrained. Azure ML offers a way to handle data uploads and versioning. Simply register a dataset either by uploading a file from your local computer or Azure Storage. When you are registering a dataset, you have the ability to specify a description so that every user knows what this dataset is for. Dataset descriptions are essential to ensuring that nobody from your team misinterprets an expensive truffle for a spoiled potato and throws it away.
Once you register your first dataset, it is classified as version 1. Any changes to the dataset thereafter will increase the version number and can be accompanied by a description of what is new in that version. However, Azure ML does not just replace the dataset but instead saves the older version so that it can be consumed on demand. Also, because there is now a centralized space where data is located, your team members will find it much simpler to locate the resources they need for modeling, which in return improves productivity.
In addition to versioning data, trained models can take advantage of the similarly convenient registration system, where each new registered version of the model is put on top of the old. Going slightly higher up, Azure ML uses a concept of experiments, which encapsulates every model training run and logs it. What this means is that the models inside the experiment run are there and waiting to be registered alongside all your artifacts that were used for that run. You can even compare runs within the experiment to each other, which is something that would be very time consuming if you make a choice to avoid Azure ML.
To sum up, Azure ML takes care of all your organization needs. When running experiments, every run is saved with its artifacts so that it could easily be reused if needed. Every trained model is also saved, which allows to easily compare results and deploy different versions of the model on demand. Lastly, data versioning helps improve collaboration with coworkers and makes sure that everyone is on the same page. If you want to organize your ML products, look no further than Azure ML.
The Knife, the Spoon, the Spork
Reliable utensils and cutlery sets are the building block of any kitchen. Great kitchens of Michelin star restaurants do not spare any expense for their tools, since they know that great tools that are designed for the job allow for greater productivity and a cutting-edge product. You could cut bread with your hands or a filet knife, but a bread knife is a way better solution. Azure ML follows the same principles when integrating tools and libraries to the product. You will always find the most up to date and appropriate tools that will allow you to do your job with ease.
To start, Azure ML allows data scientists to take advantage of their favorite development environment—notebooks. Azure’s machine learning managed environment comes pre-installed on machine learning compute instances, allowing the user to avoid tedious environmental setup. Just create a notebook, start and attach a virtual machine (VM), and you are ready to code. The pre-installed environment includes libraries such as Numpy, Pandas, Spark, Tensorflow, and others. However, you are not just limited to those libraries. Simply add your own libraries to the machine learning environment and you are good to go! You can always rely on your own custom-made spork to get the job done.
Speaking of compute instances, Azure ML allows you to create VMs within its platform. This means that you do not need to go through the process of connecting instances to your notebooks. In addition, you can create compute clusters rather than single instances, which can be an added benefit when you are running parameter tuning or if you need to train multiple models at the same time. Whatever your computing needs are, Azure ML will be able to meet your demand.
Lastly, Azure ML gives you a tool that allows you to create image labeling projects. This feature lets your team collaborate when labeling incoming data. Currently, you can create a labeling project for the following types of image classification: multi-class classification, multi-label classification, and object identification with bounding boxes. This tool is a great resource if you are designing a neural vision model and need to gather labeled data.
Azure ML contains most of the modern tools and libraries that you might need while developing your models. Not only does this make your life easier, since you can completely avoid setup, but also it allows you to get up and running in no time.
The Sous-Chef and the Culinary Staff
The chef of the kitchen is usually responsible for creating new recipes, making menu decisions, and sometimes even cooking the meals that require the skills only the chef possesses. All other enquiries can be handed over to the sous-chef and the kitchen staff. The majority of meals can be prepared without the chef because the practice of preparing them has been standardized and the staff can do it with ease. In Azure ML, you are the chef, and you are presented with an ensemble of already established and existing machine learning algorithms that are sufficient for solving the majority of common tasks.
To start, you are given an ability to create pipelines, which are a series of modular steps that make up a larger complex training or transformation algorithm. Pipelines carry a strong resemblance to a meal recipe with step-by-step instructions on how to prepare it. The modularity of the pipeline allows multiple team members to work on the algorithm at the same time. You could compare it to multiple kitchen staff members cooking a single meal by dividing the work between them. The amazing thing about pipeline creation, is that there are several predefined modular steps that can be used to accelerate work.
These predefined modules were created from already established techniques and could be thought of as kitchen staff members who already know how to do certain tasks. For example, there is a set of data transformation modules, which range from tasks for removing a column from the dataset to converting words to vectors. In addition, there are also different modeling algorithm modules which allow you to easily run model training, ranging from regression to image classification, all without any coding required. You could create a production-ready model by simply dragging a box in Azure Machine Learning Studio portal.
After using the preset modeling algorithms, there is an overabundance of defined model scoring and evaluation techniques that can be used. Techniques such as Permutation Feature Importance, Area Under Curve, Recall, Confusion Matrix, and more, are ready out-of-the-box and available to use without writing any code. Of course, there are situations where the predefined modules do not offer all the functionality that you require. In those cases, you could use a custom code module which allows you to write your own custom logic and make it a part of the pipeline.
In addition to building pipelines by using predefined modules, you could also take an even easier route and make Azure ML do all the work for you. Auto AI is a capability of Azure ML to try out different modeling algorithms and automatically define the best algorithm for your particular use case. All you do is provide a dataset and specify whether the problem you are trying to solve is a regression, classification, or time series. Auto AI will perform feature engineering and will try out different modeling algorithms to find the one that works best. This could be compared to a sous-chef cooking multiple meals while you, the chef, relax in the office. In the event you do not trust the Auto AI to thoroughly engineer features or choose the best algorithm, at the very least you could use Auto AI as a sanity check to make sure that the automated approach reflects the results of your manual labor.
To recap, every kitchen needs a set of highly trained personnel who will help you prepare the meals for your customers. In the same way, Azure ML provides you with a set of up to date predefined algorithms and data engineering procedures, all of which will assist you in creating your models with record speed. However, if you are having doubts about your choice of algorithms, you could try to leverage the Auto AI feature which will do the work for you. A lonely chef, no matter how good he or she is, will always be slower than a team of qualified professionals who are ready to assist.
The last essential part of any great restaurant are the servers. No matter the number of customers, it is the job of the servers to grab the prepared product from the kitchen and gracefully deliver it. Without serving the meals there is no point in the existence of the restaurant. After all, a meal that is not consumed is a waste of time and money. This couldn’t be more true when it comes to the ML world, since the model that is created and never used is as useless as a meal that was not eaten. Azure ML attempts to standardize the process of bringing your models to production and making sure they can handle any number of requests.
By using Azure ML, a model can be deployed to any device, starting from a local server and ending with IoT edge devices. All you do is register a model, define environmental configuration with a model entry point script, and your model is ready to be deployed. The deployment target varies by what you are trying to achieve. For example, if you are testing out your model, you can deploy it to a local web service or to an Azure VM. In addition, you could use Azure Container Instances for testing and development purposes. If you are after real-time inferencing, then deploying your model on to Azure Kubernetes Service is a must and can be done in just a few clicks or lines of code. Azure ML makes deployments of any size a simple process that can be achieved without deep coding and architecture design skills.
The ML industry needs to have all the right tools to improve its time to production. Just as you wouldn’t build a restaurant from scratch to cook and serve an omelet, you shouldn’t take months designing and developing a production server to deploy a simple regression model. With Azure ML, a data scientist can complete the entire modeling flow in weeks and not months. It is a place which could hold all your datasets, models, and workflows in a single organized and secure space.
Here at EastBanc Technologies, we are striving to ensure that every model we create gets used in production and that the time of our data scientists and ML engineers is not spent designing a modeling flow which may already be well defined and proven to work. Azure ML is one of the products which follows our slogan “complexity made simple” and that is exactly why we love it.