Blog

How Packaged AI Services are Democratizing AI

Apr 27, 2020 | by Denys Fedorchuk

With the increasing popularity of machine learning (ML), it’s becoming more difficult for data scientists to find the appropriate tools for a specific task and decide on a robust approach. Should they stick to the basics and code everything from scratch or use one of the many pre-built tools that keep popping up on the market? If budget is a concern, then pre-built tools may be your best option. But this isn’t merely a resource question. If there is a tool out there that solves your specific problem, why reinvent the wheel?

In this article we’ll explore the range of packaged artificial intelligence (AI) services that exist, and when data scientists should leverage them.

Build versus Buy

Before diving in, let’s review what’s involved in coding everything from scratch. In the modern machine learning world, this means loading one of the popular ML libraries, such as Tensorflow or Keras, and building a model using algorithms that the data scientist thinks will perform best.

This is a common practice when doing a proof of concept for a project or when running a unique experiment. Because this is the most customizable approach, it gives users the freedom to try out literally any algorithm and create a completely custom solution. On the flipside, this approach requires time and is prone to errors.

On the other side of the spectrum, you’ve got prepackaged AI services. These are state of the art pre-trained ML models which excel at a specific task. They are offered by all major cloud computing providers such as AWS, Azure, and Google. Use cases for these pre-trained models include AI vision, language translation, or text to speech.

The biggest and most obvious advantage of these solutions are that they allow data scientists to prepare, build, and deploy their projects in a matter of days. These pre-trained models are very sophisticated and it’s unlikely that your team could create more advanced models as quickly. The downside is that they aren’t as customizable as in-house solutions. While it’s possible to retrain pre-built models, this could remove the “state of the art” notion. After all, the quality of your data will ultimately affect the result. Garbage in, garbage out -- it’s as simple as that. So, beware of retraining these services, especially if you’re not confident your data is good enough. If no customization is required, then pre-built packages are probably your best option.

The Pre-Built AI Provider Landscape

This article will only cover Azure, AWS, and Google as pre-build AI service providers, since they are the largest and most popular players.

It’s important to note that packaged AI models are typically designed to do one thing well. That doesn’t mean they lack flexibility. In fact, there are tools that allow you to re-train these services to do something more specific to your organization’s needs. But again, remember, the quality of the result will depend on the data you use. If you have clean data, the process is fairly simple. Regardless of which vendor you select, pre-packaged AI services can be broken down into five categories: (1) Vision, (2) Speech, (3) Text, (4) Decision, and (5) Auto AI. We will cover these categories one by one and explore what models fit within these categories.

1. AI Vision Services

AI vision services enable users to extract data from images and videos. The data extracted can customized to your individual business needs.

All vendors in this category allow you to re-train the model and customize the vision algorithm to only detect things that are important to your organization and ignore those that aren’t. Re-training an algorithm may sound daunting. But you really don’t need to be a data science expert to do so, fortunately, it is a fairly simple process.

In fact, all vendors promise a “no data science knowledge required” approach. By providing simple API calls and integrated labeling programs, they allow teams to highlight the object they are interested in and label it. Alternatively, you can hire someone to label the data for you. After all, to create a good model, you’ll need to label thousands of images and that is pretty time consuming. But keep in mind, a model is only as good as the data. To properly train the model, you must ensure to provide a lot of high-quality labeled images.

In terms of capabilities, Vision services can help data scientists do the following:

Cloud services in this field include Amazon Rekognition, Azure Computer Vision, and Google Video AI.

2. AI Speech Services

Speech services provide very powerful tools you can use for translation, text to speech, or speech to text use cases. Google, Amazon, and Microsoft have been able to perfect this technology through their own virtual assistants that, not only understand what you are saying (well, most of the time), but also respond in a very human-like fashion. Use cases for these services include improving customer care, cataloguing audio files by understanding its contents, live captioning, smart device development, robotics, live translation, transcript generation, and many others. All these technologies can be easily deployed in production or on low power edge devices.

Speech services include the following capabilities:

Some of the services in this filed include: Amazon Transcribe, Azure Speech to Text, Google Speech to Text, Amazon Lex, Azure Conversation Learner, Google Contact Center AI.

intro k8s blog banner

3. AI Text Services

Text services enable you to analyze any form of written text and extract data from it. From simple Natural Language Processing (NLP) to complicated language feature extraction, these services are capable of efficiently process large amounts of text and provide any sort of insights.

There is always value in collecting unstructured text data. While its value may be invisible to the human eye, given to a machine in a proper format, it will easily find useful information. As it turns out, ML is very adept at identifying valuable insights within massive amounts of data.

As an example, Amazon uses this ML ability in its Amazon Comprehend Medical service to decipher medical documents, such as badly written doctor’s notes. After the note has been processed, the model can pull valuable information from it, such as prescriptions or diagnosis.

Microsoft Azure, on the other hand, is trying to help people of any age to read text using their Immersive Reader technology which can be embedded into any software. This technology reads and comprehends text, highlighting the most important points and reads them aloud if required. It can even show pictures that help the user understand the meaning of a word in a sentence.

Google, meanwhile, takes the leading role in text translation which uses complex text analysis to enable simple word translations and ensure the coherence of the translated sentence.

Examples of and use cases for text ML services include:

Some of the services in this filed include: Amazon Comprehend, Azure Text Analytics, Google Natural Language, Amazon Translate, Azure Translator Text, Google Translation.

4. AI Decision Services

Decision services enable machines to evaluate multiple factors from the past and reach a decision that may impact users in some way. For example, Azure Cognitive Services provides an Anomaly Detector technology, which ingests data and alerts users if it detects any sort of anomaly in the system. This technology looks at time series data and creates a custom algorithm to make sure users have the best possible model for detection.

Amazon also offers a forecasting service, Amazon Forecast, which allows users to submit previous time series observations and forecast the future outcome. Google doesn’t currently have an out-of-the-box service to forecast data or detect anomalies. However, they do offer a recommendation service, which analyzes past user patterns to give recommendations. You may have seen the ads, Google is currently aggressively promoting it.

Decision services include the following:

Some of the services in this filed include: Amazon Forecast, Azure Anomaly Detector, Amazon Fraud Detection, Azure Content Moderation, Google Video Analysis, Amazon Personalize, Azure Personalizer, Google Recommendations AI.

5. Auto AI Services

Perhaps the most powerful packaged AI service provided is automatic AI services. These services are built to search through all available ML algorithms and identify best fit model for your data.

Typically, the first step when using these services is to create a dataset and upload it to the Auto AI service. Then you’ll indicate what you’re trying to achieve. Examples include trying to classify cars or attempting to predict the value of those cars. In case of the former, you indicate it’s a classification problem, in the latter it’s a regression problem.

There are numerous knobs and buttons to tune, but at the end the Auto AI service will explore every possible algorithm and feature set of the provided dataset. It will then come up with the best possible fit for your needs. In short, auto AI trains your model for you – isn’t that great! This would generally take a lot of time, but since Auto AI parallelizes this process and trains multiple models at the same time, it really speeds up the fit and finish of your model.

Auto AI services include, Amazon SageMaker Autopilot, Azure Automated ML, and Google Cloud AutoML.

Dataset Creation

While they aren’t AI solutions, dataset creation tools play a major role in training or re-training proprietary models when using any of the above-mentioned AI services. You can prepare your data on you own using python and frameworks such as Spark. Alternatively, you can use integrated tools, such as Google Dataprep or AWS/Azure Databricks. Google Dataprep provides an intuitive UI to prepare, explore, and restructure data. Databricks is slightly more low-level, and provides an isolated environment and a notebook UI to edit datasets using Spark. Both Dataprep and Databricks allow users to build robust pipelines to prepare their data for model ingestion. There are also services such as AWS SageMaker Ground Truth, which allow users to hire independent contractors or create a private workforce to assemble their dataset. Most of the time, services like these are used for labeling images for vision AIs, where the amount of data required to train a model is enormous.

Conclusion

There are numerous packaged AI services offered by cloud providers that are cost effective and could speed up you AI project all while provisioning a obust starting point for ML experiments. With easily manageable pipelines, you can go from just playing around with it to full on production with multi-user inference API.

Even data science experts with experience building robust models, should at the very least explore these services. There is a tool for almost every problem. But even if there isn’t, there is most likely a service that will help speed up the process, which your end clients will appreciate.

At EastBanc Technologies we always leverage existing AI packages first. It's efficient and cost-effective and there is no reason to reinvent the wheel.

If a data scientist has a unique problem and there is no service that meets their needs, then they’ll have to start coding models from scratch. But even then, they can leverage production AI pipeline tools to minimize headaches. Furthermore, they can benefit from the fact that different cloud providers offer different services offering greater portability. For example, a user could run their vision models on Azure and their translation models on Google. Since every provider offers their own way to create a production ready pipeline, data scientists can deploy any kind of service in no time.