ML Zero to Hero: Part 1 - Hello, World

The first steps are always the hardest, where do you even begin with machine learning (ML) and artificial intelligence (AI)?

This series of blogs will take you from complete novice to ML Guru in almost no time at all. Although you might feel like you’re too late to start, now is actually a great time to begin your ML learning journey.

Together we will navigate the range of complicated or impenetrable terminology and work through meaningful examples you can follow along with. One goal is that everything we do will be FREE and legal to do.

First things first, if you’ve not played with it before, go and have a discussion with ChatGPT: https://chat.openai.com (FREE) – this will show you really quickly how exciting (and a bit scary) this technology and how accessible it is even for non-technical folks.

You have not missed the AI boat!

In 2001, when I left school, I observed the incredible growth of innovative internet companies such as Google and Amazon, Expedia and Hampsterdance, but I noticed one peculiar thing about the dot-com boom: the rules of gold-rushes applied.

In a gold rush the people making the most money are those selling picks and shovels. Infrastructure and tools are the places to invest when speculative value is being sought (for example, see nVidia’s insane growth due to crypto then AI).

I saw Cisco making more money than anyone as they were building the plumbing, routing switches and connections that the whole internet depended on. While internet businesses rose and fell, Cisco kept going up and to the right. So I decided to train as a Network Engineer.

It was good fun for a while. I hand-crimped ALL the RJ45 connectors. All of them. I wired up patch panels, installed industrial-strength switches, spent weeks in Cisco management tools, and ran miles of Cat5E cable (which was mutt’s nuts at the time).

But things change.

I got tired of roaming around with the spiders, running cables under floorboards or in suspended ceilings. I got tired of the dust and the sticky hands from self-adhesive trunking. Before long I leapt into system administration then on into software engineering, where I’ve pretty much been ever since.

And now, all the stuff I learned about designing and building computer networks, what used to occupy multiple pages of my resume, are now simply tick-box options in the admin portals of the major cloud providers. Two pages of my CV is now a toggle switch on a form.
Sad Rich

But what’s frustrating for me is great for the wider adoption of these capabilities. Now you don’t need training to build a large, secure virtual network, you just need to be able to click a few buttons.

And we’re at that stage now with AI and Machine Learning where a lot of the underlying hard stuff, the foundations, have been done. The blood, sweat and tears of those early pioneers has now been boiled down to a very accessible ecosystem of tools and off-the-shelf models.

What is Machine Learning?

“Machine learning” simply means teaching computers to become good at understanding patterns. We call teaching a computer to pattern-match, “training“.

In school you were probably given questions like, “What comes next in this sequence: 2, 4, 8, 16, ….?” You had to figure out the pattern and then guess the next number (no hints!).

While computers have been able to do simple pattern matching like this for decades, several recent innovations in the field of machine learning mean we’re now able to train and pattern-match at extremely large scales (ChatGPT was trained on practically the entire internet).

A very common application of machine learning is auto-completion, so if you start an email on GMail with “I’m busy on Saturday so I’m afraid I can’t attend”, you might be offered an auto-complete suggestion of “your birthday party”.

This is text-based machine learning, i.e. finding patterns in text and making text suggestions based on those patterns.

Additional data can be used to refine and improve suggestions – for example, GMail may check your calendar to see what events you had on on Saturday!

The more data made available to the “model” for training and matching patterns, the better the model matches new patterns. Google knows so much about us that I feel it has a distinct advantage over its competitors when it comes to AI.

Machine Learning Models

Models come in different sizes and for different purposes. Auto-complete is not the only text-based machine learning capability, Besides generating text for things like auto-complete, models exist which can:

Generate pictures/images (see DALL-E, ChatGPT’s sister app for doing just this)
Generate audio and sound
Categorise images (e.g. “this is a cat”)
Categorise, summarise or analyse text (e.g. sentiment analysis for a Tweet)
Multi-modal (several of the above in a single model)

I’ll show you how to train your own model from scratch in another article (it will take you less than 15 minutes and requires no coding ability at all – all the code will be provided) but for now let’s try out an existing model.

Set Up

First things first – you’ll need to install a few things onto your computer. This is quick, free and painless.

You’ll also need to understand how to open a command line interface on your computer. On Windows go to Start then type “command prompt” and click on the application with the black box. On Linux press Ctrl+Alt+T to open your default terminal. On Mac, open your applications folder and click on Terminal.

Use the links below to guide you through the installation of these pieces of software:

Python – this is the language we will use to interact with our model – don’t worry, I’ll provide the code you need, and you’ll very likely find it easy to understand as Python reads a lot like simple English.

You can then verify it’s correctly installed by typing the following command into your command line interface:
python3 --version

It should show you something similar to this:

Python Package Manager (PIP) – we’ll use this to install “packages” for Python to use. This is included in modern Python installations (since 3.4), you can verify if you have “pip” by running the following in the command line:

pip

It should show you a big list of options:

NOTE: if pip doesn’t work, try pip3

Next we’ll install PyTorch, which is one of three popular machine-learning platforms (for information the other two are TensorFlow and JAX). You don’t need to know why, how or what this is at this stage, it will be invisible to us during our exercise, but you still need to install it :).

pip install torch

Finally, we’re going to install a package with PIP that will make it very easy for us to interact with models. Type:

pip install transformers

Transformers aren’t the big robot things that turn into VWs or trucks, at least not in this context. In machine learning, transformers are components that convert your text or images into a format that the models understand, and then translate the model’s response into something you understand.

And that’s all we need! Let’s do some sentiment analysis in ONLY 4 LINES OF CODE!

Exercise 1: Sentiment Analysis

Before we begin, I invite you to look at your scroll bar and see how close you are to the bottom of the page you are – prepare to be amazed how simple it is to use AI today.

Still in your command line interface we’re going to write a very small Python application that will use a machine learning model and tell us if a piece of text is positive or negative.

Start by typing python3 and press enter, you should see something similar to the following:

Those three arrows are where we start writing our application. Type the following as the first line, then press Enter:

from transformers import pipeline

This adds knowledge of something called a “pipeline” into our application, and this massively simplifies how we interact with a machine learning model.

We now need to tell the pipeline what kind of activity we want to do and give the pipeline a name, in this case, we’ll call it pipe, but you can name it anything you like.

pipe = pipeline("sentiment-analysis")

At this point, the pipeline will download a model for you. Don’t worry it will be small and quick. Later on I’ll show you how to find models that best suit your use case and tell the pipeline to use them, but for now we’ll just let the pipeline decide which model to use. Your command prompt should look like this:

Now to give the model some text to analyse for sentiment, you can choose any text you like but I’ll use “I don’t like Mondays” as an example. Type:

result = pipe("I don't like Mondays")

The model will very quickly analyse the text and store its analysis in the “result” variable – this is just a piece of memory in your computer that you can refer to by name – “result” in this case. The name can be anything you want.

Next we’ll print the contents of the result variable to the screen. Type:

print(result)

And you’ll be shown something like this:

That bottom line shows that the model thinks “I don’t like Mondays” is negative, and it is 98.8% sure it is negative.

Congratulations! You’ve just written your first application that uses machine learning to provide a meaningful output! You can test the sentiment of other pieces of text by repeating the last two lines you typed but with different text, for example:

In the next Zero to ML Hero article, I’ll introduce you to the public model repository that you just downloaded your model from, and show you how to use different models for different purposes.