[April 2021] Machine Learning Monthly 💻🤖

16th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Daniel here, I'm 50% of the instructors behind the Complete Machine Learning and Data Science: Zero to Mastery course and our new TensorFlow for Deep Learning course!. I also write regularly about machine learning and on my own blog as well as make videos on the topic on YouTube.

Welcome to the 16th edition of Machine Learning Monthly. A 500ish (+/-1000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

What you missed in April as a Machine Learning Engineer…

My work 👇

Video version of this article is live. Check it out here!

The Zero To Mastery TensorFlow for Deep Learning course has launched!

Eager to learn the fundamentals of deep learning?

Like to code?

Well, this course is for you.

It's entirely code-first. Which means I use code to explain different concepts and link external non-code first resources for those who like to learn more.

I’m so pumped for this release, it’s the biggest thing I’ve ever worked on! I’d love if you checked it out:

Sign up on the Zero To Mastery Academy for the full course
View all of the course materials on GitHub (more to come)
Ask a question on the course GitHub Discussions tab if you’d like to know more
See and try the first 14-hours of the course on YouTube

From the community 👩‍💻

Louis-François' YouTube channel (What’s AI)

Looking for an overview of how Transformers (a neural network architecture) can be used for vision (in place of CNNs)? Or how OpenAI's DALL·E model can generate images from text?

Louis' channel What's AI has videos covering many of the latest advancements in AI with explanations to go along with them. Go and show him some love by subscribing!

From the internet 🕸

Welcome to the special design edition of Machine Learning Monthly!

I'm often asked, "what kind of machine learning project should I work on?"

Of course, there are an unlimited amount of answers I could give for that question. But I usually answer with "follow your curiosity".

Why?

Because of how experimental machine learning is, it's in your best interest to figure things out through tinkering. By trying things which might not work.

However, machine learning projects are no longer works of magic. The device you're reading this on probably uses machine learning in several different ways you're not aware of (see Apple's implicit machine learning below).

That being said, this issue of ML Monthly collects different design best practices from companies using machine learning at world-scale proportions.

And after reading through them, you'll start to notice there are many overlaps in how things are done. This is a good thing. Because the overlaps are what you can use for your own projects.

As models and machine learning code become more and more reproducible, you'll notice an overarching theme here: machine learning is an infrastructure problem.

Which is something you've known all along, "how do I get data from one place to another in the best way possible?"

If you're considering working on your own machine learning projects, read through each of the guidelines below and try the materials in bonuses section, but remember, none of these will replace the knowledge you gain from experimenting yourself.

Note: I have used the terms machine learning and AI interchangeably throughout this article. You can read "machine learning system" as "AI system" and vice versa.

Apple’s Human Interface Guidelines for Machine Learning

I'm writing these lines on an Apple MacBook in a library where I can see at least 6 other Apple logos. This morning I watched two people in front of me pay for their coffee using their iPhones.

Apple devices are everywhere.

And they all use machine learning in many different ways, to enhance photos, to preserve battery life, to enable voice-searches with Siri, to suggest words for quick type.

Apple's Human Interface Guidelines for Machine Learning share how they think about and how they encourage developers to think about using machine learning in their applications.

They start with two high level questions and break it down from there:

What is the role of machine learning in your app?
What are the inputs and outputs?

For the role of machine learning in your app, they go on to ask, is it critical (need to have) or complementary (nice to have)? Is it private or public? Is it visible or invisible? Dynamic or static?

For the inputs and outputs (I'm a big fan of this analogy because it's similar to a ML model's inputs and outputs) they discuss what a person will put into your system and what your system will show them.

Does a person give a model explicit feedback? As in, do they tell your model if it's right or wrong. Or does your system gather implicit feedback (feedback which doesn't require a person to do any extra work other than use the app)?

Apples Human Interface Guidelines for Machine Learning2x

Questions to think about when asking what role machine learning plays in your app/feature.

Source: https://developer.apple.com/design/human-interface-guidelines/machine-learning/overview/roles/

Google's People and AI Research (PAIR)

Google's design principles for AI can be found in their People and AI Research (PAIR) guidebook.

The PAIR guidebook also comes along with a great glossary of many different machine learning terms you'll come across in the field (there's a lot). It breaks down designing an AI project into six sections.

User Needs + Defining Success

Where's the intersection of what AI is capable of and what the people using your service require?
Should you automate (remove a painful task) or augment (improve) with AI?
What's the ideal outcome?

Data Collection + Evaluation

Turn a person's requirements into data requirements (it all starts with the data)
Where does your data come from? (is it responsibly sourced?)
Build, fit and tune your model (good models start with good data)

Mental Models (setting expectations)

What does a person believe your ML system can achieve?

Explainability + Trust

AI systems are probability-based (and may give strange results), how can this be explained?
What information should a person know about how a ML model made a decision? (confidence levels, "we're showing you this because you liked that...")

Feedback + Control

How can a person give feedback to help your system improve?

Errors + Graceful Failure

What is an "error" and what is a "failure"? (a self-driving car stopping at a green light could be an error but running a red light could be a failure)
ML systems aren't perfect and your system will eventually fail, what do you do when it does?

Each section comes with a worksheet to practice what you've learned.

A trend you'll notice after going through the guidelines (especially PAIR) is setting expectations. Being upfront with what your system is capable of. If a person expects your system to be magic (as ML is often portrayed) but isn't aware of its limitations, they may be let down.

Microsoft's design guidelines for Human-AI interaction

Microsoft's design guidelines for Human-AI interaction tackle the problem in four stages:

Initially (what should a person know when they first use your system?)
During interaction (what should happen whilst a person is using your service?)
When wrong (what happens when your system is wrong?)
Over time (how does your system improve over time?)

I noticed Microsoft's guidelines take you on a walk in a person using your ML system's shoes.

Again we see a trend.

Problem → Create solution (ML or not) → Set expectations → Allow feedback → Have a mechanism for when it's wrong → Improve over time (go back to the start).

AI-Guidelines-poster nogradient100

Microsoft's guidelines for Human-AI interaction cards, starting with initial stages through to what to do as a person interacts with your machine learning system over time.

Source: https://www.microsoft.com/en-us/research/project/guidelines-for-human-ai-interaction/

Facebook’s Field Guide to Machine Learning

While previous resources have taken the approach of an overall ML system, Facebook's Field Guide to Machine Learning focuses more in on the modelling side of things.

Their video series breaks a machine learning modelling project into six parts:

Problem definition — what problem are you trying to solve?
Data — what data do you have?
Evaluation — what defines success?
Features — what features of the data best align with your measure of success?
Model — what model best suits the problem and data you have?
Experimentation — how can you iterate and improve upon the previous steps?

But as the modelling side of things in machine learning gets more accessible (thanks to pretrained models, existing codebases, etc), it's important to keep in mind all of the other parts of machine learning.

6 step field guide to machine learning projects flowcharts2x

I used Facebook's Field Guide to Machine Learning as the outline of the Zero To Mastery Data Science and Machine Learning Course. You can also read an expanded version of these steps on my blog.

Spotify’s 3 Principles for Designing ML-Powered Products

How do you build a service which provides music to 250 million users across the world?

You start by going manual before you go magic (principle 3) and you continually ask the right questions (principle 2) to identify where the people using your service are facing friction (principle 1).

The sentence above is a play on words of Spotify's three principles for designing machine learning-powered products.

Principle 1: Identify friction and automate it away

Anywhere a person struggles in pursuit of their goals whilst using your service can be considered friction.

Imagine a person searching for new music on Spotify but unable to find anything which suits their tastes. Doing so could hurt someone's experience.

Spotify realized this and used machine learning-based recommendation systems to create Discover Weekly (what I'm currently listening to), a playlist which refreshes with new music ever week.

And in my case, it looks like they must've adhered to their other two principles whilst building it because these tracks I'm listening to are bangers.

Principle 2: Ask the right questions

Ask. Ask. Ask. If you don't know, you could end up designing a product in the wrong direction.

Much like many of the other guideline steps above challenge you to think from the person user your service's point of view, this is the goal of asking the right questions: find out what issues your customers are having and see if you can solve them using machine learning.

Principle 3: Go manual before you go magical

Found a source of friction?

Can you solve it without machine learning?

How about starting with a heuristic (an idea of how things should work)?

Like if you were Spotify and trying to build a playlist of new music someone was interested in, how do you classify something as new?

Your starting heuristic could be anything older than 30 days wouldn't be classified as new.

After testing multiple heuristics and hypothesises (a manual process) you could then again review whether or not machine learning could help. And because of your experiments, you'd be doing so from a very well-informed point of view.

From Big Data to Good Data by Andrew Ng

Andrew Ng presented a talk at Scale's recent conference on the movement of ML systems from big data to good data. And Roboflow did a great summary of the main points — all of which talk to the things we've discussed above.

Some of my favourites include:

Getting to deployment is a starting point rather than the finish line (closing the proof of concept and production gap)
From big data to good data (MLOps' most important task is ensuring high-quality data in all phases of the ML project lifecycle)
Freeze your codebase and iterate on your data (for many problems the model is a solved problem, the data is what's needed)

Untitled

Andrew Ng on the importance of thinking about good data as well as big data.

Source: https://scale.com/events/transform/videos/big-data-to-good-data

Bonuses

The above are all guidelines on how to think about building ML-powered systems. But they don't necessary show you tools or how to go about doing so.

The following are extra resources I'd recommend for filling the gaps left by the above.

Choose one and read through/work through all the materials/labs whilst building your own ML-powered project.

Engineering best practices for machine learning (Software Engineering 4 Machine Learning) — a thorough guide on developing software systems with machine learning components.
Machine Learning Engineering Book by Andriy Burkov — a one stop shop for many of the guidelines and steps discussed above, I have this book on my desk and use it as a reference.
CS329s: Machine Learning System Design — an entire Stanford course covering all of the steps that going into design a machine learning-powered system. Led by Chip Huyen with guest lectures (including one from yours truly) by engineers from many different machine learning companies.
Full Stack Deep Learning — machine learning doesn't stop once a model is built (and after reading the above, you know the model is a small part of the entire system). Full Stack Deep Learning introduces many of the steps around model building such as data storage, data manipulation, data versioning (notice the emphasis on data), model deployment as well as different tools for implementing them.
Made with ML MLOps curriculum — MLOps = machine learning operations. Made with ML MLOps is made by Goku Mohandas in apprenticeship style, "here's how I would build an ML-powered service and how you can too".

See you next month!

What a massive month for the ML world in April!

As always, let me know if there's anything you think should be included in a future post. Liked something here? Tell a friend!

In the meantime, keep learning, keep creating, keep dancing.

See you next month,

Daniel

www.mrdbourke.com | YouTube

PS. You can see also video versions of these articles on my YouTube channel (usually a few days after the article goes live).

By the way, I'm a full time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.