Machine Learning Monthly 💻🤖

Daniel Bourke
Daniel Bourke
hero image

14th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Hey everyone, Daniel here, I'm 50% of the instructors behind the Complete Machine Learning and Data Science: Zero to Mastery course. I also write regularly about machine learning and on my own blog as well as make videos on the topic on YouTube.

Welcome to the 14th edition of Machine Learning Monthly. A 500ish (+/-1000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

What you missed in February as a Machine Learning Engineer…

My work 👇

The Zero To Mastery TensorFlow for Deep Learning course has launched!

During COVID lockdowns of 2020, I decided to brush up my TensorFlow skills by getting TensorFlow Developer Certified. I made a video about how I did it (in a month or so) and afterwards had a bunch of questions from others asking how they could too.

The video talked about different resources I used but I wanted to go a step further.

So to answer the question of “I want to learn TensorFlow, where do I start?” I created the Zero to Mastery TensorFlow for Deep Learning course.

It teaches 3 things:

  1. The fundamentals of deep learning (a machine learning paradigm taking the world by storm)
  2. The fundamentals of TensorFlow (a framework used to write deep learning algorithms, e.g. Neural Networks)
  3. How to pass the TensorFlow Developer Certification (this is optional but requires 1 & 2)

The entire course is code-first. Which means I use code to explain different concepts and link external non-code first resources for those who want to learn more.

__I’m so pumped for this release, it’s the biggest thing I’ve ever worked on! I’d love if you checked it out: __

A Machine Learning Deployment Tutorial for 20% Software Engineers for Stanford’s CS329s

I started learning machine learning in 2017 using Stanford lectures and tutorials. So when Chip Huyen (featured below) asked if I’d like to create and deliver one myself, I said yes.

My goal was to break down the barrier for machine learning model deployment by highlighting:

  • Model deployment is like model building (except instead of stacking together layers, you stack together cloud services)
  • The holy grail of ML-powered applications: the data flywheel (introduced in January 2021’s ML monthly)
  • What do you do when (not if) your model fails?

To cover this, during the tutorial we deploy a TensorFlow image classification model (spoiler: one of the model’s we build in the TensorFlow for Deep Learning course!) to Google Cloud and use it to power Food Vision (a Streamlit app which classifies different photos of food). Watch below and head here to see the code & slides.

From the ML community 🙋‍♀️

Using Deep Learning to Read Brainwaves

Imagine being able to put on a helmet and instead of typing on a keyboard, all you do is think about what’d you like to type. And the sensors in the helmet pickup what you’re thinking and translate them to the screen. Well that’s what Muhtasham and team are working towards. Their latest paper takes electroencephalogram (EEG) from a brain-computer interface (BCI) and passes them through a deep neural network to try and predict what key you’re thinking of pressing on a keyboard.

Why is this important?

I take being able to type for granted now. I learned when I was young playing online video games. But what if you couldn’t type? What if you didn’t have the ability to use your limbs or speak? How would you communicate in the digital world? Here’s where BCI’s come in. Instead of you physically performing an action, a BCI reads the electric signals in your brain and translates them to the physical world. This is one of the avenues Neuralink is heading down.

Sensational work Muhtasham and team, for those interested in reading how these systems work, read A Neural Network for SSVEP-based Brain-Computer Interfaces.

Álvaro's guide to TensorFlow Serving

Last time with how to turn your PyTorch models into an API, this time how to turn your TensorFlow models into an API using TensorFlow Serving.

In the machine learning model deployment tutorial above I used Google’s AI Platform to deploy a trained TensorFlow model. AI Platform runs TensorFlow Serving under the hood. So if you want to see how you could set up your own TensorFlow model API (so you can send data to it and receive predictions back), check out Álvaro's TensorFlow Serving tutorial on GitHub.

An aside, I really love this format of sharing your work. Right within GitHub. A great way to demonstrate skill on a certain topic, code and word explanations interlaced together.

From the interwebs 🕸

Chip Huyen’s outstanding machine learning blog (2 recent posts)

I’m a big fan of Chip’s work. Last year we saw her post on the MLOps (all the parts of a machine learning system around the model) tooling landscape. Now there’s a version 2 and an essay on machine learning going real-time.

Post 1: MLOps Tooling Landscape v2

The takeaway from this post is that the ML tooling landscape is growing (and fast).

Why?

Well it seems many of the development roadblocks for ML systems have been worked out and now solutions for those common roadblocks are being built.

Such as Weights & Biases for tracking your machine learning experiments or full-blown enterprise AI solutions like DataRobot.

Post 2: ML is going real-time

In this post Chip breaks down ML going real-time into online predictions and online learning and emphasises the importance of speed in machine learning.

The simple analogy goes like this: if your app is slow, people won’t use it.

For online predictions (predictions made while someone is using your ML system), if they’re slow, people will get tired of them. For some systems, such as self-driving cars, fast online predictions are required. The classic trade off here is how much performance do you sacrifice for speed (e.g. smaller models are faster but don’t score as well).

For online learning (how quickly does your ML system update itself?), if you’re getting new data every hour but only updating your models every week, is your service performing as well as it should? This is a tough problem. Because traditionally ML is taught with static data (train, validation, test sets) but when you get to large scale, you’ve got data (called streaming data) coming in all the time so what exactly should you train and evaluate on?

Take the example of a recommendation system such as Netflix, if there’s a new series out and everyone is watching so you get recommended it but then you try it out and don’t like it. But because Netflix only update their system once a week, the new series stays on your homepage for a whole week (it doesn’t take into account you didn’t like it), is that a good experience?

How should one do online learning?

This is still being worked out. But Chip’s post talks more about the pain points and possible solutions.

Use GitHub for your Machine Learning Operations (MLOps)

If your code is on GitHub, why not use it to do all of your MLOps (model evaluation, model retraining, continuous integration, continuous deployment) as well?

The GitHub for MLOps page is a collection of blog posts, examples, talks and tools (e.g. GitHub Actions) showcasing how you can facilitate MLOps with GitHub.

Why is this important?

Find yourself running a series of similar experiments over and over?

Automate, automate, automate.

Use GitHub to watch your codebase for updates and then every time you commit a change, run a series of experiments behind the scenes (such as testing your models and then posting the results in a GitHub issue for you).

Upgrade your software engineering skills as a machine learning practitioner

If you’re like me and learned to write Python code in Jupyter Notebooks, when you first heard someone say things like: “just put it into a Python script”, “run this bash command”, “put an API on top of your model and deploy it”, “wrap your app in a Docker container” and thought “wtf???”

LJ Mirander’s post on How to improve software engineering skills as a (data science) researcher uses the narrative of creating a machine learning service (an app which uses machine learning) and then explains each of the above and how they fit together.

Why is this important?

If you want to get your machine learning models into the hands of others, software engineering skills like the ones LJ discusses are crucial.

freeCodeCamp's (upcoming) Data Science & Machine Learning curriculum

Coming soon from one the best resources to learn to code online: a full-blown data science & machine learning curriculum.

freeCodeCamp started out teaching web development for free in the browser.

Now they’re gearing up to do the same with data science and machine learning.

From foundational mathematics (linear algebra, matrix algebra, calculus) to EDA (exploratory data analysis), to supervised and unsupervised algorithms to tools like SQL, NumPy, SciPy, pandas, TensorFlow and Scikit-Learn, when its done, this’ll be an outstanding resource.

All for free!

Why is this important?

Educational resources are already abundant. There’s a shortage of willingness to learn.

But what helps grow that willingness to learn is world-class educators who inspire others to create their own path. And freeCodeCamp is full of world-class educators.

Read more about freeCodeCamp’s data science and machine learning curriculum creation and support their efforts in the blog post.

Learn how to build the machine learning model taking NLP (and now computer vision) by storm

Whenever I want to learn something really deeply, I read multiple resources, I read the white paper, I try to build it myself, I fail a dozen times, then I try to explain it someone else and finally, if I haven’t given up, maybe by the end I’m able to recreate it myself.

If I can’t recreate something, I don’t understand it.

Peter Bloem’s Transformers from scratch post explains all of the most important concepts (self-attention, multi-headed attention) behind the deep learning architecture popping up everywhere with incredible results.

The post goes through the example of building each component of a Transformer model from scratch with PyTorch and then putting them together for a text generation and text classification problem.

I read it through end-to-end on the weekend. Next, I’ll go back through and rewrite all the code myself.

Estimating training data influence using TracIn

A rule of thumb for machine learning is: more data = better model.

But what if you didn’t have more data? Or getting more was hard so you’d like to figure out if you should be spending your time on a certain kind of data versus something else?

TracIn (Tracing Gradient Descent) may be able to help.

TracIn helps to figure out how much each of your training samples influences your model.

For example, if you’re training a classifier to predict classify photos of a red car, do photos of red trucks help (proponent) or hinder (opponent) your model?

1

Example of using TracIn to see which images influenced a model deciding what a chameleon looks like. Proponents (helpful) images on the left, opponents (not so helpful) images on the right. Source: Google AI Blog.

Why is this important?

Knowing what kind of samples help and hinder your model can guide your future data collection and modelling efforts.

The beauty of TracIn is in its generality. Being able to be applied on any form of Stochasitc Gradient Descent-like optimisation algorithm (yes, it’ll work with Adam). That means you could likely incorporate it into your current models.

Codecademy’s Statistics with Python YouTube series

One of the most common questions I get is: “What statistics should I know when starting out data science?”

I struggle for an answer because it depends on the problem. And since I’ve been writing code for a while now, many of the things I do have become automatic.

But thankfully, Codecademy have provided an answer.

From visualisations to associations between variables to setting up binomial tests (comparing two possible outcomes), the series is taught completely in Python code and a great introduction to many of the most fundamental statistical concepts.

The most important statistical ideas of the last 50 years

Speaking of statistics, if the above resource is practical and hands-on with Python code, this paper talks about how the world of statistics has changed in the past few decades.

Specifically:

  • Counterfactual casual inference (what would happen if...?)
  • Bootstrapping and simulation-based inference (using lots of random simulations in an attempt to recreate reality)
  • Overparameterised models and regularisation (build a very large model to overfit on data, then regularise it to prevent further overfitting)
  • Multilevel models (models which adapt to a range of different data sources)
  • Generic computation algorithms (reusing existing compute algorithms and separating them from model creation)
  • Adaptive decision analysis (designing experiments such as A/B testing and using online learning for better decision making)
  • Robust inference (models which can still be used even if some of their inputs aren’t true)
  • Exploratory data analysis (what do the patterns in your data look like?)

The paper explores each of these in further details and discussing how they influence each other as well how they’re influenced by things like advances in compute power as well as an ever-increasing amount of data being available.


See you next month!

What a massive month for the ML world in February!

The trend of MLOps exploding and transformers taking over continues.

As always, let me know if there's anything you think should be included in a future post. Liked something here? Tell a friend!

In the meantime, keep learning, keep creating, keep dancing.

See you next month,

Daniel www.mrdbourke.com | YouTube

PS. You can see video versions of these articles on my YouTube channel (usually a few days after the article goes live). Watch previous month's here.

By the way, I'm a full time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.

More from Zero To Mastery

Python Monthly 💻🐍 preview
Python Monthly 💻🐍

15th issue of Python Monthly! Read by 1,000s of Python developers every month. This monthly Python newsletter is focused on keeping you up to date with the industry and keeping your skills sharp, without wasting your valuable time.

The Developer’s Edge: How To Become A Senior Developer in 2024 preview
Popular
The Developer’s Edge: How To Become A Senior Developer in 2024

Do you want to be a Senior Developer and excel in your field? You're in the right place. By the end of reading this, you will have a set path with a list of the best resources for you to level up and become a Senior Developer.