[November 2022] Machine Learning Monthly Newsletter 💻🤖

35th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Hey everyone!

Daniel here, I’m a machine learning engineer who teaches the following beginner-friendly machine learning courses:

Complete Machine Learning and Data Science Bootcamp: Zero to Mastery
Get TensorFlow Developer Certified: Zero to Mastery
[NEW] PyTorch for Deep Learning: Zero to Mastery (100+ videos added since last month!)

I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

Enough about me!

You're here for this month's Machine Learning Monthly Newsletter. Typically a 500ish (+/-1,000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

What you missed in November as a Machine Learning Engineer…

My work 👇

ZTM PyTorch Cheatsheet — To go along with the new Zero to Mastery PyTorch Course is a quick browsable cheatsheet with many of the most used PyTorch methods! There's also a downloadable PDF version.
Charlie Walks paperback book release — My first novel (written by hand without AI 😛) is now available in paperback form worldwide! You can all the details including a fun launch video at charliewalks.com.

From the Internet 💻

1. ChatGPT is OpenAI’s latest large language model (and the results are wild)

OpenAI just released a new version of their GPT-3 model called ChatGPT with a focus on dialogue.

There’s no point trying to describe what’s going on with it, better to try it out for yourself.

All I can say is that it’s by far the best chatbot I’ve ever used.

Goes to show that amazing AI applications are on the horizon.

It’s a design and product problem now.

The models are there and they’re good enough, it’s up to what you build with them.

Check out this example response when I asked ChatGPT to create an upper body workout for me in the form of a folklore story:

Screenshot 2022-12-02 at 9.44.56 am

2. OpenAI’s GPT-3 gets better with `text-davinci-003`

Two OpenAI releases within a couple of days of each other!

As if ChatGPT wasn’t enough, there’s now a new model powering the original GPT-3 API, text-davinci-003.

Trained using reinforcement learning to improve results, OpenAI claims several improvements over the previous text-davinci-002 model.

See the comparison Scale did on their blog comparing the two models across several different problems for more.

3. Modal is an epic new way to run your code in the cloud with a couple of lines of code

If you’ve ever tried to learn how to use cloud resources, you know there’s a pretty steep learning curve.

Modal changes this.

Built by Erik Bernhardsson (whose work has been in plenty of previous ML monthly’s + designed the recommendation engine at Spotify) and team, they’re setting the standard for future cloud development.

Check out their docs for a series of fantastic examples of how to train/deploy/use machine learning models in the Model cloud in minutes:

4. Better language models with less compute by Google AI

Google AI introduces two new papers on how to train language models with significantly less compute.

UL2R training combines several different pretraining tasks into one to get the benefits of each enabling training of stronger models with less parameters (less parameters often means less compute)
Instruction fine-tuning fine-tunes a large language model on a collection of NLP tasks phrased as instructions. This instruction style fine-tuning uses much less compute than the original training (0.2%)

instruction-fine-tuning

Example of the instruction fine-tuning task. Taking a series of existing NLP tasks and using them as instructions to prepare a model for future use on similar (but unseen) tasks. Source: Google AI Blog.

It looks like some of the weights of Google’s (smaller) versions of the models are available to use on GitHub as well as the Hugging Face models repository under the name “FLAN” for “fine-tuning language models”.

However, it seems Google may have missed the boat a little with making their models available to people to use.

Depending on what they’re doing internally with them (Google’s language models are no doubt powering several of their search services), it seems OpenAI has the leg up in providing usable language models (even if they perform worse than Google’s on paper).

It goes to show having a good product that’s available to use is better than having a great product that isn’t available.

5. How Airbnb builds machine learning-powered features

Airbnb is one of my favourite apps.

I used it all throughout a recent Europe trip.

So I love seeing how they use one of my other loves, machine learning, to power their services.

Two new articles this month from their tech blog:

6. Getting started with PyTorch Image Models (timm): A Practitioner’s Guide

Ever since the release of the Zero to Mastery PyTorch course, I’ve been learning more and more about different libraries in the PyTorch ecosystem.

The PyTorch Image Models library (timm for short) is one of the most popular and useful and often appears in citations for several research papers.

I’ve been using timm to build new models for my AI project called Nutrify (take a photo of food and learn about it).

And the Practitioner’s Guide by Chris Hughes offers a fantastic walkthrough of many of the features.

I’ve even printed out the timm train.py script to study it and improve my own scripts.

7. Eugene Yan’s guide on text-to-image models

Text-to-image models have exploded over the last several months.

On the surface, we put in a line of text, such as “a flamingo performing a bench press in a weightlifting” and we get an image back.

DALL·E 2022-12-02 11.16.34 - a flamingo performing a bench press in a weightlifting gym

Using OpenAI’s DALL•E 2 with the prompt “a flamingo performing a bench press in a weightlifting”.

But what happens behind the scenes?

Eugene writes that the modern image generation models use a combination of four techniques:

Diffusion: Gradually add noise to data and then learn to generate data from noise
Text conditioning: Generating images given (i.e., conditioned on) a text prompt
Classifier guidance: Using classifier gradients to text-increase image alignment
Latent space: Applying diffusion on image embeddings instead of image pixels

8. Data Stack for Machine Learning by Made with ML

The incredible Goku Mohandas has just updated the amazing Made with ML website with the heart of machine learning: the data stack.

Alongside several other new lessons such as machine learning orchestration (putting together several pieces of a machine learning pipeline) and machine learning testing the data stack lesson explains what kinds of data storage types go into a machine learning problem:

Data lake — stores large amounts of raw objects (e.g. images, videos, tables, almost any kind of data)
Database — stores relational (rows and columns, like Excel or Google Sheets or SQL) or non-relational (key/value, graph, NoSQL) data
Data warehouse — stores data in a format ready to be analyzed for later (this system is optimized for performing operations across columns rather than on specific rows)

Knowing the different data stacks and how they interact with each other is a fantastic way to level up your knowledge as a data engineer and in turn, a machine learning engineer.

9. The Near Future of AI is Action-Driven by John McDonnell

John McDonnell writes about where the future of AI is headed with all of the latest releases of language models.

In 2022 language models got good, really good.

Now in 2023 and onwards the trick will be combining them in some way to take actions in the real world and then using the results from those actions to update their steps.

prompt-engineering

Using large language models (LLMs) to start providing a service and then updating (fine-tuning) the model to provide an even better service. Source: John McDonnell Substack.

10. PyTorch Releases a Multimodal Domain Library

TorchMultimodal Beta is out!

With all the talk about vision and language models, you can now use them baked directly into the PyTorch library as a domain library.

There are already a bunch of pretrained models built-in such as ALBEF for visual question answering (asking a question of an image and getting a text result) and MDETR (detecting an object class on an image defined with natural language, e.g. a class of “pelican” even though you don’t have any labelled examples of pelican).

Extras

VeLO is a learned optimizer (trained for four thousand TPU months!) that basically beats every other hand-tuned optimizer (including Adam) on every single problem.
Descript partners with OpenAI to create an AI-powered video editor. I already use Descript for transcribing many of my videos and then editing then with the help of text, now 50M of funding from OpenAI is here to level things up.
Stable Diffusion V2 is out! Better quality image generation, image upscaling, depth-to-image and more!

See you next month!

What a massive month for the ML world in November!

As always, let me know if there's anything you think should be included in a future post.

In the meantime, keep learning, keep creating, keep dancing.

See you next month,
Daniel

www.mrdbourke.com | YouTube

By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.

Machine Learning Monthly Newsletter 💻🤖

Daniel Bourke

What you missed in November as a Machine Learning Engineer…

My work 👇

From the Internet 💻

1. ChatGPT is OpenAI’s latest large language model (and the results are wild)

2. OpenAI’s GPT-3 gets better with `text-davinci-003`

3. Modal is an epic new way to run your code in the cloud with a couple of lines of code

4. Better language models with less compute by Google AI

5. How Airbnb builds machine learning-powered features

6. Getting started with PyTorch Image Models (timm): A Practitioner’s Guide

7. Eugene Yan’s guide on text-to-image models

8. Data Stack for Machine Learning by Made with ML

9. The Near Future of AI is Action-Driven by John McDonnell

10. PyTorch Releases a Multimodal Domain Library

Extras

See you next month!

PyTorch for Deep Learning Bootcamp: Zero to Mastery

Complete A.I. Machine Learning and Data Science: Zero to Mastery

Complete Python Developer in 2025: Zero to Mastery

TensorFlow for Deep Learning Bootcamp: Zero to Mastery

More from Zero To Mastery

Machine Learning Monthly Newsletter 💻🤖

Daniel Bourke

What you missed in November as a Machine Learning Engineer…

My work 👇

From the Internet 💻

1. ChatGPT is OpenAI’s latest large language model (and the results are wild)

2. OpenAI’s GPT-3 gets better with text-davinci-003

3. Modal is an epic new way to run your code in the cloud with a couple of lines of code

4. Better language models with less compute by Google AI

5. How Airbnb builds machine learning-powered features

6. Getting started with PyTorch Image Models (timm): A Practitioner’s Guide

7. Eugene Yan’s guide on text-to-image models

8. Data Stack for Machine Learning by Made with ML

9. The Near Future of AI is Action-Driven by John McDonnell

10. PyTorch Releases a Multimodal Domain Library

Extras

See you next month!

PyTorch for Deep Learning Bootcamp: Zero to Mastery

Complete A.I. Machine Learning and Data Science: Zero to Mastery

Complete Python Developer in 2025: Zero to Mastery

TensorFlow for Deep Learning Bootcamp: Zero to Mastery

More from Zero To Mastery

2. OpenAI’s GPT-3 gets better with `text-davinci-003`