[February 2023] Machine Learning Monthly Newsletter 💻🤖

38th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Hey there, Daniel here.

I’m a Machine Learning Engineer who also teaches the following beginner-friendly machine learning courses:

I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

Enough about me! You're here for this month's Machine Learning Monthly Newsletter.

Typically a 500ish (+/-1,000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

What you missed in February 2023 as a Machine Learning Engineer…

My work 👇

[Blog post] The Top 4 Reasons to Learn PyTorch

PyTorch is by far the most used deep learning framework in machine learning research and is used by companies such as OpenAI to build models like ChatGPT.

These are just two of the best reasons to learn the most fire framework out there! Read on to see how you can get into AI by learning PyTorch, especially through the ZTM PyTorch course ;).

different-companies-using-pytorch

A handful of large technology companies who use PyTorch to power their AI and machine learning workflows: OpenAI, Meta, Tesla and Amazon.

Student showcase

Saket Munda turned the FoodVision 🍔👁 project from the ZTM TensorFlow course into a real working app!

Sadek took the EfficientNetB0 computer vision model and deployed it to a web application called Wikifoodia.

The web application, deployed with Vercel, allows you to upload an image and the model will classify it into one of 101 different types of food.

It even worked on my lunch!

wikifoodia-working-on-omelette

The Wikifoodia app built by Saket Munda using the FoodVision model from the ZTM TensorFlow course. Source: Wikifoodia app with my own image of a (delicious) omelette.

Try the app Wikifoodia app yourself
See Saket’s Tweet story about how he built it

From the Internet

1. Tutorial: Create a ChatGPT over your own data

The tutorial reads like a cooking recipe.

You take your ingredients: a large corpus of data (text) and a large language model.

And then you turn your data into embeddings and use the large language model to:

Encode your query (e.g. “What is the best way to learn machine learning?”) into numbers.
Find similar documents in your corpus that match the query (via similarity search).
Use the similar documents from above as the input for the prompt to the large language model to create a response.
Voilà! You now (kind of) have a ChatGPT over your data!

Check out the tutorial by the LangChain team to see how to structure the responses.

langchain-chatgpt-over-your-data

Overview of how to create a ChatGPT over your own data (all possible with LangChain!, which is also where this image comes from).

2. Tutorial: Fine-tuning a Flan-T5 XXL model (a big NLP model!) for your own custom use case

Don’t want to pay a company such as OpenAI or Cohere for a large language model API?

Well Flan-T5 XXL is an open-source language model that’s available on Hugging Face to use for free which achieves fantastic results.

flan-t5-xxl-on-hugging-face

Flan-T5 models from small to XXL are available as open-source models on Hugging Face.

And Philipp Schmid, technical lead at Hugging Face, has a tutorial on how you can fine-tune it on your own data to make it perform even better.

3. Tutorial: GPT in 60 lines of NumPy code by Jay Mody

You know when you find someone’s work and you immediately start to read it all?

That’s what I’ve done with Jay’s blog posts.

Not only is his tutorial on how to replicate GPT with 60 lines of NumPy code an incredible walkthrough of one of the most powerful architectures available today, his other blog posts contain terrific insights on other popular and useful machine learning topics.

Such as:

An intuition for attention — An excellent overview and breakdown of the attention mechanism (a function that plays a bit part in the transformer architecture) in words, code and math.
Numerically stable softmax and cross-entropy — This post gave me insights to the softmax and cross-entropy functions (two functions I use every day), that I didn’t know before. I did not know what a monotonic function was (turns out softmax is a monotonic function).

4. Blog post: The best GPUs for 2023 by Tim Dettmers

The modern AI stack involves a deep learning framework (such as TensorFlow or PyTorch) and one or more GPU(s).

Whenever I want to learn more about GPUs, I reread Tim Dettmers’ guide on the best GPUs for deep learning.

And he’s recently updated it to include the new 40-series NVIDIA RTX cards such as the RTX 4080 and RTX 4090.

Turns out they’re quite the leap over the previous generation.

The new hardware inside is what powers a lot of the speedups available in PyTorch 2.0.

GPUS Ada raw performance3

Performance numbers across a wide range of GPUs, notice the newer H100 and RTX 40- series right up the top. Source: Tim Dettmers blog.

5. Blog post: OpenAI has laid out their plans for a road to AGI

OpenAI defines artificial general intelligence (AGI) as “AI systems that are generally smarter than humans”.

In the blog post, they lay out their ground rules for what it might look like getting there:

We want AGI to empower humanity to maximally flourish in the universe…
We want the benefits of, access to, and governance of AGI to be widely and fairly shared.
We want to successfully navigate massive risks. In confronting these risks, we acknowledge that what seems right in theory often plays out more strangely than expected in practice…

Some of the above points have been cut off but you can read them in the full post.

It’s hard to think that this is becoming more and more of the conversation.

I mean, if the rate of improvement of AI systems keeps going the way its been going the past 10 years, it’s hard to imagine AGI being too far off.

Or maybe we’re just easy to trick.

6. Blog post: Machine Learning Needs Better Tools by Replicate

Replicate helps you run machine learning models in the cloud.

You upload your model to Replicate, choose your compute service (e.g. CPU or GPU) and then you get an API you can send data to.

In other words, Replicate helps you with machine learning model deployment.

Replicate wants to make two things easier:

Running open-source machine learning models — such as a model released by a research team and published for anyone to use (rather than it being in a Google Drive somewhere, have it ready to use as an API).
Deploying machine learning models — if you’ve built a model, one of the best ways to test it is to get it into someone else’s hands to try out.

Their replicate package helps you run open-source models in a few lines of code:

import replicate
model = replicate.models.get("stability-ai/stable-diffusion")
version = model.versions.get("db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf")
version.predict(prompt="an astronaut riding on a horse")

And their [cog package](https://github.com/replicate/cog) effectively helps turn your machine learning models into Docker containers.

7. Blog post: LAION AI’s CLIP-Retrieval (query CLIP for creating your own custom image datasets)

The LAION AI team just seem to keep releasing amazing tools.

One of their latest is the updated [clip-retrieval library](https://github.com/rom1504/clip-retrieval).

It now alows you to create a client service to query the indexed images/text of the LAION-5B dataset (5 billion pairs of images and text) using your own image/text requests.

What does this mean?

It means you could search across 5 billion image and text pairs to find images that suit your use case.

For example, say you wanted to build a machine learning model capable of classifying different cars, you could programmatically search the LAION-5B dataset for images of specific cars and only download those:

Searching the CLIP embeddings for a photo of a BMW M3 (you could repeat this for almost any kind of photo you’d like to search for). Source: CLIP front webpage.

See the clip-retrieval package on GitHub for doing searches programmatically.

8. Quickfire Hugging Face news

The wonderful Hugging Face 🤗 team (leaders in the open-source ML space) have been on an absolute roll lately, here are some cool things I found relating to them:

Optimize your Hugging Face Transformer models with ONNX runtime and TensorRT — A guide from 2021 (older but still relevant) on how to improve your vanilla PyTorch inference speeds by up to 10x! Bonus: See NVIDIA’s guide on Torch-TensorRT to get up to 6x faster PyTorch models.
Hugging Face partners with AWS (Amazon Web Services) — The goal here is similar to OpenAI partnering with Microsoft. Machine learning models need compute power, so OpenAI uses Microsoft’s cloud compute power and Hugging Face will leverage Amazon’s cloud compute power.
MantisNLP switched to Hugging Face Inference Endpoints and wrote about why — Hugging Face Inference Endpoints allow you to create an API of any model you host on the Hugging Face Model Hub. They take care of the compute setup, you just tell them how much you want to use (e.g. CPU vs GPU, fast GPU vs faster GPU). Turns out Hugging Face Inference Endpoints were a tad more expensive to run but far simpler and faster than other custom solutions.

9. Make your code faster

Trace your PyTorch code profiles (a thing that measures how long different parts of your PyTorch code takes, for example, data loading, computing, various model operations) with PyTorch’s Holistic Trace Analysis (HTA) library.
Kernl is an open-source library available on GitHub that helps make your PyTorch code faster with one line of code. How? By optimising a bunch of operations using OpenAI’s Triton Engine. However, I’m guessing many of these speedups may be available in PyTorch 2.0.

10. Research

Meta AI open-source (kind of) LLaMA (Large Language Model Meta AI)

There are four sizes of LLaMA: 7B, 13B, 33B and 65B, each with increasing performance compared to the previous.

But even though the model(s) have close to 10x less parameters than some other models (Google’s PaLM is 540B parameters), it still performs better or if not on par with them (the 65B LLaMA model is better than GPT-3 175B on almost every task).

The models are available to researchers via application and cannot be used commercially.

Google brings self-supervised learning to anomaly detection

Anomaly detection is the important practice of finding data points which “might not belong” or are “of interest”. For example, one fraud transaction out of 100,000 non-fraud ones. Or a damaged part in a factory line (using computer vision).

Their results show self-supervised anomaly detection and semi-supervised anomaly detection can even beat fully supervised methods.

11. Lucky number 11

Andrej Karparthy (former head of Tesla AI, now working at OpenAI) had an excellent Tweet last month about the hottest new programming language being English.

andrej-karparthy-tweet

With the rise of large language models, you can now start writing code with English via prompt engineering. For example, ask GPT-3 to write some Python functions and you’ll get pretty good results. Source: Andrej Karparthy Twitter.

The Tweet itself but the thread on Twitter contains a bunch of helpful resources for learning more about how to work with large language models (LLMs).

Tips such as:

GPT’s don’t want to succeed, they want to imitate, they’re very good at imitating so show them what to do (give examples to help) and ask for it (telling GPT’s they’re smarter can actually make them better).
You can use an LLM to generate a backend system for your code but why not use the LLM as the backend.

See you next month!

What a massive month for the ML world in February 2023!

As always, let me know if there's anything you think should be included in a future post.

In the meantime, keep learning, keep creating, keep dancing.

See you next month,

Daniel

www.mrdbourke.com | YouTube

By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.