[September 2022] Machine Learning Monthly Newsletter 💻🤖

33rd issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Hey everyone!

Daniel here, I’m a machine learning engineer who teaches the following beginner-friendly machine learning courses:

Complete Machine Learning and Data Science Bootcamp: Zero to Mastery
Get TensorFlow Developer Certified: Zero to Mastery
[NEW] PyTorch for Deep Learning: Zero to Mastery (100+ videos added since last month!)

I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

Enough about me!

You're here for this month's Machine Learning Monthly Newsletter. Typically a 500ish (+/-1,000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

What you missed in September as a Machine Learning Engineer…

My work 👇

The Zero to Mastery PyTorch for Deep Learning course is 100% live!

Two huge sections with 100+ total videos were added in the last month:

Section 08. PyTorch Paper Replicating (replicating the original Vision Transformer machine learning paper from scratch)
And section 09. PyTorch Model Deployment (making our FoodVision Mini model we’ve built through the course available to others)

From the Internet

1. Google Colab now offering pay-as-you-go computing

While the free tier of Google Colab stays unchanged, you can now upgrade your paid subscription to add even more compute opportunities (new and faster GPUs) on a pay-per-use basis.

This offer even more access to NVIDIA GPUs with a few clicks, plenty of opportunities for experimenting, experimenting, experimenting!

2. Find out how Meta improves the performance of their PyTorch models

The recent Zero to Mastery PyTorch course helps beginners learn the fundamentals of PyTorch.

But once you’ve learned the fundamentals, how do you improve your models even further?

A recent article on the PyTorch blog shares how Meta (Facebook) finds bottlenecks in their PyTorch models and then offers tips for how to improve each one.

01-meta-optimizing-pytorch

Four optimization steps for PyTorch models found by the Meta team. Source: PyTorch blog.

3. Large-scale open-source CLIP (constrastive language-image pretraining) by Laion AI

Laion AI has released the best-performing open-source CLIP model with a blog post detailing how they made it happen.

CLIP stands for “contrastive language-image pretraining” which means the model is capable of many vision and language tasks such as matching text with images or images with images.

It also makes it possible to perform zero-shot image classification based on how similar a piece of text is to an image.

For example, you could create a “cat”, “dog” and “chicken” image classification model, despite not having any labelled images of “cat”, “dog” or “chicken”.

Check out the blog post for training details and tips as well as the OpenCLIP GitHub for code and model weights.

4. Large image datasets are a mess but luckily there’s now a tool to help clean them up

Many of today’s modern computer vision models are pretrained on large open-source datasets such as ImageNet-1K (1+ million images, 1000 classes) and ImageNet-21K (13+ million images, 21,000 classes).

But large datasets like this are often created in crowd-sourcing fashion and often contain mistakes (mismatched labels) and plenty of duplicates (according to new research, ImageNet-21K has 1+ million duplicate images).

However, there’s now an open-source tool called fastdup to compute image statistics (such as brightness, darkness, sharpness, blurriest, smallest, largest, unique colours and more), find duplicates (including from slightly different points of view), find corrupted and broken images, detect outliers (images that don’t match the distribution of the rest), find wrong labels and much more.

See more on the fastdup GitHub page as well the video tutorial.

02-fastdup-github

Discovering different image statistics on the Food101 dataset (used in the Zero to Mastery PyTorch course) with fastdup. Source: fastdup GitHub.

5. Salesforce LAVIS is an open-source library to make vision-language models more accessible

Salesforce has open-sourced a library called LAVIS (LAnguage-and-VISion intelligence) to make vision-language research more reproducible.

Inside you’ll find a Pythonic API for 10+ vision and language tasks such as image retrieval, image captioning, visual question answering and more.

You’ll also have access to 20+ vision and language datasets and 30+ pretrained state-of-the-art vision-language models such as BLIP and CLIP.

See more on the LAVIS GitHub page.

6. Towhee is an open-source library for turning almost any type of data into a vector

I love this!

It’s all about the embeddings!

Good embeddings (numerical representations of data) generally lead to good results.

And Towhee helps you turn almost any type of unstructured data (images, audio, text, 3D molecular structures) into embeddings (also called feature vectors).

My favourite is how quick you can get access to Towhee’s 700+ pretrained models, for example, check out this code snippet for creating a text-to-image search:

03-towhee-getting-started-code-snippet

Installing and creating a text-to-image vector search with Towhee in ~10-lines of code. Source: Towhee GitHub.

7. No, you don’t need MLOps by Lak Lakshmanan

A fantastic breakdown of whether you need MLOps or not (chances are you’re overthinking it).

Lak’s main argument for building machine learning applications: KISS (keep it simple stupid).

8. Use GitHub Codespaces for Machine Learning (beta)

This is fairly new but it’s exciting.

Imagine being able to go to a machine learning repo on GitHub and pressing “.” and having a fully interactive, GPU-powered Jupyter server running right within the browser.

That’s what’s already possible with GitHub Codespaces on many different repos but the machine learning use-case hasn’t quite been there (until now).

If it keeps going how it is, GitHub Codespaces could be a fantastic Google Colab alternative, coding with all the machine learning requirements (a GPU, a Jupyter notebook) right with a GitHub.

I’m excited to try it out in the next couple of months!

9. Replit introduces GhostWriter and AI mode to help you code/learn to code faster

I’m a big fan of the idea of coding in the browser.

The less setup on different local machines the better.

In an ideal world, I’d go to any GitHub repo, press a button and start coding start away.

GitHub Codespaces is working towards making this happen (see the above link).

And so is Replit.

Especially with their new GhostWriter mode which is their equivalent of GitHub Copilot but faster (according to them).

I’ve really liked using GitHub Copilot lately for helping me learn web development.

But GhostWriter looks incredible too and it’s great to have more competitors in the space.

I really liked the blog post by Replit announcing how they created the model (and the challenges that came with deployment) as well as how they integrated it with their online app (as much of a challenge as training the model).

04-replit-ghostwriter

Different optimization techniques used by the Replit team to make their GhostWriter model available for pair-programming applications with an average response time of 400ms. Source: Replit blog.

My favourite quote from the release article (bold mine):

What do you do when you're not a multi-trillion multi-national corporation (yet) with tons of ML research scientists, infinite budget for training, billions in industry partnerships, and store most of the world's code but still want to bring state-of-the-art AI to production? You start from open-source!

10. Text is the universal interface blog post by Roon

A fantastic insight into why the rise of large language models (LLMs) such as GPT-3 and their integration into almost every kind of application is going to be the next phase of computing.

Always bet on text.

11. Series: How Netflix uses machine learning to create media

Netflix has long been one of the most open companies on how it uses machine learning to drive its business.

And now they’ve started a new blog series discussing how they use machine learning not only to curate media but to create media.

From using computer vision for video understanding and editing to visual effects and computer graphics to generate media and digitize actors/props and sets.

12. Tesla AI day 2022 (video)

Tesla’s AI day 2022 is live!

Packed with AI-first updates on how they’re building the world’s largest fleet of self-driving cars as well as using the same technology to create the Tesla bot.

No exact spoilers from this one as it’s only a day or two old and I haven’t watched all of yet (I’m currently on vacation in Europe).

But if this comment on the Learn PyTorch in a day video reveals anything, it’s that the Zero to Mastery PyTorch course lines up pretty well with Tesla’s philosophy:

05-yt-comment

Comment from Learn PyTorch in day. Literally. YouTube video.

I’ll include my favourites from AI day in next month’s Machine Learning Monthly!

See you next month!

What a massive month for the ML world in September!

As always, let me know if there's anything you think should be included in a future post.

In the meantime, keep learning, keep creating, keep dancing.

See you next month, Daniel

www.mrdbourke.com | YouTube

By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.

Machine Learning Monthly Newsletter 💻🤖

Daniel Bourke