38th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.
Hey there, Daniel here.
Iโm a Machine Learning Engineer who also teaches the following beginner-friendly machine learning courses:
I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.
Since there's a lot going on, the utmost care has been taken to keep things to the point.
Enough about me! You're here for this month's Machine Learning Monthly Newsletter.
Typically a 500ish (+/-1,000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.
[Blog post] The Top 4 Reasons to Learn PyTorch
PyTorch is by far the most used deep learning framework in machine learning research and is used by companies such as OpenAI to build models like ChatGPT.
These are just two of the best reasons to learn the most fire framework out there! Read on to see how you can get into AI by learning PyTorch, especially through the ZTM PyTorch course ;).
A handful of large technology companies who use PyTorch to power their AI and machine learning workflows: OpenAI, Meta, Tesla and Amazon.
Saket Munda turned the FoodVision ๐๐ย project from the ZTM TensorFlow course into a real working app!
Sadek took the EfficientNetB0 computer vision model and deployed it to a web application called Wikifoodia.
The web application, deployed with Vercel, allows you to upload an image and the model will classify it into one of 101 different types of food.
It even worked on my lunch!
The Wikifoodia app built by Saket Munda using the FoodVision model from the ZTM TensorFlow course. Source: Wikifoodia app with my own image of a (delicious) omelette.
The tutorial reads like a cooking recipe.
You take your ingredients: a large corpus of data (text) and a large language model.
And then you turn your data into embeddings and use the large language model to:
Check out the tutorial by the LangChain team to see how to structure the responses.
Overview of how to create a ChatGPT over your own data (all possible with LangChain!, which is also where this image comes from).
Donโt want to pay a company such as OpenAI or Cohere for a large language model API?
Well Flan-T5 XXL is an open-source language model thatโs available on Hugging Face to use for free which achieves fantastic results.
Flan-T5 models from small to XXL are available as open-source models on Hugging Face.
And Philipp Schmid, technical lead at Hugging Face, has a tutorial on how you can fine-tune it on your own data to make it perform even better.
You know when you find someoneโs work and you immediately start to read it all?
Thatโs what Iโve done with Jayโs blog posts.
Not only is his tutorial on how to replicate GPT with 60 lines of NumPy code an incredible walkthrough of one of the most powerful architectures available today, his other blog posts contain terrific insights on other popular and useful machine learning topics.
Such as:
The modern AI stack involves a deep learning framework (such as TensorFlow or PyTorch) and one or more GPU(s).
Whenever I want to learn more about GPUs, I reread Tim Dettmersโ guide on the best GPUs for deep learning.
And heโs recently updated it to include the new 40-series NVIDIA RTX cards such as the RTX 4080 and RTX 4090.
Turns out theyโre quite the leap over the previous generation.
The new hardware inside is what powers a lot of the speedups available in PyTorch 2.0.
Performance numbers across a wide range of GPUs, notice the newer H100 and RTX 40- series right up the top. Source: Tim Dettmers blog.
OpenAI defines artificial general intelligence (AGI) as โAI systems that are generally smarter than humansโ.
In the blog post, they lay out their ground rules for what it might look like getting there:
Some of the above points have been cut off but you can read them in the full post.
Itโs hard to think that this is becoming more and more of the conversation.
I mean, if the rate of improvement of AI systems keeps going the way its been going the past 10 years, itโs hard to imagine AGI being too far off.
Or maybe weโre just easy to trick.
Replicate helps you run machine learning models in the cloud.
You upload your model to Replicate, choose your compute service (e.g. CPU or GPU) and then you get an API you can send data to.
In other words, Replicate helps you with machine learning model deployment.
Replicate wants to make two things easier:
Their replicate
package helps you run open-source models in a few lines of code:
import replicate
model = replicate.models.get("stability-ai/stable-diffusion")
version = model.versions.get("db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf")
version.predict(prompt="an astronaut riding on a horse")
And their [cog
package](https://github.com/replicate/cog) effectively helps turn your machine learning models into Docker containers.
The LAION AI team just seem to keep releasing amazing tools.
One of their latest is the updated [clip-retrieval
library](https://github.com/rom1504/clip-retrieval).
It now alows you to create a client service to query the indexed images/text of the LAION-5B dataset (5 billion pairs of images and text) using your own image/text requests.
What does this mean?
It means you could search across 5 billion image and text pairs to find images that suit your use case.
For example, say you wanted to build a machine learning model capable of classifying different cars, you could programmatically search the LAION-5B dataset for images of specific cars and only download those:
Searching the CLIP embeddings for a photo of a BMW M3 (you could repeat this for almost any kind of photo youโd like to search for). Source: CLIP front webpage.
See the clip-retrieval
package on GitHub for doing searches programmatically.
The wonderful Hugging Face ๐คย team (leaders in the open-source ML space) have been on an absolute roll lately, here are some cool things I found relating to them:
Meta AI open-source (kind of) LLaMA (Large Language Model Meta AI)
There are four sizes of LLaMA: 7B, 13B, 33B and 65B, each with increasing performance compared to the previous.
But even though the model(s) have close to 10x less parameters than some other models (Googleโs PaLM is 540B parameters), it still performs better or if not on par with them (the 65B LLaMA model is better than GPT-3 175B on almost every task).
The models are available to researchers via application and cannot be used commercially.
Google brings self-supervised learning to anomaly detection
Anomaly detection is the important practice of finding data points which โmight not belongโ or are โof interestโ. For example, one fraud transaction out of 100,000 non-fraud ones. Or a damaged part in a factory line (using computer vision).
Their results show self-supervised anomaly detection and semi-supervised anomaly detection can even beat fully supervised methods.
Andrej Karparthy (former head of Tesla AI, now working at OpenAI) had an excellent Tweet last month about the hottest new programming language being English.
With the rise of large language models, you can now start writing code with English via prompt engineering. For example, ask GPT-3 to write some Python functions and youโll get pretty good results. Source: Andrej Karparthy Twitter.
The Tweet itself but the thread on Twitter contains a bunch of helpful resources for learning more about how to work with large language models (LLMs).
Tips such as:
What a massive month for the ML world in February 2023!
As always, let me know if there's anything you think should be included in a future post.
In the meantime, keep learning, keep creating, keep dancing.
See you next month,
Daniel
By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.