37th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.
Hey there, Daniel here.
I’m a Machine Learning Engineer who also teaches the following beginner-friendly machine learning courses:
I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.
Since there's a lot going on, the utmost care has been taken to keep things to the point.
Enough about me! You're here for this month's Machine Learning Monthly Newsletter.
Typically a 500ish (+/-1,000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.
Building Nutrify’s data engine — I’m working on a full-stack machine learning application with my brother called Nutrify. The tagline is “take a photo of food and learn about it”.
Along the way I’m planning on making videos/articles on how I’m doing it.
This month I released episode one, with a focus on how we’re copying Tesla’s data engine but for food images.
Left: Tesla’s data engine for building a dataset for self-driving cars (from their 2021 AI day video).
Right: Nutrify’s data engine for building a food image dataset.
Chip Huyen is one of my favourite writers and practitioners in the ML field.
And her latest article answers the question “what do people look for when hiring for ML roles?”
Chip’s working on her own ML company called Claypot AI for real-time ML inference so the answers are specific to her startup.
But after reading it, there’s plenty of good information I’d recommend for anyone looking to get a job in the ML field.
My personal favourite is working on your own specific projects (rather than projects from courses) and listing them on your resume:
Avoid listing “cookie-cutter” projects on your resume, these are good to learn with but much of the time someone has down the work for you.
Bonus: see my machine learning resources page for more resources like this one on getting a job.
Large Language Models (LLMs) are what have been powering all of the recent ChatGPT-like creations.
They make complex tasks accessible by allowing people to program with English (and other languages but mostly English for now) rather than computer code.
LangChain is an open-source library that allows you to combine different language models together.
The combination is referred to as a chain, hence “LangChain”.
For example, you may want to have one language model encode a large corpus of documents.
And then another language model generative answers from those documents based on a certain search query which was also encoded by another large language model.
This is exactly what LangChain did for their own documentation-specific chatbot (a chatbot you can talk to and ask questions about the LangChain documentation).
Example of using LangChain’s dedicated chat window for help with their documentation. All answers are based on actual passages from the LangChain documentation. Try it out on their website.
Cool project idea: replicate the LangChain chatbot but for one of your own favourite open-source libraries (maybe PyTorch?).
See more about LangChain at:
P.S. These guys put together a handy little LangChain guide that covers all the basics to get you up to speed, even for less technical readers.
James Briggs is easily one of my favourite writers in ML space on everything NLP (natural language processing), text embeddings and now large language models and their use cases.
His latest tutorial showcases how to create a generative question answer (GQA) pipeline using OpenAI’s API and Pinecone.
For example, with a GQA pipeline, you can:
documents
into embeddings (using OpenAI) and store them in a database (Pinecone)query
turned into an embedding (using OpenAI) to search the databaseanswer
to be returned (using OpenAI) based on the query as well as the embedded documentssource
of where the answer was generated fromNumber 3 and 4 are important because it makes sure the answer at least has a reference of where it came from (so you can check for further information).
A generative question and answer system overview. Answers are generated with context from a vector database of information that may contain the answer. You can swap Pinecone and OpenAI for other similar services.
Source: Pinecone blog by James Briggs.
In a similar light to the above, if ChatGPT gives you an answer for something, how can you make sure it’s factually correct?
Sebastian Raschka writes an article about how ChatGPT can often “hallucinate” the answer to something, such as the answer to “What is weight decay?”.
As in, it will give an answer to your question but will just make it up.
This is due to the nature of generative models.
Prompt them to do something and they will return something but that doesn’t mean it will be what you’re after/correct.
One way to help this problem is to include references in the generated answer and generate the answer from known sources of truth (e.g. include context in the original prompt to make sure the returned answer at least includes that context).
txtai
With all this talk about semantic search, you may be looking for an introduction to its use cases.
Getting started with semantic search is a fantastic blog post by David Mezzetti who works on the open-source txtai
library.
txtai
offers an easy-to-use interface for creating semantic search applications in a few lines of code:
# Get started in a couple lines
from txtai.embeddings import Embeddings
embeddings = Embeddings({"path": "sentence-transformers/all-MiniLM-L6-v2"})
embeddings.index([(0, "Correct", None), (1, "Not what we hoped", None)])
embeddings.search("positive", 1)
#[(0, 0.2986203730106354)]
See more about semantic search and txtai
at:
A fantastic overview of techniques used to take inference on a BERT model from 100 samples per second to over 3,000 samples per second!
Techniques such as:
float32
datatype to int8
datatype (far less storage/compute requirements).Researchers from Google Brain and Harvard University have teamed to create a simple and straightforward document for people looking to maximize the performance of deep learning models (all of us!).
It contains a bunch of great insights into how to squeeze the performance out of your deep models, such as:
And my personal favourite:
When starting a new project, go for a deep learning model that already works on a similar problem. Then try it on your own.
This process is known as transfer learning and is a big focus of the ZTM TensorFlow and PyTorch courses.
Source: Google Deep Learning Playbook.
The (truly) open-source AI research team LAION has published their best (and the current state-of-the-art) OpenCLIP model yet.
The model achieves 80.1% zero-shot accuracy on the ImageNet dataset.
This means after seeing zero training samples, the model can achieve 80.1% accuracy across the 1,000 classes of test samples.
What an outstanding result!
I’m loving the work from the LAION team, be sure to check out the following:
I’ve been playing around with the open-source ImaginAIry library.
It offers a simple interface for generating images with Stable Diffusion on the command line whilst leveraging the GPU power of an M1/M2 chip or a NVIDIA CUDA-enabled GPU.
Seriously, you can get started with a few lines of code:
# 1. On macOS, make sure you have Rust (a programming language) installed first
# (Rust is required for the tokenizer library)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
pip install setuptools-rust
# 2. Install ImaginAIry
>> pip install imaginairy
# 3. Generate images/gifs
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"
# Stable Diffusion 2.1
>> imagine --model SD-2.1 "a forest"
# Make generation gif
>> imagine --gif "a flower"
I created this one with the prompt: imagine "student taking a machine learning class at college"
Will ChatGPT take over Google Search?
Perhaps.
For now, ChatGPT seems to return results that are specially targeted to what you’re asking for.
However, Google still allows you to explore where you want to go in a serendipity-like fashion.
Shawn Wang (swyx) writes about several different arguments for and against OpenAI’s creations taking on Google.
My takeaway: perhaps we’ve just created an arms race to AI in which two large companies fight to the death to create it first (without really considering what the outcomes of what they’re making is).
Two epic papers I’m reading this month, both around the notion of combining vision and language.
The first helps with curating large-scale datasets for your own use case.
And the second brings a new opportunity for image-to-text communication.
What a massive month for the ML world in January!
As always, let me know if there's anything you think should be included in a future post.
In the meantime, keep learning, keep creating, keep dancing.
See you next month, Daniel
By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.