[August 2021] Machine Learning Monthly 💻🤖

20th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.

Daniel here, I'm 50% of the instructors behind Zero To Mastery's Machine Learning and Data Science Bootcamp course and our new TensorFlow for Deep Learning course! I also write regularly about machine learning and on my own blog as well as make videos on the topic on YouTube.

Welcome to this edition of the Machine Learning Monthly Newsletter. A 500ish (+/-1000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

What you missed in August as a Machine Learning Engineer…

My work 👇

The video version of this post: https://youtu.be/4zPc0Q63Tv0

Charlie Walks: A Novel

In 2017, I started writing a fiction book. Then I gave up. Then I started again. Gave up, started again and again and again. Then I put it down to finish the Zero to Mastery TensorFlow Course. Then I finished the TensorFlow Course and picked Charlie Walks back up.

And it's done.

The story of a machine learning engineer, Charlie, who wants to be a writer. Working for the largest technology company in the world and siphoning the companies' compute resources to power his secret project XK-1, a computer program that generates artificial worlds. At night, Charlie writes letters to his nephew Pauly about his past experiences as well as what he's learned from inside XK-1.

The digital version is coming out on 31 August 2021 (Australian time, so give or take a day) and the print version will be coming later in the year.

The first seven chapters are available online to read for free. And there'll be something fun coming to my YouTube channel soon.

If you've read my writing before, you'll love this book. It's the best thing I've ever made.

You can find all the details at charliewalks.com, for now, check out the cover:

charlie-walks-cover Cover of Charlie Walks: A Novel by Daniel Bourke. Charlie, the protagonist, walking between city and nature.

From the community 🙌

Artificialis Medium Publication and Discord Server

I love seeing this! Several members of the Zero To Mastery Academy have banded together and created their own portals to share their work and talk with others interested in similar topics.

Artificialis is latin for artificial, so expect to see works based but not limited to the field of artificial intelligence. For example, the Medium publication contains articles such as Your Path to Become a Machine Learning Engineer by Alessandro Lamberti and Neural Networks 101 by Ashik Shaffi.

Read and follow the Artificialis Medium Publication
Join the Artificialis Discord Community and talk with people interested in artificial intelligence

Writing online

I'm a huuuuuuge advocate for writing online. I always say you should own your own corner of the internet in some way. And writing online is one of the best ways to do so. That's exactly what Pier and Nifesimi have done.

Last month's ML monthly (July 2021) covered Gradio, a way to quickly share functional demos of your ML models. Pier Pablo Ippolito's article covers the fundamentals of how to get started with Gradio and get your models into the hands of others.
Nifesimi Ademoye told me they'd love to one day be a technical writer. So what's happening? They're publishing technical articles. Start the job before you have it. Check out Nifesimi's blog for articles such as PCA explained by Elon Musk, a guide to AWS SageMaker (Amazon's tool for full-stack ML) and a review of Udacity's Machine Learning Nanodegree.

From the internet 🕸

Are you a data scientist, engineer, analyst who does ML? Or something else?

Erik Bernhardsson (featured in last month's machine learning monthly) is back asking what's the right level of specialization?

When it comes to data and tech roles, there's no shortage of titles. When someone asks me whether they should become a data scientist or machine learning engineer, my internal response is almost always that they imply each other.

If you're going to be working with data, you're eventually going to do some analysis, run experiments (science), do some modelling (machine learning), modify, store and move the data (engineering). I'm biased though, I'm a big fan of the generalist. Someone who can do 80% of the best in the world at a number of different things.

Erik's article expands on a tweet he posted about over-specialization causing confusion in the data industry.

The article comes from the angle that too much specialization can come from poor tools. As in, if a tool requires someone to be a specialist in that tool (e.g. Kubernetes) to use it, then arguably it's a bad tool.

Are you tools-orientated or goals-orientated?

I get a lot of people asking me whether they should use TensorFlow or PyTorch, or learn Python or R?

To which I usually respond that they're all good options and you should pick one and use it to do what you want to do.

You could spend months looking for the right hammer or you could just try one of the first ones you find and start trying to hammer nails. And if it doesn't work, switch hammers.

Read the full article on Erik's blog (I've subscribed to receive future posts too)

Want to become a data engineer? Here's a roadmap

Speaking of data roles, the data engineer is perhaps the most prevalent. Since without data, none of the other roles exist. To build models, you need data, you need to store it, change it, move it. That's what the data engineer does.

The team at datastacktv are here to help with their Modern Data Engineer Roadmap.

From computer science fundamentals (how does a computer work?) to the different kinds of databases (relational, non-relational), the roadmap covers the skills a modern data engineer should be aware of.

See the full Modern Data Engineer Roadmap on GitHub
Bonus: I made a video last year called the Machine Learning Roadmap, most of it is still valid and has many overlaps with the Data Engineer roadmap
Bonus #2: I'm a little biased here but our Zero To Mastery Machine Learning Career Path is also definitely worth considering as well if you want to learn everything in one place

If you can't beat them, join them (CNNs + Transformers)

In recent times, Transformers (a type of machine learning model) have achieved incredible performance in the world of NLP.

Now, researchers are using them more and more for vision.

Some thought they might even replace convolutional neural networks (CNNs). Well, it turns out, CNNs perform so well they aren't being replaced just yet.

CNNs strong inductive biases (pixels nearby are likely related and all pixels should be processed evenly) allow them to achieve fantastic results with minimal data. Whereas Transformers weaker inductive biases (paying attention to where it needs to be paid) make them limited with small data but enable them to flourish with large amounts of data.

Facebook AI's new ConViT (convolutional vision transformers) leverages the best parts of both architectures to achieve the most data efficient transformer-based vision architecture yet (beating the previous best performance whilst using half as many parameters).

Read more about the ConViT architecture on the Facebook AI blog

Become an AI artist

Roboflow wrote an article detailing how you can use VQGAN + CLIP (a generative adversarial network and OpenAI's CLIP model that connects text and images) to create some wild artworks.

I tried it with my book cover.

Artwork generated from ~1000 iterations of VQGAN + CLIP with a starter image and a goal image and a prompt. I tried running it for another ~200 iterations and it didn't change much. I feel like if something doesn't come out how you want it after the first ~300-500 iterations, try again with a different seed.

You can start from scratch (like below) and create something random. Or you can feed the model a starter image and a goal image and a prompt. And the VQGAN + CLIP models will generate an artwork with what you've fed it.

random-vqgan-clip-artwork-1000-iterations A random image generated over ~1000 iterations with VQGAN + CLIP with no prompt, no starter image and no target image. I'm seeing some floating cities in a sea of purple flowers and waves.

Try it and see what you can create.

Self-driving cars are always 5-years away (but it looks like they're getting closer)

Tesla, Comma and Waymo, three of the largest self-driving car companies in the world had announcements recently. And it's incredible to see where some of the best machine learning technology in the world is going.

Arguably the largest difference between them all would be the vision-only approach (Tesla, Comma) versus the multiple sensor approach (LIDAR, radar, vision, mapping and more from Waymo).

Tesla's Autonomy Day keynote revealed how far they've come in the past few years. From using vision and radar to setting up the largest vision-only self-driving fleet in the world.

tesla-ai-day-keynote-highlights

Some of my favourite takeaways from Tesla's recent Autonomy Day keynote. From top left to bottom right:

how a clip goes from car camera to vector space
using the Tesla fleet to find similar examples of rare driving scenarios
creating automatic maps of an intersection thanks to multiple cars going through it over time
generating artificial scenarios for learning to drive
using Dojo (Tesla's custom AI training chip) with PyTorch
the Tesla Bot (Musk: "don't worry you'll able to run away from it").

Some of my favourite takeaways from their presentation include:

The clip labelling lifecycle — How a clip goes from being recorded on a Tesla vehicle, to being labelled (automatic and manual, larger models can be used for labelling than inference because inference needs to be fast whereas labelling can take its time), to being used in a model.
Using the fleet to find rare scenarios — The huge benefit of the Tesla fleet is that there's thousands of vehicles on the road at any given time. This means more and more data is being collected every day. So if a rare event occurs such as "item falling off the back of a truck" and the machine learning team decide they need more labelled examples of that scenario to improve the current model, they can request them from the fleet.
Automatic mapping — The downside of going vision-only for a full self-driving system is the lack of localization (the exact position of a car on the road at any given time). However, given enough data, as Tesla have shown, you can create a map out of images only. Again, leveraging the fleet for many different examples of cars driving through the same intersection, Tesla showed you can create a map, a map which updates on its own as cars continually drive through it (this is another major benefit of using end-to-end machine learning).
Scenario generation — When it comes to driving, even the real-world doesn't have all the scenarios. Anything could happen. And if a self-driving vehicle is going to be significantly better than a human at driving, it needs to know how to react in a multitude of scenarios. That's where Tesla's simulator comes from. Based on data collected from the real-world, the models which are used on the cars in production are also tested (not trained) on simulated data. This enables engineers to see how the car will react in a variety of scenarios that may not be available even with all of the training data they have.
Using Dojo (Tesla's custom AI chip) — Innovation occurs at the speed of experiment innovation. If you can experiment faster, you can innovate faster. So in order to experiment faster, Tesla designed their own machine learning chip, a chip specifically designed for the workloads they need (lots of video training). The chip is still in working prototype stage, however, early numbers show it's a substantial improvement (up to 4x training speed) on their current best. This means experiments could be cut from days to hours.
Tesla Bot — Since they're building all the pieces of the puzzle for autonomous vehicles, why not bridge those techniques to autonomous humanoids? The Tesla bot uses the same vision system as the full-self driving computer in the Tesla cars. Don't worry, Elon says, you'll be able to run away from the bot and likely overpower it (the bot can only run at 5mph and has a max load of 45lbs). The use-case of the Tesla Bot is still largely unknown but that's how these things start, iterating over time.

Waymo also released a blog post with nice pictures though not as in-depth as Tesla's presentation on their approach to full self-driving. Again, the main difference being the use of multiple sensors other than vision (LIDAR, radar, mapping).

This use of extra sensors is emphasised throughout the post as being instrumental to enabling Waymo to deploy the first real full self-driving system on the roads. This is true, although Tesla has the largest fleet by far, the Waymo Driver is the only system without any human driver that is currently deployed on public roads.

COMMA_CON, Comma AI's technology conference where they announced the Comma 3 (a device you can install in your car to give it driver-assist capabilities) also happened at the end of July.

It blows me away at what's possible with a small but dedicated team. Comma employs 22 people, yet has the second largest fleet of self-driving cars in the world. Their approach is similar to Tesla's, full end-to-end vision-only self-driving

Watch Tesla Autonomy Day
Read Waymo's blog post on the Waymo Driver
Watch COMMA_CON

Machine learning in the wild (rapid fire round)

Applied ML is a GitHub repo by Eugene Yan that collects examples, papers, tutorials and best practices on a number of machine learning topics and applications from data quality to optimization to MLOps platforms.
Apple, arguably the company with ML running on the most devices in the world, recently published how they use machine learning for on-device photo recognition. Ever wonder how your face and friend's faces get identified automatically (all privately) and put into albums for you? Apple's machine learning team shares how.

how-apple-uses-ml-in-photos-for-facial-recognition Outline schematic of how Apple's Photos app on iOS, iPadOS and macOS recognizes faces in photos on-device (your photos are never sent to Apple's servers, they stay on your devices at all times).

Google's AI team substantially improves their polyp detection algorithm. First featured in machine learning monthly a year ago (August 2020), Google's AI team built a computer vision algorithm to detect polyps (potential starting points for colon cancer) in the intestine. The new version of the algorithm (a CNN-based network) improves sensitivity (how many times the algorithm correctly detects a polyp out of 100 tries) from 94.4% to 97%.

See you next month!

What a massive month for the ML world in August!

As always, let me know if there's anything you think should be included in a future post.

Liked something here? Tell a friend!

In the meantime, keep learning, keep creating, keep dancing.

See you next month,

Daniel

www.mrdbourke.com | YouTube

PS. You can see also video versions of these articles on my YouTube channel (usually a few days after the article goes live).

By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.