20th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.
Daniel here, I'm 50% of the instructors behind Zero To Mastery's Machine Learning and Data Science Bootcamp course and our new TensorFlow for Deep Learning course! I also write regularly about machine learning and on my own blog as well as make videos on the topic on YouTube.
Welcome to this edition of the Machine Learning Monthly Newsletter. A 500ish (+/-1000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.
Since there's a lot going on, the utmost care has been taken to keep things to the point.
The video version of this post: https://youtu.be/4zPc0Q63Tv0
In 2017, I started writing a fiction book. Then I gave up. Then I started again. Gave up, started again and again and again. Then I put it down to finish the Zero to Mastery TensorFlow Course. Then I finished the TensorFlow Course and picked Charlie Walks back up.
And it's done.
The story of a machine learning engineer, Charlie, who wants to be a writer. Working for the largest technology company in the world and siphoning the companies' compute resources to power his secret project XK-1, a computer program that generates artificial worlds. At night, Charlie writes letters to his nephew Pauly about his past experiences as well as what he's learned from inside XK-1.
The digital version is coming out on 31 August 2021 (Australian time, so give or take a day) and the print version will be coming later in the year.
The first seven chapters are available online to read for free. And there'll be something fun coming to my YouTube channel soon.
If you've read my writing before, you'll love this book. It's the best thing I've ever made.
You can find all the details at charliewalks.com, for now, check out the cover:
Cover of Charlie Walks: A Novel by Daniel Bourke. Charlie, the protagonist, walking between city and nature.
I love seeing this! Several members of the Zero To Mastery Academy have banded together and created their own portals to share their work and talk with others interested in similar topics.
Artificialis is latin for artificial, so expect to see works based but not limited to the field of artificial intelligence. For example, the Medium publication contains articles such as Your Path to Become a Machine Learning Engineer by Alessandro Lamberti and Neural Networks 101 by Ashik Shaffi.
I'm a huuuuuuge advocate for writing online. I always say you should own your own corner of the internet in some way. And writing online is one of the best ways to do so. That's exactly what Pier and Nifesimi have done.
Erik Bernhardsson (featured in last month's machine learning monthly) is back asking what's the right level of specialization?
When it comes to data and tech roles, there's no shortage of titles. When someone asks me whether they should become a data scientist or machine learning engineer, my internal response is almost always that they imply each other.
If you're going to be working with data, you're eventually going to do some analysis, run experiments (science), do some modelling (machine learning), modify, store and move the data (engineering). I'm biased though, I'm a big fan of the generalist. Someone who can do 80% of the best in the world at a number of different things.
Erik's article expands on a tweet he posted about over-specialization causing confusion in the data industry.
The article comes from the angle that too much specialization can come from poor tools. As in, if a tool requires someone to be a specialist in that tool (e.g. Kubernetes) to use it, then arguably it's a bad tool.
Are you tools-orientated or goals-orientated?
I get a lot of people asking me whether they should use TensorFlow or PyTorch, or learn Python or R?
To which I usually respond that they're all good options and you should pick one and use it to do what you want to do.
You could spend months looking for the right hammer or you could just try one of the first ones you find and start trying to hammer nails. And if it doesn't work, switch hammers.
Speaking of data roles, the data engineer is perhaps the most prevalent. Since without data, none of the other roles exist. To build models, you need data, you need to store it, change it, move it. That's what the data engineer does.
The team at datastacktv are here to help with their Modern Data Engineer Roadmap.
From computer science fundamentals (how does a computer work?) to the different kinds of databases (relational, non-relational), the roadmap covers the skills a modern data engineer should be aware of.
In recent times, Transformers (a type of machine learning model) have achieved incredible performance in the world of NLP.
Now, researchers are using them more and more for vision.
Some thought they might even replace convolutional neural networks (CNNs). Well, it turns out, CNNs perform so well they aren't being replaced just yet.
CNNs strong inductive biases (pixels nearby are likely related and all pixels should be processed evenly) allow them to achieve fantastic results with minimal data. Whereas Transformers weaker inductive biases (paying attention to where it needs to be paid) make them limited with small data but enable them to flourish with large amounts of data.
Facebook AI's new ConViT (convolutional vision transformers) leverages the best parts of both architectures to achieve the most data efficient transformer-based vision architecture yet (beating the previous best performance whilst using half as many parameters).
Roboflow wrote an article detailing how you can use VQGAN + CLIP (a generative adversarial network and OpenAI's CLIP model that connects text and images) to create some wild artworks.
I tried it with my book cover.
Artwork generated from ~1000 iterations of VQGAN + CLIP with a starter image and a goal image and a prompt. I tried running it for another ~200 iterations and it didn't change much. I feel like if something doesn't come out how you want it after the first ~300-500 iterations, try again with a different seed.
You can start from scratch (like below) and create something random. Or you can feed the model a starter image and a goal image and a prompt. And the VQGAN + CLIP models will generate an artwork with what you've fed it.
A random image generated over ~1000 iterations with VQGAN + CLIP with no prompt, no starter image and no target image. I'm seeing some floating cities in a sea of purple flowers and waves.
Try it and see what you can create.
Tesla, Comma and Waymo, three of the largest self-driving car companies in the world had announcements recently. And it's incredible to see where some of the best machine learning technology in the world is going.
Arguably the largest difference between them all would be the vision-only approach (Tesla, Comma) versus the multiple sensor approach (LIDAR, radar, vision, mapping and more from Waymo).
Tesla's Autonomy Day keynote revealed how far they've come in the past few years. From using vision and radar to setting up the largest vision-only self-driving fleet in the world.
Some of my favourite takeaways from Tesla's recent Autonomy Day keynote. From top left to bottom right:
Some of my favourite takeaways from their presentation include:
Waymo also released a blog post with nice pictures though not as in-depth as Tesla's presentation on their approach to full self-driving. Again, the main difference being the use of multiple sensors other than vision (LIDAR, radar, mapping).
This use of extra sensors is emphasised throughout the post as being instrumental to enabling Waymo to deploy the first real full self-driving system on the roads. This is true, although Tesla has the largest fleet by far, the Waymo Driver is the only system without any human driver that is currently deployed on public roads.
COMMA_CON, Comma AI's technology conference where they announced the Comma 3 (a device you can install in your car to give it driver-assist capabilities) also happened at the end of July.
It blows me away at what's possible with a small but dedicated team. Comma employs 22 people, yet has the second largest fleet of self-driving cars in the world. Their approach is similar to Tesla's, full end-to-end vision-only self-driving
Outline schematic of how Apple's Photos app on iOS, iPadOS and macOS recognizes faces in photos on-device (your photos are never sent to Apple's servers, they stay on your devices at all times).
What a massive month for the ML world in August!
As always, let me know if there's anything you think should be included in a future post.
Liked something here? Tell a friend!
In the meantime, keep learning, keep creating, keep dancing.
See you next month,
Daniel
PS. You can see also video versions of these articles on my YouTube channel (usually a few days after the article goes live).
By the way, I'm a full-time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.