January 1st, 2021 · 11 min read
12th issue! If you missed them, you can read the previous issues of the Machine Learning Monthly newsletter here.
Hey everyone, Daniel here, I'm 50% of the instructors behind the Complete Machine Learning and Data Science: Zero to Mastery course. I also write regularly about machine learning and on my own blog as well as make videos on the topic on YouTube.
Welcome to the 12th edition of machine learning monthly and the final edition of 2020. A 500ish (+/-1000ish, usually +) word post detailing some of the most interesting things on machine learning I've found in the last month.
Since there's a lot going on, the utmost care has been taken to keep things to the point.
I've been putting together a code-first introduction to deep learning with TensorFlow course. And last month, the GitHub repo went public.
If you've done a beginner machine learning course, the upcoming deep learning with TensorFlow course will be a great follow up.
Students of the course will get hands-on practice writing deep learning models with TensorFlow, common deep learning troubleshooting techniques, practice searching for how to solve a problem and more.
The best places to get updates as the course gets ready to go live will be:
The November 2020 issue of Machine Learning Monthly mentioned a new fork of TensorFlow, specifically for macOS (
[tensorflow_macos](https://github.com/apple/tensorflow_macos)) which claimed to have incredible performance improvements for running TensorFlow natively on Macs.
Naturally, I wanted to test it. So I compared two M1-powered MacBooks (an Air and a 13-inch Pro) versus my almost top-spec Intel-powered 16-inch MacBook Pro.
After running a series of tests and comparing the results, I came away with the conclusion: the hype is real.
You can check out the full testing I conducted and results in video/blog post form:
Forget prebuilt machine learning models, how about prebuilt machine learning pipelines?
That's what jrieke's traingeneator does.
Enter a few parameters such as task (e.g. image classification), model (e.g. ResNet) and what kind of input data your model should accept (e.g. NumPy arrays) and traingenerator will produce a series of boiler plate code ready to be adapted to your own problem.
The app is put together with Streamlit and hosted on Heroku.
Thank you Yogesh for the submission!
What it is: DeepMind have cracked a 50-year-old problem: predicting the shape of a protein based on its amino acid sequence. Why is the shape of a protein important? Because a protein's shape is generally indicative of it's function (a very primitive example would be: ACGTCTT → helps with memory).
Using attention-based neural networks trained on 100-200 GPUs for a couple of weeks (the actual training details are sparse), DeepMind were able to create AlphaFold 2 which achieved a median 92.4 GDT (Global Distance Test, min: 0, max: 100) across all targets in CASP14 (Critical Assessment of protein Structure Prediction), a dramatic improvement over previous results.
Why it matters: AlphaFold 2 is a great example of how if you can turn your problem into a learnable problem, for example, "can you predict a protein's 3D structure based solely on its 1D amino acid structure?", as long as you've got the right model setup, you've got a good shot at figuring it out.
Of course, the above statement takes away the from the complexity of the problem. A single protein can have 10^300 possible confirmations (more atoms than in the known, observable universe). But in nature, proteins take on their shape quickly and spontaneously. This large number of combinations yet proteins morphing into specific shapes at high speed is known as Levinthal's paradox.
There was a quote which caught my eye which said something along the lines of "This happened decades before experts thought it would...". A good reminder that we just can't predict when the next breakthrough will come.
In saying this, it's hard to predict what AlphaFold 2 may be used for. DeepMind mentions they were able to predict several protein structures of the SARS-CoV-2 virus and another research group said they used AlphaFold 2 understanding how signals are transmitted across cell membranes, however, it's long-term use cases for biology may be like what the telescope offers to space exploration — a way to see into the unknown.
What it is: Previous works such as AlphaGo and AlphaZero achieved super-human results in games like Go and Chess using large amounts of compute and well-crafted algorithms, however, both of these knew the rules of the game ahead of time.
MuZero, DeepMinds latest reinforcement learning/model-based (it's a hybrid I'm not quite sure how to describe) algorithm does just as well as the previous two but without knowing the rules of the game. This means it can perform on perfect rule-based games such as Chess and Go but also obtain state of the art results on games with "messy rules" such as Atari.
MuZero models a combination of: value (how good is the current position?), policy (which action is the best to take?) and reward (how good was the last action?).
A critical piece of the puzzle is that MuZero doesn't try to model the entire environment (too much) but instead learns the most important parts of the environment. For example, you don't need to model the direction of each individual rain drop to know having an umbrella will help to keep you dry.
Why it matters: Real-life rarely (read: never) has perfect rules. So developing an algorithm which can learn rules on the fly as well as navigate its environment would be a very big step towards true self-learning systems.
Imagine a self-driving car system which learns to adapt to scenarios based on the environment, no matter what the environment was, rather than labelling every single object in a car's field of view. More specifically, rather than labelling sets of traffic lights and what colour each light is, instead, take in the whole scene and then learn what the most ideal move would be to make.
I still need to dig deeper into MuZero but my feeling is once it starts to get applied to real-life scenarios, it'll start to make even more sense.
What it is: TensorFlow 2.4 is out with a bunch of quality of life and performance upgrades.
Why it matters: You're only as good as the tools you use. And as someone currently building a TensorFlow course (and using it daily), I'm grateful I get access to what the pioneers use.
My favourite is the mixed precision API in Keras. Now, instead of using the float32 dtype (default) you can use the float16 dtype which uses less memory and can improve model training speed by 3x on GPUs and up to 60% on TPUs. For more on this, check out the Mixed Precision tutorial.
What it is: If there's anything in life I'm looking for, it's good explanations. And Thinc's explanation of backpropagation (the algorithm which powers how many of the machine learning algorithms we use learn patterns in data) is one of the best I've ever seen.
Using the example of workers performing a task and being given feedback for how well they did that task, Thinc's Backpropagation 101 article takes the reader on a learning journey paved with code and words.
Why it matters: Too often I've been scared off learning something because it seemed too complex. I believed the silly mentality of "I can't learn that".
But it's because of people like the authors of the Thinc blog, I update my internal state (pun-intended) to believe "I can learn that".
Also thought-provoking, Thinc is a deep learning library with a different (but still compatible, confusing I know) concept on building models than frameworks such as TensorFlow and PyTorch.
Different? Won't that be less reliable?
It doesn't look like it. Thinc is from the makers of and the engine behind spaCy (an industrial-strength NLP library) so its been battle-tested.
What it is: The wildly popular CS50 course (I used an earlier version of CS50 to kick-off learning how to code) has been upgraded with an artificial intelligence lecture refresh for 2021.
Check it out to learn about important artificial intelligence concepts such as: decision trees, minimax, depth-first search, greedy best-first search, explore vs. exploit, genetic algorithms and neural networks (yes, all in one lecture).
Why it matters: If there's anything that 2020 has taught us, its that every individual has to be in charge of their own health and their own education. If the last decade was the internet and online learning getting warmed up, this decade it's going to be taking full stride.
And CS50's resources on learning computer science are some of the best in the world (alongside Zero To Mastery, of course ;)).
If you're looking at learning more about artificial intelligence in 2021, keep CS50's AI course on your radar.
What it is: Many machine learning models are built off the assumption that training and test data are from the same distribution. However, anyone who's deployed a machine learning model knows this assumption gets violated almost every time. However, new research out of Berkley's AI research (BAIR) group proposes a method, adaptive risk minimisation (ARM) to help deal with this violation.
From the blog post:
In particular, we meta-train the model using simulated distribution shifts, which is enabled by the training groups, such that it exhibits strong post-adaptation performance on each shift. The model therefore directly learns how to best leverage the adaptation procedure, which it then executes in the exact same way at test time.
The method achieves state of the art on several benchmarks.
Why it matters: As careful as you are creating a test set, the real test set for any machine learning model is putting it into the hands of users and seeing what happens. But methods such as ARM show promise to helping mitigate the difference between test set results and real-life results.
What it is: Look, this one is a big one. I haven't fully read through it. But a skim through and you'll see it's probably the most extensive resource on the crossover of software engineering and artificial intelligence out there.
Why it matters: I often get questions asking how the AI and software engineering crossover, but as someone who's more familiar with the AI side of things than the software engineering side of things (I'm getting better at the latter), I find it hard to answer. From now on, this'll be one of the articles I pass on.
Bored of your news reader job? No problems. Just get an AI to do it for you.
An AI-powered news anchor just made its (her?, I'm not sure here...) in South Korea. Apparently the AI-powered newsreader can read at 1000+ words per minute. Wild.
And if replicating yourself with AI still doesn't relieve your boredom, you can always take a dance class from these groovy Boston Dynamics robot(s):
Serious question: If a robot does a particular dance move, is it always called the robot? 🤯
What a massive month for the ML world in December.
As always, let me know if there's anything you think should be included in a future post. Liked something here? Tell a friend!
In the meantime, keep learning, keep creating.
Have a great 2021 and I'll see you next month,
By the way, I'm a full time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.