February 3rd, 2020 · 6 min read
1st issue! That's right, you're right at the beginning of this journey! If there is enough interest, I will keep doing these every month so please share it with your friends!
Being a Machine Learning Engineer is a fantastic career option and Machine Learning is now one of the fastest growing job markets (including Data Science). Job opportunities are plentiful, you can work around the world, and you get to solve hard problems. However, it’s hard staying up to date with the ever-evolving ecosystem.
This is where this newsletter comes in. Every month, it’ll contain some of my favourite things from the industry, keeping you up to date and helping you stay sharp without wasting your time.
Jeff Dean, SVP of Google Research and Health posted a great summary of Google's machine learning accomplishments for 2019. Some of my favourites were:
At the end of the article, Dean also lays out some of Google's visions for machine learning going forward. Health gets another mention here. "How can we apply computation and machine learning to make advances in important new areas of science?"... such as healthcare and bioinformatics.
One of the questions I get most often is "how can I learn the math behind machine learning?". The canned response is to say something like, go and study linear algebra, calculus, probability, statistics, computer science. But that isn't really helpful. As one could spend years on each of these and still not know enough (how much is enough?).
Jason Brownlee from Machine Learning Mastery lays out a far more practical approach. One based on curiosity rather than logic. Trying to learn all of the above topics at once is like trying to boil the ocean. Instead of boiling the ocean, Brownlee advocates for starting with trying to boil a kettle first.
Choose a project to work on, something which interests you. And see if you can apply machine learning to that project. When you get some small wins, solve a few problems, you'll have no choice but to want to dive deeper. And then you can take advantage of this curiosity to dive deeper to fuel your further understanding of the math.
This is the approach I take. Learning what you need to learn when you need to learn it.
When getting started, it's common practice to apply machine learning to a single source of input data. Such as an algorithm looking at a single image and deciding whether or not there is a car in there.
But as Andrei Karpathy, head of AI (artificial intelligence) at Telsa, explains in a recent talk, self-driving cars are very much a multi-task problem. This means, rather than a single image input and a decision being made, a Tesla takes in information from 8 different images, stitches them together and then makes a decision based on the collective.
This, of course, is harder to do than from a single input. But is necessary for a domain such as self-driving cars.
You can imagine this is also the case in many other domains. We make decisions based on information from many input sources. As an example, imagine a doctor trying to prescribe a treatment based only on your age and nothing else. How well would it go?
Another gem from Google. In their Best Practices for ML Engineering guidelines, they outline a series of heuristics one can use for approaching a potential machine learning project.
My favourite is #1.
"Don't be afraid to launch a product without machine learning."
As powerful as machine learning is, if a simple rule-based system, one which gets the job done, can be used, it should be.
The standard metric for evaluating classification models is accuracy.
But let's see where that fails.
Let's say 100,000 people board planes every day at airport X. And 1 of them has a disease. If the person with the disease gets on the plane, this could be problematic for the people on board.
So airport X is tasked with building a machine learning classifier to figure out who has the disease based on an eyeball scan at the terminal (remember, this is made up).
A machine learning model which predicted "no disease" for every single person would have an accuracy of 99.999%. Look at all those 9's! Not bad!
But now you start to realise where accuracy comes undone. Damien Martin discusses two better-suited metrics to this problem, precision and recall in an article which may also help you in a future interview. Give the example questions a try, they tripped me up.
Phew! 2020 is almost 10% over already and there's one thing for sure. There's plenty going on.
Stay playful, keep learning.
PS. If you have a suggestion you'd like to see in a future edition or some of your own work you'd like to share, let us know. See you next month!
How did you like this post? Let me know if there are any changes or improvements you’d like to see. Please share the post on Twitter if you enjoyed it and want me to keep writing them! Also, if you haven't already, subscribe below to receive Machine Learning Monthly next month and other exclusive ZTM posts.
By the way, I'm a full time instructor with Zero To Mastery Academy teaching people Machine Learning in the most efficient way possible. You can see a couple of our courses below or see all Zero To Mastery courses by visiting the courses page.