AMA Deep Dive on PyTorch, Deep Learning, Machine Learning + More

A quick note from the editor:

I recently got to join and moderate a live AMA call that Daniel Bourke did with our students in the Zero To Mastery virtual campus.

Students and I asked questions about Machine and Deep Learning, how they work, the best ways to learn ML, how PyTorch compares to Tensorflow, some technical questions, and much more.

You can check out some of Daniel’s answers to the questions from his AMA call below, but first, let me introduce him…

Who is Daniel Bourke, and why should you listen to him?

Daniel is a self-taught Machine Learning Engineer. He’s worked at one of Australia's fastest-growing artificial intelligence agencies, Max Kelsen, and is now using his expertise to teach thousands of students Data Science and Machine Learning as an Instructor, here at the Zero to Mastery Academy. He has worked on Machine Learning and data problems across a wide range of industries including healthcare, eCommerce, finance, retail, and more.

Aside from building Machine Learning models on his own, Daniel loves writing about and making videos on the process. His articles and videos on Machine Learning on his personal blog, as well as his YouTube channel have collectively received over 5 million views! He also writes a monthly newsletter, Machine Learning Monthly.

Daniel knows what it's like to try and learn a new topic, online and on your own, so he pours his soul into making sure his courses are as accessible as possible and takes complicated topics and explains them in an entertaining, yet simple and educative way.

Daniel is the creator of the Complete Machine Learning and Data Science Bootcamp, the TensorFlow Developer Certification, and the PyTorch for Deep Learning course!

That’s over 144+ hours of in-depth training on Machine Learning so as you can see, Daniel clearly knows what he's talking about!

Side Note:

One of Dan’s students here at ZTM recently got hired as a Machine Learning Engineer after taking one of his courses, just 6 weeks after finishing high school!

So with the introduction out of the way let’s dive into the questions 😊

🙋🏽‍♂️AMA with Daniel Bourke on Machine Learning and more

Want to watch the video, ask questions in our Discord, or join in live on our next expert AMA?

These are all exclusive access for members of Zero To Mastery. You can find out more details about every course and the other benefits of being a ZTM member here.

What is the average salary for a Machine Learning Engineer?

According to Ziprecruiter, the average salary in the US for a Machine Learning Engineer is $142,988 /year.

What is PyTorch and how does it work?

Originally there was a Machine Learning framework called Torch written in another programming language called Lua.

However, a lot of Machine Learning programmers started using Python more and more, mainly because of Python's ease of access, and the fact that Python is one of the most popular programming languages in the world.

This led to Torch being ported across to working with Python, and hence PyTorch was born. This meant that users can now use Python to write Machine Learning algorithms to process and find patterns in data.

For a simplified overview of how PyTorch works, it allows you to write Machine Learning code in Python for PyTorch, which then triggers C++ code under the hood. This allows it to run very fast and on various types of hardware.

This is super important because when it comes to Machine Learning, your goal is to be able to process large data sets and find patterns, and so the ability to run faster and process quicker is vital.

Because PyTorch is triggering C++ which is very close to low-level machine code, it’s one of the fastest and most efficient methods that you can use for Machine Learning and pattern recognition.

You could in theory code directly into C++ but there is a difficulty curve in learning this. That’s why PyTorch is so great because it allows us to use a more human-friendly high-level language, that then converts into that low-level language that we need for the Machine Learning algorithms.

If you want a deeper dive into exactly how this all works, we highly recommend this article.

Is PyTorch open source?

It sure is! You can go to the Github repository for PyTorch and download it there.

Sidenote:

PyTorch was created by Facebook but they are not the only people pushing it forward. There are multiple companies such as Nvidia, Google, Amazon, Microsoft, and Facebook, which each have full-time employees who all contribute to the main PyTorch repository.

Heck, you can even make your own contributions! In the last build, almost 40% of the commits were from people external to these major company contributors!

PyTorch has a thriving and helpful ecosystem which means you’ll never be stuck for help and support if you choose to learn and use it. (We also have our own Machine Learning and Data Science Discord server that you can use to ask questions).

TensorFlow and PyTorch are which types of Machine Learning?

Tensorflow is an end-to-end open source platform that was created by Google. Originally designed for large numerical analysis, it works great for Deep Learning as well as traditional Machine Learning applications.

PyTorch is also an open-source Machine Learning framework, developed by Facebook. Its main function is in training neural network Deep Learning models, but just like Tensorflow, it can also be used for traditional Machine Learning applications.

What is Deep Learning, are Machine Learning and Deep Learning the same thing, and what is a neural network?

If you want to get technical, then Deep Learning is a further niched part of Machine Learning.

It usually focuses on a specific type of Machine Learning algorithm called a neural network, and the ‘Deep’ in deep learning is actually referring to the number or 'depth' of layers in that neural network.

For a simple overview, imagine a neural network as a composition of layers, and each one of those layers is a function or mathematical operation.

Essentially what you have is an input or data set, and you pass that through several layers and mathematical operations, and then you have the output at the end of this.

The goal of these layers is to learn and find patterns within that data set so that they can map the input to the output and find connections that humans would miss. This way we can then learn the best paths and criteria to achieve the goal output more often.

For Example

Facebook uses this Machine Learning method in its ad platform.

It works like this. You set up an advert and choose your audience. Facebook then runs ads to a large audience base and tries to get the desired end response that you choose, such as a purchase.

If you can spend enough money to reach a large data set, eventually the ad platform will start to recognize data points among customers that you wouldn’t have thought to target, improving the performance and raising sales, while also lowering costs over time.

They find the patterns and continue to narrow them down.

Because the platform is so good at recognizing patterns, you can even specify cheaper events to act as a success trigger. Let’s say that you know that blog subscribers are more likely to become customers, you could run an advert to target and find patterns for subscribers instead of customers.

Why bother?

Well, customers will usually subscribe way before they will buy. This means you can start using Machine Learning pattern recognition to get a large data set for much cheaper than it would cost you for purchase events.

This means you can leverage the benefits of Machine Learning even in these smaller datasets!

Which is better...Tensorflow or PyTorch?

Honestly from a Developer's point of view, almost all Machine Learning frameworks are pretty much identical nowadays. They all have almost the exact same features, which means that if you have experience with one of them, you can pick up any of the others with a few weeks' practice using whichever framework.

The main deciding factor of which framework to use comes down to what you’re wanting to build (as some are used more often than others in some industries), and which framework the company you want to work for currently uses.

tl;dr

If you’re just getting started in learning Machine Learning, you won’t go wrong with either PyTorch or Tensorflow. Generally, most companies are using TensorFlow.
If your goal company uses a specific framework, learn that one first. Simply research the companies that you want to work with and find out what they use. (Check job requirements etc)
However, if you want to get into research with Machine Learning, such as at a University, then we recommend learning PyTorch as around 60% of the Github shared research papers are written in this, and it seems to be a trend towards this particular framework.

Like I said earlier though, you can do everything that you want to in either language. It just comes down to your end goal and what’s being used in that industry. You can also learn one and then pick up the other framework pretty fast if you want to.

What is the best way to learn PyTorch and is PyTorch difficult to learn?

If you have zero programming background, the learning curve for PyTorch can be a little steep. However, if you can follow a set path to learn you’ll find it much easier.

Here at ZTM, we have our own PyTorch course that's designed that so you can get started with it with only 2-3 months of Python experience.

Better still, we focus on learning by doing which means we teach the course and you write the code as you go, building on what you know.

Applying what you learn like this builds confidence but it’s also the fastest way to learn and solidify your knowledge.

Sidenote: If you're already learning Pytorch and having difficulties, then you might be interested in this guide to 3 most common Pytorch errors, and how to solve them.

What is a convolutional neural network in PyTorch?

There are multiple types of neural network that we can use. When we say ‘neural network’, really we’re talking about a particular Deep Learning style of algorithm .i.e it has layers that are connected in some way.

A conventional neural network is simply a style of neural network that uses convolutional layers and is typically very good at computer vision datasets, but they can be used on almost any problem, such as text, etc.

The reason they are often used for images is how the algorithm ‘convolutes’ and assesses each image. Imagine a window frame scanning each pixel of the image, scanning left to write and working its way down from top to bottom. This means that it’s very good for finding patterns in images compared to other methods.

Convolutional neural networks are some of the most popular and widely used forms of neural network, simply because it’s also one of the most successful methods.

For example

Tesla uses convoluted neural networks with their self-driving cars, and Apple uses them with their audio recognition technology.

What is data normalization and why do we need it?

There are many different definitions of what data normalization is when we’re talking about Machine Learning.

Here’s a simple analogy to understand the broad concept:

Imagine that you have a set of steps that you want to walk up, but there’s an issue. Each step has different heights. Some are small and some are large, and so it’s difficult to walk up.

Ideally, we want each step to be pretty much identical in height because if they’re all different heights, you have to focus on them more and think about where to place and lift your foot each step you take.

Data normalization is similar to us trying to make the step height more uniform so that it’s less effort to concentrate on when we go up and down them. We smooth out the different sizes to create a uniform average height range.

Instead of steps though, imagine we have a large data set with thousands of numbers. Not only that but these numbers are focused around multiple ranges. Some are in the range of 0 to 1, others are in the range of 1 to 2.55 (which is common when looking at RGB colors).

Just like how trying to walk up different-sized steps would be more difficult and slower for us, when a Machine Learning algorithm has to process data in multiple ranges and large data sets like this, it can struggle.

With Data Normalization we smooth out the range to help the algorithm run easier. It does affect the results slightly but it also greatly reduces the run time to process this information. We accept this error factor in exchange for the speed to learn and understand.

You do however have to be careful that you’re recalibrating for new data sets.

For example

Imagine you have a self-driving car and your algorithm is set for only daytime driving conditions, but you then want it to drive at night in low light. The performance would be grossly different, especially if you normalized that data on the initial data set.

What are common applications of Deep Learning in artificial intelligence?

There are so many applications of Deep Learning in A.I. In fact, every large tech service that you use right now, such as Facebook, Youtube, Google, and others are all using Deep Learning in some way.

For Example

Google Search is powered by this and it’s actually learning and updating on its own. Case in point, large Youtube videos will often have chapters added to them manually by the creator. This helps improve the user experience but also helps their video rank in search for different keywords.

The thing is, Youtube and Google are starting to understand video content and are automatically adding chapters around topics in videos on their own!

Another example of Deep Learning AI is voice assistants such as Google Home or Siri. These tools are actively learning to the point now, where you can use them to perform direct dictated tasks at higher levels.

Sure you might pause your Spotify or search for something, but you can use voice assistant now along with Github Copilot to actually write code.

Crazier still, you tell it what you want to do and it will write the code for you instead of just transcribing the code that you say out loud!

Heck… search is even starting to learn sentimental context.

You can ask it to find photos for you that you took, based on the details you give it. “Hey Siri, find photos of me and my partner at the airport in San Francisco” and it will search and find them!

This is just the top of the iceberg of what Deep Learning is capable of when added to AI.

🔥 Want to learn more about Machine Learning and Data Science?

You can check out all 3 of Dan’s Machine Learning courses below:

You can take his Complete Machine Learning and Data Science Bootcamp
Check out his Tensorflow Certification program
Or, you can learn everything about PyTorch for Deep Learning in his brand new course!

Go ahead and join any or all of them, and then ask questions in the dedicated Machine Learning Discord channel. You can ask questions there and get answers and help from Daniel, as well as from other experienced and beginner Developers.