Unless you’ve been living under a rock, you’ve no doubt heard all the chatter about Artificial Intelligence (AI) in the last 14 months or so.
ChatGPT burst onto the scene, and then every company everywhere seemed to have either:
The thing is, this isn't just some hot new trend that is going to come and go. It’s a rapidly growing industry with some serious potential.
In fact, the market size for the AI industry is projected to reach $305.9 Billion by the end of 2024 and estimated to hit $738.8 Billion by 2030 (which honestly feels low if I was a betting man).
To put that into perspective, here's a side by side comparison of some other tech sectors, their current industry size and projected growth:
As you can see, it’s pretty clear AI is expanding like crazy.
The best part? It's still so early, and AI is evolving so quickly that there aren't many people with hands-on experience in the field.
This means that with a dedicated 3-6 months of study, you can go from not knowing anything about the field to applying the latest state-of-the-art research.
Many people's first interaction with AI over the last year has only been ChatGPT, but that's just one general use case. AI has a role in almost any field you can think of
Which is why in this guide, I’m going to walk you through the exact steps it takes to become an AI Engineer - one of the key new roles that has already started to emerge during this new AI era.
I’ll walk you through the job, what you’ll be doing, how much you can make, the skills required, and even give you a roadmap of what to learn and when. This way you can get hired, earn great money, and future-proof your career.
Sounds good?
Alright, let’s dive in…
Editor's Note: Given how quickly the industry is evolving, we'll also continue to update and evolve this guide as well, just like we do with all our courses.
To understand the AI Engineer role, you first need to understand what AI is.
The goal of AI (artificial intelligence), is to create machines and programs that can perform tasks that would typically require human intelligence to achieve, to make our lives easier and work more efficiently.
Common examples of this are self-driving cars, or chat assistants like Siri, that can listen to a command or question, and then present an answer.
There are 6 main areas of focus when it comes to AI:
As you might notice, these different areas of focus are often combined in a single system.
For example
A self-driving car will use all of these.
It has:
The short version:
Put most simply, an AI Engineer builds AI applications using pre-built LLMs (large language models) or other machine learning models.
The longer version:
To be honest with you, the answer to this question is still up for debate.
Given how quickly the AI field is evolving, people's definition of what an "AI Engineer" means is different from person to person right now.
It's like jobs in web development.
You can look at 10 job postings for a 'Web Developer' position, and the requirements will be different for each.
At their core, they're all building web applications using code, but what the work actually looks like will be different for each.
Well the same goes for AI Engineers.
While all AI Engineers may have the same job title, the actual work will look different for each of them depending on what that company wants to use AI for.
Some will be more specialized in certain areas. Some will do a bit of everything. But at their core, they're all building AI applications using LLMs or other machine learning models.
A large part of being an AI Engineer is organizing and collaborating with multiple multidisciplinary teams to complete a project.
Everyone from stakeholders, domain experts, and users, to software developers, ML engineers, data analysts and data scientists. This means that both soft skills and technical skills are vital for this role.
An example work process can look something like this:
Work with stakeholders to understand and define the problem, and how AI could solve it.
Because a lot of AI is working with patterns of data, the next step is usually to collect data around that problem that needs to be solved and clean it up.
This can then be used to better understand the issue, and train the AI models.
It's a step you might not think of, but AI Engineers also have to consider any ethical implications and potential biases that might occur by accident, that could affect the data and user experience.
The last thing you want is a racist chatbot trained off of the wrong forums, or a detection system that can recognise a bike, or a person, but not a person pushing a bike.
The next step is to decide on which algorithm to use to solve the problem.
These can be designed from scratch, but usually, if you can define the problem well enough, there are current Machine Learning models that can be used to fit your situation and then optimized further.
You can also start with API offerings such as GPT, Gemini, or Claude to bootstrap your way to an initial product/project stage before starting to think about using your own in-house models. (API stands for Application Programming Interface which typically means using some code someone else has written for your own use case)
Now that a model is decided on, an AI Engineer (most likely with the help of a collab team), will write code to use the models they've developed/use models available from various APIs.
Common languages used are Python, Java, or C++.
That being said, Python is incredibly popular [due to its adoption with Data Scientists and Machine Learning Engineers, and so it's the language that will most likely be used.
Fun fact, the “Py” in PyTorch (a popular open-source machine learning framework) comes from Python.
The next stage is to train the model from scratch (if brand new) or fine-tune an existing model to suit your own use case.
This involves taking data and feeding it into the AI so that it can start to recognize patterns and make predictions on future data.
This is then refined further and further until it's ready for the next stage of testing.
However, if you decide to use an existing API such as GPT, Claude, or Gemini, you may not need to fine-tune a model and can instead focus on prompt engineering. (This is a technique used to get LLMs to produce outputs specific to your use case).
Now that the model is trained and validated, the next step is to implement it into software applications or systems - such as databases, applications, interfaces, or other elements.
This is so the model can be tested ‘in the wild’, but in a controlled environment.
Now that the model is implemented in the system or application, AI Engineers will run tests to make sure that it works as intended, before deploying it to the core users.
Often you’ll be looking for ways to not only check that it works, but also to improve efficiency further (as compute time equals cost, and if a model takes a long time to make a prediction, that isn’t a great experience).
Notice the trend of continual testing. This is paramount with any machine learning or AI system. Due to the probabilistic nature of the models, their outputs can’t be guaranteed so they must be continually checked and refined.
After successful testing in a controlled environment, it’s now time to deploy this into the real world.
This can still be in a smaller test group at first to see how it goes, before deploying further in staggered stages.
During those initial deployments, AI Engineers will continue to monitor the performance and identify any issues or improvements.
If that all goes well, they continue to release to the public and update and improve as needed.
This is a question that comes up a lot, and a lot of people use these two terms interchangeably (in fact, I'm guilty of it at times too).
However, there are some key differences.
Although both roles have similar skill sets, ML Engineers are more focused on the machine learning models themselves and will even build them from scratch, whereas AI Engineers cover all aspects of developing an AI application, and usually uses a pre-built AI model via an API or similar.
For example
ML Engineering focuses more on the creation and development of the AI models to help bring that project to life.
While AI Engineering is more about the planning, developing, and implementing an AI application/solution, and therefore requires a broader AI skillset.
If we look back at the AI Engineer's workflow, from before, you can see the difference:
An AI Engineer is someone who builds something with AI as the main feature of a product.
For example
If you're building an application on top of ChatGPT or on top of StableDiffusion, you're an AI Engineer. You're not necessarily building your own AI, but you are using it predominantly.
To contrast that, an ML Engineer is someone who builds, tweaks and optimizes custom machine learning models.
From collecting a dataset, to refining model architectures, to performing transfer learning on pre-trained models to custom domains to ensuring that their models can run on specific hardware.
Case in point, when I was building the computer vision models that power Nutrify, I would classify myself as a Machine Learning Engineer, as I have to curate a dataset (food images), tune a model via transfer learning and ensure the model runs fast on mobile devices (iPhones).
And when I use LLM APIs such as GPT, Gemini, or Claude to enrich our food image datasets with text descriptions, I’d classify myself as an AI Engineer (using a pre-built API to satisfy a use case rather than training my own models).
To summarize:
Or even:
However, the two are quite overlapped. I switch several times a day between AI Engineer and ML Engineer.
Most new people to the field today will start as an AI Engineer (using existing AI tools) and then move to being an ML Engineer (building their own custom AI tools).
Next up = is the question you really want to know isn't it 😉?
Obviously this can vary based on location, experience, and company applied to.
However, according to ZipRecruiter, the average salary for an AI Engineer in the US is $101,752, and Senior AI Engineers get around $126,557 per year.
Some FAANG companies are even offering as high as $300,000 - $400,000 with stock options. That sounds pretty good to me!
Also, at the time of writing this, there are 31,156 remote AI Engineer jobs available in the US.
Not at all.
Although some FAANG companies may request a CS or Mathematical background degree, the majority of them will hire based on expertise instead.
This means that companies would rather have someone with hands-on experience and a portfolio of relevant projects, vs a degree only, as it shows you can do the work.
However, it should be said that because this is such a fast-paced and evolving industry, you are required to stay on top of your game and keep learning - regardless of your background.
That’s pretty much true for all areas of tech though.
When you enter the tech field, you’re signing up for a life-long learning journey.
It really depends on which path you choose to take. If you go for a Computer Science degree first, then you’re immediately adding 3 to 5 years to your timeline.
If you add a Masters or PhD on top of that so that you can apply for more Senior roles, then be prepared to add another 4-6 years or longer, as well as drop $40,000 - $80,000 in school fees.
Yeesh! The good news is, that there is a faster and easier path (that costs considerably less), so let’s get into it.
If we break it down into simple terms, there are really 5 major milestones that you need to cover, to become an AI Engineer:
Obviously, each of these steps can have multiple components, so let’s take a look at each of them in more detail.
Sidenote: We actually have an entire career path that you can follow here of what to learn and in what order to become an AI Engineer and get hired.
Be sure to check it out after you read this post, and then follow along. Trust me when I say that it’s a far quicker path than going the degree route.
When we interviewed students who had taken this online career path and got hired, we found the average time to learn and complete was one of 3 time frames, depending on how much time you could commit to it per week:
That’s a bit better than 40 hours a week for 8 years and $80,000 in debt right?
Go check it out now. Otherwise, I’ll walk you through the general steps here.
This is where you’ll spend the majority of your time learning to become an AI Engineer, as obviously, you need to learn how to do the job.
Most people struggle to learn new things, simply because they lack systems to learn effectively. It’s not their fault, it’s generally not a skill taught in school which is ironic.
This is why we have a course on the topic.
It’ll not only help you pick up new skills faster but also help you retain and understand them. When you’re learning a lot of new things, this can cut your time down considerably.
Although AI Engineers might use Python, Java, or C++, Python is the language that you're most likely to come across in the field.
Also, Python is an excellent first programming language to learn, so even if you pick up the others later on, you can start here and get moving and then come back to more languages when needed.
Learn machine learning fundamentals, including data preprocessing, various algorithms and models, as well as gain a fundamental understanding of specific subfields of AI, such as natural language processing, computer vision, and robotics.
You’ll also need to learn a programming language such as Python (Python is particularly popular in the AI community due to its simplicity and extensive libraries, including those for machine learning and data science).
Good news?
Large Language Models (LLMs) like the one that powers ChatGPT are only growing in popularity, so it’s worth spending some time to understand how they work.
We have a few other more specific courses on this that you can check out:
Check those out and see how they can help you.
Also, depending on the time that you read this, there may be new A.I. tools on the market that are unique for your role, so have a quick Google search and see if there is anything that can help, and play around with it.
Sidenote: You could probably start applying for jobs right now as a Junior AI Engineer, but let’s walk through the rest of the recommended education, so you can smash your interviews.
Familiarize yourself with popular AI frameworks and libraries like TensorFlow, PyTorch, and scikit-learn.
These tools are the building blocks of modern AI models and will give you an understanding of Deep Learning.
Because you’ll be collaborating with other teams and stakeholders, you need to be able to work and communicate with people effectively.
It’s one thing to build the thing but it’s another thing to be able to communicate it with someone.
We’ve got a great article that contains 7 tips for how to improve on your soft skills..
Develop a solid understanding of how data structures and algorithms work. This will not only help you in your overall understanding but will also help you to ace the technical interview (more on this later).
The concepts here come up during interviews, so it's vital to learn them - even if you might never use them in your daily work.
Like I said earlier, a lot of tech companies will hire based on proving your ability to do the work, so you have to be able to show them what you can do.
This means creating a project portfolio, as well as a GitHub profile to show your work, and then filling it with relevant examples.
The portfolio course above will show you how to create an awesome no-code site that will stand out with employers, as well as how to write your resume and application for later on, so I don’t miss it.
Depending on the path you take, you should already have some projects completed from your learning journey so far.
For example
The course I shared earlier has 3 major projects for you to work on and add to your portfolio.
You can also supplement these with additional projects to help you further stand out, and also further refine your skills.
Here's 3 new projects that you can follow along and create inside of the ZTM academy:
Although ZTM is packed with guided projects, we always recommend taking it a step further and creating your own projects or furthering the existing ones to practice figuring things out on your own.
My favorite piece of career advice for getting a job in tech, especially machine learning and AI, is simple: Start the job before you have it.
How? Start writing code to do the things you'd like it to do. It will be hard but let your curiosity pull you through.
Many of the latest AI and machine learning models can be applied in a few lines of code. But they still need to be directed where to go. Building something yourself is what gets you specific knowledge, the type of knowledge you can't get from courses.
Some people will tell you to apply for internships and things like that so that you can get in-person experience.
Here’s the thing though. Unless it’s your absolute dream company, and it’s the only way you’ll get your foot in the door, or you're learning this at 15 and too young to be hired, then don’t bother with internships.
Just apply for junior AI Engineering roles instead, as this is the best way to get hands-on experience, and will pay far better.
Also, if you follow the projects that we’ve shared above, most companies will be blown away by your current skill level. (This is a common quote from our students. We even just helped someone score a senior ML role at Nvidia after taking these same courses).
Don’t undersell yourself. Apply now and start practicing preparing for interviews.
If this is your first ever programming / technical job, you need to understand that the interview at tech companies is different from elsewhere that you might have worked before.
Because they care more about if you can do the work versus a degree or certificate, they not only want you to show your portfolio, but they also want you to prove your skills, during multiple stages of interviews.
It can vary based on the company, but the process is usually:
In reality, the most difficult part is the technical interview, as they’ll ask you about AI, and also about data structures and algorithms. However, with hands-on experience, you’ll be learning these things as you go.
Remember, workflows often matter more than immediate correct solutions so if you don’t know something, talk through how you would go about trying to solve it.
I also recommend you check out my 'No BS' Guide To Getting A Machine Learning (or AI) Job.
Sidenote: If you’re applying for FAANG or similar-level companies, you might also get some more theoretical, CS degree type questions. You can learn how to pass these FAANG interview questions here.
So check out our ML + AI Engineering career path now to go from absolutely zero experience to getting hired.
If you can only manage 1-10 hours a week with diligent study and dedicated practice, you can still get hired in this career within 12 months, and even quicker if you can spend a few more hours.
What do you have to lose? Get started now!
That's one price for all those courses and more.
And better still? You also get access to the private ZTM Discord community, where you can ask questions from me, from your other instructors, as well as other students and current working AI Engineers.