Preparing for a SQL interview doesn’t have to be stressful. I’ve been there, and I know how overwhelming it can feel. But here’s the good news - you can absolutely nail it with the right prep. Whether you’re just getting started or diving into more advanced topics, I’ve got you covered.
In this guide, I’ll walk you through some of the most common SQL interview questions, from beginner basics to the trickier advanced stuff. By the time you’re done, you’ll have the knowledge and confidence to tackle anything the interviewer throws your way.
So, let’s jump in and get you ready to ace that interview!
Sidenote: If you find that you’re struggling with the questions in this guide, or perhaps feel that you could use some more training, or simply want to build some more impressive projects for your portfolio, then check out my complete SQL course:
This SQL Bootcamp will take you from complete beginner to a master of SQL, database management, and database design. You'll learn by using fun exercises and working with all database types to give you real-world experience. No prior experience needed.
Updated with all modern SQL and Database (PostgreSQL, MySQL) features for 2024, you'll learn SQL from not one, but two industry experts. Learning SQL and database best practices from this bootcamp will help you level-up your skillset to earn a higher salary and even get you hired.
With that out of the way, let’s get into the questions!
SQL, or Structured Query Language, is the standard way to interact with relational databases.
It allows you to define, manipulate, and retrieve data using simple commands like SELECT
, UPDATE
, and DELETE
. Whether you're querying customer records or building entire databases, SQL is an essential skill for anyone working with data.
For instance, e-commerce platforms rely on SQL to handle product catalogs, customer accounts, and order tracking. Without SQL, interacting with relational data would be cumbersome and prone to errors.
SQL commands are categorized into five primary types:
CREATE
, ALTER
, and DROP
allow you to define tables, indexes, or viewsINSERT
, UPDATE
, and DELETE
for modifying data within tablesGRANT
and REVOKE
are used to manage permissions and control who can access or manipulate the dataCOMMIT
and ROLLBACK
ensure that transactions are completed properly, preventing issues like partial updates or system crashes from corrupting dataSELECT
command, which retrieves data based on conditions you specifyKnowing these command types is vital for effective database management.
For instance, when designing a new database schema, you’ll use DDL commands to create tables, while DML commands let you update records when new data comes in. While using TCL commands ensures that multi-step processes (such as financial transactions) are handled reliably.
A primary key uniquely identifies each record in a table. It ensures that no two rows can have the same value in the primary key column, and it cannot contain null values. The primary key is used in operations like retrieving specific records or establishing relationships with other tables.
Primary keys are essential for database design because they enforce uniqueness, which prevents issues like data duplication.
A foreign key is a column or group of columns that creates a relationship between two tables by referencing the primary key in another table. This ensures that data remains consistent between related tables.
Foreign keys are important because they maintain data integrity across your database. If a customer record is deleted, the foreign key constraint ensures that related orders in the Orders table are handled properly, either by cascading the deletion or preventing the removal until the related records are updated.
Aggregate functions perform calculations on a set of values and return a single result.
Examples include:
SUM()
adds up the values in a numeric columnAVG()
returns the average of the values in a columnCOUNT()
counts the number of rows in a columnMAX()
returns the largest value in a columnMIN()
returns the smallest valueThese functions are useful when you need to analyze data across a large dataset.
For example, if you’re working with sales data, you might use SUM()
to calculate total revenue for the quarter, or COUNT()
to determine how many orders were placed.
Aggregate functions provide key insights into trends and patterns within your data.
WHERE
clause. Each deleted row is logged, allowing the operation to be rolled back if neededTRUNCATE
is generally faster than DELETE
because it doesn't generate individual log entries for each row deleted, making it the better choice when clearing large datasets.
For example, if you're wiping data from a test environment, you don’t need to log each deletion, so TRUNCATE
speeds up the process.
However, DELETE
is more suitable when precision and the ability to undo the operation are required, such as when removing specific rows based on a condition or when transactions need to be controlled.
GROUP BY
clause to filter grouped dataFor example, if you’re querying sales data, you might use WHERE
to select only sales from a specific region. After grouping the data by product, you could use HAVING
to show only those products with sales totals greater than a specified amount.
The key distinction is that WHERE
works on raw data, while HAVING
operates on aggregated data.
Normalization is the process of organizing data in a database to reduce redundancy and improve integrity. It involves dividing large tables into smaller, related tables and defining relationships between them.
The different normal forms include:
Normalization is important because it improves data integrity and makes databases more efficient.
For example, in an e-commerce platform, normalization prevents storing redundant information like customer addresses in multiple places, which helps avoid inconsistencies when the data needs to be updated.
Joins in SQL allow you to retrieve data from multiple tables based on a related column. When working with normalized databases, data is often spread across multiple tables, and joins help you combine this data into a meaningful result set.
Types of joins:
INNER JOIN
: Returns only rows where there’s a match in both tablesLEFT JOIN
: Returns all rows from the left table and the matched rows from the right. Unmatched rows in the right table are filled with NULL
RIGHT JOIN
: Returns all rows from the right table and matched rows from the left. Unmatched rows in the left table are filled with NULL
FULL JOIN
: Returns all rows from both tables. Rows that don’t match in either table are filled with NULL
CROSS JOIN
: Returns the Cartesian product of both tables, combining every row from the first table with every row from the secondExample of an INNER JOIN
:
SELECT Customers.CustomerID, Orders.OrderID
FROM Customers
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
This query retrieves all customers who have placed an order by joining the Customers
and Orders
tables on the CustomerID
field.
Indexes speed up data retrieval by allowing the database to find rows more quickly without having to scan the entire table.
For instance, when querying a table with millions of records, having an index on frequently searched columns like CustomerID
can make queries execute significantly faster.
Without indexes, the database would have to perform a full table scan for each query, which can be extremely slow for large datasets.
However, indexes also take up space and can slow down write operations (like INSERT
or UPDATE
), so it's important to use them judiciously.
A self-join is when a table is joined with itself. This type of join is particularly useful when you want to compare rows within the same table. It’s commonly used in hierarchical data structures, such as employee-manager relationships.
For example, in a table of employees, you might use a self-join to find each employee's manager by joining the table to itself based on the ManagerID
column:
SELECT e1.employee_id, e1.name AS employee_name, e2.name AS manager_name
FROM employees e1
JOIN employees e2 ON e1.manager_id = e2.employee_id;
Self-joins are essential for working with recursive or hierarchical data structures where relationships exist within the same entity, like organizational charts or product categories.
A view is a virtual table based on the result of a SELECT
query. Views do not store data but instead present data from one or more tables, making it easier to query complex datasets without having to rewrite complicated queries each time.
Why use views?
Example of creating a view:
CREATE VIEW SalesView AS
SELECT order_id, customer_id, total_amount
FROM orders;
This view returns only the FirstName
, LastName
, and Salary
of employees who work in the Sales department.
INNER JOIN
and OUTER JOIN
An INNER JOIN
returns only the rows that have matching values in both tables. It effectively filters out rows from either table that don’t have a corresponding match in the other table. This is the default join type and is commonly used when you want to retrieve only the rows that meet specific criteria in both tables.
An OUTER JOIN
returns all the rows from one or both tables, regardless of whether there is a match. If no match is found, the missing data is filled with NULL
values.
OUTER JOIN
can be divided into three types:
LEFT OUTER JOIN
: Returns all rows from the left table, along with matched rows from the right table. If there’s no match, NULL
is returned for columns from the right tableRIGHT OUTER JOIN
: Returns all rows from the right table, along with matched rows from the left table. If there’s no match, NULL
is returned for columns from the left tableFULL OUTER JOIN
: Returns all rows from both tables. If there’s no match in either table, NULL
is returned for the missing data from the other tableExample of a LEFT OUTER JOIN:
SELECT Customers.CustomerID, Orders.OrderID
FROM Customers
LEFT OUTER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;
This query retrieves all customers, including those who haven’t placed any orders. If a customer hasn’t placed an order, NULL
is returned for the OrderID
.
The key difference is that INNER JOIN
excludes non-matching rows, while OUTER JOIN
includes all rows from one or both tables and uses NULL
to fill in gaps for unmatched rows.
A subquery is a query nested inside another SQL query, often used in the WHERE
clause to filter results based on the output of another query. Subqueries can be either:
Example of a subquery used to find employees who work in the same department as 'John':
SELECT name
FROM employees
WHERE department_id = (SELECT department_id FROM employees WHERE name = 'John');
Subqueries are particularly useful when a single query isn’t sufficient to retrieve the needed data. For example, if you're looking for records that depend on the results of another query, subqueries provide a convenient solution without requiring multiple steps or temporary tables.
A CTE (Common Table Expression) is a temporary result set that you can reference within a SELECT
, INSERT
, UPDATE
, or DELETE
statement. It’s defined using the WITH
clause and can improve query readability and performance by breaking down complex queries into smaller, reusable parts.
Example of a CTE used to calculate total sales per salesperson:
WITH Sales_CTE AS (
SELECT salesperson_id, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY salesperson_id
)
SELECT salesperson_id, total_sales
FROM Sales_CTE
WHERE total_sales > 10000;
CTEs are especially helpful when working with recursive queries or when you need to reference the same result set multiple times in a query.
They improve query organization and make complex operations easier to manage.
Window functions perform calculations across a set of table rows related to the current row.
Unlike aggregate functions, window functions do not collapse rows into a single result; instead, they calculate values across a set of rows, maintaining individual row outputs.
For example:
Here you can see a window function that calculates the average salary for each department while showing individual salaries:
SELECT employee_id, department_id, salary,
AVG(salary) OVER (PARTITION BY department_id) AS avg_department_salary
FROM employees;
Window functions are used when you need to calculate running totals, moving averages, or rank rows within partitions of data.
They provide powerful analytical capabilities that are more flexible than simple aggregate functions, allowing for complex data analysis without restructuring your query results.
The ACID properties ensure that SQL transactions are processed reliably:
These properties are crucial in scenarios where data consistency and reliability are essential, such as in financial systems. If a bank transfer fails halfway through, atomicity ensures that the funds aren’t deducted from one account without being credited to the other.
Stored procedures are precompiled collections of SQL statements that are stored in the database and can be executed as a single unit. They can accept input parameters, perform operations, and return results.
Example of a stored procedure that retrieves employee details:
CREATE PROCEDURE GetEmployeeDetails
@EmployeeID INT
AS
BEGIN
SELECT * FROM Employees WHERE EmployeeID = @EmployeeID;
END;
Stored procedures are used to encapsulate repetitive tasks, enforce business logic at the database level, and improve performance by reducing the need to send multiple queries across the network.
Since they are stored directly in the database, they also offer better security and can be optimized for faster execution.
Triggers are special types of stored procedures that automatically execute (or “fire”) when certain events occur in the database, such as an INSERT
, UPDATE
, or DELETE
operation.
Triggers are often used to enforce business rules, maintain data integrity, or automate auditing.
For example:
Here you can see a trigger that logs every time a new employee is added:
CREATE TRIGGER trgAfterInsert
ON Employees
AFTER INSERT
AS
BEGIN
PRINT 'A new employee was added to the Employees table';
END;
Triggers are particularly useful for automatic tasks like logging changes, enforcing constraints, or cascading updates.
For example, in an HR system, a trigger can ensure that whenever an employee is added, a corresponding record is created in the payroll table.
OLTP (Online Transaction Processing)
These systems are designed for managing transaction-oriented applications where quick query processing and maintaining data integrity are important.
They support a high volume of short, atomic transactions like order processing, customer management, and financial operations.
OLAP (Online Analytical Processing)
These systems are optimized for complex queries and analysis, often using historical data.
They’re typically used in business intelligence and data warehousing, allowing for the analysis of large datasets and the generation of reports. For example, a retail company might use an OLTP system to handle customer orders in real-time, while using an OLAP system to analyze historical sales trends and create reports that inform business decisions.
UNION
and UNION ALL
UNION
combines the result sets of two or more SELECT
queries and removes duplicate rowsUNION ALL
combines the result sets of two or more SELECT
queries but keeps all duplicate rowsFor example:
SELECT customer_id FROM Customers
UNION
SELECT customer_id FROM Orders;
SELECT customer_id FROM Customers
UNION ALL
SELECT customer_id FROM Orders;
UNION
is useful when you want a clean, deduplicated list of results, such as when merging customer data from multiple sources.
However, UNION ALL
is faster because it doesn't remove duplicates, making it a better choice when you need all records, including duplicates, for analytical purposes.
A primary key is a column (or a set of columns) that uniquely identifies each row in a table. It ensures that no two rows can have the same primary key value, and it does not allow null values.
A table can have only one primary key, but it may consist of multiple columns (a composite key).
For example, in an Employees
table, EmployeeID
could be the primary key, ensuring that each employee has a unique identifier:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Position VARCHAR(50)
);
The primary key is crucial for maintaining data integrity and allows relational databases to efficiently index and access specific records.
Without a primary key, it would be difficult to ensure that each record is unique, leading to potential data duplication and inconsistency.
A table is a physical object in the database that stores data in rows and columns. You can perform various operations on a table, such as inserting, updating, or deleting records. Tables store data directly.
A view, on the other hand, is a virtual table that is the result of a SELECT
query.
Views do not store data directly; instead, they display data from one or more underlying tables. A view can be used to simplify complex queries, provide specific data to users without exposing the full table, or present data in a different format.
For example, a view might show only the name and salary of employees, hiding sensitive information like social security numbers:
CREATE VIEW EmployeeView AS
SELECT FirstName, LastName, Salary
FROM Employees;
Views provide an abstraction layer, which can help control access to data, streamline complex queries, and present data in a more user-friendly format.
A transaction in SQL is a sequence of one or more SQL operations that are executed as a single unit of work.
Transactions are used to ensure that operations on the database are executed reliably and that the database remains consistent even in the event of a failure.
The ACID properties ensure that SQL transactions are processed reliably:
For example, when transferring money between two bank accounts, a transaction ensures that either both the debit and credit operations are completed, or neither is. This guarantees the accuracy and integrity of the financial data:
BEGIN TRANSACTION;
UPDATE Accounts
SET Balance = Balance - 100
WHERE AccountID = 1;
UPDATE Accounts
SET Balance = Balance + 100
WHERE AccountID = 2;
COMMIT;
These properties are crucial in any environment where data accuracy and integrity are essential, such as in banking, ecommerce, or healthcare systems.
There you have it - 24 of the most common SQL questions and answers that you might encounter in your interview.
Preparing for a SQL interview requires a solid understanding of both fundamental and advanced SQL concepts, but by practicing these questions and understanding their underlying principles, you can confidently navigate through your SQL interview.
What did you score? Did you nail all 24 questions? If so, it might be time to move from studying to actively interviewing!
Didn't get them all? Got tripped up on a few? Don't worry; I'm here to help.
If you want to fast-track your SQL knowledge and interview prep, and get as much hands-on practice as possible, then check out my complete SQL course:
Like I said earlier, this SQL Bootcamp will take you from complete beginner to a master of SQL, database management, and database design. You'll learn by using fun exercises and working with all database types to give you real-world experience. No prior experience needed.
Plus, once you join, you'll have the opportunity to ask questions in our private Discord community from me, other students and working SQL Devs.
If you join or not, I just want to wish you the best of luck with your interview!