Welcome, would-be SQL warriors! Are you getting ready for an upcoming SQL interview? Are you feeling stressed out about all the questions that could be asked? Don’t worry—this complete guide will give you the knowledge and strategies to handle any SQL challenge that comes your way.
What is SQL?
When you want to talk to relational databases, you should use Structured Query Language (SQL). It’s important for data analysts, data scientists, and software developers to know how to use it because it lets them get data from databases, change it, and look at it.
Preparing for the Interview
Before diving into specific questions let’s establish a solid foundation for your interview preparation.
1. Brush up on the basics:
- Data types: Understand the different data types used in SQL, such as integers, strings, dates, and booleans.
- Operators: Familiarize yourself with various operators, including arithmetic, comparison, logical, and string operators.
- Control flow statements: Learn about control flow statements like IF-THEN-ELSE and CASE statements.
- Functions: Explore built-in functions for data manipulation, aggregation, and string manipulation.
- Joins: Master different join types like INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
2. Practice with sample questions:
- Toptal’s 41 Essential SQL Interview Questions: This comprehensive list covers a wide range of topics, from basic syntax to advanced concepts.
- InterviewBit’s SQL Interview Questions and Answers: This resource provides clear explanations and examples for various SQL interview questions.
- LeetCode’s SQL Interview Questions: This platform offers a variety of SQL problems to test your skills and prepare for real-world scenarios.
3, Sharpen your problem-solving skills
- Focus on understanding the problem: Read the question carefully and identify the key requirements.
- Break down the problem into smaller steps: Decompose the problem into manageable chunks to simplify the solution.
- Write clear and concise SQL queries: Use proper syntax, indentation, and comments to enhance readability and maintainability.
- Test your queries: Ensure your queries return the expected results using sample data or a database.
Top 10 SQL Interview Questions
Now, let’s delve into the top 10 SQL interview questions you’re likely to encounter:
1 What does UNION do? What is the difference between UNION and UNION ALL?
UNION combines the contents of two structurally-compatible tables into a single table, eliminating duplicate records.
UNION ALL also combines tables but includes all records, even duplicates.
2. List and explain the different types of JOIN clauses supported in ANSI-standard SQL.
- INNER JOIN: Returns rows where there’s a match in BOTH tables.
- LEFT JOIN: Returns all rows from the left table and matching rows from the right table.
- RIGHT JOIN: Returns all rows from the right table and matching rows from the left table.
- FULL JOIN: Returns all rows from BOTH tables, regardless of whether there’s a match.
- CROSS JOIN: Returns all possible combinations of rows from the joined tables.
3. Given two tables, write a query to select records from one table that are not present in the other.
Use the NOT IN
clause to filter out records from the first table that don’t have a matching value in the second table.
4. Given two tables, write a query to update a column in one table based on a condition in the other table.
Use the UPDATE
statement with a JOIN
clause to update the target column based on the condition in the other table.
5. What is the difference between the WHERE
and HAVING
clauses?
WHERE
filters records before grouping.HAVING
filters groups after aggregation.
6. Given a table, write a query to select the first 100 odd-numbered records.
Use the MOD
operator to check if the record ID is odd and limit the result to 100 records using TOP
or LIMIT
.
7. What is the difference between the RANK()
and DENSE_RANK()
functions?
RANK()
assigns non-consecutive ranks to tied values, resulting in gaps.DENSE_RANK()
assigns consecutive ranks to tied values, avoiding gaps.
8. Given a table, write a query to select all even-numbered records and all odd-numbered records.
Use the MOD
operator to check if the record ID is even or odd and filter accordingly.
9. What is the difference between CHAR
and VARCHAR2
data types?
CHAR
allocates a fixed length and pads with blanks if the value is shorter.VARCHAR2
allocates space only for the actual value, making it more efficient.
10. Write a SQL query to transpose text, converting each character in a column into a separate row.
Use a combination of string functions and recursion to achieve the desired output.
By diligently following these tips and practicing with various SQL interview questions, you’ll be well-equipped to ace your upcoming interview. Remember, confidence, clarity, and a strong understanding of SQL concepts will set you apart from the competition.
Additional Resources
- SQL Tutorial: https://www.w3schools.com/sql/default.asp
- SQL Interview Questions and Answers: https://www.interviewbit.com/sql-interview-questions/
- Toptal’s 41 Essential SQL Interview Questions: https://www.toptal.com/sql/interview-questions
Remember, the road to SQL mastery is paved with practice and perseverance. Keep learning, keep practicing, and keep conquering those SQL challenges!
Submit an interview question
Questions and answers sent in will be looked over and edited by Toptal, LLC, and may or may not be posted, at their sole discretion.
Toptal sourced essential questions that the best SQL developers and engineers can answer. Driven from our community, we encourage experts to submit questions and offer feedback.
What does UNION
do? What is the difference between UNION
and UNION ALL
?
UNION merges the contents of two structurally-compatible tables into a single combined table. The difference between UNION and UNION ALL is that UNION will omit duplicate records whereas UNION ALL will include duplicate records.
It is important to note that the performance of UNION ALL will typically be better than UNION, since UNION requires the server to do the additional work of removing any duplicates. So, in cases where is is certain that there will not be any duplicates, or where having duplicates is not a problem, use of UNION ALL would be recommended for performance reasons. 2 .
List and explain the different types of JOIN
clauses supported in ANSI-standard SQL.
ANSI-standard SQL specifies five types of JOIN
clauses as follows:
- INNER JOIN (a. k. a. “simple join”: Shows all rows where at least one record matches in BOTH tables This is the join type that is used by default if no other type is given.
- LEFT JOIN (or LEFT OUTER JOIN): Gives you back all the rows from the left table plus the rows from the right table that match. e. even if the JOIN condition doesn’t find any matching records in the right table. The results will still have all the records from the left table. This means that if the ON clause doesn’t find any records in the right table, the JOIN will still return a row from the right table with NULL in every column for that record in the left table.
- RIGHT JOIN (or RIGHT OUTER JOIN): Gives you back all the rows from the right table plus the rows from the left table that match. This is the exact opposite of a LEFT JOIN; i. e. even if the JOIN condition doesn’t find any matching records in the left table. The results will still have all the records from the right table. The JOIN will still return a row for that record in the right table even if the ON clause doesn’t match any records in the left table. However, each column from the left table will have NULL in it.
- Full Join (also called FULL Outer Join): Shows all rows where there is a match in EITHER of the tables. In theory, a FULL JOIN does the same thing as both a LEFT JOIN and a RIGHT JOIN. e. its set of results is the same as combining the results of left and right outer queries.
- CROSS JOIN: A cross join takes each row from the first table and joins it with a row from the second table. e. , gives back the Cartesian product of the groups of rows from the joined tables It’s important to remember that you can either (a) use the CROSS JOIN syntax (“explicit join notation”) or (b) list the tables in the FROM clause, separated by commas, without using a WHERE clause to set the join criteria (“implicit join notation”).
3 .
Given the following tables:
What will be the result of the query below?
Explain your answer and give another way to ask this question that doesn’t bring up the problem it does.
Surprisingly, given the sample data provided, the result of this query will be an empty set. The reason for this is that the outer query will return an empty set if the set being checked by the NOT IN condition has any null values, even if there are many runner IDs that match winner IDs in the races table.
Knowing this, a query that avoids this issue would be as follows:
Note, this is assuming the standard SQL behavior that you get without modifying the default ANSI_NULLS
setting.
Apply to Join Toptals Development Network
and enjoy reliable, steady, remote Freelance SQL Developer Jobs
Given two tables created and populated as follows:
What will the result be from the following query:
The result of the query will be as follows:
The EXISTS clause in the above query is a red herring. It will always be true since ID is not a member of dbo. docs. As such, it will refer to the envelope table comparing itself to itself!.
You won’t be able to set the idnum value to NULL because the join of NULL won’t work when you try to match it with any value of envelope. 5 .
Assume a schema of Emp ( Id, Name, DeptId ) , Dept ( Id, Name)
.
If the Emp table has 10 records and the Dept table has 5 records, how many rows will be shown when you run the following SQL query?
If you leave out the “where” clause, the query will automatically do a “cross join” or “Cartesian product” and return 50 rows. 6 .
Given two tables created as follows
You don’t need to use the NOT keyword in this query to get values from table test_a that are not in table test_b.
Note, Oracle does not support the above INSERT
syntax, so you would need this instead:
In SQL Server, PostgreSQL, and SQLite, this can be done using the except
keyword as follows:
In Oracle, the minus
keyword is used instead. Note that if there are multiple columns, say ID and Name, the column should be explicitly stated in Oracle queries: Select ID from test_a minus select ID from test_b
MySQL does not support the except function. However, there is a standard SQL solution that works in all of the above engines, including MySQL:
Write a SQL query to find the 10th highest employee salary from an Employee
table. Explain your answer.
(Note: You may assume that there are at least 10 records in the Employee
table.)
This can be done as follows:
This works as follows:
The first query, SELECT DISTINCT TOP (10) Salary FROM Employee ORDER BY Salary DESC, will pick out the 10 highest-paid workers in the table. However, those salaries will be listed in descending order. So the first query could work. But now, picking the first item on that list will give you the highest salary, not the tenth highest.
This means that the second query sorts the 10 records again, this time in ascending order, which is how the records are normally sorted. It then chooses the top record, which is now the lowest salary of the 10 records.
Not all databases support the TOP
keyword. For example, MySQL and PostreSQL use the LIMIT
keyword, as follows:
Or even more concisely, in MySQL this can be:
And in PostgreSQL this can be:
Write a SQL query using UNION ALL (not UNION) that uses the WHERE clause to eliminate duplicates. Why might you want to do this?.
You can avoid duplicates using UNION ALL and still run much faster than UNION DISTINCT (which is actually same as UNION) by running a query like this:
The key is the AND a!=X part. This gives you the benefits of the UNION (a. k. a. , UNION DISTINCT) command, while avoiding much of its performance hit. 9 .
Given the following tables:
Write a query to get a list of all the users who took the same training lesson more than once on the same day. The list should be sorted by user and training lesson, with the most recent lesson date at the top and the oldest date at the bottom.
What is an execution plan? When would you use it? How would you view the execution plan?
An execution plan is like a road map that shows in text or a graph how the SQL server’s query optimizer chose to get data for a stored procedure or ad hoc query. Because they are used to run the query or stored procedure, execution plans are a great way for developers to learn about and analyze how well a query or stored procedure works.
Using a keyword like EXPLAIN in many SQL systems will give you a textual execution plan. You can also often get a visual representation of the plan. “Show Execution Plan” is an option in Microsoft SQL Server’s Query Analyzer. It can be found in the Query drop down menu. This choice will show query execution plans in a separate window when a query is run if it is turned on. 11 .
List and explain each of the ACID properties that collectively guarantee that database transactions are processed reliably.
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. They are defined as follows:
- Atomicity. Atomicity means that every transaction has to be “all or nothing.” If any part of the transaction fails, the whole transaction fails, and the database state stays the same. An atomic system has to guarantee atomicity all the time, even when there is no power, an error, or a crash.
- Consistency. Any operation on the database will move it from one valid state to another, thanks to the consistency property. Any information added to the database needs to be correct and follow all the rules that have been set up. These rules can be constraints, cascades, triggers, or any mix of these.
- Isolation. The isolation property makes sure that when transactions are run at the same time, the system is in the same state as when transactions were run one after the other, i.e. e. , one after the other. Providing isolation is the main goal of concurrency control. Depending on concurrency control method (i. e. It’s possible for another transaction to not see the effects of an incomplete transaction if it uses strict serializability instead of relaxed serializability.
- Durability. A transaction is said to be durable if it stays that way even if the power goes out, the system crashes, or something goes wrong. In a relational database, for example, the results of a set of SQL statements must be saved permanently, even if the database crashes right afterward. Transactions (or their effects) must be saved in a non-volatile memory in case the power goes out.
12 .
Given a table dbo. The column user_id is a unique number that identifies each user. How can you quickly get the first 100 odd user_id values from the table?
(Assume the table contains well over 100 records with odd user_id
values.)
SELECT TOP 100 user_id FROM dbo.users WHERE user_id % 2 = 1 ORDER BY user_id
13 .
What are the NVL
and the NVL2
functions in SQL? How do they differ?
It is checked by both NVL(exp1, exp2) and NVL2(exp1, exp2, exp3) to see if exp1 is null.
With NVL(exp1, exp2), the value of exp1 is returned if exp1 is not null. If exp1 is null, the value of exp2 is returned, but it is of the same data type as exp1.
If the first argument to NVL2 is not null, then the value of exp2 is returned. If it is, then the value of exp3 is returned. 14 .
How can you select all the even number records from a table? All the odd number records?
To select all the even number records from a table:
To select all the odd number records from a table:
What is the difference between the RANK()
and DENSE_RANK()
functions? Provide an example.
The only difference between the RANK() and DENSE_RANK() functions is in cases where there is a “tie”; i. e. , in cases where multiple values in a set have the same ranking. If this happens, RANK() will give each value in the set a “rank” that doesn’t go in order, leaving gaps between the integer ranking values if there is a tie. DENSE_RANK(), on the other hand, will give each value in the set a rank that goes in order, so there will be no gaps between the integer ranking values if there is a tie.
For example, consider the set {25, 25, 50, 75, 75, 100}. If you give this set to RANK(), it will return [1, 1, 3, 4, 4, 6] (note that 2 and 5 are skipped), but DENSE_RANK() will return [1,1,2,3,3,4}. 16 .
What is the difference between the WHERE
and HAVING
clauses?
When GROUP BY
is not used, the WHERE
and HAVING
clauses are essentially equivalent.
However, when GROUP BY
is used:
- The WHERE clause is used to pick out records from a list. The filtering occurs before any groupings are made.
- Having is used to pick out values from a group (i.e. e. so that conditions can be checked after data has been grouped)
17 .
What will happen if you run the following SQL query on a table called Employee that has columns called empName and empId?
“Order by 2” is only valid when there are at least two columns being used in select statement. But this query only chooses one column name from the Employee table, even though the table has two columns. This means that “Order by 2” will fail when the above SQL query is run. 18 .
What will be the output of the below query, given an Employee table having 10 records?
This query will return 10 records as TRUNCATE was executed in the transaction. TRUNCATE does not itself keep a log but BEGIN TRANSACTION keeps track of the TRUNCATE command. 19 .
- What is the difference between single-row functions and multiple-row functions?
- What is the
group by
clause used for?
- Single-row functions work with single row at a time. There are functions that can work with data from more than one row at a time.
- The group by clause brings together all the records that have the same value in any field or group of fields.
20 .
Imagine a table with only one column that has either a single number (0–9) or a single character (a–z, A–Z). Write an SQL query that will print “Fizz” for any number value and “Buzz” for any letter value in that column.
Example:
[d, x, T, 8, a, 9, 6, 2, V]
…should output:
[Buzz, Buzz, Buzz, Fizz, Buzz,Fizz, Fizz, Fizz, Buzz]
What is the difference between char
and varchar2
?
When stored in a database, varchar2 uses only the allocated space. E. g. if you have a varchar2(1999) and put 50 bytes in the table, it will use 52 bytes.
But when stored in a database, char always uses the maximum length and is blank-padded. E. g. if you have char(1999) and put 50 bytes in the table, it will consume 2000 bytes. 22 .
Write an SQL query to display the text CAPONE
as:
Or in other words, an SQL query to transpose text.
In Oracle SQL, this can be done as follows:
Can we insert a row for identity column implicitly?
Yes, like so:
Given this table:
What will be the output of below snippet?
Table is as follows:
ID | C1 | C2 | C3 |
---|---|---|---|
1 | Red | Yellow | Blue |
2 | NULL | Red | Green |
3 | Yellow | NULL | Violet |
Print the rows where one of columns C1, C2, or C3 has “Yellow,” but don’t use OR.
Write a query to insert/update Col2
’s values to look exactly opposite to Col1
’s values.
Col1 | Col2 |
---|---|
1 | 0 |
0 | 1 |
0 | 1 |
0 | 1 |
1 | 0 |
0 | 1 |
1 | 0 |
1 | 0 |
Or if the type is numeric:
How do you get the last id
without the max
function?
In MySQL:
In SQL Server:
What is the difference between IN
and EXISTS
?
IN
:
- Works on List result set
- Virtual tables with more than one column are caused by subqueries that don’t work.
- Compares every value in the result list
- Performance is comparatively SLOW for larger resultset of subquery
EXISTS
:
- Works on Virtual tables
- Is used with co-related queries
- Exits comparison when match is found
- Performance is comparatively FAST for larger resultset of subquery
29 .
Suppose in a table, seven records are there.
The column is an identity column.
After the record with the identity value 7, the client wants to add a record with an identity value that starts at 10.
Is it possible? If so, how? If not, why not?
Yes, it is possible, using a DBCC command:
How can you use a CTE to return the fifth highest (or Nth highest) salary from a table?
Given the following table named A
:
Write one query to find the sum of all x values that are positive and all x values that are negative.
Given the table mass_table
:
weight |
---|
5.67 |
34.567 |
365.253 |
34 |
Write a query that produces the output:
weight | kg | gms |
---|---|---|
5.67 | 5 | 67 |
34.567 | 34 | 567 |
365.253 | 365 | 253 |
34 | 34 | 0 |
Consider the Employee
table below.
Emp_Id | Emp_name | Salary | Manager_Id |
---|---|---|---|
10 | Anil | 50000 | 18 |
11 | Vikas | 75000 | 16 |
12 | Nisha | 40000 | 18 |
13 | Nidhi | 60000 | 17 |
14 | Priya | 80000 | 18 |
15 | Mohit | 45000 | 18 |
16 | Rajesh | 90000 | – |
17 | Raman | 55000 | 16 |
18 | Santosh | 65000 | 17 |
Write a query to generate below output:
Manager_Id | Manager | Average_Salary_Under_Manager |
---|---|---|
16 | Rajesh | 65000 |
17 | Raman | 62500 |
18 | Santosh | 53750 |
How do you copy data from one table to another table ?
The SQL statement that matches this is: SELECT name FROM customer WHERE state = VA;
SELECT name IN customer WHERE state IN (VA);
SELECT name IN customer WHERE state = VA;
SELECT name IN customer WHERE state = V;
SELECT name FROM customer WHERE state IN (VA);
SELECT name FROM customer WHERE state IN (VA);
36 .
Given these contents of the Customers table:
Here is a query written to return the list of customers not referred by Jane Smith:
What will be the result of the query? Why? What would be a better way to write it?
Even though Jane Smith herself is one of four customers who were not referred by her, the query will only return one: Pat Richards. Anyone who wasn’t referred by anyone else (those with NULL in their ReferredBy column) doesn’t show up. Though, those customers were not sent by Jane Smith, and NULL is not equal to 2, so why didn’t they show up?
SQL Server uses three-valued logic, which can be hard for programmers who are used to the more convenient two-valued logic (TRUE or FALSE) that most programming languages use. You could choose between two predicates in most languages: ReferredBy = 2 and ReferredBy 1. But in SQL Server, if ReferredBy is NULL, neither is true nor is false. Anything compared to NULL evaluates to the third value in three-valued logic: UNKNOWN.
The query should be written in one of two ways:
…or:
Watch out for the following, though!
This will return the same faulty set as the original. We already talked about why: anything that is compared to NULL evaluates to UNKNOWN, the third value in the three-valued logic. Anything, even NULL! That’s why SQL Server has the IS NULL and IS NOT NULL operators to check for NULL in particular. Those particular operators will always evaluate to true or false.
Although a candidate may not have a lot of experience with SQL Server, getting into the details of three-valued logic in general can give you a good idea of how quickly they can learn it or how hard they will find it. 37 .
Given a table TBL
with a field Nmbr
that has rows with the following values:
1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1
Write a query to add 2 where Nmbr
is 0 and add 3 where Nmbr
is 1.
This can be done as follows:
Suppose we have a Customer
table containing the following data:
Write a single SQL statement to concatenate all the customer names into the following single semicolon-separated string:
This is close, but will have an undesired trailing ;
. One way of fixing that could be:
In PostgreSQL one can also use this syntax to achieve the fully correct result:
How do you get the Nth-highest salary from the Employee table without a subquery or CTE?
This will give the third-highest salary from the Employee table. Accordingly we can find out Nth salary using LIMIT (N-1),1.
But MS SQL Server doesn’t support that syntax, so in that case:
OFFSET
’s parameter corresponds to the (N-1)
above. 40 .
How to find a duplicate record?
- duplicate records with one field
- duplicate records with more than one field
- There are two records with the same name and email address that are in the same group of users. The count of these emails is greater than 1.
- More than one record with the same name and email address: SELECT name, email, COUNT(*) FROM users GROUP BY name, email HAVING COUNT(*) > 1
41 .
Write an SQL query that will return a list of all the invoices based on the database schema shown in the SQLServer-style diagram below. Show the Invoice ID, the billing date, the customer’s name, and the name of the customer who sent that customer (if any) for each invoice. The list should be ordered by billing date.
This question simply tests the candidate’s ability take a plain-English requirement and write a corresponding SQL query. There is nothing tricky in this one, it just covers the basics:
- Were they sure they used a LEFT JOIN instead of an inner JOIN when they joined the customer table for the referring customer name? If not, any invoices from customers who weren’t referred by someone will be missed.
- When the candidate did the JOIN, did they alias the tables? Most experienced T-SQL programmers do this because typing the full table name every time it needs to be used gets old fast. The query would fail if at least the Customer table wasn’t aliased, since it is mentioned twice, once as the table with the name of the customer who is being invoiced and once as the table with the name of the customer who is referring the customer.
- The candidate should have made sure that the Id and Name columns in the SELECT were clear. This is something that most experienced programmers do automatically, even if there isn’t a conflict. In this case too, there would be a conflict, so the query would fail if the candidate didn’t do that.
Note that this query will not return Invoices that do not have an associated Customer. This may be the correct behavior for most cases (e. g. , every invoice is guaranteed to be linked to a customer (invoices that don’t match aren’t interesting) But to make sure that all invoices are sent back no matter what, the Customers table should be joined to the Invoices table using LEFT JOIN:
There is more to interviewing than tricky technical questions, so these are intended merely as a guide. Not every good candidate for the job will be able to answer all of them, and answering all of them doesn’t mean they are a good candidate. At the end of the day, hiring remains an art, a science — and a lot of work.
Tired of interviewing candidates? Not sure what to ask to get you a top hire?
Let Toptal find the best people for you.
Our Exclusive Network of SQL Developers
Looking to land a job as a SQL Developer?
Let Toptal find the right job for you.
Job Opportunities From Our Network
Top 25 SQL Interview Questions and Answers(The BEST SQL Interview Questions)
FAQ
What is the correct order of a SQL?
How to prepare for a SQL interview?
You should also brush up on the tricky questions, as the interviewers like to use them to try and catch you off guard. If you like learning SQL using hands-on exercises, then you’ve got to try All Forever SQL Package. Some of the common tricky SQL interview questions for experienced users are presented below.
What questions are asked in a SQL interview?
In a SQL interview, you may be asked about advanced queries, such as subqueries (both nested and correlated) and finding the nth highest value in a column. Before asking you technical questions, your interviewer may ask you some general questions about your overall experience with SQL.
What questions are asked at an advanced SQL interview?
You’ll often get this type of question at your advanced SQL interview: you’ll be given a code and have to describe the query’s return. While writing and reading SQL code go hand-in-hand, it still feels different when you have to analyze the code someone else wrote. You have data in the table contributors: What will this code return?
How to write a SQL interview question for a freelancer?
Some of the common tricky SQL interview questions for experienced users are presented below. Write a query that selects all freelancers along with their task info: Include freelancers that don’t have any tasks assigned. Dataset: The dataset is of a company that employs freelancers on certain tasks. It consists of three tables.