Have you ever stared at your computer screen wondering why it is taking so long for your SQL query to run? You are not alone in this universe, we all have been through such situations quite often. Of those, a few have been lucky to get help from their colleagues and seniors in understanding the reason behind their query running slow.
I am one of those lucky guys. Having worked with some of the best minds, I have received tips from them on how to optimize SQL queries for faster execution. Today I will share some of the tips I have received from these smart folks and also a few from my own experience of working on data for the past few years.
Tips to Optimize SQL Queries
1. Use the correct JOIN
There are four kinds of joins that you can use based on your specific requirement, namely,
- Inner Join
- Outer Join
- Full Join
- Cartesian Join
Here is a diagram to show how these joins are different from each other:
But not all the Joins are equally efficient. This is how they rank in terms of efficiency:
So, use JOINS wisely. If Inner Join can give your required result, don’t go for the Outer Join.
2. While using JOIN, keep smaller table on the left for faster execution
3. Avoid HAVING clause when possible
HAVING clause applies the filter on the result of the query after aggregation has been done. Due to this second round of filtering, HAVING clause can increase the time taken to run a query significantly.
See this article to understand the order in which each clause of a query is executed.
4. Avoid subqueries in WHERE clause
The reason behind this again lies in the order in which each clause is executed.
5. Filter out the data in JOIN instead of WHERE clause
As you must have read by now in the link that I shared above, SQL queries are executed in the following order:
FROM > ON > JOIN > WHERE > GROUP BY > HAVING > SELECT > DISTINCT > ORDER BY
Since FROM & JOIN executes before WHERE, narrowing down the data in FROM clause can make the query efficient. So, whenever possible, filter out the data using the ON rules for JOIN. Thus, by the time we reach WHERE clause, we already have a smaller set of data.
Also, remember that use of WHERE with OUTER JOIN can lead to the incorrect results. Check this article for details.
6) Include partition columns in the WHERE clause
Before you start writing the query check the table to see if it has been partitioned and use the partitioned columns in the WHERE clause. This will ensure that the database looks for the data only in that particular partition instead of scanning the whole table.
7) Use UNION ALL instead of UNION whenever possible
Both UNION and UNION ALL are used to combine the results of two different SELECT statements. The only difference between the two is that UNION removes duplicate records whereas UNION ALL doesn’t. While you might want to get unique records in most of the cases, and hence want to use UNION, it comes with performance hit because the database server has to do extra work to remove the duplicates.
Hope these 7 tips will help you optimize SQL queries that you write. If you have some more tips to offer please share here using the comment section.
This post first appeared on my blog Let’s Talk Data.