10 SQL Tips and Tricks For Freshers

10 Mind-Blowing SQL Tricks Every Data Engineer Should Know

 

SQL is the backbone of data engineering and the foundation for many data-driven applications. Whether you're preparing for an interview or trying to optimize a complex query, knowing how to write efficient SQL is a game-changer. In this post, I’ll walk you through 10 essential SQL tricks that will help you write faster, cleaner, and more efficient queries. Let’s dive in!

 

1. CTE (Common Table Expressions) 

CTEs, or Common Table Expressions, are temporary result sets that make your queries easier to read and manage. They allow you to break down complex queries into simpler steps by creating "subqueries" that can be referenced later.

 

Example: 

 

WITH EmployeeCTE AS (

    SELECT EmployeeID, Salary

    FROM Employees

    WHERE Department = 'Engineering'

)

SELECT EmployeeID, Salary

FROM EmployeeCTE

WHERE Salary > 80000;

 

 

Why Use It: 

CTEs are perfect for making long, complex queries more readable by isolating different parts of your logic.

 

2. Window Functions: ROW_NUMBER 

Window functions allow you to perform calculations across a set of table rows that are related to the current row. One of the most common is ROW_NUMBER, which assigns a unique number to each row in your result set.

 

Example: 

 

SELECT EmployeeID, FirstName, ROW_NUMBER() OVER (ORDER BY HireDate) AS RowNum

FROM Employees;

 

 

Why Use It: 

Use window functions like ROW_NUMBER, RANK, and DENSE_RANK to analyze data trends or organize results based on custom criteria.

 

3. SQL CASE Statements 

The CASE statement allows you to perform conditional logic in SQL, similar to IF statements in other languages. It’s a great way to return customized results based on conditions.

 

Example: 

 

SELECT FirstName,

    CASE

        WHEN Salary > 100000 THEN 'High Salary'

        WHEN Salary BETWEEN 50000 AND 100000 THEN 'Medium Salary'

        ELSE 'Low Salary'

    END AS SalaryRange

FROM Employees;

 

 

Why Use It: 

The CASE statement is incredibly flexible for transforming data on the fly, whether for grouping, categorizing, or creating calculated fields.

 

4. SQL Merge Statements 

The MERGE statement is a powerful tool for performing insert, update, or delete operations in one go. It’s especially useful when syncing datasets.

 

Example: 

 

MERGE INTO TargetTable AS Target

USING SourceTable AS Source

ON Target.EmployeeID = Source.EmployeeID

WHEN MATCHED THEN

    UPDATE SET Target.Salary = Source.Salary

WHEN NOT MATCHED BY TARGET THEN

    INSERT (EmployeeID, Salary) VALUES (Source.EmployeeID, Source.Salary);

 

 

Why Use It: 

MERGE simplifies combining data from multiple sources without writing separate insert, update, and delete queries.

 

5. Recursive CTEs (With Fibonacci Example) 

Recursive CTEs are an advanced feature of CTEs, allowing you to perform recursive operations, such as calculating a Fibonacci sequence or handling hierarchical data.

 

Example (Fibonacci Sequence): 

 

WITH FibonacciCTE (n, Fibonacci) AS (

    SELECT 0, 0

    UNION ALL

    SELECT 1, 1

    UNION ALL

    SELECT n + 1, Fibonacci + LAG(Fibonacci) OVER (ORDER BY n)

    FROM FibonacciCTE

    WHERE n < 10

)

SELECT * FROM FibonacciCTE;

 

 

Why Use It: 

Recursive CTEs are perfect for working with self-referential data, such as org charts or hierarchical structures, and for performing mathematical calculations like this Fibonacci example.

 

6. EXISTS vs. IN 

Both EXISTS and IN can be used to filter data, but they behave differently. Generally, EXISTS is faster when checking for the existence of rows in a subquery, whereas IN works well when dealing with smaller, static sets of data.

 

Example with EXISTS: 

 

SELECT FirstName

FROM Employees e

WHERE EXISTS (SELECT 1 FROM Projects p WHERE p.EmployeeID = e.EmployeeID);

 

 

Why Use It: 

EXISTS checks for the existence of rows more efficiently than IN, especially when working with large datasets.

 

7. SQL Join Strategies 

SQL joins are essential for combining data from multiple tables. But not all joins are created equal—choosing the right join can impact query performance significantly. Understanding the difference between INNER, LEFT, RIGHT, and FULL OUTER JOIN is key.

 

Example (INNER JOIN): 

 

SELECT e.FirstName, d.DepartmentName

FROM Employees e

INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID;

 

 

Why Use It: 

Mastering join strategies is crucial for combining data across tables without impacting performance.

 

8. STRING_SPLIT 

If you're working with delimited data in SQL (like a list of items separated by commas), STRING_SPLIT can make it easy to parse and work with the individual elements.

 

Example: 

 

SELECT value

FROM STRING_SPLIT('Apple,Banana,Orange', ',');

 

 

Why Use It: 

Parsing strings in SQL becomes easier with STRING_SPLIT, making it an invaluable tool when dealing with text-based data.

 

9. TRY_CAST for Safer Data Conversion 

Converting data types in SQL can lead to errors if the data is not in the expected format. TRY_CAST helps you safely convert data types without failing the entire query.

 

Example: 

 

SELECT TRY_CAST('123' AS INT) AS IntValue,

       TRY_CAST('ABC' AS INT) AS FailedConversion;

 

 

Why Use It: 

Use TRY_CAST to avoid runtime errors when working with unpredictable or inconsistent data.

 

10. Indexing for Performance 

Proper indexing is key to improving query performance, especially with large datasets. Understanding when and how to use indexes can drastically reduce query execution time.

 

Example: 

 

CREATE INDEX idx_EmployeeID ON Employees (EmployeeID);

 

 

Why Use It: 

Indexing optimizes data retrieval, making it crucial for scaling databases and handling large amounts of data efficiently.

 

Conclusion 

Mastering these SQL tricks will help you level up your data engineering skills and write more efficient, scalable queries. Whether you're a beginner looking to improve your SQL skills or a seasoned professional, these techniques will give you the edge in query performance and optimization.

 

Stay tuned for more tips on data engineering, and don’t forget to challenge yourself with these tricks in your daily workflow!

Previous
Previous

Do Data Engineers Really Need Big Data Right Away? Spoiler: Not Always!

Next
Next

Data Engineering Trends (2024)