Oracle SQL tutorial Lesson 10



Oracle SQL Basic Lesson 10



Introduction


In the world of database analytics and business intelligence, Oracle SQL analytical functions provide powerful tools for gaining insights from your data. Unlike group functions, analytical functions return a value for each row while still being able to perform calculations across a set of rows. This makes them ideal for running totals, rankings, comparisons, and much more—without losing the details of individual records.

In this blog, we’ll explore the most essential Oracle SQL analytical functions, provide two practical examples for each, and explain how they work in simple terms. Whether you're building reports or doing advanced analysis, mastering these functions will elevate your SQL skills.

Below are the functions we will discuss one by one with example queries 



  • ROW_NUMBER | Sequential number per row in partition
  • RANK | Rank with gaps for ties
  • DENSE_RANK | Rank without gaps
  • NTILE | Divides rows into equal-sized buckets
  • LAG Value from previous row
  • LEAD Value from next row
  • FIRST_VALUE  First value in window
  • LAST_VALUE  Last value in window
  • SUM() OVER Running or group total
  • AVG() OVER Moving or group average

1. ROW_NUMBER()


The ROW_NUMBER() function assigns a unique sequential number to rows within a partition of a result set.

Example 1:

SELECT employee_id, department_id,
       ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;



- This query ranks employees by salary within each department.
- The highest-paid employee in each department gets row number 1.
- Useful for identifying top N performers per group.

Example 2:

SELECT * FROM (
  SELECT employee_id, ROW_NUMBER() OVER (ORDER BY hire_date) AS row_num
  FROM employees
)
WHERE row_num <= 5;




- This retrieves the first five employees based on hire date.
- Helps in getting earliest records without removing detail rows.
- Great for pagination or top-N queries.

2. RANK()


The RANK() function provides a ranking of rows within a partition, but gives the same rank to ties.

Example 1:

SELECT employee_id, salary,
       RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;



- Ranks employees by salary, assigning the same rank to ties.
- If two employees have the same salary, they receive the same rank.
- Gaps are introduced after ties.

Example 2:

SELECT department_id, employee_id, salary,
       RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_rank
FROM employees;



- Ranks salaries within each department.
- You can find the top earners in each department.
- A useful function in performance or compensation reports.

3. DENSE_RANK()


DENSE_RANK() is similar to RANK() but does not leave gaps after ties.

Example 1:

SELECT employee_id, salary,
       DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

- Assigns ranks based on salary, without skipping ranks.
- If two employees share the same salary, they get the same rank.
- The next rank continues without a gap.

Example 2:

SELECT department_id, employee_id, salary,
       DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dept_dense_rank
FROM employees;

- This ensures tightly packed ranking inside each department.
- Ideal when reporting rank-based results without gaps.
- Helps in scoring systems or awards.

4. NTILE()


The NTILE(n) function divides rows into n approximately equal groups.

Example 1:

SELECT employee_id, salary,
       NTILE(4) OVER (ORDER BY salary) AS salary_quartile
FROM employees;

- Divides employees into 4 quartiles based on salary.
- Each group has roughly the same number of employees.
- Useful for statistical bucket analysis.

Example 2:

SELECT department_id, employee_id, NTILE(3) OVER (PARTITION BY department_id ORDER BY hire_date) AS cohort
FROM employees;

- Divides each department’s employees into 3 cohorts by hire date.
- Useful for time-based segmentation.
- Common in workforce and cohort analysis.

5. LAG()


The LAG() function allows access to a previous row's value without using a self-join.

Example 1:

SELECT employee_id, salary,
       LAG(salary, 1) OVER (ORDER BY hire_date) AS prev_salary
FROM employees;

- Shows the salary of the employee hired before the current one.
- Helps in comparing current vs. previous values.
- Useful in trend and delta analysis.

Example 2:

SELECT department_id, employee_id, salary,
       LAG(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS previous_dept_salary
FROM employees;

- Retrieves previous salary within the same department.
- Can be used to detect pay progression over time.
- Supports sequential analysis.

6. LEAD()


The LEAD() function accesses the next row’s value in the same result set.

Example 1:

SELECT employee_id, salary,
       LEAD(salary) OVER (ORDER BY hire_date) AS next_salary
FROM employees;

- Shows the salary of the next employee by hire date.
- Can be used to compare against future values.
- Useful for forecasting or benchmarking.

Example 2:

SELECT department_id, employee_id, salary,
       LEAD(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS next_dept_salary
FROM employees;

- Retrieves salary of the next employee in the same department.
- Helps in intra-departmental trend analysis.
- Very useful in comparing row progressions.

7. FIRST_VALUE()


The FIRST_VALUE() function returns the first value in an ordered set.

Example 1:

SELECT employee_id, department_id, salary,
       FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS first_salary
FROM employees;

- Gets the salary of the earliest hired employee in each department.
- Shows the baseline value in a group.
- Good for comparisons to initial values.

Example 2:

SELECT employee_id, hire_date,
       FIRST_VALUE(hire_date) OVER (ORDER BY salary DESC) AS first_high_paid
FROM employees;

- Finds hire date of the highest-paid employee.
- Useful in identifying trend starters.
- Aids in understanding who began certain trends.

8. LAST_VALUE()


The LAST_VALUE() function retrieves the last value in an ordered set.

Example 1:

SELECT employee_id, department_id, salary,
       LAST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY hire_date 
       ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_salary
FROM employees;

- Retrieves last salary within department by hire date.
- Useful with frame clauses to get actual last value.
- Helps in closing balance or final state analysis.

Example 2:

SELECT employee_id, hire_date,
       LAST_VALUE(hire_date) OVER (ORDER BY salary DESC 
       ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS latest_hire
FROM employees;

- Gets the most recent hire date after the current row.
- Requires proper frame to avoid unexpected results.
- Great for end-state comparisons.

9. SUM() OVER()


The SUM() function can be used with the OVER() clause to provide running totals or group-based sums.

Example 1:

SELECT employee_id, salary,
       SUM(salary) OVER (ORDER BY hire_date) AS running_total
FROM employees;

- Calculates running total of salaries by hire date.
- Each row shows cumulative total up to that row.
- Ideal for financial statements and trends.

Example 2:

SELECT department_id, employee_id, salary,
       SUM(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS dept_running_salary
FROM employees;

- Computes running total within each department.
- Maintains visibility into group-level aggregation.
- Very useful in cumulative group analysis.

10. AVG() OVER()


Like SUM, AVG() can be used to calculate moving or group-based averages without collapsing rows.

Example 1:

SELECT employee_id, salary,
       AVG(salary) OVER (ORDER BY hire_date 
       ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM employees;

- Calculates a moving average of salary over 3 rows.
- Helps smooth out data trends.
- Common in time series analysis.

Example 2:

SELECT department_id, employee_id, salary,
       AVG(salary) OVER (PARTITION BY department_id) AS dept_avg_salary
FROM employees;

- Displays department average salary for each employee.
- Useful for relative performance evaluation.
- Helps detect outliers or underpaid staff.



Below are some additional queries for practice purpose.



1. Find the Second Highest Salary in Each Department Using RANK()

SELECT * FROM (
  SELECT department_id, employee_id, salary,
         RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk
  FROM employees
)
WHERE rnk = 2;


2. Calculate Running Total of Salary per Department Using SUM()

SELECT department_id, employee_id, salary,
       SUM(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS running_total
FROM employees;


3. Compare Each Employee's Salary with Department Average Using AVG()

SELECT employee_id, department_id, salary,
       AVG(salary) OVER (PARTITION BY department_id) AS dept_avg,
       salary - AVG(salary) OVER (PARTITION BY department_id) AS salary_diff
FROM employees;


4. Identify Employees Hired Before the Highest Paid in Department

SELECT employee_id, department_id, hire_date,
       MAX(hire_date) OVER (PARTITION BY department_id ORDER BY salary DESC 
       ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS latest_hire
FROM employees;


5. Assign Quartiles Based on Commission Using NTILE()

SELECT employee_id, commission_pct,
       NTILE(4) OVER (ORDER BY commission_pct) AS commission_quartile
FROM employees
WHERE commission_pct IS NOT NULL;

-
6. Find the Previous Hire Date in Each Department Using LAG()

SELECT employee_id, department_id, hire_date,
       LAG(hire_date, 1) OVER (PARTITION BY department_id ORDER BY hire_date) AS prev_hire_date
FROM employees;

-
7. Get First and Last Salary in Each Department

SELECT department_id, employee_id, salary,
       FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY hire_date) AS first_salary,
       LAST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY hire_date 
         ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_salary
FROM employees;


8. Calculate Moving Average of Salary over 3 Rows

SELECT employee_id, salary,
       AVG(salary) OVER (ORDER BY hire_date 
       ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_salary
FROM employees;




Conclusion


Oracle SQL analytical functions are game-changers for anyone working with large datasets, analytics, or business reporting. Unlike standard aggregate functions, these let you maintain row-level detail while still applying sophisticated calculations like ranks, trends, moving averages, and comparisons.



Post a Comment

0 Comments