🔢 Aggregate Functions
Common Aggregate Functions
| Function | Purpose | Example | Returns |
|---|---|---|---|
| COUNT(*) | Total rows | COUNT(*) AS total | Number |
| COUNT(col) | Non-NULL values | COUNT(salary) | Number |
| SUM(col) | Sum of values | SUM(salary) | Number |
| AVG(col) | Average value | AVG(salary) | Number |
| MIN(col) | Minimum value | MIN(salary) | Any type |
| MAX(col) | Maximum value | MAX(salary) | Any type |
| STDDEV(col) | Standard deviation | STDDEV(salary) | Number |
| VARIANCE(col) | Variance | VARIANCE(salary) | Number |
Basic Aggregate Queries
-- Simple aggregation
SELECT
COUNT(*) AS total_employees,
SUM(salary) AS total_payroll,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM employees;
-- Aggregation with DISTINCT
SELECT
COUNT(DISTINCT department_id) AS dept_count,
COUNT(DISTINCT manager_id) AS manager_count
FROM employees;
-- Conditional aggregation
SELECT
COUNT(CASE WHEN salary > 50000 THEN 1 END) AS high_earners,
COUNT(CASE WHEN salary <= 50000 THEN 1 END) AS low_earners
FROM employees;
🎯 GROUP BY Clause
Basic GROUP BY
-- Group by single column
SELECT
department_id,
COUNT(*) AS emp_count,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
ORDER BY avg_salary DESC;
-- Group by multiple columns
SELECT
department_id,
job_id,
COUNT(*) AS emp_count,
SUM(salary) AS total_salary
FROM employees
GROUP BY department_id, job_id
ORDER BY department_id, job_id;
-- Group with JOIN
SELECT
d.department_name,
COUNT(e.employee_id) AS emp_count,
AVG(e.salary) AS avg_salary
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY emp_count DESC;
🔍 HAVING Clause - Filter Groups
Filter groups after aggregation (similar to WHERE but for groups).
-- HAVING filters groups, WHERE filters rows
SELECT
department_id,
COUNT(*) AS emp_count,
AVG(salary) AS avg_salary
FROM employees
WHERE salary > 40000 -- Row filter (before grouping)
GROUP BY department_id
HAVING COUNT(*) > 5 -- Group filter (after grouping)
ORDER BY emp_count DESC;
-- Multiple HAVING conditions
SELECT
department_id,
COUNT(*) AS emp_count,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 3
AND AVG(salary) > 60000
AND MAX(salary) > 100000;
-- HAVING with subquery
SELECT
department_id,
COUNT(*) AS emp_count
FROM employees
GROUP BY department_id
HAVING COUNT(*) > (
SELECT AVG(emp_per_dept)
FROM (
SELECT COUNT(*) emp_per_dept
FROM employees
GROUP BY department_id
)
);
📈 ROLLUP - Hierarchical Aggregation
Creates subtotals and grand total from left to right.
-- ROLLUP generates subtotals and grand total
SELECT
department_id,
job_id,
COUNT(*) AS emp_count,
SUM(salary) AS total_salary
FROM employees
GROUP BY ROLLUP (department_id, job_id)
ORDER BY department_id, job_id;
-- Results include:
- Detail rows (all combinations)
- Subtotals by department (job_id = NULL)
- Grand total (both NULL)
-- Identify subtotal rows using GROUPING
SELECT
CASE
WHEN GROUPING(department_id) = 1 THEN 'GRAND TOTAL'
WHEN GROUPING(job_id) = 1 THEN 'SUBTOTAL (Dept: ' || department_id || ')'
ELSE 'DETAIL'
END AS row_type,
department_id,
job_id,
COUNT(*) AS emp_count,
SUM(salary) AS total_salary
FROM employees
GROUP BY ROLLUP (department_id, job_id)
ORDER BY department_id, job_id;
🎲 CUBE - Cross-Tabulation
Creates all possible combinations of groupings (full cross-tabulation).
-- CUBE generates all combinations
SELECT
department_id,
job_id,
COUNT(*) AS emp_count,
SUM(salary) AS total_salary
FROM employees
GROUP BY CUBE (department_id, job_id)
ORDER BY department_id, job_id;
-- Results include:
- (dept, job) combinations
- (dept only) subtotals
- (job only) subtotals
- Grand total
-- Using GROUPING_ID for compact notation
SELECT
GROUPING_ID(department_id, job_id) AS grouping_level,
department_id,
job_id,
COUNT(*) AS emp_count
FROM employees
GROUP BY CUBE (department_id, job_id);
-- GROUPING_ID returns:
0 = Both specified
1 = department_id rolled up
2 = job_id rolled up
3 = Both rolled up (grand total)
⚡ ROLLUP vs CUBE Comparison
| Aspect | ROLLUP | CUBE |
|---|---|---|
| Combinations | Hierarchical (left to right) | All possible |
| Example: (A,B) | (A,B) → (A) → () | (A,B) → (A) → (B) → () |
| Row Count | Fewer rows | More rows (2^n) |
| Use Case | Hierarchical data (year → month → day) | Cross-tabulation (dept ↔ job) |
| Performance | ⭐⭐⭐ Better | ⭐⭐ Slower (more rows) |
Detailed Example: ROLLUP
-- Data: 4 employees in 2 departments with 2 jobs
SELECT
department_id,
job_id,
SUM(salary)
FROM employees
GROUP BY ROLLUP (department_id, job_id);
-- Results:
-- 10 ANALYST 50000 -- Detail
-- 10 NULL 100000 -- Subtotal for dept 10
-- 20 MANAGER 80000 -- Detail
-- 20 NULL 150000 -- Subtotal for dept 20
-- NULL NULL 250000 -- Grand Total
-- Total: 5 rows (from original 4)
Detailed Example: CUBE
-- Same data with CUBE
SELECT
department_id,
job_id,
SUM(salary)
FROM employees
GROUP BY CUBE (department_id, job_id);
-- Results:
-- 10 ANALYST 50000 -- (dept, job)
-- 10 MANAGER 50000 -- (dept, job)
-- 20 ANALYST 60000 -- (dept, job)
-- 20 MANAGER 80000 -- (dept, job)
-- 10 NULL 100000 -- Subtotal by dept
-- 20 NULL 140000 -- Subtotal by dept
-- NULL ANALYST 110000 -- Subtotal by job
-- NULL MANAGER 130000 -- Subtotal by job
-- NULL NULL 240000 -- Grand Total
-- Total: 9 rows (from original 4, = 2^2 + 4)
🚀 Advanced Aggregation Examples
Running Totals and Cumulative Sums
-- SUM with window function for running total
SELECT
employee_id,
salary,
SUM(salary) OVER (ORDER BY employee_id) AS running_total
FROM employees
ORDER BY employee_id;
-- Running total by department
SELECT
department_id,
employee_id,
salary,
SUM(salary) OVER (
PARTITION BY department_id
ORDER BY employee_id
) AS dept_running_total
FROM employees;
💡 Tip: Use ROLLUP for hierarchical data (time periods). Use CUBE for cross-tab analysis. Always use GROUPING() to identify subtotal rows.
✓ Learning Checklist - Module 7