📊 Module 16: Monitoring & Diagnostics

AWR, ASH, ADDM Reports, Performance Monitoring

AWR
Workload Data
ASH
Active Sessions
ADDM
Diagnostics
METRIC
Performance Data

📈 AWR (Automatic Workload Repository)

AWR Basics & Snapshots

-- Create manual snapshot
BEGIN
  DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT();
END;
/

-- View snapshots (last 10)
SELECT snap_id, snap_time, snap_level
FROM dba_hist_snapshot
ORDER BY snap_id DESC
FETCH FIRST 10 ROWS ONLY;

-- Set AWR retention (in minutes)
BEGIN
  DBMS_WORKLOAD_REPOSITORY.MODIFY_SNAPSHOT_SETTINGS(
    retention => 10080,  -- 7 days
    interval => 60       -- 1 hour
  );
END;
/

-- View AWR settings
SELECT *
FROM dba_hist_wr_control;

Generating AWR Reports

-- Generate AWR HTML report (run as SYS/SYSDBA)
-- Use SQL*Plus: @$ORACLE_HOME/rdbms/admin/awrrpt.sql
-- Or programmatically:
SET heading OFF feedback OFF verify OFF
SET PAGESIZE 0 linesize 1000 long 20000

SELECT output
FROM TABLE(DBMS_WORKLOAD_REPOSITORY.AWR_REPORT_HTML(
  l_dbid => (SELECT dbid FROM v$database),
  l_inst_num => 1,
  l_begin_snap => 100,
  l_end_snap => 110
));

📊 ASH (Active Session History)

ASH Query Examples

-- Top wait events in last hour
SELECT event, count(*) wait_count
FROM v$active_session_history
WHERE sample_time > TRUNC(SYSDATE) + 1
GROUP BY event
ORDER BY wait_count DESC;

-- Top SQL by elapsed time
SELECT sql_id, count(*) sample_count,
       MAX(sql_exec_start) last_execution
FROM v$active_session_history
WHERE sql_id IS NOT NULL
  AND sample_time > SYSDATE - 1/24
GROUP BY sql_id
ORDER BY sample_count DESC
FETCH FIRST 10 ROWS ONLY;

-- Active sessions over time
SELECT TRUNC(sample_time, 'HH') hour,
       count(*) active_sessions
FROM v$active_session_history
WHERE sample_time > SYSDATE - 1
GROUP BY TRUNC(sample_time, 'HH')
ORDER BY hour DESC;

🔧 ADDM (Automatic Database Diagnostic Monitor)

Running ADDM Analysis

-- Analyze performance between snapshots
DECLARE
  task_id NUMBER;
BEGIN
  task_id := DBMS_ADDM.CREATE_TASK(
    task_name => 'my_addm_task',
    description => 'Database diagnostic analysis',
    dbid => (SELECT dbid FROM v$database),
    instance_number => 1,
    snapshot_begin => 100,
    snapshot_end => 110
  );
  
  DBMS_ADDM.EXECUTE_TASK(task_id);
  
  DBMS_OUTPUT.PUT_LINE('ADDM Task ID: ' || task_id);
END;
/

-- View ADDM findings
SELECT task_id, task_name, task_status
FROM dba_addm_tasks
ORDER BY task_id DESC;

-- View recommendations
SELECT finding_id, type, message, impact
FROM dba_addm_findings
WHERE task_id = 1
ORDER BY impact DESC;

📊 Performance Metrics Monitoring

Key Database Metrics

Metric Description Ideal Range
DB CPU Time Total CPU used by database Low to Moderate
Physical I/O Disk read/write operations Minimize
Logical I/O Buffer cache reads Reasonable
Library Cache Hit Ratio SQL reuse efficiency > 95%
Buffer Cache Hit Ratio Memory hit efficiency > 99%
Parse Time Ratio Time spent parsing SQL < 5%

Query Performance Metrics

-- Database performance metrics
SELECT metric_name, value, metric_unit
FROM v$sysmetric
WHERE metric_name IN (
  'Database CPU Time Ratio',
  'Memory Sorts Ratio',
  'Redo Allocation Hit Ratio'
);

-- Wait events breakdown
SELECT event, total_waits, total_timeouts,
       time_waited_micro / 1000 time_waited_ms
FROM v$system_event
WHERE event NOT LIKE 'SQL%'
ORDER BY time_waited_micro DESC
FETCH FIRST 15 ROWS ONLY;

-- Session resource usage
SELECT username, sid, physical_reads, logical_reads, cpu_time
FROM v$sess_io
ORDER BY cpu_time DESC
FETCH FIRST 10 ROWS ONLY;

🔍 Real-Time Monitoring Commands

Session & Lock Monitoring

-- Active sessions with their SQL
SELECT s.username, s.sid, s.serial#,
       sq.sql_text, s.status, s.wait_class
FROM v$session s
LEFT JOIN v$sql sq ON s.sql_id = sq.sql_id
WHERE s.username IS NOT NULL
ORDER BY s.status;

-- Blocking locks
SELECT blocking_session, sid, serial#, username
FROM v$session
WHERE blocking_session IS NOT NULL;

-- Long running operations
SELECT sid, serial#, username, opname,
       elapsed_seconds, time_remaining
FROM v$long_ops
WHERE sofar != totalwork;
💡 Monitoring Best Practices: Monitor AWR snapshots regularly (hourly). Check ASH for blocking issues. Run ADDM analysis weekly. Watch key metrics (CPU, I/O, memory). Monitor wait events. Check for long-running operations. Review redo log generation. Monitor archive log space. Check tablespace usage. Review alert log for errors.
✓ Learning Checklist - Module 16