🛡️ Module 22: High Availability

Data Guard & Oracle RAC - Disaster Recovery & Clustering

🔄 Oracle Data Guard

Data Guard Architecture

Component Description Purpose
Primary Database Production DB (read-write) Active workload
Standby Database Replica synchronized Disaster recovery
Redo Transport Ships redo logs Data replication
Apply Services Applies redo Keep standby sync'd
Role Management Switchover/failover DR automation

Standby Types

Type Data Format Use Case
Physical Standby Block-for-block Exact replica, fast failover
Logical Standby SQL-based Open for queries
Snapshot Standby Writable copy Testing, development

Enable Archive Log Mode

-- Check archive log status
SELECT LOG_MODE FROM V$DATABASE;

-- Enable archive log mode
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;

-- Set archive destination
ALTER SYSTEM SET LOG_ARCHIVE_DEST_1=
  'LOCATION=/u01/archive VALID_FOR=(ALL_LOGFILES,ALL_ROLES)';

⚙️ Configure Data Guard

Primary Database Setup

-- Enable force logging
ALTER DATABASE FORCE LOGGING;

-- Set standby redo destination
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=
  'SERVICE=standby ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)';

-- Configure redo transport
ALTER SYSTEM SET LOG_ARCHIVE_CONFIG=
  'DG_CONFIG=(primary,standby)';

ALTER SYSTEM SET LOG_ARCHIVE_FORMAT='%t_%s_%r.arc';

-- Add standby redo logs
ALTER DATABASE ADD STANDBY LOGFILE 
  '/u01/oradata/standby_redo01.log' SIZE 200M;

Create Physical Standby

-- Step 1: Backup primary
RMAN> BACKUP DATABASE PLUS ARCHIVELOG;
RMAN> BACKUP CURRENT CONTROLFILE FOR STANDBY;

-- Step 2: Restore on standby
RMAN> RESTORE CONTROLFILE FROM '/backup/control.ctl';
RMAN> STARTUP MOUNT;
RMAN> RESTORE DATABASE;
RMAN> RECOVER DATABASE;

-- Step 3: Start redo apply
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE 
  USING CURRENT LOGFILE DISCONNECT FROM SESSION;
💡 Tip: Use Data Guard Broker (DGMGRL) for easier management and automated failover.

🔧 Data Guard Operations

Switchover (Planned)

-- On primary
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY 
  WITH SESSION SHUTDOWN;
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;

-- On standby (becomes primary)
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
ALTER DATABASE OPEN;

-- On old primary (now standby)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
  USING CURRENT LOGFILE DISCONNECT;

Failover (Unplanned)

-- On standby: Activate as primary
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
ALTER DATABASE OPEN;

Active Data Guard

-- Open standby read-only with apply
ALTER DATABASE OPEN READ ONLY;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
  USING CURRENT LOGFILE DISCONNECT;

-- Check apply lag
SELECT NAME, VALUE, UNIT, TIME_COMPUTED
FROM V$DATAGUARD_STATS
WHERE NAME IN ('apply lag', 'transport lag');

Snapshot Standby

-- Convert to snapshot (writable)
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE CONVERT TO SNAPSHOT STANDBY;
ALTER DATABASE OPEN;

-- Convert back to physical
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE
  USING CURRENT LOGFILE DISCONNECT;

📊 Monitor Data Guard

Monitoring Queries

-- Check database role
SELECT DATABASE_ROLE, OPEN_MODE FROM V$DATABASE;

-- Archive log status
SELECT DEST_ID, STATUS, DESTINATION, ERROR
FROM V$ARCHIVE_DEST
WHERE STATUS <> 'INACTIVE';

-- Redo apply progress
SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#
FROM V$MANAGED_STANDBY;

-- Apply/transport lag
SELECT NAME, VALUE, DATUM_TIME, TIME_COMPUTED
FROM V$DATAGUARD_STATS;

-- Archive gap detection
SELECT * FROM V$ARCHIVE_GAP;

Data Guard Broker

-- Enable broker
ALTER SYSTEM SET DG_BROKER_START=TRUE;

-- DGMGRL commands
DGMGRL> CONNECT sys/password@primary
DGMGRL> CREATE CONFIGURATION dg_config AS
        PRIMARY DATABASE IS primary
        CONNECT IDENTIFIER IS primary;

DGMGRL> ADD DATABASE standby AS
        CONNECT IDENTIFIER IS standby;

DGMGRL> ENABLE CONFIGURATION;
DGMGRL> SHOW CONFIGURATION;
DGMGRL> SWITCHOVER TO standby;
⚠️ Warning: Always test switchover monthly and failover quarterly in non-production.

🔗 Oracle RAC

RAC Architecture

Component Description Purpose
Cluster Nodes Multiple servers Load balancing, HA
Shared Storage ASM or SAN Single DB, multiple instances
Cache Fusion Inter-instance comm Share data blocks
CRS Cluster Ready Services Node membership, failover
Interconnect High-speed network Cache fusion traffic

RAC Administration

-- Check RAC status
SELECT INST_ID, INSTANCE_NAME, HOST_NAME, STATUS
FROM GV$INSTANCE
ORDER BY INST_ID;

-- View active instances
SELECT INST_ID, INSTANCE_NUMBER, THREAD#
FROM GV$INSTANCE;

-- Check interconnect
SELECT NAME, VALUE 
FROM GV$SYSSTAT
WHERE NAME LIKE '%gc%'
ORDER BY INST_ID;

-- Services configuration
SELECT NAME, NETWORK_NAME 
FROM DBA_SERVICES;

RAC Service Management

-- Create RAC service
BEGIN
  DBMS_SERVICE.CREATE_SERVICE(
    service_name => 'app_service',
    network_name => 'app_service.example.com'
  );
  DBMS_SERVICE.START_SERVICE('app_service');
END;
/

-- SRVCTL commands (OS level)
$ srvctl status database -d orcl
$ srvctl start instance -d orcl -i orcl1
$ srvctl stop instance -d orcl -i orcl2
$ srvctl relocate service -d orcl -s app_service -i orcl1 -t orcl2

🎯 Best Practices

Data Guard Best Practices

RAC Best Practices

✓ Learning Checklist - Module 22