📊 Entity-Relationship Diagrams (ERD)
ERD Fundamentals
An Entity-Relationship Diagram visually represents database structure and relationships between entities:
| Notation | Meaning | Example |
|---|---|---|
| Rectangle | Entity (table) | EMPLOYEES, DEPARTMENTS |
| Oval | Attribute (column) | employee_id, first_name |
| Double Oval | Multi-valued attribute | phone_numbers |
| Line | Relationship | Works_In, Manages |
| PK (Primary Key) | Unique identifier | employee_id (underlined) |
| FK (Foreign Key) | Reference to other entity | dept_id references DEPARTMENTS |
Relationship Cardinalities
| Cardinality | Notation | Meaning | Example |
|---|---|---|---|
| One-to-One | 1:1 | One entity relates to exactly one other | Employee : Parking_Space |
| One-to-Many | 1:N | One relates to multiple others | Department : Employees (1:N) |
| Many-to-Many | M:N | Multiple relate to multiple | Students : Courses (M:N) |
| Self-Referencing | Reflexive | Entity relates to itself | Employee : Manager (Employee) |
🔧 Normalization Forms (1NF to 5NF)
1NF - First Normal Form (Eliminate Repeating Groups)
Rule: Remove duplicate columns, store one value per column
-- NOT 1NF: Multiple phone numbers in one column
StudentPhones:
StudentID | StudentName | Phones
1 | John | 555-1111, 555-2222, 555-3333
-- AFTER 1NF: Repeating groups eliminated
StudentPhones:
StudentID | StudentName | PhoneNumber
1 | John | 555-1111
1 | John | 555-2222
1 | John | 555-3333
2NF - Second Normal Form (Remove Partial Dependencies)
Rule: Remove attributes that depend on part of a composite key
-- NOT 2NF: InstructorName depends only on InstructorID, not on composite key
CourseInstructor:
CourseID | InstructorID | InstructorName
C101 | I001 | Dr. Smith
C101 | I002 | Dr. Brown
-- AFTER 2NF: Split into separate tables
CourseInstructor:
CourseID | InstructorID
C101 | I001
C101 | I002
Instructor:
InstructorID | InstructorName
I001 | Dr. Smith
I002 | Dr. Brown
3NF - Third Normal Form (Remove Transitive Dependencies)
Rule: Remove attributes that depend on non-key attributes
-- NOT 3NF: Department_Manager depends on DepartmentID, not PK
Employee:
EmpID | Name | DepartmentID | DepartmentName | DepartmentManager
E001 | John | D01 | Sales | Alice
E002 | Sarah | D02 | HR | Bob
-- AFTER 3NF: Remove transitive dependency
Employee:
EmpID | Name | DepartmentID
E001 | John | D01
E002 | Sarah | D02
Department:
DepartmentID | DepartmentName | DepartmentManager
D01 | Sales | Alice
D02 | HR | Bob
BCNF - Boyce-Codd Normal Form
Rule: Every determinant must be a candidate key (stricter than 3NF)
4NF & 5NF (Advanced Forms)
- 4NF: Remove non-functional multivalued dependencies
- 5NF (PJ/NF): Eliminate join dependencies (rare in practice)
💡 Practical Insight: Most business databases should reach 3NF. BCNF is good for complex scenarios. 4NF & 5NF are rarely needed in practice.
🎯 Database Design Patterns
1. Star Schema (Dimensional Modeling)
Used in data warehouses for analytical queries
-- Fact table (central)
SalesFact:
SalesID, DateID, ProductID, CustomerID, Amount, Quantity
-- Dimension tables (arms of star)
DateDimension: DateID, Date, Quarter, Year
ProductDimension: ProductID, ProductName, Category, Price
CustomerDimension: CustomerID, CustomerName, Region, Segment
2. Snowflake Schema
Normalized version of star schema, dimensions are normalized
3. Surrogate Key Pattern
Use system-generated keys instead of natural keys
-- Natural Key (business data)
Employee (SSN, FirstName, LastName)
-- Surrogate Key (system generated)
Employee (EmployeeID, SSN, FirstName, LastName)
- EmployeeID: Auto-increment primary key
- More flexible for schema changes
4. Soft Delete Pattern
Mark records as deleted instead of physically removing them
-- Add is_deleted flag
CREATE TABLE employees (
employee_id NUMBER PRIMARY KEY,
first_name VARCHAR2(50),
is_deleted CHAR(1) DEFAULT 'N'
);
-- Soft delete (reversible)
UPDATE employees SET is_deleted = 'Y' WHERE employee_id = 123;
-- Always filter active records
SELECT * FROM employees WHERE is_deleted = 'N';
5. Audit Trail Pattern
Track all changes for compliance and auditing
-- Main table
employees (employee_id, name, salary, version)
-- Audit table (automatic updates tracked)
employee_audit (
audit_id, employee_id, old_salary, new_salary,
changed_by, changed_date, change_reason
)
-- Trigger example:
TRIGGER emp_audit AFTER UPDATE ON employees
FOR EACH ROW BEGIN
INSERT INTO employee_audit
VALUES (audit_seq.NEXTVAL, :OLD.id, :OLD.salary, :NEW.salary,
USER, SYSDATE, 'Salary Update');
END;
6. Temporal Data Pattern (SCD - Slowly Changing Dimensions)
| Type | Description | Use Case |
|---|---|---|
| SCD Type 1 | Overwrite old data | Current data only (no history) |
| SCD Type 2 | Add new row with timestamps | Full history tracking |
| SCD Type 3 | Add previous value columns | Current + previous only |
⚖️ Denormalization Strategies
When to Denormalize
- Performance bottleneck caused by too many JOINs
- Heavy read-only analytical queries
- Data warehouse/OLAP systems
- Real-time reporting with tight SLAs
Denormalization Techniques
| Technique | Benefit | Cost |
|---|---|---|
| Redundant Columns | Eliminate JOIN, faster queries | Storage, update complexity |
| Derived Columns | Pre-calculated values | Update triggers, consistency |
| Materialized Views | Pre-computed aggregations | Storage, refresh overhead |
| Horizontal Partitioning | Parallel query execution | Distributed complexity |
-- Denormalization Example: Store total salary per department
CREATE TABLE departments (
dept_id NUMBER PRIMARY KEY,
dept_name VARCHAR2(50),
employee_count NUMBER, -- Redundant
total_salary NUMBER -- Derived
);
-- Maintain with trigger on salary changes
TRIGGER update_dept_totals AFTER UPDATE OF salary ON employees
FOR EACH ROW BEGIN
UPDATE departments
SET total_salary = total_salary + (:NEW.salary - :OLD.salary)
WHERE dept_id = :NEW.dept_id;
END;
⚠️ Warning: Denormalization adds complexity. Keep update logic simple and maintain referential integrity with triggers.
🏆 Design Best Practices
- Start with 3NF: Begin normalized, denormalize only if needed for performance
- Choose Keys Wisely: Use surrogate keys for primary key, natural keys for business rules
- Enforce Referential Integrity: Use foreign keys and constraints
- Document ERD: Keep design documentation current
- Plan for Growth: Consider scalability when designing tables
- Test Backup/Recovery: Ensure design supports backup strategies
- Use Sequences Carefully: Distribute SEQUENCE generation for RAC systems
✓ Learning Checklist - Module 2