Unit 11.2A · Term 2

Normalization (1NF, 2NF, 3NF)

Normalization is the process of organizing a database to reduce redundancy (repeated data) and dependency problems. It involves splitting large tables into smaller, well-structured ones linked by keys.

Learning Objectives

11.4.1.5 Define connections (1NF, 2NF, 3NF)

Lesson Presentation

11.2A-normalization.pdf · Slides for classroom use

View Download

Conceptual Anchor

The Filing Cabinet Analogy

Imagine a messy filing cabinet where each folder duplicates information (student name in every grade folder). Normalization is like reorganizing the cabinet: put student info in ONE folder, grade info in ANOTHER, and link them with a student ID number. Less duplication, fewer errors.

Rules & Theory

Why Normalize?

Problem	Without Normalization	With Normalization
Data Redundancy	Same data stored multiple times	Data stored once
Update Anomaly	Change in one place but not another	Change once, reflected everywhere
Insert Anomaly	Can't add data without unrelated data	Can add independently
Delete Anomaly	Deleting one thing removes other data	Deletion is isolated

The Three Normal Forms

Form	Rule	In Simple Words
1NF	No repeating groups; each cell has a single atomic value	"No lists in cells"
2NF	1NF + no partial dependencies (all non-key fields depend on the WHOLE PK)	"Depends on the whole key"
3NF	2NF + no transitive dependencies (non-key fields don't depend on other non-key fields)	"Depends on nothing but the key"

Memory Trick

"Every non-key attribute must depend on the key (1NF), the whole key (2NF), and nothing but the key (3NF) — so help me Codd!"

Step-by-Step Example

UNNORMALIZED (UNF):
┌──────────┬────────┬──────────────────────────────────┐
│ StudentID│ Name   │ Courses                          │
├──────────┼────────┼──────────────────────────────────┤
│ S001     │ Ali    │ Maths, Physics, CS               │ ← list in cell!
│ S002     │ Dana   │ CS, English                      │ ← list in cell!
└──────────┴────────┴──────────────────────────────────┘

1NF — Remove repeating groups (one value per cell):
┌──────────┬────────┬──────────┐
│ StudentID│ Name   │ Course   │
├──────────┼────────┼──────────┤
│ S001     │ Ali    │ Maths    │
│ S001     │ Ali    │ Physics  │ ← "Ali" repeated!
│ S001     │ Ali    │ CS       │ ← redundancy
│ S002     │ Dana   │ CS       │
│ S002     │ Dana   │ English  │ ← "Dana" repeated!
└──────────┴────────┴──────────┘
PK = (StudentID + Course)  ← composite key

2NF — Remove partial dependencies:
"Name" depends only on StudentID, not on the full PK

Students Table:           Enrolment Table:
┌──────────┬────────┐    ┌──────────┬──────────┐
│ StudentID│ Name   │    │ StudentID│ Course   │
├──────────┼────────┤    ├──────────┼──────────┤
│ S001     │ Ali    │    │ S001     │ Maths    │
│ S002     │ Dana   │    │ S001     │ Physics  │
└──────────┴────────┘    │ S001     │ CS       │
                         │ S002     │ CS       │
                         │ S002     │ English  │
                         └──────────┴──────────┘

3NF — Remove transitive dependencies:
Suppose Enrolment also had TeacherName (which depends on Course, not the PK):

Before 3NF:                        After 3NF:
┌─────────┬────────┬────────────┐  Enrolment:       Courses:
│ StudentID│ Course │ TeacherName│  ┌────────┬──────┐ ┌──────┬───────────┐
│ S001    │ Maths  │ Mr Smith   │  │StudentID│Course│ │Course│TeacherName│
│ S001    │ CS     │ Ms Lee     │  └────────┴──────┘ └──────┴───────────┘
│ S002    │ CS     │ Ms Lee     │  (FK to both)
└─────────┴────────┴────────────┘
TeacherName depends on Course (non-key → non-key) = transitive!

Common Pitfalls

Skipping Normal Forms

You cannot jump to 3NF without achieving 1NF and 2NF first. Normalization is sequential — each form builds on the previous one.

Over-Normalizing

In practice, going beyond 3NF can make queries very complex with many JOINs. 3NF is usually sufficient for most databases.

Tasks

Remember

State the rules for 1NF, 2NF, and 3NF.

Apply

Normalize the following table to 3NF: OrderID, CustomerName, CustomerAddress, ProductName, ProductPrice, Quantity.

Analyze

Explain what an "update anomaly" is and how normalization prevents it.

Self-Check Quiz

Q1: What does 1NF require?

No repeating groups — each cell must contain a single, atomic value (no lists or arrays in a cell).

Q2: What is a partial dependency?

When a non-key attribute depends on only PART of a composite primary key (not the whole key). Violates 2NF.

Q3: What is the memory trick for the 3 normal forms?

Every non-key attribute must depend on "the key (1NF), the whole key (2NF), and nothing but the key (3NF)".