Database Normalization

Normalization means evaluating and correcting table structures to minimize data redundancies. Normalization takes you through steps that lead to successively higher normal forms. In other words, First normal form (1NF), second normal form (2NF), third normal form (3NF), this is the normal form we should really look at achieving.

Before going on, we need to understand Functional Dependence. It plays quite a role with normalization. The definition of functional dependence is: The attribute A is fully functionally dependent on the attribute B if each value of B determines one and only one value of A.

Substituted into an example:

The attribute dbo.users.id is fully functionally dependent on the attribute dbo.users.username if each value of username determines one and only one value of id.

Most of the below is from support.microsoft.com (database-normalization-basics)

1NF

  • Eliminate repeating groups in individual tables.
  • Create a separate table for each set of related data.
  • Identify each set of related data with a primary key.

2NF

  • Create separate tables for sets of values that apply to multiple records.
  • Relate these tables with a foreign key.

3NF

  • Eliminate fields that do not depend on the key.
  • Eliminate computable fields (Use a stored procedure)

This goes up to 6NF however in general 3NF is acceptable.

Note that database normalization adds a performance impact and in some cases a de-normalization is needed.

See: https://en.wikipedia.org/wiki/Denormalization

References