This concept document provides a brief discussion of the topic of data normalization as it applies to the development
of the Artifact: Data Model. It does not provide a full treatment of
normalization, because the subject is quite broad and has been documented in many texts on database design. In [NBG01], normalization is defined as "an analytic technique used to produce a correct
relational database design." In practice, normalization is a procedure for eliminating redundancy in the Data Model by
means of applying restrictive rules. Elimination of data redundancy in the tables of the Data Model helps enforce
referential integrity of the data in the database.
Normalization is usually performed on the Data Model after an initial version of the tables and their relationships has
been developed in the model. The exact timing of when to apply normalization depends on the specific project
situation and is up to the Database Designer. The normalization process is applied to the
tables in the Data Model in series of steps in which each step applies rules that are stricter than the last.
Levels of Normalization
Normalization is hierarchically classified into numeric forms, with the most common being first, second, and third
normal form. Each level of normalization is more restrictive than the previous. The first three hierarchical levels of
normalization are:
-
First Normal Form-Repeating groups of data columns in tables have been eliminated such that data is
organized into atomic units.
-
Second Normal Form-Data is in first normal form, and redundancy on primary key fields has been eliminated
such that column values are wholly dependent on the primary key field.
-
Third Normal Form-Data is in second normal form, and each column is not dependent on any other non-key
column.
Other levels of normalization are possible but are not covered in this discussion. Information on additional
levels of normalization can be found in [DAT99]. The exact level of normalization to apply to the Data Model is a
decision that the database designer must make based on the specifics of the project situation.
|