Database Normalization
Database Design Process (review)
* Gather user needs / business
* Develop an ER Model based on user / business
* Convert E-R model to the set of relations (tables)
* Normalisasikan relations, to remove anomalies
* Implemented to create a database with a table for each relationship that have to normalized
Database Normalization
1. Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.(from http://en.wikipedia.org/wiki/Database_normalization)
2. Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. (from http://databases.about.com/od/specificproducts/a/normalization.htm)
* Normalization process is the establishment of the database structure so that most of the ambiguity can be removed.
* Normalization stage, starting from the most mild (1NF) to most stringent (5NF)
* Usually only up to the level of 3NF or BCNF because already sufficient to generate the table-table good quality
Why normalization is done?
* Optimizing table structures
* Increase speed
* The income data is the same
* More efficient in the use of storage media
* Reduce redundancy
* Avoid anomalies (insertion anomalies, deletion anomalies, update anomalies).
* Improved data integrity
-> A table saying good (efficient) or if the normal 3 to meet the following criteria:
- If there is decomposition (decomposition) table, it must be guaranteed safe dekomposisinya (Lossless-Join Decomposition). That is, after the table is described / in the decomposition into a new tables, the tables can generate a new table with the same exact.
- Maintain dependence on the functional changes in data (Dependency preservation).
- Does not violate Boyce-Code Normal Form (BCNF)
-> If the three criteria (BCNF) can not be met, then at least the table does not violate the Normal Form of the third stage (3rd Normal Form / 3NF).
Functional Dependency ~ 1
- Functional Dependency describes a relationship attributes in relation
- An attribute said functionally dependant on the other, if we use the value attribute to determine the value of the other attributes.
- Symbols ( -> ) is used to represent the functional dependency.This symbol is read : determine the functional
Functional Depedency ~ 2
- Notation : A -> B
A and B are attributes of a table. A means of determining the functional B or B depends on A, if and only if there are 2 rows of data with the same value of A, then B is also the same value
- Notation : A != B or A X-> B
It is the opposite of the previous notation
EXAMPLE:
Functional Dependency:
- NRP -> Nama
- Mata_Kuliah, NRP -> Nilai
Non Functional Dependency:
- Mata_Kuliah -> NRP
- NRP -> Nilai
Functional Depedency ~ 4
Functional Dependency of the table value :
- Nrp -> Name
Because for each value Nrp the same, then the value of the same name
- (Mata_kuliah, NRP) -> Value
Because the value of attributes depending on the NRP and Mata_kuliah together. In another sense Mata_kuliah for the NRP and the same, they also rated the same, because Mata_kuliah and the NRP is a key (is unique).
- Mata_kuliah -//-> NRP
- NRP -//-> Value
Normal Form
1. The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. (from http://en.wikipedia.org/wiki/Database_normalization)
* First Normal Form (1NF) A table on the form said to be normal I if it's did not reside in the unnormalized form of a table, where there is a kind of field multiplication and field that allows a null (empty)
Is not allowed there:
- Many attributes of value (Multivalued attributes).
- Attributes composite or a combination of both
So:
Price is the domain attribute must be atomic rates
Eg Student Data as follows:
or
the tables above does not meet the requirements 1NF
both tables are decomposition into:
Student Table:
Table Hobbies:
* Second Normal Form (2NF) ~ 1 - Normal form 2NF met in a table if it meets the form of 1NF, and all the attributes than the primary key, have a full Functional Dependency on primary key
- A table does not meet 2NF, if there are attributes that it's Functional Dependency are only partial (only depending on the part of the primary key)
- If there are attributes that have no dependence on the primary key, then the attributes must be moved or removed
* Second Normal Form (2NF) ~ 2 - Functional dependency X -> Y is full if it is said to delete an attribute A from X means that Y is no longer dependent functional.
- Functional dependency X -> Y said if deleting a partial attribute A from X means that Y is functionally dependent.
- Relation scheme R in the form 2NF if every non-primary key attribute A e R depend on the full functional primary key R.
EXAMPLE:
The following table meet 1NF, but not include 2NF
the tables above does not meet 2NF, because (NIM, KodeMk) is regarded as the primary key:
{ NIM, KodeMk } -> NamaMhs
{ NIM, KodeMk } -> Address
{ NIM, KodeMk } -> Matakuliah
{ NIM, KodeMk } -> SKS
{ NIM, KodeMk } -> NilaiHuruf
Table in the decomposition needs to be some of the table is eligible 2NF
their functional dependency as follows:
- {NIM, KodeMk} -> NilaiHuruf (fd1)
- NIM -> {NamaMhs, Address} (fd2)
- KodeMk -> {Matakuliah, SKS} (fd3)
So that:
- fd1 (NIM, KodeMk, NilaiHuruf) -> Value Table
- fd2 (NIM, NamaMhs, Address) -> Table Student
- fd3 (KodeMk, Matakuliah, SKS) -> Table MataKuliah
Third Normal Form (3NF) ~ 1 Normal form 3NF fulfilled if the form meets 2NF, and if there are no non-primary key attribute that has a dependence on non-primary key attributes of the other (transitive dependencies)
EXAMPLE:
The table following students eligible 2NF, but does not meet 3NF
Because there are non-primary key attribute (ie, City and Provincial), which has a dependence on non-primary key attributes of the other (ie KodePos):
KodePos -> { City, Province }
So that the table in the decomposition needs to be:
- Student (NIM, NamaMhs, Road, KodePos)
- KodePos (KodePos, Province, City)
Boyce-Codd Normal Form (BNCF) Boyce-Codd Normal Form constraint has a stronger form of the Normal third. To be BNCF, relations must be in the form of First Normal form and forced each of the attributes depends on the function in the super key attributes.
In the example below there is a relationship of seminar, is the Primary Key NPM + Seminar.
Students may take one or two seminars. Each seminar requires 2 each of the students and led by one of the 2 seminar. Each leader can only take one seminar course. NPM in this example and show a Seminar Pembimbing
Relations Seminar is a form of Third Normal, but not BCNF, because Seminar Code still depends the function on the Pembimbing, if any Pembimbing can only teach a seminar. Depending on the seminar is not a super key attributes such as required by BCNF. So Seminar relations must be parsed into two tables:
Fourth Normal Form (4NF) and Fifth Normal Form (5NF)
- Relations in the fourth normal form (4NF) if the relation in BCNF and does not contain a lot of dependence values. To remove the dependency of many values from a relation, we divide the relationship into two new relations. Each relation contains two attributes that have a lot of relationship value.
- Relations in fifth normal form (5NF) deal with the property called the join without any loss of information (lossless join). Fifth normal form (also called the 5 NF PJNF (projection join normal form). The case is very rare and appear difficult to detect in practice.
Refrences
1. Agus Sanjaya ER, S.Kom, M.Kom ,presentation slide : Normalization
2. http://en.wikipedia.org/wiki/Database_normalization