There are several processes that must be done to design a database, namely:
- Gathering user needs / business
- Develop the ER model based on user needs / business
- Convert ER Model to set the relation (table)
- normalization of relations, for the anomaly
- to implement the database for each table to create relationships that have been in the normalization
Database Normalization
Normalization process is the establishment of the database structure so that most of the ambiguity can be removed. Normalization process of decomposition is also a relationship that is still "bad" with break the attributes to build some relationships. Normalization stage, starting from the most mild (1NF) to most stringent (5NF). Normalization is usually only up to the level of 3NF or BCNF because already sufficient to generate the table-a table of good quality.
Normalization is done at the design stage, before the database is implemented. Purpose to avoid update anomalies. To find out how to do normalization, we must first understand the concept of functional dependency and key.
Why do normalization?
- Optimization table structures
- Increase the speed
- Eliminate income data the same
- More efficient use of storage media
- Reduce redundancy
- Avoiding anomalies (insertion anomalies, deletion anomalies, update anomalies).
- Improved data integrity
A table saying good (efficient) or if the normal 3 to meet the following criteria:
- If there is decomposition (decomposition) table, then the decomposition will be guaranteed safe (Lossless-Join Decomposition). That is, after the table is described / in the decomposition into a new table-table, the table-table can generate a new table with the same exact.
- Maintain the functional dependence on the change data (Dependency preservation)
- No violate Boyce-Code Normal Form (BCNF)
If the three criteria (BCNF) can not be met, then at least the table does not violate the Normal Form of the third stage (3rd Normal Form / 3NF).
Normalization is required because of the redundancy relations, not relations of "good". Why?
- The main reason is the possibility of "update anomalies" (when the insert, delete, update) because can impact on the data inconsistencies
- The reason is other waste storage space (hard disk)
How to Handle anomaly?
1. Anomaly in the handle by programming language used to create the application database. Designer should note this anomaly and to tell a programmer.
2. Anomaly does not handle the system, but submitted to the operator to be careful in making modifications, inserts and deletes. It is human error risk.
3. Anomalies be avoided, with ways to secure the normalization.
Denormalization
Denormalization is a reverse process of normalization, ie, combine several relations, brought to normal form of a lower.
Functional Dependency (FD)
Functional dependency (abbreviated FD) is a restriction that comes from the meaning of attributes and relationships between attributes. Functional Dependency attributes describe the relationship in a relationship. An attribute said functionally dependant on the other, if we use the value attribute to determine the value of the other attributes.
Symbol that is used to represent --> functional dependency. --> read the functional set. FD from the fact that there is (obtained at the analysis system).
There is a functional dependency from A to B indicates that a value attribute to uniquely determine the value of attribute B.
- notation: A --> B
A and B are attributes of a table. A means of determining the functional B or B depends on A, if and only if there are 2 rows of data with the same value of A, then B is also the same value
- notation: A --> B or A x--> B
- It is the opposite of the previous notation.
Example:
Functional Dependency of the table value
- Nrp --> Nama
Because for each value Nrp the same, then the value of the same Nama
- {Mata_kuliah, NRP} --> Nilai
Because the Nilai of attributes depending on the NRP and Mata_kuliah together. In another sense Mata_kuliah for the NRP and the same, they also the same Nilai, because Mata_kuliah and the NRP is a key (is unique).
- Mata_kuliah --> NRP
- NRP --> Nilai
INFERENCE RULE TO FD
Example FD:
Suppose we are given a set of FD F, then we can reduce the FD-FD of F for the rule of inference. Example:
F = {SSN --> {EName, BDate, Address, DNumber}, {SSN, PNumber} --> Hours }
What is the FD that has a meaning similar to the F?
- SSN --> EName
- SSN --> BDate
- SSN --> Address
- SSN --> DNumber
- SSN, PNumber --> Hours
Amstrong’s Inference Rule
1. Reflective (IR1): If X --> Y, then X --> Y.
2. Augmentation (IR2): If {X --> Y} then XZ --> YZ.
3. Transitive (IR3): If {X--> Y, Y--> Z} then X --> Z.
Derived Inference Rule
Decomposition: If {X--> YZ} then X --> Y.
Additive (
Pseudotransitive: If {X -->Y, WY -->Z} then WX --> Z.
- closure (cover) from the set of functional dependency F, where F is F+ the set of all the FD has that can be derived from F.
- closure of the set of attribute X in relation to F is a set of X+ has all the attribute which is determined by the functional X.
- X+ can be searched by repeatedly applying IR1, IR2, IR3 with FD in the F
NORMAL FORM
Normal form is a condition (using the FD and key) that determines whether a scheme relationships meet certain criteria. There are several normal forms based on a number of criteria:
- Primary keys (1NF, 2NF, 3NF)
- All Candidate Keys (2NF, 3NF, BCNF)
- Multivalued dependencies (4NF)
- Join dependencies (5NF)
FIRST
A table on the form said to be normal if I did not reside in the unnormalized form of a table, where there is a kind of field multiplication and field that allows a null (empty) 1NF is not allowed on the:
- Attribute values, many (Multivalued attributes).
- Attribute a composite or a combination of both.
- nested relations.
- Price is the domain attribute must be atomic rates
- Ex Student Data as follows:
Or:
Table-top table in 1NF is not eligible, so the decomposition will be:
Mahasiswa Table
Hobby Table
Second
Normal form 2NF met in a table if it meets the form of 1NF, and all the attributes than the primary key, have a full Functional Dependency on primary key. A table does not meet 2NF said, if there are attributes that Functional Dependency are only partial (only depending on the part of the primary key). If there are attributes that have no dependence on the primary key, then the attributes must be moved or removed.
Functional dependency X --> Y if it is said of a remove attribute A from X means that Y is no longer dependent functional.
Functional dependency X --> Y if it is said partial delete an attribute A from X means that Y is functionally dependent.
Relation scheme R in the form 2NF if every non-primary key attribute A Î R is functionally dependent on the full primary key R.
The following table meet 1NF, 2NF, including but not
The table above does not meet 2NF, because (NIM, KodeMk) is regarded as the primary key:
{NIM, KodeMk} --> NamaMhs
{NIM, KodeMk} --> Alamat
{NIM, KodeMk} --> Matakuliah
{NIM, KodeMk} --> Sks
{NIM, KodeMk} --> NilaiHuruf
Table in the decomposition needs to be some of the table so that eligible 2NF, becomes:
- Functional dependency is as follows:
{NIM, KodeMk} --> NilaiHuruf (fd1)
NIM --> {NamaMhs, Alamat} (fd2)
KodeMk --> {Matakuliah, Sks} (fd3)
- So
fd1 (NIM, KodeMk, NilaiHuruf) --> Tabel Nilai
fd2 (NIM, NamaMhs, Alamat) --> Tabel Mahasiswa
fd3 (KodeMk, Matakuliah, Sks) --> Tabel MataKuliah
Third
Normal form 3NF fulfilled if the form meets 2NF, and if there are no non-primary key attribute that has a dependence on non-primary key attributes of the other (transitive dependencies).
Table following students eligible 2NF, 3NF, but does not meet
Because the table above there are still non-primary key attribute (ie,
KodePos --> {Kota, Provinsi}
So that the table in the decomposition needs to be:
- Mahasiswa (NIM, NamaMhs, Jalan, KodePos)
– KodePos (KodePos, Provinsi,
Boyce-Codd
Boyce-Codd Normal Form constraint has a stronger form of the
In the example below there is a relationship seminar, is the Primary Key NPM + Seminar.
Students may take one or two seminars. Each seminar requires 2 each of the students and led by one of the 2 seminar. Each leader can only take one seminar course. NPM and Seminar in this example and show a Pembimbing.
Seminar relations is a Third Normal Form, but not BCNF because Code Seminar is still dependent on the function Pembimbing, if any Pembimbing can only teach a seminar. Depending on the seminar is not a super key attributes such as required by BCNF. But relations Seminar should be parsed into two namely:
Normal Form of the Fourth and Fifth
- Relations in fourth normal form (4NF) if the relation in BCNF and does not contain a lot of dependence values. To remove the dependency of many values from a relation, we divide the relationship into two new relations. Each relation contains two attributes that have a lot of relationship value.
- Relations in fifth normal form (5NF) deal with the property called the join without any loss of information (lossless join). Fifth normal form also called the 5 NF PJNF (projection join normal form). The case is very rare and appear difficult to detect in practice.
References:
- ER Ngurah Agus Sanjaya. Slide Part 6 - NORMALISASI.
- Siti Aminah. Slide Review_Normalisasi_untuk_Vcon_7_des.