Normalization Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly. Data Redundancy and Update Anomalies The main purpose of database design is to identify the optimal grouping of attributes in order to minimize data redundancy, and NULL value which affect on saving space for data storage. Data redundancy always causes UPDATE ANOMALIES which are classified into 3 types: Insertion anomalies Deletion Anomalies Modification Anomalies Insertion Anomalies Deletion Anomalies Modification Anomalies Insertion Anomalies To insert the details of new students into the Class_Info relation, we must include the details of the lecturer and subject in order to avoid null value. Deletion Anomalies If we delete a lecturer from the Class_Info relation, the details of students and subjects are also lost from the database. Modification Anomalies If we want to change the value of one of the attributes of a particular student in the Class_Info relation, we must update all rows which associate to the student. If this modification is not carried out on all the appropriate rows of the Class_Info relation, the database will become inconsistent. Insertion Anomaly Class_Info LID Lname E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Dusit Dusit Dusit Dusit Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara Salary 28700 28700 28700 28700 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 Dept Subject EE EE EE EE IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services Credit 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4 4 SID Sname S4 S5 S6 S7 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Panita Sarun Kanok Vichu Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu GPA 3.35 2.96 2.75 3.15 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 NULL NULL NULL NULL NULL NULL S999 Luxana NULL E9999 Thana 17500 CPE NULL NULL NULL NULL NULL NULL NULL NULL CPE GIS 4 NULL NULL NULL Insert new records may cause data redundancy and null value in some fields. Insertion Anomaly LID Lname E5001 E5001 E5001 E5001 Dusit Dusit Dusit Dusit E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Class_Info Dept Subject 28700 28700 28700 28700 EE EE EE EE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Dusit Dusit Dusit Dusit 28700 28700 28700 28700 EE EE EE EE Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Salary SID Sname 3 3 3 3 S4 S5 S6 S7 Panita Sarun Kanok Vichu 3.35 2.96 2.75 3.15 Power Control Power Control Power Control Power Control 3 3 3 3 S4 S5 S6 S7 Panita Sarun Kanok Vichu 3.35 2.96 2.75 3.15 Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services 3 3 4 4 3 3 3 3 3 4 4 4 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 Credit GPA NULL NULL NULL NULL NULL NULL S999 Luxana NULL E9999 Thana 17500 CPE NULL NULL NULL NULL NULL NULL NULL NULL CPE Prob Stat 4 NULL NULL NULL Insert new records may cause data redundancy and null value in some fields. Insertion Anomaly LID Lname E5001 E5001 E5001 E5001 E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Dusit Dusit Dusit Dusit Dusit Dusit Dusit Dusit Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara Class_Info Salary 28700 28700 28700 28700 28700 28700 28700 28700 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 Dept Subject EE EE EE EE EE EE EE EE IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Power Control Power Control Power Control Power Control Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services Credit 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4 4 SID Sname S4 S5 S6 S7 S4 S5 S6 S7 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Panita Sarun Kanok Vichu Panita Sarun Kanok Vichu Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu GPA 3.35 2.96 2.75 3.15 3.15 3.15 3.15 3.15 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 NULL NULL NULL NULL NULL NULL S999 Luxana NULL E9999 Thana 17500 CPE NULL NULL NULL NULL NULL NULL NULL NULL CPE Prob Stat 4 NULL NULL NULL Insert new records may cause data redundancy and null value in some fields. Deletion Anomaly Class_Info LID Lname Salary Dept Subject Credit SID Sname GPA E5001 Dusit 28700 EE Electronic 1 3 S4 Panita 3.35 E5001 Dusit 28700 EE Electronic 1 3 S5 Sarun 2.96 E5001 Dusit 28700 EE Electronic 1 3 S6 Kanok 2.75 E5001 Dusit 28700 EE Electronic 1 3 S7 Vichu 3.15 E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54 E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08 E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54 E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08 E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67 E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25 E9001 Pattara 18500 CPE Data Structure 3 S1 Preeda 2.85 E9001 Pattara 18500 CPE Data Structure 3 S2 Panu 2.45 E9001 Pattara 18500 CPE Data Structure 3 S3 Vallapa 3.02 E9001 Pattara 18500 CPE Web Service 4 S3 Vallapa 3.02 E9001 Pattara 18500 CPE Web Services 4 S1 Preeda 2.85 E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45 Deletion Anomaly may cause loss other necessary data. Modification Anomaly Class_Info LID Lname Salary Dept Subject Credit SID Sname GPA E5001 Dusit 28700 45000 EE Electronic 1 3 S4 Panita 3.35 E5001 Dusit 45000 EE 28700 Electronic 1 3 S5 Sarun 2.96 E5001 Dusit 45000 EE 28700 Electronic 1 3 S6 Kanok 2.75 E5001 Dusit 45000 EE 28700 Electronic 1 3 S7 Vichu 3.15 E6001 Anan 24900 IE Optimization 3 S8 Kitti 2.54 E6001 Anan 24900 IE Optimization 3 S9 Chareon 3.08 E6001 Anan 24900 IE Prob Stat 4 S8 Kitti 2.54 E6001 Anan 24900 IE Prob Stat 4 S9 Chareon 3.08 E6002 Saeree 53020 IE Optimization 3 S10 Sathit 2.67 E6002 Saeree 53020 IE Optimization 3 S11 Vitthaya 3.25 E9001 Pattara Pattara 18500 21000 CPE Data Structure 3 S1 Preeda 2.85 E9001 Pattara 18500 18500 CPE Data Structure 3 S2 Panu Panu 2.45 2.67 E9001 Pattara 18500 18500 CPE Data Structure 3 S3 Vallapa 3.02 E9001 Pattara 18500 18500 CPE Web Service 4 S3 Vallapa 3.02 E9001 Pattara 18500 25000 CPE Web Services 4 S1 Preeda 2.85 E9001 Pattara 18500 CPE Web Services 4 S2 Panu 2.45 If we want to change the value of one of the attributes of a particular entity in the relation, we must update all rows that relate to this entity. If this modification is not carried out on all the appropriate rows ,the data base will become inconsistent. To solve update anomalies, a relation must be normalized by using normalization process to remove existing data redundancy. LID Lname E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Dusit Dusit Dusit Dusit Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara Salary 28700 28700 28700 28700 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 Dept Subject EE EE EE EE IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services Credit 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4 4 SID Sname S4 S5 S6 S7 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Panita Sarun Kanok Vichu Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu GPA 3.35 2.96 2.75 3.15 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 Functional Dependency One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes. Functional Dependency describes the relationship between attributes in a relation. For example, if A and B are attributes (or set of attributes) of relation R, B is functionally dependent on A (denoted AB), if each value of A is associated with exactly one value of B. The symbol of Functional Dependency (AB) can be described as followings: B is functionally dependent on A or A determines B or B depends on A Functional Dependencies One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes. (Definition of Functional Dependency) Suppose that B is an attribute and A is another one, we said that B is functionally dependent on A (denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consists of one or more attributes.) The symbol of functional dependence (A B) means B is functionally dependent on A or A functionally defines B or B depends on A If the functional dependency holds on schema R, in any legal relation r, for all pairs of tuples t1 and t2 in r such that t1[] = t2[], it is also the case that t1[] = t2[]. Given a relation r, attribute y of r is dependent on attribute x if and only if whenever two tuples of R agree on their x-value, they must necessarily agree on their y-value. For every tuple in the relation r, if the value of attribute in tuples are the same, DBMS guarantees that the value of the attribute in those tuples must be the same. That is If holds on R and if t1[] = t2[] DBMS must guarantee that t1[] = t2[] A B is functionally dependent on A B When a functional dependency exists, the attribute or group Of attributes on the left-hand side of the arrow is called the determinant. Position is functionally Staff_No Position dependent on Staff_No SL21 Position System Engineer Staff_No is not functionally dependent on Position System Engineer Staff_No SL21 SG5 LID Lname E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Dusit Dusit Dusit Dusit Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara Salary 28700 28700 28700 28700 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 Dept Subject EE EE EE EE IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services Credit 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4 4 SID Sname S4 S5 S6 S7 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Panita Sarun Kanok Vichu Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu ( LID, Subject,SID ) Lname, Salary, Dept, Credit, Sname, GPA LID Lname, Salary, Dept Subject Credit SID Sname, GPA GPA 3.35 2.96 2.75 3.15 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 Utilization of FD to decompose a relation LID Lname Salary Dept Subject Credit SID Sname GPA E5001 E5001 E5001 E5001 E6001 E6001 …… Dusit Dusit Dusit Dusit Anan Anan ………….. 28700 28700 28700 28700 24900 24900 ………….. EE EE EE EE IE IE …………. Electronic 1 Electronic 1 Electronic 1 Electronic 1 Optimization Optimization ………….. 3 3 3 3 3 3 ………….. S4 S5 S6 S7 S8 S9 ………….. Panita Sarun Kanok Vichu Kitti Charoen ………….. 3.35 2.96 2.75 3.15 2.54 3.08 ………….. Lecturer LID Lname E5001 E6001 E6002 E9001 Dusit Anan Saeree Pattara Student Subject Salary 28700 24900 53020 18500 Dept Subject EE IE IE CPE Electronic 1 Optimization Prob Stat Data Structure Web Service Credit 3 3 4 3 4 SID Sname S1 S2 S3 S4 S5 S6 S7 S8 S9 Preeda Panu Vallapa Panita Sarun Kanok Vichu Kitti Chareon GPA 2.85 2.45 3.02 3.35 2.96 2.75 3.15 2.54 3.08 Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Unnormalized Form 1st Normal Form 2nd Normal Form The process of normalization is a formal method that identifies relations based on primary key (or candidate keys in the case of BCNF the functional dependencies among their attributes). 3rd Normal Form Boyce-Codd Normal Form Normalization can be applied to individual relation so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly. Relationships of Normal Forms 1NF 2NF 3NF/BCNF 4NF 5NF Higher DKNF Normal forms Case Study The DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time. Unnormalized form (UNF) : A table that contains one or more repeating groups. Customer_Rental Relation Cust_No CName Property_ No PAddress Rent RentStart RentFinish Owner_No CR76 John Kay PG4 6 Lawrence St, 350 1-Jul-94 31-Aug-96 CO40 Tina Murphy ------- PG16 5 Norwar Dr 450 1-Sep-96 1-Sep-98 CO93 Tony Shaw -------- PG4 6 Lawrence St, 350 1-Sep-92 10-Jan-94 CO40 Tina Murphy -------- PG36 2 Manor Rd, 375 10-Oct-94 1-Dec-95 CO93 Tony Shaw --------- PG16 5 Norwar Dr 450 1-Jan-96 10-Aug-96 CO93 Tony Shaw ------- CR56 Aline Stewart OName O_addr A repeating group is an attribute or group of attributes within a table that occurs with multiple values for a single occurrence of the key attribute (s) for that table. The term key refers to the attribute (s) that uniquely identify each row within the unnormalized table. Case Study The DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time. Adjust Unnormalized form to 1st NF by removing of repeating groups in order to form relational data model (data are conceptually structured in the form of table) . Customer_Rental Relation Cust_No CName Property_ No PAddress Rent RentStart RentFinish Owner_No OName O_addr CR76 John Kay PG4 6 Lawrence St, 350 1-Jul-94 31-Aug-96 CO40 Tina Murphy …… CR76 John Kay PG16 5 Norwar Dr 450 1-Sep-96 1-Sep-98 CO93 Tony Shaw ……. CR56 Aline Stewart PG4 6 Lawrence St, 350 1-Sep-92 10-Jan-94 CO40 Tina Murphy …….. CR56 Aline Stewart PG36 2 Manor Rd, 375 10-Oct-94 1-Dec-95 CO93 Tony Shaw ……. CR56 Aline Stewart PG16 5 Norwar Dr 450 1-Jan-96 10-Aug-96 CO93 Tony Shaw ……. First normal form (1NF) : A relation in which the intersection of each row and column contains one and only one value. Customer_Rental Relation Custome_No Property_No CName PAddress Rent RentStart RentFinish Owner_No OName CR76 PG4 John Kay 6 Lawrence St, 350 1-jul-94 31-Aug-96 CO40 Tina Murphy CR76 PG16 John Kay 5 Norwar Dr 450 1-Sep-98 1-Sep-98 CO93 Tony Shaw CR56 PG4 Aline Stew 6 Lawrence St, 350 10-Jun-94 10-Jun-94 CO40 Tina Murphy CR56 PG36 Aline Stew 2 Manor Rd, 375 1-Dec-95 1-Dec-95 CO93 Tony Shaw CR56 PG16 Aline Stew 5 Norwar Dr 450 10-Aug-96 10-Aug-96 CO93 Tony Shaw For the relational data model, it is important to recognize that it is only first normal form(1NF) that is critical in creating appropriate relations. All the subsequent normal forms are optional. However, to avoid the update anomalies, it is recommended that we proceed to at least 3NF. Set of the Functional Dependency of Customer_Rental relation fd1 Customer_No, Property_No RentStart, RentFinish fd2 Customer_No CName fd3 Property_No PAddress, Rent, Owner_No, OName fd4 Owner_No Oname, O_add fd5 Customer_No, RentStart Property_No, PAddress, RentFinish, Rent, Owner, OName (Candidate key) fd6 Property_No, RentStart Customer_No, CName, RentFinish (Candidate key) (Primary key) (Partial dependency) (Partial dependency) (Transitive dependency) Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName (Primary key) fd1 fd2 (Partial dependency) (Partial dependency) fd3 fd4 (Transitive dependency) fd5 (Candidate key) fd6 (Candidate key) Second Normal Form (2NF) : A relation that is in the first normal form and every non-primary key attribute is fully functionally dependent on the primary key. Full functional : dependency Indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. ถ้า B เป็ น Non-Key attribute ซึง่ มีฟังก์ชนการขึ ั่ น้ ต่อกันอยูก่ บั ส่วนใดส่วนหนึ่งของคียห์ ลัก เราจะเรียกว่า B partial dependence on A. Partial dependency ต้องถูกขจัดออกโดยการแยก ออกไปตัง้ เป็ นตารางใหม่ เพือ่ ให้ Non-Key attribute ตัวนี้ fully dependent on คียห์ ลัก Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName O_Addr (Primary key) fd1 fd2 (Partial dependency) fd3 (Partial dependency) Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr) Rental Relation Customer Relation Customer_No Property_No RentStart RentFinish John Kay CR76 PG14 1-Jul-94 31-Aug-96 Aline Stewart CR766 PG16 1-Sep-96 1-Sep-98 CR56 PG4 1-Sep-92 10-Jun-94 CR56 PG36 10-Oct-94 1-Dec-95 CR56 PG16 1-Jan-96 Customer_No CName CR76 CR56 10-Aug-96 Property-Owner Relation Property_No PAddress Rent Owner_No OName O_addr PG14 6 Lawrence St, 350 CO40 Tina Murphy 28 North Rye PG16 5 Norwar Dr 450 CO93 Tony Shaw 550/8 Lake Shore Dr. PG36 2 Manor Rd, 375 CO93 Tony Shaw 550/8 Lake Shore Dr. 2NF applies to relations with composite keys, that is, relations with a primary key that composed of two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF. Transitive dependency Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr) Transitive dependency Property-Owner Relation Property_ No PAddress Rent Owner_No OName O_addr PG14 6 Lawrence St, 350 CO40 Tina Murphy 28 North Rye PG16 5 Norwar Dr 450 CO93 Tony Shaw 550/8 Lake Shore Dr. PG36 2 Manor Rd, 375 CO93 Tony Shaw 550/8 Lake Shore Dr. Customer Relation Rental Relation Customer_No CName Customer_No Property_No RentStart RentFinish CR76 John Kay CR76 PG14 1-Jul-94 31-Aug-96 CR56 Aline Stewart CR766 PG16 1-Sep-96 1-Sep-98 CR56 PG4 1-Sep-92 10-Jun-94 CR56 PG36 10-Oct-94 1-Dec-95 CR56 PG16 1-Jan-96 10-Aug-96 Transitive dependency : A condition where A, B, and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Definition of Third Normal Form: A relation that is in first and second normal form, and in which no non-primary key attribute is transitively dependent on the primary key. Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr) Property-for-Rent Relation Owner Relation Property_No PAddress Rent Owner_No PG14 6 Lawrence St, 350 CO40 C040 Tina Murphy 28 North Rye PG16 5 Norwar Dr 450 CO93 Co93 Tony Shaw 550/8 Lake Shore Dr. PG36 2 Manor Rd, 375 CO93 Owner_No OName O_addr Customer_Rental Relation Custome_No Property_No CName PAddress Rent RentStart RentFinish Owner_No OName CR76 PG4 John Kay 6 Lawrence St, 350 1-jul-94 31-Aug-96 CO40 Tina Murphy CR76 PG16 John Kay 5 Norwar Dr 450 1-Sep-98 1-Sep-98 CO93 Tony Shaw CR56 PG4 Aline Stew 6 Lawrence St, 350 10-Jun-94 10-Jun-94 CO40 Tina Murphy CR56 PG36 Aline Stew 2 Manor Rd, 375 1-Dec-95 1-Dec-95 CO93 Tony Shaw CR56 PG16 Aline Stew 5 Norwar Dr 450 10-Aug-96 10-Aug-96 CO93 Tony Shaw Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property (Property_No, PAddress, Rent, Owner_No) Owner (Owner_No, Oname, O_addr) Customer_Rental 1NF Property_Owner Customer Rental Property_for_Rent 2NF Owner 3NF Rental Customer Customer_No CName Customer_No Property_No RentStart RentFinish CR76 John Kay CR76 PG14 1-Jul-94 31-Aug-96 CR56 Aline Stewart CR766 PG16 1-Sep-96 1-Sep-98 CR56 PG4 1-Sep-92 10-Jun-94 CR56 PG36 10-Oct-94 1-Dec-95 CR56 PG16 1-Jan-96 10-Aug-96 Property_for_Rent Owner Property_No PAddress Rent Owner_No PG14 6 Lawrence St, 350 CO40 Owner_No OName PG16 5 Norwar Dr 450 CO93 CO40 Tina Murphy 28 North Rye PG36 2 Manor Rd, 375 CO93 CO93 Tony Shaw 550/8 Lake Shore address From 3NF to Boyce-Codd Normal Form (BCNF) BCNF is based on functional dependencies that take into account all candidate keys in a relation. For a relation with only one candidate key, 3NF and BCNF are equivalent. The difference between 3NF and BCNF is that for a functional dependency AB, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such every relation in BCNF is also in 3NF. Boyce-Codd : normal form (BCNF) A relation is in BCNF if and only if every determinant is a candidate key. Violation of BCNF is quite rare, since it may only happen under specific conditions. The potential to violate BCNF may occur in relation that • contains two (or more) composite candidate keys and • which overlap, that is share at least one attribute in common Case Study In this example, Client_Interview relation is presented. It contains details of the arrangements for interviews of clients by members of staff of the DreamHome company. The members of staff involved in interviewing clients are allocated to a specific room on the day of interview. However, a room may be allocated to several members of staff as required throughout a working day. A client is only interviewed once on a given date, but may be requested to attend further interviews at later dates. This relation has three candidate keys: (Client_No, Interview_Date), (Staff_No, Interview_Date, Interview_Time), and (Room_No, Interview_Date, Interview_Time). Therefore the Client_Interview relation has three composite candidate keys, which overlap by sharing the common attribute Interview_Date. We select Client_No, Interview_Date) to act as the primary key for this relation. Client_Interview (Client_No, Inverview_Date, Interview_Time, Staff_No, Room_No) The Client_Interview relation has the following functional dependencies : Fd1 Client_No, Interview_Date Interview_Time, Staff_No, Room_No (Primary key) Fd2 Staff_No, Interview_Date, Interview_Time Client_No Fd3 Room_No, Interview_Date, Interview_Time Staff_No, Client_No (Candidate key) Staff_No, Interview_Date Room_No Fd4 (Candidate key) Client_No Interview_Date Interview_Time Staff_No Room_No CR76 13-May-98 10:30 SG5 G101 CR56 13-May-98 12:00 SG5 G101 CR74 13-May-98 12:00 SG37 G102 CR56 1-Jul-98 10:30 SG5 G102 Client_Interview Relation Interview (Client_No, Interview-Date, Interview_Time, Staff_No) Staff_Room (Staff_No, Interview-Date, Room_No) Interview Relation Client_No Interview_Date Interview_Time Staff_No CR76 13-May-98 10:30 SG5 CR56 13-May-98 12:00 SG5 CR74 13-May-98 12:00 SG37 CR56 1-Jul-98 10:30 SG5 Staff_Room Relation Staff_No Interview_Date Room_No SG5 13-May-98 G101 SG37 13-May-98 G102 SG5 1-Jul-98 G102 Review of Normalization (1NF to BCNF) The DreamHome company manages property on behalf of the owners, and as part of this service the company undertakes regular inspections of the property by members of staff. When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. However, a car may be allocated to several members of staff, as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Property_Inspection Relation Property_No PAddress IDate ITime Comments Staff_No SName Car_Reg PG4 6 Lawrence St, 18-Oct-96 10:00 Need to replace crockery SG37 Ann Beech M231 JGR 22-Apr-97 09:00 In good order SG14 David Ford M533 HDR 1-Oct-98 12:00 Damp rot in bathroom SG14 David Ford N721 HFR 22-Apr-96 13:00 Replace room carpet SG14 David Ford M533 HDR 24-Oct-97 14:00 Good condition SG37 Ann Beech N721 HFR PG16 5 Norwar Dr Property_Inspection (Property_No, PAddress, IDate, ITime, Comments, Staff_No, SName, OName) 1NF : Property_Inspection Relation Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg PG4 18-Oct-96 10:00 6 Lawrence St, Need to replace crockery SG37 Ann Beech M231 JGR PG4 22-Apr-97 09:00 6 Lawrence St, In good order SG14 David Ford M533 HDR PG4 1-Oct-98 12:00 6 Lawrence St, Damp rot in bathroom SG14 David Ford N721 HFR PG16 22-Apr-96 13:00 5 Norwar Dr Replace room carpet SG14 David Ford M533 HDR PG16 24-Oct-97 14:00 5 Norwar Dr Good condition SG37 Ann Beech N721 HFR Property_Inspection (Property_No, IDate, ITime, PAddress, Comments, Staff_No, SName, OName) Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg FD1 FD2 FD3 (Primary key) (Partial dependency) (Transitive dependency) FD4 FD5 FD6 (Candidate key) (Candidate key) 1NF : Property_Inspection Relation Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg PG4 18-Oct-96 10:00 6 Lawrence St, Need to replace crockery SG37 Ann Beech M231 JGR PG4 22-Apr-97 09:00 6 Lawrence St, In good order SG14 David Ford M533 HDR PG4 1-Oct-98 12:00 6 Lawrence St, Damp rot in bathroom SG14 David Ford N721 HFR PG16 22-Apr-96 13:00 5 Norwar Dr Replace room carpet SG14 David Ford M533 HDR PG16 24-Oct-97 14:00 5 Norwar Dr Good condition SG37 Ann Beech N721 HFR The potential to violate BCNF may occur in relation that • contains two (or more) composite candidate keys and • which overlap, that is share at least one attribute in common (Property_No, Idate) (IDate, ITime, Car_Reg) (IDate, ITime, Staff_No) Property_No IDate ITime PAddress Comments Staff_No SName Car_Reg FD1 (Primary key) (Partial dependency) FD2 Remove Partial dependency (decompose the relation) to obtain 2NF Property Relation Property_No PAddress PG4 6 Lawrence St, PG16 5 Norwar Dr Property_Inspection Relation Property_No IDate ITime Comments Staff_No SName Car_Reg PG4 18-Oct-96 10:00 Need to replace crockery SG37 Ann Beech M231 JGR PG4 22-Apr-97 09:00 In good order SG14 David Ford M533 HDR PG4 1-Oct-98 12:00 Damp rot in bathroom SG14 David Ford N721 HFR PG16 22-Apr-96 13:00 Replace room carpet SG14 David Ford M533 HDR PG16 24-Oct-97 14:00 Good condition SG37 Ann Beech N721 HFR Property Relation (Property_No, PAddress) Property_No PAddress PG4 6 Lawrence St, PG16 5 Norwar Dr Property_Inspection Relation Property_No IDate ITime Comments Staff_No SName Car_Reg FD1 FD3 (Primary key) (Transitive dependency) FD4 (Candidate key) FD5 FD6 (Candidate key) Property Relation Property_No PAddress PG4 6 Lawrence St, PG16 5 Norwar Dr Remove Transitive dependency (decompose the relation) to obtain 3NF Staff Relation Staff_No SName SG37 Ann Beech SG14 David Ford Property_Inspection Relation Property_No IDate ITime Comments Staff_No Car_Reg PG4 18-Oct-96 10:00 Need to replace crockery SG37 M231 JGR PG4 22-Apr-97 09:00 In good order SG14 M533 HDR PG4 1-Oct-98 12:00 Damp rot in bathroom SG14 N721 HFR PG16 22-Apr-97 13:00 Replace room carpet SG14 M533 HDR PG16 24-Oct-97 14:00 Good condition SG37 N721 HFR Property Relation Staff Relation Property_No PAddress Staff_No SName PG4 6 Lawrence St, SG37 Ann Beech PG16 5 Norwar Dr SG14 David Ford Remove remaining anomalies from functional dependencies to obtain BCNF Property_Inspection Relation Property_No IDate ITime Comments Staff_No Car_Reg (Primary key) (Candidate key) Staff_Car (Staff_No, IDate, Car_Reg) Inspection (Property_No, IDate, ITime, Comments, Staff_No) From BCNF to Fourth Normal Form (4NF) Although BCNF removes any anomalies due to functional dependencies, further research led to the identification of another type of dependency called multi-valued dependency (MVD), which can cause similar design problems for relations in terms of data redundancy. Even though the following table is in BCNF, but update anomalies still exists. Lect_Sub_Research Relation Lecturer_Name Subject Research Yuen Data Structure Natural Language Processing Yuen Data Structure Protocal Analyzer Yuen Discrete Math Natural Language Processing Yuen Discrete Math Protocal Analyzer Yuen Data Base Natural Language Processing Yuen Data Base Protocal Analyzer Chalerrmsak Data Structure Protocal Analyzer Chalerrmsak Data Structure Compiler Utilities Chalerrmsak Data Structure Natural Language Processing Multi-valued : dependency (MVD) Represents a dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B, and a set of values for C. However, the set of values for B and C are independent of each other. A > B A > C Lecturer > Subject Lecturer > Research Lec_Sub_Research Relation Lecturer_Name Lec_Sub Relation Lecturer_Name Subject Subject Research Yuen Data Structure Natural Language Processing Yuen Data Structure Yuen Data Structure Protocal Analyzer Yuen Discrete Math Yuen Discrete Math Natural Language Processing Yuen Data Base Yuen Discrete Math Protocal Analyzer Chalerrmsak Data Structure Yuen Data Base Natural Language Processing Yuen Data Base Protocal Analyzer Lec_Research Relation Chalerrmsak Data Structure Protocal Analyzer Lecturer_Name Chalerrmsak Data Structure Compiler Utilities Yuen Natural Language Processing Chalerrmsak Data Structure Natural Language Processing Yuen Protocal Analyzer Chalerrmsak Protocal Analyzer Chalerrmsak Compiler Utilities Chalerrmsak Natural Language Processing Research Unnormalized form (UNF) Remove repeating groups First normal form (1NF) Remove partial dependencies Second normal form (2NF) Remove transitive dependencies Third normal form (3NF) Remove remaining anomalies From functional dependencies Boyce-Codd form (BCNF) Remove multi-valued dependencies Fourth normal form (4NF) LID Lname E5001 E5001 E5001 E5001 E6001 E6001 E6001 E6001 E6002 E6002 E9001 E9001 E9001 E9001 E9001 E9001 Dusit Dusit Dusit Dusit Anan Anan Anan Anan Saeree Saeree Pattara Pattara Pattara Pattara Pattara Pattara Salary 28700 28700 28700 28700 24900 24900 24900 24900 53020 53020 18500 18500 18500 18500 18500 18500 Dept Subject EE EE EE EE IE IE IE IE IE IE CPE CPE CPE CPE CPE CPE Electronic 1 Electronic 1 Electronic 1 Electronic 1 Optimization Optimization Prob Stat Prob Stat Optimization Optimization Data Structure Data Structure Data Structure Web Service Web Services Web Services Credit 3 3 3 4 3 3 4 4 3 3 3 3 3 4 4 4 SID Sname S4 S5 S6 S7 S8 S9 S8 S9 S10 S11 S1 S2 S3 S3 S1 S2 Panita Sarun Kanok Vichu Kitti Chareon Kitti Chareon Sathit Vitthaya Preeda Panu Vallapa Vallapa Preeda Panu GPA 3.35 2.96 2.75 3.15 2.54 3.08 2.54 3.08 2.67 3.25 2.85 2.45 3.02 3.02 2.85 2.45 NULL NULL NULL NULL NULL NULL S999 Luxana NULL E9999 Thana 17500 CPE NULL NULL NULL NULL NULL NULL NULL NULL CPE Prob Stat 4 NULL NULL NULL
© Copyright 2026 Paperzz