The Practical Need for Fourth Normal Form Margaret S. W% 425 Beldon Iowa City, Phone: Ave Iowa (319) 52246 335-0846 ABSTRACT Many practitioners encountered. data violating normal form and We report fourth academicians believe that upon of forty organizational a study normal is more form. important data Consequently, than previously violating fourth normal database$ the need to nine form is rarely of them understand contained and use fourth believed. INTRODUCTION A paramount issue in the design of any database is what data fields should be grouped together into records. In the relational model, the data fields are grouped into logical structures called relations. The determination of which data fields are placed together in a relation is based upon the concept of normal forms; the process is known as normalization. The set of data fields comprising the database is progressively organized into relations in first through fdth normal form placed upon the relations that the relation There fourth has no modification is some normal neglect form the topic Stamper normal Price in [10] are so rarely in thk book.” and 4NF highly theoretical step in the normalization process, we may find that the relations no longer require further reorganization, in that either ignore practitioners case, we say that final normal form has been achieved. Frequently, a set of data may be in final normal form Thus, the practical stops with normal created to replace 3NF. with 3NF in the normalization process. normal form called Domain Key not applied to normalize data but relation has been normalized to its Permission grantad direct to copy provided commercial that of the publication that copying and/or specific e 1992 ACM fee tha copies advantage, title Machinery. without To copy all or part the ACM otherwise, is placed after There is an addkional normal form which is serves to verify that a final form by showing of this are not made and its date is by permission BCNF copyright appear, material or distributed notica and notica of the Association or to republish, is and the is given The a fee 9..,$1.50 19 are by the the next section, for 4NF paper and discuss state the section, into this fourth section. Finally, given in the last section. in these the results are In The second of this study. of the empirical presented of need procedures the organization a summary results databases. of 4NF. investigation recc~rd objectives. the practical world in update process result methodology insights Additional during and normalization the and defined is to present we present gained were will the advantages the research issue. apparently normalization of real of in academicians investigated examination we theory in meeting which [1,6,9] view by Edwards forms 4NF ineffective of this The an academic some of the are still texts inconsistencies violating study 4NF third for are not “Although they 4NF. is merely the normal data purpose details 4NF termination for research. permission. 0-89791-468-6192100021001 that explain is represented BCNF database of an empirical the for Computing requires designs section for the the or prevent anomalies, 5NF, fifth business they that database use of normalization Because to fully that than Other 4NF and in hence, may in M [S. “fourth [8] states be less rare do not 3NF practitioners. In theory, or courses that obscure; Mittra view and thus encountered in nature.” he states (3NF). Because the definition of 3NF does not treat certain instances of data adequately, an additional normal form called Boyce-Codd normal form (BCNF) was order may regarding [3] where form state as to be almost described academicians management and applications anomalies. that as unimportant forms BCNF only to third (4NF) in database (5NF) according to constraints in each normal form. At any when it has been normalized evidence these In of data in results the is NORMALIZATION The deffition follows: OF DATA To normalize from frost normal a set of data items, we may proceed form (lNF) to second normal form A relation or groups only of fields attribute atomic values. normal form A holds in R, then in R such R. To ->-> Y (FY,Z) ad (XJ,Z’). A relation schema do not = t2[X]. fourth holds for tuples form, R(X,Y,Z) the is required. if and only concept of X MVD X ->-> a subset of X or XUY Y is said to be. trivial ifY and is Table nine databases 1 with data violating 4NF Functional Total Orszanization Area Fields Z!I&S Financial Firm Sales Management System 307 24 Physician Database Data on Individual Physician 68 6 63 9 55 8 47 17 42 7 33 5 18 7 16 4 Publishing ColIege Firm Job Apartment Complex Division Scheduling Student of Nursing Skills Maintenance of Sponsored Grants Database Tracking Database * Record Research Animal Laboratory Athletic Tutoring Automotive Parts Inventory Program Assignment Retailer Inventory real world of Animals of Tutors 20 data, we if R is in fidly to the original deftition given by Fagin However, the theoretical differences do not affect normalization of real world databases. = R. The of normrdizing BCNF and does not contain any nontrivial multivalued dependencies. We note that this simpler statement does not conform whenever are tuples of R, then so are (Ly’,z) LY’,z’) fact dependency does is necessary. To may restate the deftition of 4NF as follows: A relation R is in fourth normal form o3 An MVD if deftition of 4NF. For the purpose and t a superkey a obtain 4NF, the relation R is decomposed into two or more relations which meets the constraints given by the X -> tl if, w~enever Y is not independent, i.e., a mukivalued not exist, redundancy of data values single dependericy two X is called (MVD) are is in 4NF X ->-> about X. It is possible for Y to be called a multivalued fact about X and yet the relationship between X and Y is not a mukivalued dependency. When a multivalued fact R is in Boyce-Codd exist normal deDendencv lNF MVD column name of R . If X ->-> Y, we also say that Y is a mukivalued for each record. by a functional there tl[X] define An permitted if whenever that multivalued are repeated values R* [51 as holds for R , then so does the functional. dependency X -> A for every we can proceed directly from lNF to BCNF. The use of BCNF rather than 3NF is preferred because normalization to BCNF is easily understood by students and can be quickly implemented, in addition, a relation in BCNF is dSO in 3NF. To normalize a record type to lNF, any repeating fields schema nontrivial (2NF) to third normal form (3NF) to Boyce-Codd normal form (BCNF) to fourth normal form (4NF). In practice, The of 4NF is then given by Fagin [4]. the RESEARCH The for METHODS databases this division Each organization fum, or a division database required with the exception for within and the The the a large study 500 firm, a major financial for course. team record carefully reviewed to obtain types the organizations large was made for the data to resolve and the relationships as a were Additional then This study has at least constituting study. over Twenty forty indicates received by the organization. with relations to the Because of this in data violating normalized incidence, the 4NF. The database of normalized record The each tabulation magnitude of databases were represented average 4NF. of over is not yields Both in the 40 data large required financial eight and for databases the normal possible form by representing Similarly, by its COI)E we add the field a Y or N value corresponding value of FIELD PENDING We then have two relations as shown the (ACCOUNT ID, HOME CODE, HOME-PENDING(ACCOUNT ID, FIELD REQUIREMENT, P-REQTS- Y-N) sales These was found contain two and require an Another fields. OF FOURTH NORMAL It is sometimes if the value of identified to contain relations no further FORM CODE to avoid the use of fourth PENDING the data differently. codes. As an 21 field do not violate has only fourth normal form normalization. representation the use of buckets. AVOIDANCE has not been For example, PENDING PENDING small database firm of information document CODE-Y-N) Pending-Reqts-YN found of The PENDI~U in the HOME Home-Code-YN PENDING is also fields ID, FIELD each number picture study. fields of a major other of data value to a particular REQUIREMENT. below data for The particular P-REQTS-Y-N contained fields organization a general databases. The which tabulated. each the item field has been received or not. six study. so rare. has been for this regarding of data included system represent in databases by the 307 data management to violate these types types a corresponding BCNF record a particular whether the of record of the number organization in this comprising number types 4NF databases normalization 4NF organizational violating databases types information organizational at from 1 provides found of the record These 4NF. of given. 350 resulted percent Table percent the data organizational PENDI~U 01 is present for HOME PENDING CODE for a particular value of ACCOUNT ID, we know that the document identified as 01 has not been received. ‘To represent this information without violating fourth normal form, we will require that all values for HOME PENDING CODE be given and an additional field called HOME-PENDING-CODE-Y-N be included. This latter field will contain either a Y or N value to indiciite in the data. that in nine twenty of databases determined once to form two ID, HOME (ACCOUNT as necessary RESULTS OF THE STUDY occurred CODE) REQUIREMENT) It is possible to represent the attributes of HOME PENDING CODE and FIELD PENDING REQUIREMENT so that the data does not violate 4NF. The value of these fields is given by a numeric value that contact any ambiguities between PENDING is decomposed (ACCOUNT CODE) Pending-Reqts were teams reports 4NF, thk relation Home-Codes a for and Design by the author in 4NF. HOME PENDING a for of student These To obtain remainder Analysis ID, FIELD relations: for databases The by CODE, firms, system to present 3NF. and revised record the data studied in private database in the Systems Codes (ACCOUNT organizations. database 4NF. was required types were three the functional management violated were classes Each of These that organizations sales to was studied were entire a example of this approach, let us consider the database of the Financial Organization. This database in BCNF contains the following relation violating fourth normal form: organization which manufacturing the firm. limited government the and firm firm, university. of the twenty were the data undergraduate set university, included Fortune was area studied a small organizations two firm, examined was of one business remaining were of a major organization functional Eighteen reprographics the each for entirety. units organizations project. data its forty of a large The in from for In actuality, 19 possible REQUIREMENT this data is possible the HOME codes field while has only by PENDING the FIELD five possible Table Violations The 2 of Fourth Normal Form In this research study, nine databases were found to contain data violating fourth normal form. organizations is shown below in relations normalized to 4NF. 1. Financial Organization The relevant data for these nine - Sales Management System Home-ties (ACCOUNT ID, HOME PENDING CODE) Pending-Reqts (ACCOUNT ID, FIELD PENDING REQ UIREMEMP) 2. Physician Identification System Is-Member (PHYSICIAN Works-At (PHYSICIAN 3. Publishing Firm - Scheduling Associated-Job Colors 4. (JOB (JOB College of Nursing (COURSE, Family Housing (APT Division of Sponsored (GRANT 7. Animal Research CODE, Tutoring DATE) Database NUMBER NUMBER INVESTIGATOR ANIMAL NUMBER) ID) Assignment Service did not wish to classify individual a student Tutor-Subjects (TUT OR ID, SUBJECI_ Tutor-Courses (TUT OR ID, COURSE Automotive MAINTENANCE SPRAYED) KEYWORD) (ACCOUNT areas for which AREA of Animals System - Tutor subject CODE) CODE, ID, KEYWORD) Athletic The Athletic CODE) DISCIPLINE) CODE, (ACCOUNT Tutoring SKILL System APT - Grants Account-Animals Note: 9. DATE, - Inventory Account-Investigators 8. ADJOINfNG (UNfVERSITY Laboratory Tracking MAINTENANCE SPRAY (GRANT Faculty-Interests PREREQUIWI’E - Maintenance CODE, Grant-Keyword Database CODE) CODE, (APT Grant-Area Skills CODE, Spraying JOB NUMBER) REQUIRED) CODE, SKILL (AIT Maintenance 6. - Student Office Adjoining-Apt ASSOCIATED COLOR (SKILL Course System NUMBER NUMBERj Has-Prerequisite 5. ID, GROUP ID) ID) ID, LOCATION was deemed qualified courses by subject to sewe as a tutor. AREA) lD) Parts Retailer - Inventory Vehicle-Engine-Part (W3 HICLE, MODEL. YEAR ENGINE TYPE, ENGINE ENGINE-PART-NUMBER) Vehicle-Feature (VEHICLE, MODEL, Note: FEATURE consists The field YEAR FEATURE, ~ of several subfields. 22 FEATURE, area(s) but did wish to track both courses and Consequently, we may choose to store information in one relation as shown below thk 5. same “Multivalued Dependencies and a Fagin, Ronald. New Normal Form for Relational Databases+ ACM All-Codes (ACCOUNT ID, HOME-PENDING6. CODE-1, .... HOME-PENDING-CODE-19, FIELD-PENDING-REQUIREMENT-1, .... FIELD-PENDING-REQUIREMENT-5) 7. Although the relation All-Codes is technically in 4NF, the use of the bucket violates the principles of the relational model. The organization may choose the bucket representation because these data fields have been in use for several years and it is unlikely that change 8. would 9. occur. The bucket representation Desire of data databases showed that believed. Data violating 5NF did not occur in any of the databases which may indicate that 5NF is only an academic issue, Because data violating 4NF was found to occur in over twenty percent of the organizations in the study, it is advisable that practitioners understand 4NF. It is also imperative emphasized in the teaching and apply the constraints for that the rules for 4NF be of normalization theory. ACKNOWLEDGEMENT The author wishes to thank Jeff Hoffer for su~esting this research project. REFERENCES 1. Courtney, James F. and David B. Paradice. Database Svstems for Management. St. Louis: Times Mh-ror/Mosby College Publishing Co., 1988. 2. Date, C. J. An Introduction to Data Base m. 5th ed. Reading, Publishing Co., 1990. 3. Edwards, Joe B. Compromise} 4. 53-58. Elmasri, Ramez “High MA: Addison-Wesley Performance Datamation, ~, and Shamkant Without 13, (July 1, 1990), B. Navathe. Fundamentals of Database Svstems. Redwood City, CA: Benjamin/Cummings Publishing Co. 1989. 23 R. Inc., 1991. and Mana~ement: New York CONCLUSION The study of forty real world Svstems, ~, 3, Pratt, Philip J. Microcomputer Data Base Mana~ement Usimz dBASE III Plus. Boston, Boyd & Fraser Publishing Co., 1988. 10. Stamper, David and Wilson Price. Database potential problems due to its inflexibility. We note that the proper choice for data representation is dependent on the individual installation’s preference for data representation, the requirements for flexibility, the need 4NF occurs more often than we previously on Database Kent, W. “A Simple Guide to Five Normal Forms in Relational Data base Theory, Communications of the ACM, 26, 2, (Feb. 1983), 120-125. Mittra, Sitansu S. PrincirJes of Relational Database Svstems. Englewood Cliffs, N. Y.: Prentice-Hall, presents to minimize storage space, and the trade-off structure for ease of programming. Transactions (Sept. 1977), 262-278. Fife, Dennis, W. Terry Hardgrave, and Donald Deuth. Database Concepts. Cincinnati Southwestern Publishing Co., 1986. McGraw-Hill, An A~plied Inc., 1990. MA Ap~roach.
© Copyright 2026 Paperzz