Normalization and Entity-Relationship (ER) Modeling Benefits of Normalization and ER Modeling • Easily maps into a relational model • Directly maps to relational database tables • Can be used as a “blueprint” for a database design • Simple to understand, short learning curve • Useful to communicate with customers Basic Concepts of ER Modeling • Introduced in 1976 by MIT’s Professor Peter Chen • Separates entities (objects, people, classes) from relationships (associations) • Localizes data as attributes • Entities play roles in relationships Entities are nouns – People – Jobs, Accounts – Objects (Chairs, Cars, Books) Relationships can be expressed as verbs – Owns, Rents, Borrows, Checks out, Buys – has-a Attributes are data associated with entities and relationships. Typically adjectives. – Color, Height, Weight – Name, Title, ID – Date, Timespan Roles are entities’ parts or jobs in relationships. Typically nouns. – Further classify entities • Generally unnecessary unless entity may play multiple roles Example: Husband, Wife 1 husband 1 married to wife Example: Library checkout E Date SSN Name Due Date M ID N Customer Book Checks Out Example: Student/Class SSN Student ID Date Enrolled Capacity Name Section N M Student takesa Class Instructor CourseCode General form for ER modeling Attribute Attribute Attribute Attribute Attribute M Entity/Role Attribute Attribute Attribute N Relationship Entity/Role Normalization • A technique used in designing relational databases • Removes redundancy • Minimizes dependences • Localizes data • Simplifies modifications • 5 normal forms defined – First 3 are most important First Normal Form • Remove “repeating groups” (duplicate columns/variable length array) • Example – Order (ordernum, customer, (itemName, itemCost, itemQuantity)) – Parenthesis (or overbar) indicates repeating group – Create item table: Item (name, cost, …) – Create order-item relationship table: OrderItem (ordernum, itemname, quantity) – Typically uses composite keys Before normalization item1name item1quantity Item2name item1cost Item2quantity item2cost ordernum customer Item3name item3cost Item3quantity Order After normalization ordernum customer quantity N Order name cost M contain s Item Second Normal Form • Remove non-dependent attributes – All attributes in a table must be dependent on the entire primary key – Cannot be dependent on only part of the primary key – Reduces redundancy and dependencies • Applies only to tables with composite keys • Example – Part (part number, supplier_name, price, supplier_addess) – Supplier address is dependent only on supplier – Create supplier table: Supplier (supplier_name, supplier_address) Before Normalization part_number supplier_name price supplier_address Part After Normalization supplier_name part_number supplier_name supplier_address price Part Supplier Third Normal Form • Non-key attributes must be independent of each other. No transitive dependencies – i.e. Non-key attributes must depend only on the primary key – Reduces redundancy and dependencies – Can be viewed as an extension to second normal form (i.e. 2NF = no partial dependencies; 3NF = no non-key dependencies) • Example – CD (title, artist, publisher, publisher_address) – Publisher address depends on publisher – Move to separate table • Example – Order (ordernum, customer, unit_price, quantity, total) – Total can be calculated from unit_price and quantity. It is therefore not independent. – Total can simply be removed entirely Anomalies A goal of normalization is to remove (at least reduce) anomalies. Types of anomalies: Update Inconsistent Data Additions Deletions Example from Pratt paper using this relation: Order(Order#, Date, Part#, Description, Quantity) o Relation is in 1NF since Description depends only on Part# and not Order#. o Note there will be redundancy since the description is repeated in each record with the same Part#, so Update anomaly is that multiple Description entries would have to be changed for any single part description change. o Because of the duplication there could easily be Inconsistent Data, where the descriptions for the same part are different. o Can’t add a new part until it has been ordered, since the Order# attribute is part of the PK and can’t be null. This is an Addition anomaly. o If a record is deleted that is the only entry for a particular Part#, the Description of that part would be lost. This is a Deletion Anomaly. Review of Normal Forms • • • • Unnormalized – Has repeating group First Normal Form (1NF) – Has no repeating group/duplicate columns Second Normal Form (2NF) – Has all non-key attributes dependent on “the whole key” Third Normal Form (3NF) – Has all non-key attributes dependent on “nothing but the key” Unnormalized Remove Repeating Groups First Normal Form Remove Partial Dependencies Second Normal Form Remove Transitive Dependencies Third Normal Form
© Copyright 2025 Paperzz