CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 1/7 Please note… These rules apply to structured analysis and design of relational databases. They are not exactly the same as the class associations in the class diagrams we will cover in later lectures. Relational databases are a mature and well-supported technology. They were designed to: eliminate data redundancy provide the simplest possible representation of data allow complex queries They are used for systems that were designed as structured or object-oriented. The deliverables for structured logical modeling are: every data element is accounted for whether it’s raw or derived relationships within the data are normalized data requirements are structured into relationships the physical relational database can be designed A relational database model is a representation of data in tables or relations. Relations are named 2dimensional tables of data with named columns (attributes) and an arbitrary number of rows (records). An entity is anything about which you need to carry information. Relationships between entities follow certain rules. These are valid relationships between entities A and B A B for each occurence of entity A, there is always exactly one occurence of entity B, and vice versa A B for each occurence of entity A, there is sometimes one (but never more than one) occurence of entity B, and vice versa A B for each occurence of entity A, there is always exactly one occurence of entity B, but for each occurence of entity B there is sometimes one (but never more than one) occurence of entity A A B for each occurence of entity A, there is always at least one occurence of entity B, but for each occurence of entity B c is sometimes one occurence of entity A (but never there more than one) A B for each occurence of entity A, there can be zero, one or many occurences of entity B, and for each occurence of entity B, there is sometimes one (but never more) occurence of entity A A B for each occurence of entity A there is at least one occurence of entity B, and for each occurence of entity B there is always exactly one occurence of entity A A B for each occurence of entity A there can be zero, one or many occurences of entity B, and for each occurence of entity B there is always exactly one occurence of entity A CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 2/7 A many-to-many relationship must be corrected by creating an associative entity. The associative entity’s primary key is a composite of the primary keys for the entities it associates. For example, students may take several courses and a course will usually have several students enrolled in it. The new associative entity will have a primary key of student id + course id. Note that the new entity is the perfect place to store a student’s grade for that course. the problem the solution A B A AB B A B A AB B A B A AB B Repeating information within an entity have to be removed. For example, an order many be for many items. We solve this by creating an attributive entity where each occurrence of the entity carries one instance of the repeating data. The primary key for this attributive entity is also a composite key consisting of the primary key of the original entity plus one other attribute that makes the composite key of the attributive unique. In the order example, we create a detail entity whose primary key is equal to the order number (the primary key for the order) plus the product code. Note that now we carry the customer id in the order entity and the quantity ordered in the attributive entity for each item ordered. A AC A AC CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 3/7 Normalization mean cleaning up the entities according to the some clear rules. If you consider the data within an entity, it can be presented as a table with the following characteristics: 1. 2. 3. 4. Each entry (row-column intersection) has only one value All entries in a column are instances of the same attribute Each row is unique Sequence of columns in not important We might get a table such as the one below after looking at some orders for a company that ships nuts, bolts, etc. This table satisfies the above criteria. order number 13405 13405 13405 13405 13406 13406 13407 13407 13408 13409 customer Epsilon Ltd. Epsilon Ltd. Epsilon Ltd. Epsilon Ltd. Tau Corp. Tau Corp. Delta Inc. Delta Inc. Epsilon. Ltd. Alpha Corp. ship-to address Sarnia Sarnia Sarnia Sarnia Windsor Windsor Detroit Detroit Sarnia Chicago carrier Fedex Fedex Fedex Fedex Fedex Fedex UPS UPS Fedex UPS ship date June 15 June 15 June 15 June 15 June 16 June 16 June 15 June 15 June 16 June 16 item ordered nails bolts screws nuts bolts nuts nails screws bolts bolts quantity ordered 300 2,300 450 2,300 4,000 4,000 369 566 490 650 packaging box bin box bin bin bin box bin box bin quantity ordered 300 2,300 450 2,300 4,000 4,000 369 566 490 650 packaging box bin box bin bin bin box bin box bin To make this table into first normal form (1NF) we must remove repeating data. We do this by considering an order and seeing what is being repeated. order number 13405 carrier Epsilon Ltd. ship-to address Sarnia Fedex ship date June 15 13406 Tau Corp. Windsor Fedex June 16 13407 Delta Inc. Detroit UPS June 15 13408 13409 Epsilon. Ltd. Alpha Corp. Sarnia Chicago Fedex UPS June 16 June 16 customer item ordered nails bolts screws nuts bolts nuts nails screws bolts bolts We remove the repeating data and put it into an attributive entity. So now we get the following two tables. CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough The original table, order number 13405 13406 13407 13408 13409 And a new attributive table for the repeating data. Note the composite key (order number + item ordered) customer Epsilon Ltd. Tau Corp. Delta Inc. Epsilon. Ltd. Alpha Corp. ERDs and normalization pg 4/7 ship-to address Sarnia Windsor Detroit Sarnia Chicago order number 13405 13405 13405 13405 13406 13406 13407 13407 13408 13409 item ordered nails bolts screws nuts bolts nuts nails screws bolts bolts To achieve second normal form (2NF) we remove non-key attributes that do not depend on the entire key. By talking to the client you found out that they always used bins if the quantities were large and boxes if they were not. You were able to define that business rule using this new table… and remove the packaging attribute from the order/item table. Now if your client decided that you could carry more than 600 nails per box, orders would not need to be changed. order number 13405 13405 13405 13405 13406 13406 13407 13407 13408 13409 carrier Fedex Fedex UPS Fedex UPS quantity ordered 300 2,300 450 2,300 4,000 4,000 369 566 490 650 item nails bolts screws nuts item ordered nails bolts screws nuts bolts nuts nails screws bolts bolts ship date June 15 June 16 June 15 June 16 June 16 packaging box bin box bin bin bin box bin box bin box max. 600 800 500 800 quantity ordered 300 2,300 450 2,300 4,000 4,000 369 566 490 650 CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 5/7 To achieve third normal form (3NF)we remove non-key attributes that depend on other non-key attributes. After checking with the client, you found out that they usually shipped by Fedex if the customer was in a Canadian city and by UPS if the customer was in a city in USA. This means the carrier (shipper) was dependent on location, not on the order. So you create a new table defining this business rule. You also found out that customers only have one location for receiving products. This means that the customer’s shipto address is dependent on the customer and not the order. So you need another table to hold this information. And finally, your order table looks like this. order number 13405 13406 13407 13408 13409 ship-to address Sarnia Windsor Detroit Chicago customer Epsilon Ltd. Tau Corp. Delta Inc. Alpha Corp. customer Epsilon Ltd. Tau Corp. Delta Inc. Epsilon. Ltd. Alpha Corp. carrier Fedex Fedex UPS UPS ship-to address Sarnia Windsor Detroit Chicago ship date June 15 June 16 June 15 June 16 June 16 Now, for the five tables we ended up with, all non-key attributes depend on the whole primary key and nothing but the whole primary key. And we have avoided some common problems in database design: insertion you don’t have to supply information about customers and carriers at the same time deletion if you delete certain information you are not losing unrelated information modification if you change a carrier, you don’t have to change all outstanding orders using the former carrier CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 6/7 Now we can draw the entity relationship diagram for our example. (The box maximum is dependent on the key for the item entity.) customer order carrier order detail entity customer order order detail item carrier primary key customer name order number order number + item identification item identification ship-to address item foreign key ship-to address customer name order number + item identification non-key attributes ship-to date quantity ordered box maximum carrier name A foreign key is an attribute that is a primary key for another table. That’s how you get all the information for each order in the original table we started with. Note that the order/detail table is an attributive entity that holds the repeating data for orders, but it is also an associative entity because it solves the many-to-many problem “an item can appear on many orders and an order can be for many items”. CSCC40 Analysis and Design of Information Systems University of Toronto at Scarborough ERDs and normalization pg 7/7 So how do you complete the modeling for an entire system? Let’s combine the above model with the following information you need for billing. item price nails bolts screws nuts customer $0.15 $0.20 $0.17 $0.10 Epsilon Ltd. Tau Corp. Delta Inc. Alpha Corp. bill-to address Toronto Hamilton Chicago Chicago This is rather a trivial example because we already have entities with the same primary keys. We simply add the new attributes to existing tables. item box max. nails bolts screws nuts 600 800 500 800 ship-to address Sarnia Windsor Detroit Chicago customer Epsilon Ltd. Tau Corp. Delta Inc. Alpha Corp. price $0.15 $0.20 $0.17 $0.10 bill-to address Toronto Hamilton Chicago Chicago But if we were presented with the entirely new tables, we would have to make sure that the resulting information is 3NF and we could navigate using primary and foreign keys. There are still a couple of interesting variations to cover. Unary (self referential relationships) For example, in an organization where one employee reports to another, you might find the following employee table. A foreign key attribute holds the employee id of a person’s boss. Note that you could figure out the organization chart from this information. (presumably the President’s foreign key is blank.) entity employee primary key employee id foreign key employee id (of the superior) non-key attribute address, birthday, etc Is-a relationship (subclasses) For example, we might be carrying different information about full-time and part-time employees. We solve this by creating the more tables. It is not a problem that two tables have the same key. This is simply an instance of a primary key also functioning as a foreign key. entity employee part-time employee full time employee primary key employee id employee id employee id non-key attribute address, birthday, etc hourly rate salary
© Copyright 2026 Paperzz