SOLUTIONS - CIS209 - INTERNAL - 2002 PROBLEM 1 [25] Question 1 Define the relational model. What is a relational database management system (DBMS)? [3] Answer The relational model is a data model (or a model for representing data). The (relational) data objects or, rather, data structures it consists of are domain and relation. The relational operators include set specific operators – union, intersection, difference and Cartesian product – and relation specific operators – restriction, projection, join, division. [2] A relational DBMS is a DBMS that provides/implements the elements of a relational model (i.e. the relational data objects/structures and the relational operators) [1] TOTAL [3] Question 2 Define the notion of “foreign key”. Give an example. [3] Answer Consider two relations R1 and R2. A set of attributes S of R2 is a foreign key referencing R1 if and only if S is a candidate key in R1. [2] For example: Customer (Customer-id, Name, Address, Job-details) Account (Account-number, Type, Overdraft-Limit, Balance, Customer-id) Customer-id is a FK in Account referencing Customer. [1] TOTAL [3] Question 3 [4] Explain the two types of program–data independence on the basis of the three level ANSI/SPARC architecture of a database system. Answer Introduction/description of the three ANSI/SPARC levels – i.e., internal (physical), conceptual and external. A diagram is sufficient (see Study Guide page 11), but other correct ways of defining/introducing these levels should be accepted. [2] Physical program-data independence is the immunity of application programs to changes at the internal (or physical) level (assuming that the conceptual level does not change). [1] Logical program-data independence is the immunity of application programs to changes at the conceptual level (assuming that the external level remains unchanged). [1] TOTAL [4] CIS209 – IS52003A 2002 Internal Solutions 1 Question 4 [5] What is a system catalogue? Give example of two types of data/information it usually includes and explain what data/information is/can-be used for (one or two sentences per type of data/information). Answer A system catalogue is a component of a database system which contains information about the database. [1] Types of data/information contained in a catalogue include: 1. schemas: 1.1. conceptual schema (or description of base tables) – used, for example, for checking the correctness of SQL queries; 1.2. external schemas (or description of views) – used for evaluating queries at the external level; 2. integrity rules – used for enforcing the integrity of the database between transactions; 3. security rules – used for enforcing the security of the database; 4. statistical information about the data (extension) of the database – used by the optimiser; 5. transaction log – used in data recovery. Award 2 marks per correct example of type of data/information contained in a catalogue, up to four marks. [4] TOTAL [5] Question 5 [5] Data can be stored in files and application programs can share this data by having a direct access to the respective files (refer to Diagram 1, below). However, data-centred applications normally employ a database management system (refer to Diagram 2, below). DBMS Program/Application 1 Program/Application n data as files (disk) Program/Application 1 data as files (disk) Program/Application n Diagram 1 Diagram 2 State why the latter approach (Diagram 2) is preferred for data-centred applications (refer to at least two features of a DBMS). Answer (a) Approach 2 is preferred because a DBMS provides many features for data access which, otherwise (approach 1), would have to be implemented in each application, such as: data definition and manipulation; support for the integrity of the data; support for the security of the data; catalogue (description of the data in the database); (b) Also, a DBMS provides features which would otherwise not be supported in approach 1: concurrency control support for data recovery Award full marks if the answer states either (a) or (b) and makes reference to two DBMS features.[5] TOTAL [5] CIS209 – IS52003A 2002 Internal Solutions 2 Question 6 [5] Explain what is it meant by impedance mismatch, in the context of relational database systems. Answer In applications based on relational databases, data has to be translated between the way it is stored/represented on/in the database (the database’s data types) and the way it is represented in the application programmes (the data types of the programming language). Usually, the data types used by a relational database do not coincide with the data types used by a programming language. This is called impedance mismatch, and may cause the corruption of data. For example, an application A1 may be implemented in a strongly typed language and application A2 in a less strong-types language. The strong enforcement of types by A 1 is lost if A1 and A2 share data via the database. The example is not necessary, if the other points are clearly made. [5] TOTAL [5] PROBLEM 2 [25] Question 1 [3] Can a set of data requirements be correctly modelled by two or more different ER diagrams? Explain your answer. You may use a small example, if you think it will help your explanation. Answer Yes. An ER model captures only certain aspects of a real life system. Thus, two different models of a real life system can both be correct, because each captures a different set of characteristics of the system. It is also possible that the same characteristic of a system is correctly modelled via different elements of the ER model (e.g., the registration of a student for a course, with the attributes date and exam result, can be modelled as either a relationship with attributes or as an entity – see Question 3 below). Furthermore, models can describe a system at different levels of detail. Award full marks if at least one of the above points is made or if the student comes with another convincing explanation. [3] TOTAL [3] Question 2 [12] Draw an ER diagram for the following description. Illustrate only the entity types (disregard the attributes), the relationships between them and the multiplicity of each relationship. A company specialises on IT training. At the time being, the company has 20 instructors, provides 30 courses and can handle a maximum number of 600 trainees. However, these numbers may increase in the future. Each trainee registers for a minimum of 1 and a maximum of 3 courses. The number of trainees that can register for a course is not limited. Each course is assigned to a maximum number of 5 instructors. A course may be assigned to no instructors, if there are no trainees registered for it. An instructor may be assigned to a maximum of 10 courses. Each course is organised in 10 sessions. Each session is taught by one instructor, only. An instructor may be in charge of any number of sessions (obviously, an implicit constraint exists, namely that an instructor cannot be in charged of more than 100 sessions, but you may disregard this constraint). CIS209 – IS52003A 2002 Internal Solutions 3 Answer AssignedTo Instructor 0..5 0..10 1 Courses InChargeOf 1..3 0..* Trainee Registers 1 0..* 10 Session Has Award 4 marks for correct identification of entities (1 per entity) (the names of the entities may be different, provided they “preserve” the same meaning as above) 4 marks per correct identification of relationships (1 per relationship) (the names and direction of the relationships may be different, provided they “preserve” the same meaning as above) 4 marks for correct identification of multiplicity of relationships (1 per relationship) (if the multiplicity of 10 and 5 are represented as ‘*’, then the answer is still considered correct) TOTAL [12] Question 3 Draw an ER diagram for the following description. [5] The students of a university register for different modules. One student may register for one or more modules (but not exceeding 24). One module, normally, has many students registered for it. If students fail a module they have to register again (they have to retake it). Therefore, the information relevant to registration is: date of registration and result. Answer Solution 1 Student 0..* 1..* Registers Module Date Result Solution 2 Student 1 1..* StudentRegistration Registration Date Result 0..* 1 Module ModuleRegistration Both solutions are correct. Either should be awarded full marks. Award 1 mark for the identification of the entities ‘Student’ and ‘Module’. Award 4 marks for the rest of the model (in either solutions). It would have been a mistake to have had the multiplicity ‘1..24’ instead of ‘1..*’ because a student may register more than once for a module (in case s/he fails). If this error occurs, take away one mark. TOTAL [5] CIS209 – IS52003A 2002 Internal Solutions 4 Question 4 [5] Consider the following ER diagram. Translate it into a relational model and specify the primary keys, foreign keys and foreign key rules for each of the resulting relations. Residence address {PK} noOfBedrooms noOfBathrooms noOfKitchens livingArea price {Mandatory, Or} Flat floor hasLift hasAccessToGym hasAccessToSauna House type areaOfGarden leaseForGround isListed Answer Residence (address, noOfBedrooms, noOfBathrooms, noOfKitchens, livingArea, price) PK : address Flat (address, floor, hasLift, hasAccessToGym, hasAccessToSauna) PK : address FK : address REFERENCES Residence ON DELETE CASCADE, ON UPDATE CASCADE House (address, type, areaOfGarden, leaseForGround, isListed) PK : address FK : address REFERENCES Residence ON DELETE CASCADE, ON UPDATE CASCADE Award: 2 marks for the introduction of ‘address’ in the relations ‘Flat’ and ‘House’ 1 mark for the specification of ‘address’ as PK in both ‘Flat’ and ‘House’ 1 mark for the specification of ‘address’ as FK referencing ‘Residence’ in both ‘Flat’ and ‘House’ 1 mark for FK rules TOTAL [5] PROBLEM 3 [25] Question 1 [3] Explain how the process of normalisation can complement the process of ER modelling in database design. Answer ER modelling is a top down design technique. This is indicated to be used as a first step in database design. The relations of a relational model that results from an ER model are not guaranteed to be free of update anomalies. Therefore, as a second stage in database design, each relation can be subjected to a normalisation process. A good ER model may require little or no normalisation. TOTAL [3] CIS209 – IS52003A 2002 Internal Solutions 5 Question 2 Consider the following relation. Student-Name Username [4] Email Course Exam-Date Attempt Result (a) Give examples of three possible non-trivial functional dependencies (FDs) and concisely explain why did you consider them to be FDs. At least one FD should have a composite determinant. [3] (b) Choose a primary key for this relation. [1] Answer a) Username Student-Name Username Email Email Username (possible, if a student has only one email account) (Username, Course, Exam-Date) Attempt (Username, Course, Exam-Date) Result (Username, Course, Attempt) Exam-Date (Username, Course, Attempt) Result Award 1 mark for each correctly chosen FD, but not more than 3 marks. [3] b) Possible PKs: (Username, Course, Exam-Date) (Username, Course, Attempt) Award 1 marks if any of the above two possible PKs is chosen; if another PK is chosen, still award 1 mark if the corresponding semantic assumptions are correctly stated. [1] TOTAL [4] Question 3 Consider the following relation. Patient Disease [12] Doctor Diagnosis Treatment Diet and the following functional dependencies: (Patient, Disease, Doctor) Diagnosis (Patient, Disease) Treatment Treatment Diet Assume they completely express all the functional dependencies existing in the given relation (i.e., the other are either trivial or can be deduced from the given ones). Decompose/transform (non-loss) the given relation into a set of relations in BCNF. Explain how you apply Heath’s theorem for each decomposition you make. State the candidate keys for each resulting BCNF relation. CIS209 – IS52003A 2002 Internal Solutions 6 Answer (1) Heath’s theorem for R (the initial relation) based on ‘Treatment Diet’ leads to: R1 (Treatment, Diet) R2 (Patient, Disease, Doctor, Diagnosis, Treatment) R1 is in BCNF ; CK is (Treatment) R2 is not in BCNF (2) Heath’s theorem for R2, based on ‘(Patient, Disease) Treatment’ leads to R21 (Patient, Disease, Treatment) R22 (Patient, Disease, Doctor, Diagnosis) R21 is in BCNF ; CK/PK is (Patient, Disease) R22 is in BCNF ; CK/PK is (Patient, Disease, Doctor) Note that step 2 could be based on ‘Patient, Disease, Doctor Diagnosis’ ; this would lead to: R22 (Patient, Disease, Doctor, Diagnosis) (the same as above) and R21 (Patient, Disease, Doctor, Treatment). R21 would have to be decomposed then based on ‘(Patient, Disease) Treatment’ and this would lead to R211 (Patient, Disease, Treatment) (i.e., R21 above) and R212 (Patient, Disease, Doctor); this relation is subsumed by R22 and thus can be discarded. Thus, the same solution is obtained as above. Result: (Treatment, Diet) (Patient, Disease, Treatment) (Patient, Disease, Doctor, Diagnosis) Award 4 marks for step (1) and 8 marks for step (2). Alternatively, award 6 marks for correct set of normalised relations (refer to “Result”, above; this includes the specification of CKs) and 6 marks for correct process (application of Heath’s theorem + identification of relations in and not in BCNF). TOTAL [12] Question 4 Consider the following relation R: Patient Disease [6] Doctor Treatment Consider the following functional dependencies for R: (Patient, Disease) Doctor (Patient, Disease) Treatment Doctor Disease Assume they completely express all the functional dependencies existing in R. Discuss the way in which these functional dependencies can be expressed via normal forms (decomposition) in parallel with the issue of dependency preservation. CIS209 – IS52003A 2002 Internal Solutions 7 Answer By expressing ‘(Patient, Disease)’ as a CK (or, since it is just one, as the PK) in R, the following FDs are expressed by R: (Patient, Disease) Doctor (Patient, Disease) Treatment This solution is not BCNF due to Doctor Disease I.e., the above FD is not expressed by/in R. There are two ways to express the last FD above. The first, is directly at the level of R. This leads to: (Doctor, Disease) ; CK (Doctor) and (Patient, Doctor, Treatment) ; CK (Patient, Doctor) This solution has lost the first two FDs above. A better solution would be to first decompose R, based on any of the first two FDs into (Patient, Disease, Treatment) with PK (Patient, Disease) – in BCNF ; and (Patient, Disease, Doctor); The latter would then have to be decomposed into (Patient, Doctor) (Doctor, Disease) Still, ‘(Patient, Doctor) Disease’ is lost. Award 6 marks if the point that there is no solution that can express all three FDs is clearly made. The answer does not have to be as detailed as above and it may follow a slightly different line of argument. However, the explanation has to be clear. [6] TOTAL [6] PROBLEM 4 [25] Question 1 [6] Write the SQL statements that implement the database schema that corresponds to the following ER model. The entity “Child” is a weak entity which depends on “Employee”. Your answer should also include the statement of the relevant integrity constraints. The answer can be given purely in terms of two CREATE statements. Employee empNo {PK} name jobTitle department salary CIS209 – IS52003A Child 1 0..* 2002 Internal Solutions name sex dateOfBirth 8 Answer CREATE TABLE Employee ( empNo char(10), name varchar(50) NOT NULL, jobTitle varchar(100), department char(5), salary real CHECK (salary > 15000), PRIMARY KEY (empNo) ) CREATE TABLE Child ( empNo char(10), name varchar(50), sex char(1) CHECK (sex IN (‘M’, ‘F’)), dateOfBirth date, PRIMARY KEY (empNo, name), FOREIGN KEY (empNo) REFERENCES Employee ON DELETE CASCADE ON UPDATE CASCADE ) Award 2 marks for the first definition. Award 4 marks for the second definition (1 mark for PK, 1 mark for PK, 1 mark for FK rules and 1 mark for the rest). Full marks may be awarded if other attribute constraints, apart from PKs and FKs, are not specified. TOTAL [6] Description for the following two questions: Consider a small database for a library. The database stores general information about books, the physical copies of each book they have in the library, their readers and the books/copies that were or are given out on loan. This information is stored in the following relations (primary keys are represented in bold underlined and foreign keys in bold italic (arrows are also drawn for foreign keys to improve readability)): Book ISBN title authors publisher year price ISBN location maxDaysLoan overdueChargePerDay name address maxNoBooksForLoan catalogNo dateOut dateIn PhysicalCopy catalogNo Reader userName Loan userName Question 2 Express the following natural language queries in SQL: [13] a) List the title, authors, and price for all the books published by Addison-Wesley in 2000, in alphabetical order with respect to titles. [2] SELECT FROM WHERE ORDER BY title, author, price Book publisher = ‘Addison Wesley’ AND year = ‘2000’ title; CIS209 – IS52003A 2002 Internal Solutions 9 b) List the titles of all the books that can be taken on loan for more than three days. SELECT FROM WHERE [2] title Book B, PhysicalCopy C B.ISBN = C.ISBN AND maxDaysLoan > 3; c) List how many non-returned books (as in physical copies) does the reader “Goldy Smith” have (hint: a non-returned book has no value for ‘dateIn’). [2] SELECT FROM WHERE count (*) Loan L, Reader R L.username = R.username AND name = ‘Goldy Smith’ AND dateIn IS NULL ; d) List all the readers (as name and address) who have books overdue, together with the titles of these books (a book is considered overdue if it was not yet returned and it was on loan for more than the maximum number of days allowed (‘maxDaysLoan’) (hint: assume that the difference between two values of type DATE corresponds to the data type associated with ‘maxDaysLoan’; ‘CURRENT_DATE’ is an SQL unary operator which returns the current date). [3] SELECT FROM WHERE name, address, title Reader R, Book B, Loan L, PhysicalCopy C L.userName = R.userName AND L.catalogNo = C.catalogNo AND C.ISBN = B.ISBN AND dateIn IS NULL AND (CURRENT_DATE - dateOut) > maxDaysLoan ; e) List the names of all the readers who have non-returned books together with the total number of non-returned books, but only if this total exceeds their quota (‘maxNoBooksForLoan ’). [4] SELECT FROM WHERE GROUP BY HAVING name, count (*) AS totalNoOfBooksOnLoan Loan L, Reader R L.username = R.username AND dateIn IS NULL name, maxNoBooksForLoan count(*) > maxNoBooksForLoan ; TOTAL [13] Question 3 Express the following integrity constraints in SQL: [6] a) Books located in ‘Reference’ should not be allowed to be borrowed, i.e., the ‘maxDaysLoan’ for all their copies should be zero (note that this will not stop an actual loan to happen and even to be recorded in the database). [3] CREATE ASSERTION Cannot_borrow_reference_books CHECK ( NOT EXISTS ( SELECT * FROM PhysicalCopy WHERE location = ‘Reference’ AND ‘maxDaysLoan’ > 0 )); b) Books whose price exceeds £100 should not be allowed to be borrowed (the same observation as above applies here, too). [3] CREATE ASSERTION Expensive_books_cannot_be_borrowed CHECK ( NOT EXISTS ( SELECT * FROM Book B, PhysicalCopy C WHERE B.ISBN = C.ISBN AND price > 100 AND maxDaysLoan > 0 )); TOTAL [6] CIS209 – IS52003A 2002 Internal Solutions 10 PROBLEM 5 [25] Question 1 [7] a) Explain the idea of query optimisation via a simple example. [5] b) Enumerate some types of information about a database that may be used by an optimiser. Where is such information stored? [2] Answer a) The operators of a data manipulation language, such as SQL, are implemented through procedures – the process of executing a query through some procedures is called the evaluation of the query. The same operator may be implemented (in a DBMS) by more than one procedure. By using different procedures, the execution of a query may lead to different execution times. Moreover, the order in which the procedures that implement a query are executed is relevant in terms of execution time. The evaluation of a query requires: a) the association of specific procedures with the operators used in the query; b) the specification of a certain order in which the operators are executed. Consider, for example, the following SQL query: SELECT * FROM Student S, Registrations R WHERE S.sId = R.sId AND S.Age > 30 AND R.Result > 70; This query can be evaluated by: - (1) restricting Students to Age>30 ; (2) restricting Registrations to Result>70 ; and (3) join the respective results; - (1) join Student with Registrations ; (2) restrict to Age>30 ; and (3) restrict to Result>70 ; - etc. Each order may lead to a different result. Award 3 marks for explanation (it has only to convey the main idea; it does not have to be as detailed as above) 2 marks for example (any example that makes reference to the order of execution or to the selection of procedures is sufficient) TOTAL [5] b) Examples of information utilised by an optimiser include: - no of tuples relation/table; - space occupied by each relation/table; - min, max and average values for numeric fields; - no of distinct values for each filed; - histograms; Such information is stored in the catalogue. Award 1 mark for at least one correct example 1 mark for answer: catalogue. TOTAL [2] TOTAL [7] CIS209 – IS52003A 2002 Internal Solutions 11 Question 2 a) What is a transaction? Give a simple example. b) Explain the ACID properties of transactions (one/two sentences per property). [8] [4] [4] Answer a) A transaction is a sequence of database operations that represents a logical unit of work. An example could be given in the context of a database that stores some redundant data (e.g., each loan, in a library database, is stored explicitly, but the total number of loans is also explicitly stored for each borrower) – a transaction is required when such data is updated. Award: 2 marks for definition 2 marks for example. TOTAL [4] b) A - atomicity – all or nothing; C - consistency – the consistency of the database is preserved by the execution of a transaction I - isolation – transactions are isolated from one another (i.e. no predetermined linked exists between transactions); D - durability – once a transaction was committed, its execution is guaranteed (even in the event of a soft failure). Award 1 mark per property TOTAL [4] TOTAL [8] Question 3 [5] There are five types of transactions that can be identified when a system failure arises. Describe each of them, stating, in each case, the corresponding recovery action that a DBMS must take (a diagram may help your explanation). Answer checkpoint system failure T1 T2 T3 T4 T5 T1 - completed and written on disk T2 and T4 - completed, but not completely written on disk T3 and T5 - not completed T2 and T4 must be redone T3 and T5 must be undone Award 3 marks for explanation of types of transaction (diagram is not necessary) 2 marks for action TOTAL [5] CIS209 – IS52003A 2002 Internal Solutions 12 Question 4 [5] Consider the following transaction, called T, represented in Diagram 1 (t i represent tuples). Explain the execution of T in time, in terms of locks, using the following primitives: request-for-lock(type, tuple); acquire-lock(type, tuple); wait; release-lock(type, tuple) and the time scale represented in Diagram 2. Each horizontal line on the time scale could represent the execution of an operation, provided the requests for the corresponding locks is successful. The evolution of locks on these tuples from the point of view of another transaction, executed concurrently with T, is also described in Diagram 2. BEGIN SELECT UPDATE UPDATE UPDATE COMMIT another transaction has: acquire-lock(X, t2) and acquire-lock(S, t1) t1 t2 t3 t1 start T release-lock(X, t2) release-lock(S, t1) Diagram 1 Diagram 2 Answer the other transaction release-lock(X, t2) release-lock(S, t1) transaction T start T request-for-lock(S, acquire-lock(S, t1) request-for-lock(X, wait wait acquire-lock(X, t2) request-for-lock(X, acquire-lock(X, t3) request-for-lock(X, wait wait acquire-lock(X, t1) release-lock(X, t1) release-loch(X, t2) release-lock(X, t3) t1) t2) t3) t1) // this is a promote, in fact Award 5 marks for correct answer (award less than 5 marks but more than 0 if some good attempt to the answer was made). TOTAL [5] CIS209 – IS52003A 2002 Internal Solutions 13
© Copyright 2024 Paperzz