Essay Question: Question 1 Explain 4 different means by which

Essay Question:
Question 1
Explain 4 different means by which constrains are represented in the
Conceptual Data Model (CDM).
By specifying participation conditions
By specifying the degree of relationship (1:1, 1:n, or n:m)
By specifying entity unique identifiers
By specifying additional constraints
Question 2
Briefly explain the different options usually available when specifying the
system action in response an attempted deletion of a referenced tuple that
would result in violating of the referential integrity constraint.
Restrict: Disallow the deletion.
Cascade: propagate the deletion by deleting all referring tuples.
Default: perform the deletion and apply a predefined default value in place of the
deleted value in the referring tuples.
Question 3
Briefly explain the main drawbacks of the client-multi-server approach of
distributed database management
The user must manage the connections to remote servers
The user must be aware of where the data is located, thus losing the
desirable property of location transparency
Question 4
Explain three important advantages of SQL routines.
SQL routines provide the following advantages:
Security: only execute privileges can be given to users, without any other
explicit privileges. This enhances security.
Efficiency: SQL routines can be stored on the server side, thus resulting in
lower network traffic
The separation of concerns: developers will spate the internal details of
how a certain procedure is implemented from the way it is used.
Question 5
Briefly explain what is meant by a lost dependency (also called a hidden
dependency), and how this problem can be corrected.
A lost dependency is a situation that arises due to the incompatibility of
normalization and the preservation of all functional dependencies in the ER
diagram. It results in not being able to directly represent some functional
dependencies in the ER diagram, thus losing part of the problem specifications.
This problem can be corrected by specifying constraints in the additional
constraints section to restore the lost dependencies.
Question 6
Explain what was the type of interface that was provided in the old file systems
approach?
The old file systems approach used a type of physical interface whereby the
user/application developer had to know the physical locations of the various
items in the file in order to retrieve the desired information. All information was
retrieved based on their position in the file.
Question 7
What was the main disadvantage of this old style of interface?
The main disadvantage of such an interface was the high program
maintenance overhead in response to changes in the file structure. Changes
in the file structures would require re-coding all application programs that
referred to affected areas of the original file, resulting in much wasted effort
and expense.
Question 8
Explain what is meant by “logical interface”, why it is an improvement over the
interface utilized in the old file systems approach?
A logical interface is one that does not require the user to refer to data items
by location; rather, data items are retrieved based on their logical structure
and predefined names. It was an improvement over the interface utilized in
the old file systems approach because it avoids having to re-code applications
in response to changes in the file structures. Since items are referred to
logically, the item references will remain valid in the face of physical file
changes.
Question 9
How (i.e. in what form) the logical interface is implemented in modern
Relational Database Management Systems (RDBMS)? Give an example of a
logical request to a relational database.
It is implemented in modern relational databases through the relational
model and its widely used language SQL. As an example, to retrieve the
names and addressed of all students, we might issue the SQL statement:
select name, address from student.
Question 10
Explain what is meant by unproductive maintenance problem in the old file
system approach and how the database approach addresses it.
Unproductive maintenance occurs in the old file system approach when applications
programs need to be maintained or re-written due to restructuring in the underlying
file that is unrelated to the current application and does not lead to any
improvements in it. It is usually related to additional requirements in other
application programs, resulting in the need to restructure the underlying file(s). This
problem is addressed in database management systems by introducing the concept of
data independence in which application programs are independent of the file
structures.
Question 11
Describe briefly the differences between the Information Systems (IS),
Information management (IM), and Information Technology (IT) strategies.
Information Systems (IS) strategy is focused on the business improvement and what
it can be brought about by application of appropriate information systems. The
Information Management (IM) strategy focuses on people and the management
aspects of the information systems. Finally, the Information Technology (IT)
strategy focuses on what technologies should be used, and on the development
aspects of the technology.
Question 12
Explain the statement: “join is a refinement of Cartesian product”. Suppose we wish
to join tables S(A, B) and T(C, D) on attributes B and C. Write a simple SQL query to
make the join using a refinement of Cartesian product.
A Cartesian Product operation produces all possible pairs of tuples. We can achieve
the effect of join by further restricting the result of the Cartesian product to ensure
that the join attributes coming from both tables fulfill the join condition.
SELECT *
FROM S, T
WHERE S.B = T.C;
Question 13
The basic operation to normalize a table R – regardless of which normal form – is
to identify an offending functional Dependency (FD) then to successively split it
into two tables until all tables are normalized. Describe how this can be done in
such a way to guarantee non-loss decomposition?
We proceed in each step by projecting R:
First, over the left- and right-hand sides of the FD to give S, and then over all the
attributes of R except those which form the right-hand side of FD, to give T. This
process guarantees non-loss decomposition by the theory of I.J.Heath.
Question 14
Explain four different means by which the relational model can represent
constraints.
1. By specifying primary keys
2. By specifying foreign keys
3. By specifying alternate keys
4. By specifying general constraints
5. By specifying domains
Question 15
In the pure relational model, explain why we can’t represent a relationship type by
a posted foreign key when both sides of the relationship have optional participation
“constraints”.
The pure relational model requires all fields to have a value. No matter which side
we use to post a foreign key, we will not be able to guarantee the optional
participation since this will require the posted foreign key to have no value.
Question 16
Explain the statement: “The HAVING clause is to groups what the WHERE clause is
to rows.”
The having clause operates at the level of groups,
meaning that each group if filtered individually by applying the condition in the
HAVING clause to it. If the group as whole passes the test, it is included in the
result, otherwise, it is rejected.
The WHERE clause operates at the level of rows, meaning that each row is filtered
individually by applying the condition in the WHERE clause to it. If it passes the
test, it is included in the result.
Question 17
A query written using subqueries can often be re-written using joins and vice versa.
Discuss when are subqueries necessary, when are joins necessary, and what factors
affect the choice when both options are feasible.
A join is necessary when the final table includes data from both tables because
fields in tables referred to only inside a subquery are inaccessible in the SELECT
Clause.
A subquery is necessary when a comparison is to be made with an aggregate
function applied to the second table or lists of values from all the rows because
this requires two levels of evaluation, not available in queries with joins only.
When there is a genuine choice between a join and a subquery, there are two
factors to consider:
Which formulation seems more naturally based on the English request. Which formulation
is likely to be most efficient
Question 18
Explain the concept of isolation levels and why this is needed in databases where
concurrent access is required? What is the highest isolation level?
Isolation levels refer to the ability to define multiple levels of concurrency which
differ in how strictly concurrently running transactions are isolated from each other
and to also define the level of concurrent access for a transaction. User-selectable
isolation levels allow a choice between accuracy of data and speed of execution.
The highest level of isolation is known as serializable isolation, which guarantees
serializable execution of a transaction. Lower isolation levels may result in one of
the concurrency problems occurring, but generally allow quicker execution.
Question 19
Discuss the role played by the concept of separation of concerns in modern
database management.
The separation of concerns facilitates the design of database systems by allowing the
designers to focus more on data representations, requirements and processing. The
separation of concerns also simplifies the task for programmers developing user processes
because it relives them from having to consider all the concerns simultaneously. It also
results in facilitating ad hoc querying because the DBMS can translate a logical user
query into the more complex physical requirements of the underlying computer system.
Finally it leads to logical data independence which in turn results in reduced unproductive
maintenance.
Question 20
Which data quality issue(s): accuracy, completeness, timeliness, relevance,
understandability and trust ability is (are) being compromised in each of the
following situations? Explain why?
i. The Student Information System does not record the history of the transactions on
student
marks, including the time and identity of the user who changed your marks in the system.
Un-trusted. If an authorized person can maliciously alter students‟ marks in the
system without ever being identified, we lose trust in the system.
ii. A system posts a notice for a cancelled lecture 5 minutes before the lecture starts
Un-timely. Finding out a cancelled lecture only 5 minutes before the lecture is
useless to avoid travel to the university.
iii. Your request for your course marks returns the marks of all students in a different
course.
Irrelevant and un-trusted. Irrelevant because it is for a different class, and un-trusted
because a system that reveals student marks to unauthorized persons does not convey
trust.
iv. The registrar wishes to choose a suitable room for a certain class. A request to the
system
returns a listing of all rooms in the first floor with the designation “small” or “large”.
Incomplete and not understandable. How about other floors? And what do the terms
“small” or “large” mean?
Question 21
Describe the three key activities of the waterfall model applied to database
development. Explain, for each key activity, what document is produced at
the end of the activity.
1. Establishing requirements: the users are interviewed and documents are examined to
determine the user requirements. The end product is a requirements document that is
important as it acts as a contract between the user and the developer.
2. Data analysis: is the process of converting the requirements document into a
conceptual data model that is independent of the implementation environment. Its end
product is the conceptual data model and it is important because its acts as a formal
system specification throughout the development process to follow.
3. Database design: is the process of developing a system-specific design for the intended
database. Its end product is the logical schema and it is important because it forms the
basis of the subsequent system implementation.
Question 22
Explain what is meant by weak entity types in a conceptual data model?
A weak entity type is an entity type whose entities cannot exist without the existence of
another identifying entity type.
Question 23
Explain one key advantage of using SQL routines in database development.
1. Centralizing the location of functionality: Routines are held in a central place in the
database and are made accessible to all applications. In case the definition of a routine
needs to be changed, it will need to be updated only once, not in many places.
2. Improving security: Instead of granting access rights to all users, we can only grant
execute right to the stored routines, which will help control the types of access allowed for
the users.
3. Improving Efficiency: Routines that are defined within the database on the server side
do not need to use communication networks between the client and the server, and
therefore improving efficiency.
Question 24
Explain what is meant by Data Type Definition (DTD) in the context of the
eXtended Markup Language (XML).
A DTD is a way to define the structure or type of an XML document and to define which
documents represent valid instances of this definition.
Question 25
Explain what is meant by transaction management and why it is important in
a multi-user E-nvironment?
Transaction management aims to eliminate the interference between multiple operatio
on the same data while allowing a maximum amount of availability. In especially
important in multi-user environments because it allows multiple users to access and
update the database efficiently without risk if interference among their transactions.
Question 26
Explain why a compiled program including embedded SQL statements is not
portable across different DBMSs from different Vendors?
It would not be portable because the compiled SQL code would be specific to the
DBMS for which it was compiled. The Open Data Base Connectivity (ODBC)
approach has been developed to overcome this limitation.
Question 27
Explain why a composite query involving a correlated nested sub-query can
be inefficient.
A composite query involving a correlated nested sub query can sometimes be
inefficient
because the subquery will need to be evaluated many times, once for each tuple
considered in the outer query. For example, returning the employees whose salary
is higher than the average salary of all employees in his/her department will require
the average salary for each department to be evaluated once for each employee in
it.
Question 28
Explain when you would need to add a general constraint to a relational
database definition in the theoretical database model.
A general constraint would be needed if the constraint cannot be represented in the E-R
model or if a relationship with a mandatory participation condition is implemented using
a posted foreign key from the mandatory side to the other side.
Question 29
Explain when it would be necessary to use a subquery as opposed to using a
join?
A subquery is necessary when a comparison is to be made with an aggregate
function applied to the second table or lists of values from all the rows.
Question 30
Briefly explain the two possible outcomes of a transaction in a transaction
management system, and why are transactions needed in a shared
environment?
The two possible outcomes are: 1. The transaction succeeds and all its statements
are
carried out, or 2. One of its statements fails resulting in all the changes made getting
rolled back. Transactions are needed in a shared environment to prevent different
processes from interfering with each other in such a way that incorrect results are
produced.
Question 31
Briefly explain the main benefits of producing a normalized database.
1. Reducing unnecessary redundancy
2. The elimination of insertion, deletion and update anomalies
Question 32
Briefly explain the main phases of developing a database system using the
waterfall model.
1. establishing requirements, produces the data requirement specifications
2. data analysis, produces the conceptual model
3. database design, resulting in the logical schema
4. implementation, resulting in an initial schema and database
5. testing against the requirements specifications, resulting in the released schema
and
database
6. maintenance, a continuous activity
Question 33
Explain the main advantages of using procedures in database systems.
1. Improving security by giving only execution rights to procedures on critical data.
2. Improving performance since procedures can be stored on the server, eliminating
the
need to transfer large data sets to/from the server. 3. Achieving a better measure
for the separation of concerns.
Explain the difference between cascade, default and restrict options when deleting
a references tuple that would result in violating referential integrity.
Question 34
Explain how the global and distribution schemas are used in a distributed
database system?
A global schema is like a logical schema for the entire database while a distribution
schema is like a storage schema which defines where data is located. They are used
to provide a single unified logical schema for the entire database and to help
achieve location transparency, where the user is not required to be aware of where
the data is located.
Question 35
What is the different between a table and a relation? Mention the relatedrules
Arelation is an abstract concept consisting of a set of tuples of attribute values in any
Order . Provided that it obeys the following rules, a table might be aconcrete
depiction of a relation
Rule 1: atomic entries
Rule 2: consistent column values
Rule 3: unique column names
Rule 4: distinct rows
Rule 5: insignificance of order
AMSWER :