Introduction - csns - California State University, Los Angeles

CS4222
Principles of Database System
1. Introduction
Huiping Guo
Department of Computer Science
California State University, Los Angeles
Some concepts
 Database
 A very large, integrated collection of data.
 Data
 Known facts that can be recorded and have an implicit
meaning
 Database Management System (DBMS)
 a software package that enables users to create and
maintain databases
• Facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and
applications
 Database System:

The DBMS software together with the data itself.
Sometimes, the applications are also included.
1. Introduction
CS4222_Summer'17
2
Simplified database system
environment
1. Introduction
CS4222_Summer'17
1-3
Typical DBMS Functionality
 Define a particular database in terms of its data
types, structures, and constraints
 Construct or Load the initial database contents on
a secondary storage medium
 Manipulating the database:



Retrieval: Querying, generating reports
Modification: Insertions, deletions and updates to
its content
Accessing the database through Web applications
 Processing and Sharing by a set of concurrent
users and application programs – yet, keeping all
data valid and1. consistent
Introduction
CS4222_Summer'17
1-4
Typical DBMS Functionality
 Other features:
 Protection or Security measures to prevent unauthorized
access
 “Active” processing to take internal actions on data
 Presentation and Visualization of data
 Maintaining the database and associated programs over
the lifetime of the database application
• Called database, software, and system maintenance
1. Introduction
CS4222_Summer'17
1-5
Example of a Database
 Miniworld
 A database represents some aspect of the real world
 Mini-world for the example:
 Part of a UNIVERSITY environment.
 Some mini-world entities:
 STUDENTs
 COURSEs
 SECTIONs (of COURSEs)
 (academic) DEPARTMENTs
 INSTRUCTORs
1. Introduction
CS4222_Summer'17
1-6
Example of a Database
(with a Conceptual Data Model)
 Some mini-world relationships:






SECTIONs are of specific COURSEs
STUDENTs take SECTIONs
COURSEs have prerequisite COURSEs
INSTRUCTORs teach SECTIONs
COURSEs are offered by DEPARTMENTs
STUDENTs major in DEPARTMENTs
 Note: The above entities and relationships are
typically expressed in a conceptual data model,
such as the ENTITY-RELATIONSHIP data model
(to be discussed in details later)
1. Introduction
CS4222_Summer'17
1-7
Example of a simple database
1. Introduction
CS4222_Summer'17
1-8
Main Characteristics of the
Database Approach
 Self-describing nature of a database system:



A DBMS catalog stores the description of a particular
database (e.g. data structures, types, and constraints)
The description is called meta-data.
This allows the DBMS software to work with different
database applications.
1. Introduction
CS4222_Summer'17
1-9
Example of a simplified database catalog
1. Introduction
CS4222_Summer'17
1-10
Main Characteristics of the
Database Approach (continued)
 Insulation between programs and data:



Called program-data independence.
Allows changing data structures and storage
organization without having to change the DBMS
access programs
The characteristic that allows program-data
independence is called data abstraction
 Support of multiple views of the data:
 Each user may see a different view of the database,
which describes only the data of interest to that user.
1. Introduction
CS4222_Summer'17
1-11
Main Characteristics of the
Database Approach (continued)
 Sharing of data and multi-user transaction
processing:




Allowing a set of concurrent users to retrieve from
and to update the database.
Concurrency control within the DBMS guarantees
that each transaction is correctly executed or
aborted
Recovery subsystem ensures each completed
transaction has its effect permanently recorded in
the database
OLTP (Online Transaction Processing) is a major part
of database applications. This allows hundreds of
concurrent transactions to execute per second.
1. Introduction
CS4222_Summer'17
1-12
Database Users
 Users may be divided into
 Those who actually use and control the
database content, and those who design,
develop and maintain database applications
(called “Actors on the Scene”)
 Those who design and develop the DBMS
software and related tools, and the computer
systems operators (called “Workers Behind the
Scene”).
1. Introduction
CS4222_Summer'17
1-13
Database Users
 Actors on the scene
 Database administrators:
• Responsible for authorizing access to the database, for
coordinating and monitoring its use, acquiring software and
hardware resources, controlling its use and monitoring
efficiency of operations.

Database Designers:
• Responsible to define the content, the structure, the
constraints, and functions or transactions against the
database. They must communicate with the end-users and
understand their needs.
1. Introduction
CS4222_Summer'17
1-14
Advantages of Using the
Database Approach
 Controlling redundancy in data storage and in
development and maintenance efforts.
 Restricting unauthorized access to data.
 Providing Storage Structures (e.g. indexes) for
efficient Query Processing
 Providing persistent storage for program Objects
1. Introduction
CS4222_Summer'17
1-15
Advantages of Using the
Database Approach (continued)
 Providing backup and recovery services.
 Providing multiple interfaces to different classes
of users.
 Representing complex relationships among data.
 Enforcing integrity constraints on the database.
 Permitting inferences and actions using deductive
and active rules
1. Introduction
CS4222_Summer'17
1-16
File systems vs a DBMS
 Scenario
 A university wishes to store large collection of data on
faculty, departments, students, and so on
 The data will be accessed by different faculty,
departments, students for various reasons
1. Introduction
CS4222_Summer'17
17
Storing the data in files (cont.)
 Goal
 Find the names of all faculty in a department
 Solution
 Write a procedure to search files for matching
faculty
 Problems
 -query is hard-coded, requires program
 -new queries require re-programming
 -no theory for query optimization
1. Introduction
CS4222_Summer'17
18
Storing the data in files (cont.)
 Goal: Data should be free of inconsistencies
 A student cannot get both B and A in the same
course
 Solution
 make sure procedures verify data is valid
 Problems:
 integrity constraints cannot be found or changed
without re-programming
1. Introduction
CS4222_Summer'17
19
Storing the data in files (cont.)
 Goal
 Data should be secure from unauthorized users
 Solution
 use file access permissions
 Problems:
 host OS may not provide any file access restrictions
 file access permission provided by host operating
system will be inadequate
1. Introduction
CS4222_Summer'17
20
Database solution
 Create an application to which we input:
 what our data will look like
 what constraints apply to the data
 allows us to pose arbitrary queries
 allows us to specify access controls
 monitors modification of data
 The database is generic
 works with anyone’s data or queries -at run time
1. Introduction
CS4222_Summer'17
21
Describing and Storing Data in a
DBMS
 data model
 Data can be represented in different ways
• Trees, graphs, tables, etc.

A data model is a way to represent data
 relational data model
 using tables to represent data
1. Introduction
CS4222_Summer'17
22
Relational data model

Relation


A table with rows and columns.
Schema



Conceptual schema (logical schema)
• Describes the columns, or fields of relations
Physical schema
• File organizations
• Indices
External schema
• A collection of views from the conceptual schema
1. Introduction
CS4222_Summer'17
23
Levels of Abstraction
View 1
View 2
View 3
Conceptual Schema
Physical Schema
disk
 Schemas are defined using DDL; data is modified/queried using DML.
1. Introduction
CS4222_Summer'17
24
Example: University Database
 Conceptual schema:

Courses(cid: string, cname:string, credits:integer)
 Faculty(fid:string, fname:string, sal:real)

Teaches(cid:String, fid:string)
 Physical schema:


Relations stored as unordered files.
Index on first column of Students.
 External Schema (View):

Course_info(cid:string,fname:string)
1. Introduction
CS4222_Summer'17
25
Data Independence *
Applications insulated from how data is structured
and stored.
 Logical data independence

Protection from changes in logical structure of data.
 Physical data independence
 Protection from changes in physical structure of data.
 One of the most important benefits of using a DBMS!
1. Introduction
CS4222_Summer'17
26
Logical data independence example
 Suppose logical schema has been changed
 Faculty_public(fid:string, fname:string, office:string)
 Faculty(fid:string, sal:real)


Need to change the definition of the view
course_infor
The applications that use Course_infor remain the
same
1. Introduction
CS4222_Summer'17
27
Queries in a RDBMS
 Relational calculus
 A formal query language based on mathematical logic
 Relational algebra
 Anther formal language based on a collection of
operators for manipulating relations
 Structured Query Language (SQL)
 Data Manipulation language (DML)
 Data Definition Language (DDL)
1. Introduction
CS4222_Summer'17
28
What will be covered
 Database design
 Data modeling
• Entity-Relationship (ER) model
• ER diagram  relational model
 Normalization
 Database Implementation
 Data Definition Language (DDL)
 Authorization
 Integrity constraints
 Stored procedures and triggers
1. Introduction
CS4222_Summer'17
29
What will be covered
 Query Languages
 Relational calculus and algebra
 DML
• queries, subqueries, nested queries, joins
 Application Development
 Embedded SQL
 JDBC
1. Introduction
CS4222_Summer'17
30