Database

DATABASES
1
Conventional Files versus the
Database
File – a collection of similar records.
Files are unrelated to each other except
in the code of an application program.
Data storage is built around the
applications that use the files.
Database – a collection of interrelated files
 Records in one file (or table) are
physically related to records in another
file (or table).
Applications are built around the
integrated database
2
Files Versus Database
3
Pros and Cons of Conventional Files
Pros
Easy to design because of their singleapplication focus
Excellent performance due to optimized
organization for a single application
Easy to design because of their singleapplication focus
Excellent performance due to optimized
organization for a single application
4
Cons
Harder
to
adapt
to
sharing
across
applications
Harder to adapt to new requirements
Need to duplicate attributes in several
files.
5
Pros and Cons of Databases
Pros
Data independence from applications
increases adaptability and flexibility
Superior scalability
Ability to share data across applications
Less, and controlled redundancy (total
non-redundancy is not achievable)
6
Cons
More complex than file technology
Somewhat slower performance
Investment
experts
in
DBMS
and
database
Need to adhere to design principles to
realize benefits
Increased
vulnerability
consolidating data in a
database
due
to
centralized
7
Data is stored in some combination of:
Conventional files
Operational databases – data
bases that support day-to-day
operations and transactions for an
information system. Also called
transactional databases.
8
Data warehouses – databases
that store data extracted from
operational databases.
To support data mining
Personal databases
Work
group
databases
9
A Modern Data Architecture
10
Data administrator – a database specialist
responsible for data planning, definition,
architecture, and management.
Database administrator – a specialist
responsible for database technology,
database design, construction, security,
backup and recovery, and performance
tuning.
A
database
administrator
will
administer one or more databases
11
Why Use A Database?
Data overload is a common problem in
business
today.
Corporations
and
individuals have plenty of raw data, but
can't always find it or aren't aware that
they even have it. Raw data must be
filtered and organized to become useful
information. Databases are a primary tool
for the task; a tool which takes advantage
of the speed and power of modern
computers.
12
Why Design a Database?
Goal:
 To produce an information system that adds
value for the user
Reduce costs
Increase sales/revenue
Provide competitive advantage
Objective:
To understand the system
To improve it
To communicate with users and IT staff
13
Requirements Collection and Analysis
This task results in a concise set of user
requirements, which should be detailed and
complete.
The functional requirements should be
specified, as well as the data requirements.
Functional requirements consist of user
operations that will be applied to the database,
including retrievals and updates.
Functional requirements can be documented
using diagrams such as sequence diagrams,
data flow diagrams etc.
14
Designing Systems
Designs are a model of existing & proposed systems:
 They provide a picture or representation of reality
 They are a simplification
 Someone should be able to read your design
(model) and describe the features of the actual
system.
You build models by talking with the users
 Identify processes
 Identify objects
 Determine current problems and future needs
 Collect user documents (views)
Break complex systems into pieces and levels
15
Conceptual Design
 Once the requirements are collected and analyzed,
the designers go about creating the conceptual
schema (model).
 Conceptual schema: concise description of data
requirements of the users, and includes a detailed
description of the entity types, relationships and
constraints.
 The concepts do not include implementation details;
therefore the end users easily understand them, and
they can be used as a communication tool.
 The conceptual schema is used to ensure all user
requirements are met, and they do not conflict.
16
Entity Relationship (ER) Model
The most popular high-level conceptual data
model is the ER model. It is frequently used for
the conceptual design of database applications.
The diagrammatic notation associated with the
ER model, is referred to as the ER diagram.
ER diagrams show the basic data structures
and constraints.
17
Entity Relationship (ER) cont…
The basic object of an ER diagram is the entity.
An entity represents a ‘thing’ in the real world.
Examples of entities might be a physical entity,
such as a student, a house, a product etc, or
conceptual entities such as a company, a job
position, a course, etc.
Entities have attributes, which basically are the
properties/characteristics of a particular entity.
18
Entity Relationship (ER) cont…
Entity
Car
Attributes
Color
Make
Values
Red
Volkswagen
Model
Bora
Year
2000
19
KEY
An important constraint on entities of an entity
type is the uniqueness constraint.
A key attribute is an attribute whose values are
distinct for each individual entity in the entity
set.
The values of the key attribute can be used to
identify each entity uniquely.
Sometimes a key can consist of several
attributes together, where the combination of
attributes is unique for a given entity. This is
called a composite key.
20
Relationships
Each time an attribute of one entity type refers
to another entity type, some relationship exists.
In ER diagrams, these references should be
represented as relationships, rather than
attributes.
Relationships between entities are represented
using a diamond shape.
21
Relationships….
Employee
Wor
ks
for
Depart
ment
22
Summary of ER, EER Diagram Notation

Entity Name
Entity Name
Strong Entities
Weak Entities
Attributes
Multi Valued Attributes
Composite Attributes
Relation
ship
Name
Relationships
23
Constraints
1:N – One Customer buys many products,
each product is purchased by only one
customer.
Customer
1
N
Product
Purchases
N:1 - Each customer buys at most one
product, each product can be purchased
by many customers.
Customer
N
Purchases
1
Product
24
ER DIAGRAM – Entity Types are:
EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT
25
COMPANY ER Schema Diagram
using (min, max) notation
26
Transforming an Entity Type to a
Relation
27
Figure Representing a 1:N
Relationship
28
CLASS DIAGRAMS
Class: Description of an entity, that includes
its attributes (properties) and behavior
(methods).
Object: One instance of a class with specific
data.
Property:
A characteristic or description of a
class or entity.
Method: A function that is performed by the
class.
Association: A relationship between two or
more classes.
29
Entities/Classes
30
Association Example
EmployeeID
11
12
Employee
Name
...
ProductID
A3222
A5411
1
*
1
*
Component
CompID
Type
Name
CompID
563
872
882
883
888
Type
W32
M15
H32
H33
T54
Name
Wheel
Mirror
Door hinge
Trunk hinge
Trunk handle
Assembly
Assembly
EmployeeID
CompID
ProductID
Multiplicity is defined as the number of items
that could appear if the other N-1 objects
are fixed. Almost always “many.”
Name
Joe Jones
Maria Rio
*
1
Type
X32
B17
…
…
…
Name
Corvette
Camaro
Product
ProductID
Type
Name
EmployeeID
11
11
11
11
12
12
12
12
CompId
563
872
563
872
563
882
888
883
ProductID
A3222
A3222
A5411
A5411
A3222
A3222
A3222
A5411 31
Example of relationships
Customer
Bicycle::Bicycle
1…1
CustomerID
Phone
FirstName
LastName
Address
ZipCode
CityID
BalanceDue
1…1
0…*
0…*
BicycleID
…
CustomerID
StoreID
…
Retail Store
Customer
Transaction
CustomerID
TransactionDate
EmployeeID
Amount
Description
Reference
0…*
StoreID
StoreName
Phone
ContactFirstName
ContactLastName
Address
ZipCode
CityID
0…1
32
Groupo
Customer
Bicycle
BicycleTube
CustomerID
Phone
FirstName
LastName
Address
ZipCode
CityID
BalanceDue
SerialNumber
CustomerID
ModelType
PaintID
FrameSize
OrderDate
StartDate
ShipDate
ShipEmployee
FrameAssembler
Painter
Construction
WaterBottle
CustomName
LetterStyleID
StoreID
EmployeeID
TopTube
ChainStay
HeadTubeAngle
SeatTueAngle
ListPrice
SalePrice
SalesTax
SaleState
ShipPrice
FramePrice
ComponentList
SerialNumber
TubeID
Quantity
CustomerTrans
CustomerID
TransDate
EmployeeID
Amount
Description
Reference
RetailStore
StoreID
StoreName
Phone
ContacFirstName
ContactLastName
Address
Zipcode
CityID
StateTaxRate
State
TaxRate
Customer
CityID
ZipCode
City
State
AreaCode
Population1990
Population1980
Country
Latitude
Longitude
ModelType
ModelType
Description
ComponentID
BikeTubes
SerialNumber
TubeName
TubeID
Length
CompGroup
GroupName
BikeType
Year
EndYear
Weight
ModelSize
Paint
PaintID
ColorName
ColorStyle
ColorList
DateIntroduced
DateDiscontinued
ModelType
MSize
TopTube
ChainStay
TotalLength
GroundClearance
HeadTubeAngle
SeatTubeAngle
BikeParts
EmployeeID
TaxpayerID
LastName
FirstName
HomePhone
Address
ZipCode
CityID
DateHired
DateReleased
CurrentManager
SalaryGrade
Salary
Title
WorkArea
LetterStyle
Description
PurchaseOrder
PurchaseID
EmployeeID
ManufacturerID
TotalList
ShippingCost
Discount
OrderDate
ReceiveDate
AmountDue
TubeID
Material
Description
Diameter
Thickness
Roundness
Weight
Stiffness
ListPrice
Construction
Component
SerialNumber
ComponentID
SubstituteID
Location
Quantity
DateInstalled
EmployeeID
LetterStyle
Employee
TubeMaterial
PurchaseItem
PurchaseID
ComponentID
PricePaid
Quantity
QuantityReceived
Manufacturer
ManufacturerID
ManufacturerName
ContactName
Phone
Address
ZipCode
CityID
BalanceDue
ComponentID
ManufacturerID
ProductNumber
Road
Category
Length
Height
Width
Weight
Year
EndYear
Description
ListPrice
EstimatedCost
QuantityOnHand
GroupCompon
GroupID
ComponentID
ComponentName
ComponentName
AssemblyOrder
Description
ManufacturerTrans
ManufacturerID
TransactionDate
EmployeeID
Amount
Description
Reference
33
Planning and analysis
Data modeling is preceded by planning
and analysis.
The effort devoted to this stage is
proportional to the scope of the database.
The planning and analysis of a database
intended to serve the needs of an
enterprise will require more effort than one
intended to serve a small workgroup.
34
An accurate and up-to-date data
model can serve as an important
reference
tool
for
DBAs,
developers, and other members
of a JAD (joint application
development) team.
35
By building quality into the project, the
team reduces the overall time it takes to
complete the project, which in turn
reduces project development costs.
An effective data model completely and
accurately
represents
the
data
requirements of the end users. It is simple
enough to be understood by the end user
yet detailed enough to be used by a
database designer to build the database.
36
The model eliminates redundant
data, it is independent of any
hardware and software constraints,
and can be adapted to changing
requirements with a minimum of
effort.
37
Data modeling is a bottom up
process. A basic model, representing
entities
and
relationships,
is
developed first. Then detail is added
to the model by including information
about attributes and business rules.
The information needed to build a
data model is gathered during the
requirements analysis.
38
The requirements analysis is usually done
at the same time as the data modeling.
As information is collected, data objects
are identified and classified as either
entities,
attributes,
or
relationship;
assigned names; and, defined using terms
familiar to the end-users. The objects are
then modeled and analyzed using an ER
or class diagram.
39
The diagram can be reviewed to
determine its completeness and
accuracy, and/or modified.
The review and edit cycle continues
until the model is certified as correct.
40
Points to note
a) Talk to the end users about their data
in "real-world" terms. Users do not
think in terms of entities, attributes,
and relationships but about the
actual people, things, and activities
they deal with daily.
41
b) Take the time to learn the basics about
the organization and its activities that
you want to model. Having an
understanding about the processes will
make it easier to build the model.
c) End-users typically think about and view
data in different ways according to their
function
within
an
organization.
Therefore, it is important to interview the
largest number of people that time
permits.
42
What makes an object an entity or
attribute?
For example, given the statement
"employees work on projects". Should
employees be classified as an entity or
attribute? Very often, the correct answer
depends upon the requirements of the
database. In some cases, employee
would be an entity, in some it would be an
attribute.
43
Some commonly given guidelines are:
entities
information
contain
descriptive
attributes either identify or describe
entities
relationships
are
between entities
associations
44
Achieving a Well-Designed
Database
A table should have an identifier.
A table should store only data for a single
type of entity.
A table should avoid nullable columns.
A table should not have repeating values
or columns.
45
Some Common Database Design
Mistakes
1. Poor design/planning
2. Ignoring normalization
3. Poor naming standards
4. Lack of documentation
5. Lack of testing
46
1.Poor Design/Planning
"If you don't know where you are going,
any road will take you there" –
George Harrison
47
2. Ignoring Normalization
Normalization defines a set of methods
to break down tables to their constituent
parts until each table represents one
and only one "thing", and its columns
serve to fully describe only the one
"thing" that the table represents.
48
Normalization
 Normalization is a database design
approach that seeks the following
four objectives:
i. minimization of data redundancy,
ii. minimization of data restructuring,
iii. minimization of I/O by reduction of
transaction sizes, and
iv. enforcement of referential integrity.
49
Normalization….
Consider
the
following  A payment does not
describe a Customer
example Customer table:
and should not be stored
in the Customer table.
 Details of payments
should be stored in a
Payment table, in which
you could also record
extra information about
the payment, like when
the payment was made,
and what the payment
was for.
50
3.Poor naming standards
Consistency. The names you choose
are not just to enable you to identify the
purpose of an object, but to allow all
future programmers, users, and so on
to quickly and easily understand how a
component part of your database was
intended to be used, and what data it
stores.
51
Poor naming standards ……
Present to the users clear, simple,
Descriptive
names,
such
as
Customer and Address.
Avoid names such as:
- colVarcharAddress
- X304_DSCR
These mean nothing to the user.
The usage of dashes, spaces, digits
and special characters is discouraged
52
4.Lack of Documentation
Poorly documented code
synonym for "job security."
is
a
Your goal should be to provide
enough information that when you
turn the database over to a support
programmer, they can figure out your
minor bugs and fix them.
53
Lack of Documentation…..
In many cases, you may want to
include sample values, where the
need arose for the object, and
anything else that you may want to
know in a year or two when "future
you" has to go back and make
changes to the code.
54
5.Lack of Testing
Proper
test
plan
takes
into
consideration all possible types of
failures, codes them into an
automated test, and tries them over
and over.
Good testing won't find all of the
bugs, but it will get you to the point
where most of the issues that
correspond to the original design are
ironed out.
55
DATABASE SECURITY
SECURITY CONCERNS
AND MEASURES
56
Database Integrity
Database integrity ensures that data entered
into the database is accurate, valid, and
consistent. Any applicable integrity constraints
and data validation rules must be satisfied
before permitting a change to the database.
Business applications have several similar
problems such as:
Multiple users trying to change the same data
Multiple changes need to be made concurrently
57
Database Integrity….
For example: A customer uses the ATM and
instructs it to transfer 20,000 shillings from the
savings account to the current account. This
transaction require two steps –
1) subtracting money from the savings account
2) adding money to the current account
These are two updates or SQL statements. If
the system crashes in between, the customer
could loose their money.
58
Database Integrity…..
How does the computer know that both
operations must be completed at the same
time? As an application developer, you must
tell the computer system what operations
belong to a transaction.
You do this by marking the start and the end of
all transactions inside the code. This would
ensure that all the updates complete together or
fail together
59
Database Integrity……
Concurrent
access
can
also
be
problematic. An example is if two people
try to change the same data at the same
time. Some data could be overwritten and
lost. One solution is to prevent concurrent
access by forcing transactions to be
completely isolated.
60
Concurrent Access
 Concurrent Access
 Two processes
Multiple users or
processes changing the
same data at the same
time.
Final data will be wrong!
 Force sequential
Locking
Delayed, batch updates
 Initial balance $800
Result should be $800 200 + 150 = $750
Interference result is
either $600 or $950
Customers
Receive Payment
1) Read balance
2) Subtract pmt
4) Save new bal.
Receive payment ($200)
Place new order ($150)
800
-200
600
ID
Jones
Balance
$800
$600
$950
Place New Order
3) Read balance
5) Add order
6) Write balance
800
150
950
61
Pessimistic Locks: Serialization
One answer to concurrent access is to prevent it.
When a transaction needs to alter data, it places
a SERIALIZABLE lock on the data used, so no
other transactions can even read the data until
the first transaction is completed.
SET TRANSACTION SERIALIZABLE, READ WRITE
Customers
Receive Payment
1) Read balance
2) Subtract pmt
4) Save new bal.
800
-200
600
ID
Jones
Balance
$800
$600
Place New Order
3) Read balance
Receive error message
that it is locked.
62
Database Integrity
The concept of integrity is fundamental to
databases. One of the strengths of the
database approach is that the DBMS has
tools to handle the common problems. In
terms of transactions, many of these
concepts can be summarized in the
acronym ACID. The following figure shows
the meaning of the term.
63
ACID Transactions
 Atomicity: all changes succeed or fail together.
 Consistency: all data remain internally
consistent (when committed) and can be
validated by application checks.
 Isolation: The system gives each transaction
the perception that it is running in isolation.
There are no concurrent access issues.
 Durability: When a transaction is committed, all
changes are permanently saved even if there is
a hardware or system failure.
64
Referential Integrity
 Referential integrity is a property of data that
applies (or fails to apply) to a database as a whole.
In this sense, referential integrity means that in the
database as a whole, things are set up in such a
way that if a column exists in two or more tables in
the database (typically as a primary key in one
table and as a foreign key in one or more other
tables), then any change to a value in that column
in any one table will be reflected in corresponding
changes to that value where it occurs in other
tables. This means that the RDBMS must be set
up so as to take appropriate actions to spread a
change—in one table—from that table to the other
tables where the change must also occur.
65
Database Security
The major technical areas of computer security
are usually represented by the initials CIA:
confidentiality, integrity, and authentication or
availability. Confidentiality means that
information cannot be access by unauthorized
parties. Confidentiality is also known as
secrecy or privacy. Integrity means that
information is protected against unauthorized
changes that are not detectable to authorized
users; Authentication means that users are
who they claim to be. Availability means that
resources are accessible by authorized
parties.
66
Database Security
Database security is the system,
processes, and procedures that protect a
database from unintended activity.
Unintended activity can be categorized as
authenticated misuse, malicious attacks or
inadvertent mistakes made by authorized
individuals or processes. Database
security is also a specialty within the
broader discipline of computer security
67
Database Security cont….
Traditionally databases have been protected
from external connections by firewalls on the
network
perimeter
with
the
database
environment existing on the internal network.
Additional network security devices that detect
and alert on malicious database protocol traffic
include network intrusion detection systems
along with host-based intrusion detection
systems.
Database security is more critical as networks
have become more open.
68
Firewalls
firewall is a part of a computer system or
network that is designed to block
unauthorized access while permitting
authorized communications. It is a device
or set of devices that is configured to
permit or deny network transmissions
based upon a set of rules and other
criteria.
69
Firewall Cont….
Firewalls can be implemented in either
hardware or software, or a combination of
both. Firewalls are frequently used to
prevent unauthorized Internet users from
accessing private networks connected to
the Internet, especially intranets.
All
messages entering or leaving the intranet
pass through the firewall, which inspects
each message and blocks those that do
not meet the specified security criteria.
70
Firewall
71
Vulnerability Assessments
An important procedure when evaluating
database security is performing vulnerability
assessments against the database. A
vulnerability assessment attempts to find
vulnerability holes that could be used to break
into the database. Database administrators or
information
security
administrators
run
vulnerability scans on databases to discover a
breach of controls, along with known
vulnerabilities within the database software.
The results of the scans should be used to
harden the database in order to mitigate the
threat of compromise by intruders.
72
Database Security Cont…
A database security program should
include the regular review of permissions
granted to individually owned accounts
and accounts used by automated
processes. The accounts used by
automated processes should have
appropriate controls around password
storage such as sufficient encryption and
access controls to reduce the risk of
compromise
73
Database Security cont…
In conjunction with a sound database
security program, an appropriate disaster
recovery program should exist to ensure
that service is not interrupted during a
security incident or any other incident that
results in an outage of the primary
database environment. An example is that
of replication for the primary databases to
sites located in different geographical
regions.
74
Database Security cont…
Native database audit capabilities are also
available for many database platforms.
The native audit trails are extracted on a
regular basis and transferred to a
designated security system where the
database administrators do not have
access. This ensures a certain level of
segregation of duties that may provide
evidence that the native audit trails were
not
modified
by
authenticated
75
Database Forensics
A forensic examination of a database may
relate to the timestamps that apply to the
update time of a row in a relational table being
inspected and tested for validity in order to
verify the actions of a database user.
Alternatively, a forensic examination may focus
on identifying transactions within a database
system or application that indicate evidence of
wrong doing, such as fraud. The forensic study
of relational databases requires a knowledge of
the standard used to encode data on the
76
Physical Security
 Hardware
Preventing problems
 Fire prevention
 Site considerations
 Building design
Hardware backup facilities
 Continuous backup (mirror
sites)
 Hot sites
 Shell sites
 “Sister” agreements
Telecommunication systems
Personal computers
 Data and software
Backups
Off-site backups
Personal computers
 Policies and
procedures
 Network backup
 Disaster planning
Write it down
Train all new
employees
Test it once a year
Telecommunications
 Allowable time
between disaster
and business
survival limits.
77
Threats
The primary threat to any company comes
from insiders. Employees must be trusted,
because in order for them to do their jobs
they need access to the computers and
the database. Once they are granted
access it becomes more difficult to control
what they do.
Another threat comes from programmers.
78
Threats…
One technique used by programmers is
to insert a time bomb in a program. A
time bomb requires a programmer to
enter a secret code every day. If the
programmer is sacked or leaves work
and cannot enter the code, the program
starts deleting files.
In other cases programmers have
deliberately created programs that alter
data or transfer money to their accounts.
79
Managerial Controls
“Insiders”
Hiring
Termination
Monitoring
Job segmentation
Physical access limitations
Locks
Guards and video monitoring
Badges and tracking
80
Managerial Controls…..
Consultants and Business
alliances
Limited data access
Limited physical access
Paired with employees
81
Logical Security
Unauthorized
disclosure.
Unauthorized
modification.
Unauthorized
withholding.
 Disclosure example
Letting a competitor see
the strategic marketing
plans.
 Modification example
Letting employees
change their salary
numbers.
 Withholding example
Preventing a finance
officer from retrieving
data needed to get a
bank loan.
82
Basic Security Ideas
 Limit access to hardware
3
Physical locks.
Video monitoring.
Fire and environment monitors.
Jones 1111
Smith 2222
Employee logs / cards.
Olsen 3333
Araha 4444
Dial-back modems
5
2
phone
company
phone
company
 Monitor usage
Hardware logs.
 Dialback modem
Access from network nodes.  User calls modem
Software and data usage.
 Modem gets name,
 Background checks
Employees
Consultants
4
1
password
 Modem hangs up phone
 Modem calls back user
83
 Machine gets final
Separation of Duties
Supplier
SupplierID Name …
673
772
983
Acme Supply
Basic Tools
Common X
Referential
integrity
Purchasing manager
can add new suppliers,
but cannot add new
orders.
Resource
Supplier table
PurchaseOrder table
PurchaseItem table
PurchaseOrder
OrderID SupplierID
8882
8893
8895
772
673
009
Purchasing
Manager
Select, Insert
Modify, Delete
Select
Purchasing
Clerk
Select
Select, Insert
Modify, Delete
Clerk must use SupplierID from
the Supplier table, and cannot
add a new supplier.
84
Encryption
Plain text
message
 Protection for open
transmissions
 Networks
 The Internet
 Weak operating systems
 Single key (AES)
 Dual key
AES
Key: 9837362
Encrypted
text
Single key: e.g., AES
 Protection
 Authentication
Encrypted
text
Key: 9837362
AES
Plain text
message
85
Dual Key Encryption
Message
Transmission
Message
Encrypt+T+M
Alice
Private Key
13
Use
Alice’s
Private key
Encrypt+M
Encrypt+T
Bob
Public Keys
Alice 29
Bob 17
Use
Bob’s
Public key
Use
Alice’s
Public key
Private Key
37
Use
Bob’s
Private key
 Using Bob’s private key ensures it came from him.
 Using Alice’s public key means only she can read it.
86
Backup and Recovery
Backups are crucial!
Offsite storage!
Scheduled backup.
Regular intervals.
Record time.
Snapshot
Track backups.
Journals / logs
Checkpoint
Rollback / Roll
Journal/Log
forward
Changes
OrdID Odate Amount ...
192
2/2/01 252.35 …
193
2/2/01 998.34 …
OrdID
192
193
194
Odate Amount ...
2/2/01 252.35 …
2/2/01 998.34 …
2/2/01
77.23 ...
OrdID
192
193
194
195
Odate Amount ...
2/2/01 252.35 …
2/2/01 998.34 …
2/2/01
77.23 …
2/2/01 101.52 …
87
Database Security
 Authorization, Access Control:
protect intranet from hordes: Firewalls
 Confidentiality, Data Integrity:
protect contents against snoopers: Encryption
 Authentication:
both parties prove identity before starting
transaction: Digital certificates
 Non-repudiation:
proof that the document originated by you & you
only: Digital signature
88
What can go wrong?
Security issues
 How to deal with intruders
 Intruders
Identify every user
Casual prying (read
other peoples e-mail, Advise users to log off
documents, etc.)
when they leave their
desk
Snooping by insiders
Limit the privileges of
Determined attempt
users
to make money
Log files to monitor
Commercial or
users activity
military espionage
Encryption
Simply for fun or to
prove it can be done Etc.
89
Insiders
What could some of the employees do?
Read other people’s emails
Attempt to read documents and access
information that is NOT intended for their
eyes
Commercial espionage
Install unauthorised software
90
Insiders…..
How to prevent all of the above?
Each employee should log in the system
using a unique username / password
Advice all employees not to disclose their
password to anyone
Advice all employees to log off when they
leave their desk
Advice all employees
password regularly
to
change
their
91
Insiders…..
Put in place a system that tracks
employees
actions
and
network
resources accessed
Limit privileges of employees allowing
them to perform only authorised tasks
and obtain only authorised information
Encrypt or password protect
confidential documents / data
Any other measures?
all
92
Outsiders
What could they do?
As a hobby, prove that “it can be done”
Commercial and military espionage
Access bank accounts
Access and use other people’s credit card
details
Shut down systems, etc.
93
Outsiders….
How to prevent outsiders gaining access to
resources
Identify every user of the system
Put in place a system that tracks users actions
and network resources accessed
Encrypt confidential documents / data
Put firewalls in place to protect the network
Keep all software and operating systems up to
date to prevent hackers exploit security holes
94
Have a security policy in place and
ENFORCE IT
Have clear guidelines as how security should
be implemented
Management has to make sure that all IT
technicians apply all the security measures
Management has to make sure that all
employees are aware of the security
measures and apply them
Technology used to implement security
guidelines
Sophisticated tools used to analyse,
interpret, configure and monitor the state of
the network security
95
Identify each user….
Install access control programs and physical
security devices on all systems. Access
control programs run extra checks on users
before allowing access. Physical security
devices include biometric scanning devices
fitted to a computer which check a user’s
face, retina, fingerprint, hand, voice, typing
rhythm, signature and so on against a set of
stored data for all legitimate users.
Make sure to delete the accounts of
employees no longer working for the company
96
Monitor the network
Security monitor
Test and monitor the state of the
network security
Technology used to monitor the network
Network log files that record
Who logged in, for how long, from
which computer, what resources they
have accessed, etc.
97
Monitor the network…..
Network vulnerability scanners
Antivirus software
Disaster recovery backup technology
Check security logs and audit trails
regularly
Conduct regularly a through risk analysis
of the network
Have a disaster recovery plan
98
Monitor and restrict access from outside
into the network
Monitor remote access into the network by
Allowing only a limited number of
attempts to log in
Block the account if all attempts to log in
are unsuccessful
Use log files to monitor the resources
accessed by remote users
Put firewalls in place before allowing
Internet access
99
Database Security Summary
Stay aware of data security holes
Explore possible third-party options
Perform audits tests on your databases
regularly
Encryption of data in motion
Encryption of data at rest within the
database
100
Monitor your log files
Implement Intrusion Detection
p.s Provide multiple levels of security
The data stored in a database is
managed
by
a
Data
Base
Management System (DBMS).
101
Data Warehousing
A data warehouse is where information
is organized for quick retrieval. Data is
got from different sources (usually
databases) set up for different
purposes
102
Differences to Traditional Database
Data is organized around major subjects
rather than individual transactions
Summarized data is used rather than
detailed data
Data is framed for long time decision
making
They are organized for quick queries not
so much for efficient storage
103
Optimized for complex queries known
as OLAP (online analytical processing).
Allows managers to look at a database
at different dimensions
Allows easy access via data mining
(swift ware) that searches for patterns
and is able to identify relationships
104
Include multiple databases that have been
processed so that data is uniform (clean
data)
They include data from outside sources and
the one generated internally
 Building a warehouse is complex. An
analyst gathers information from a variety of
sources, translates it into a common form
e.g. a database of gender could be “male”
“female”, another one could have “M” and
“F” while a third one could have “0” and “1”
105
Once clean, the analyst has to decide how
to summarize data and predict the type of
queries that might be asked (details are
usually lost during summarization). The
warehouse is then designed both logically
and physically
Note: the analyst must know a lot about
the business.
Because of its size,
expensive
a warehouse is
106
Data Mining
Data mining can identify patterns that
human is unable to detect The data
mining algorithms search data
warehouses for patterns. It is known
by another name Knowledge Data
Discovery (KDD).
107
Software for Data Mining
Known as decision aids include:
Statistical analysis software
Neural networks
Fuzzy networks
Intelligent argents
Logic and data visualization
108
Patterns that decision makers try to
identify include:
Associations:
Patterns that
occur
together at the same time. For example, a
person who buys milk usually buys bread
Sequences: Actions that take place over
a period of time, e.g. if a family buys a
house this year, they will most likely buy a
fridge and cooker next year.
109
Clustering: A pattern that develops
among a group of people. e.g.
Customers who live in a particular area
tend to buy a particular product
Trends: Patterns that are noticed over
a period of time. E.g. Customers may
move from buying processed food to
natural foods (herbal products) or
African attires
110
Data mining also targets customers.
Assuming that past behavior is a good
predictor for the future. A large amount
of data is captured from a particular
person and companies share this
information. Credit companies have
taken advantage of this where they
target customers.
111
Problems with Data Mining
Cost could be too high to justify data mining
Coordination of several customers
departments could be problematic
or
Customers could resent their privacy being
invaded and reject the offers that are
coming their way
 Erroneous profiles could be made of
people, stored, and not deleted. The police
could act on these profiles without meeting
the people
112
Ethical Issues
Analysts should take the responsibilities
for considering the ethical aspects of any
data mining projects that are proposed.
Length of time the material is kept
Privacy safe guards should be installed
Confidentially of the material
The uses to which inferences are put
should be asked and considered with the
client.
113
The opportunities for abuse are
apparent and must be guarded
against. For consumers, data mining
is a push technology and if
consumers do not want to be pushed,
data mining efforts could back fire.
114
Data Warehousing
Operational
databases
Data extraction and
transformation
accounting
databases
Internal
Data
sources
Customer
databases
Extract and
transform
Manufacturing
databases
Extract
Filter
Transform
Classify
Aggregate
Summarize
Historical
databases
External
Data
sources
External
databases
Data
warehouses
Custome
r Data
Product
data
Sales
data
Data access
and analysis
Business
intelligence
OLAP
Data
Mining
Querying
Reporting
Integrated
Subject
oriented
Time-variant
Non-volatile
Data
115
116