Data Service Models - Conference Sites hosted by Acadia University

DLI Training Workshop
Hosted by
Dalhousie University
March 2000
DLI Workshop -Mar 2000
1
Data Service Models
• Review data service models within
the framework of:
 access and dissemination
 aggregate data and microdata
 statistics versus data
DLI Workshop -Mar 2000
2
DLI Workshop -Mar 2000
3
Data Service Models
• Models were presented as a continuum
during the 1997 DLI workshop
“Order &
Passthrough”
Service
DLI Workshop -Mar 2000
Install
Data and
Provide
Access
Treat as a
Collection
and Provide
Reference
4
Data Service Models
• Choose a model that matches your
staff and computing resources
DLI Workshop -Mar 2000
5
Acquisition
Fill a Request
Locate data
Order data & documentation
Collection Development•
Select & Locate data
Order data & documentation
Catalogue data & documentation
Install & Store (data & documentation)
Reference
Search for data
Interpret documentation
Retrieve or download data
Process data
change formats
subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
6
Acquisition
 Fill a Request

Locate data

Order data & documentation
Collection Development
Select & Locate data
Order data & documentation
Catalogue data & documentation
Install & Store (data & documentation)
Reference
Search for data
Interpret documentation
Retrieve or download data
Process data
change formats
subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
7
Acquisition
 Fill a Request

Locate data

Order data & documentation
Collection Development
Select & Locate data
Order data & documentation
Catalogue data & documentation

Install & Store (data & documentation)
Reference
Search for data
Interpret documentation
Retrieve or download data
Process data
change formats
subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
8
Acquisition
 Fill a Request

Locate data

Order data & documentation
Collection Development
Select & Locate data
Order data & documentation
Catalogue data & documentation

Install & Store (data & documentation)
Reference
 Search for data
 Interpret documentation
 Retrieve or download data
Process data
change formats
subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
9
Acquisition
 Fill a Request

Locate data

Order data & documentation
 Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)
Reference
 Search for data
 Interpret documentation
 Retrieve or download data
Process data
change formats
subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
10
Acquisition
 Fill a Request

Locate data

Order data & documentation
 Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)
Reference
 Search for data
 Interpret documentation
 Retrieve or download data
 Process data

change formats

subset cases or variables
aggregate cases
merge files
analyze data
DLI Workshop -Mar 2000
11
Acquisition
 Fill a Request

Locate data

Order data & documentation
 Collection Development

Select & Locate data

Order data & documentation

Catalogue data & documentation

Install & Store (data & documentation)
Reference
 Search for data
 Interpret documentation
 Retrieve or download data
 Process data

change formats

subset cases or variables Find a referral
aggregate cases
partner on campus
merge files
analyze data
DLI Workshop -Mar 2000
12
1.
DLI Workshop -Mar 2000
13
The Inventory Model
• In the traditional inventory model,
roughly half of the support goes to
putting items on the shelf, while the
other half goes to finding and getting
the items off the shelf.
Source: Darlene Fichter
DLI Workshop -Mar 2000
14
The Access Model
• With the access model, support is
split between getting information into
a deliverable state and finding
appropriate ways of retrieving and
disseminating the information.
DLI Workshop -Mar 2000
15
Access/Dissemination Issues
• managing vendor licenses
 are the license conditions realistic?
 what type of identification or
authentication is required?
DLI Workshop -Mar 2000
16
Access/Dissemination Issues
• matching products with technology
 is the product dependent on a
specific operating system?
 is the product software dependent?
DLI Workshop -Mar 2000
17
Access/Dissemination Issues
• determining access methods
 stand-alone, lan or wan?
 what are the finding tools?
DLI Workshop -Mar 2000
18
Access/Dissemination Issues
• determining dissemination options
 what are the output formats?
 does the output require special
storage considerations?
DLI Workshop -Mar 2000
19
The Access Model
• These issues and others about
access and dissemination will
continue in our discussions.
DLI Workshop -Mar 2000
20
2.
DLI Workshop -Mar 2000
21
Data Types
 In the 1997 DLI workshop time was
spent discussing differences
between aggregate data and
microdata.
 Each type has an impact on data
access models.
DLI Workshop -Mar 2000
22
Aggregate Data
 Aggregate data consist of statistical
summaries derived from original data
collections and organized in tables
according to the following properties:
• socio-economic phenomena
• spatial representation
• time
DLI Workshop -Mar 2000
23
Aggregate Data
 Statistical summaries
• these summaries take the form of
counts, totals, sums, averages or
percentages
DLI Workshop -Mar 2000
24
Spatial representation
and Time are fixed
Cells contain counts
Age and Sex are displayed
DLI Workshop -Mar 2000
25
Spatial representation
and Age are fixed
Year and Sex are displayed
DLI Workshop -Mar 2000
26
Age and Time are fixed
Geography and Sex are displayed
DLI Workshop -Mar 2000
27
Aggregate Data
 Aggregate data products
• usually stored as a series of related
tables in some type of database
structure requiring special retrieval
software (examples from STC include
C86, C91, CBP, CANSIM, etc.)
DLI Workshop -Mar 2000
28
Microdata
 Microdata are
• usually anonymised records of actual
respondents from a survey
• unsummarized, i.e, observations in the
form in which the data were collected
• in a raw format requiring some form of
processing, typically a flat ASCII file
DLI Workshop -Mar 2000
29
Microdata: Cases 3 & 4 from the GSS 2 Main File
0000312141100119820012122222210020982001212222224011
21111241112121112205020197111971021212222225211026121
2043001409557204113130221119999019787878797022214112
7141240031500061661123222222222111117262616221222266
6666636212000000020320222224222000022204141101101102
1111111221110000002100000000021000000000100000000002
00000423300200200100000100200
0000411001100111011021222222210020092002122222220211
11111231212111211208120193811938044122222221111052201
203901007504721031191012233520406058787870304221303
4207083004000014200071112221222117215756565655555556
66666656565000555500210222111111110000001111100001101
1122121221110110101100001101011000000000000000000000
00000000000000000000000000000
Microdata: First 14 Cases from the GSS 2 Episode File
000041144504000800024010000000012518733
000041144308000900006011222220012518733
000041141709000930003031222220012518733
000041141709301100009031222220012518733
000041141211001330015011222220012518733
000041149113301630018011222220012518733
000041141216301800009011222220012518733
000041143018002000012031222220012518733
000041147920002015001541222220012518733
000041143720152130007531222220012518733
000041147921302145001542221220012518733
000041144321452200001512221220012518733
000041147522002300006012221220012518733
000041144523002800030010000000012518733
Impact on Data Access
 Aggregate data have been processed
and organized in a database structure
• must locate the table with desired data
• must deal with each database structure
• must deal with accompanying retrieval
software
DLI Workshop -Mar 2000
32
Impact on Data Access
 Microdata data must be processed or
subset for subsequent processing
• must identify desired variables and
cases (data documentation)
• must deal with the raw data file structure
• must address the issue of desired
formats
DLI Workshop -Mar 2000
33
Impact on Data Access
• These and others differences
between aggregate data and
microdata will be part of our
discussions about data access,
also.
DLI Workshop -Mar 2000
34
3.
DLI Workshop -Mar 2000
35
Statistics versus Data
 The term statistics is commonly used
to describe the numeric summaries,
such as counts, totals, sums and
averages, that people use to make a
point in a study or report.
DLI Workshop -Mar 2000
36
Statistics versus Data
 The term data refers to numeric files
containing a collection of raw
information with many observations
that can be analyzed from a variety
of perspectives.
DLI Workshop -Mar 2000
37
Statistics versus Data
 Typically, generalizations are drawn
from analyses of a data file.
 The information provided by all of
the individuals in a survey is
considered to be data, while the
percent of respondents in a survey
with a university degree is a statistic.
DLI Workshop -Mar 2000
38
Blurring Statistics and Data
 In the print world, statistical
information is usually found in
statistical abstracts, census
monographs and serial
publications by government
agencies.
DLI Workshop -Mar 2000
39
Blurring Statistics and Data
 In the digital world this numeric
information is now appearing with
electronic table access on CD-ROM,
the Internet, or in electronic journals.
 Many aggregate data products now
fall in this category.
DLI Workshop -Mar 2000
40
Blurring Statistics and Data
In other instances, the responses in
the microdata file of a survey may
provide the answer to a statistical
question.
• For example, the percentage of the
population in Canada with high blood
pressure may be determined from the
National Population Health Survey.
DLI Workshop -Mar 2000
41
Impact on Data Access
• The use of aggregate data products
and microdata files to answer
statistical questions will also
contribute to our discussions about
data access.
DLI Workshop -Mar 2000
42
DLI Workshop -Mar 2000
43
Context for Aggregate Data
• The motivation to simplify access to
aggregate data products exists
because of their utility in answering
general statistics questions.
• The demand for facts and figures at
the reference desk steadily
increases.
DLI Workshop -Mar 2000
44
Aggregate Data Challenges
• The challenges of creating access to
aggregate data were summarized
earlier.
 how to find the table with the desired
statistics
 how to deal with each database structure
 how to cope with the retrieval software
DLI Workshop -Mar 2000
45
DLI Aggregate Data Sources
• Four major DLI aggregate data
sources have been chosen for
this workshop.
DLI Workshop -Mar 2000
46
DLI Aggregate Data Sources
• CANSIM is a major
source for
economic data and
social data reported
over time.
DLI Workshop -Mar 2000
47
DLI Aggregate Data Sources
• E-STAT is a popular
resource for simple
access to selected
CANSIM series and
some aggregate data
sources viewed useful
in teaching.
DLI Workshop -Mar 2000
48
DLI Aggregate Data Sources
• The 1996 Census
aggregate data
products are a
valuable collection of
electronic tables.
DLI Workshop -Mar 2000
49
DLI Aggregate Data Sources
• The Health Indicators
Database is a compilation
of tables from several
sources to provide a
single-access tool about
health status in Canada.
DLI Workshop -Mar 2000
50
 Hands-on work with aggregate data
• CANSIM
• E-STAT
• Census ‘96
• Health Indicators Database
DLI Workshop -Mar 2000
51