R - NCRM EPrints Repository

Using Software in
Teaching Statistics
Damon Berridge, Centre for Applied
Statistics, Dept of Mathematics & Statistics
[email protected]
ESRC NCRM Meeting: Training the
Trainers, London, 4 June 2007
Acknowledgement

This presentation is based on a set of
Powerpoint slides ‘Using Open Source
Software to Teach Mathematical Statistics’ by
Douglas M. Bates, University of WisconsinMadison
http://www.stat.wisc.edu/~bates/JSM2001.pdf
Outline
 Discussion
of general issues
 Introduction to R
 Obtaining & installing R
 Examples
 Useful resources
 Concluding remarks
Discussion of general issues
It is common to use some statistical
software in teaching statistics to social
scientists.
 Software in statistics courses for social
scientists must have a simple interface
for it to be useful.
 Is it important that use of computing
system be integrated with lectures and
text?

Linking the computing to the text

Two ways of achieving this:



Write a text that is tied to specific
software and illustrates the use of that
software, e.g. CAS course on GLMs.
Adopt a conventional text but provide
examples from text in computing system,
e.g. CAS course on Duration Analysis.
Alternatively:

Write self-standing material that is not
tied to specific software, e.g. LEMMA.
Advantages of Open Source
software




No licence fees or management of
licences.
Software can be installed on all
departmental PCs.
Staff and students can install software on
their own laptops without charge (and
without violating licences).
Open Source projects encourage
contributions from users so extensions
are easier.
Introduction to R

A number of CAS methodology courses
use Open Source software such as R.

R is an Open Source project based
largely on interactive programming
language S.

Initially developed by Ihaka & Gentleman
at University of Auckland in 1996.
Introduction to R – cont’d.

R is now developed and maintained by
a widely-dispersed, international group
of volunteers from academia and
industry.

R operates through web sites, archives,
e-mail lists, etc. (see Useful
Resources).
What is R?

An Open Source implementation of
programming language S.

A language and environment for data
analysis and graphics.

A means of technology transfer through
packages, e.g. SABRE in R.
What is R? – cont’d.

A flexible data exchange mechanism
accessing:

text files and saved R workspaces

S-PLUS data objects, SAS XPORT
datasets, SPSS saved datasets,
Minitab worksheets, etc.
How do I get R?

Informational web site:
http://www.r-project.org/

CRAN – the Comprehensive R Archive
Network
Primary CRAN site:
http://cran.r-project.org/


Mirror sites, e.g.
http://cran.uk.r-project.org/

New releases occur frequently – be prepared
to re-install!
Installing R in Windows

Simple procedure: download and run
installer R-2.5.0-win32.exe

CRAN sites are available for
installing R on other platforms, e.g.
Macintosh, Linux, Unix, etc.
Example 1: Web-based resources
Registration required; example sessions available
Example 2: Support material

Course notes available for downloading from:
http://www.cas.lancs.ac.uk/short_courses/coursematerials.html

Audio, slides and demos available at:
http://www.cas.lancs.ac.uk/e-learning/index.php
are based on the notes.
Useful resources

Web site http://www.r-project.org/ and CRAN.

FAQ list at http://cran.r-project.org/doc/FAQ/
is a good source of information.

Manuals in documentation directory
http://cran.r-project.org/doc/manuals
See especially R-intro.pdf and R-data.pdf.

Paper ‘Using the R statistical computing environment to teach
social statistics courses’ by Fox & Andersen, Dept of
Sociology, McMaster University
http://socserv.mcmaster.ca/jfox/Teaching-with-R.pdf