Intro to R (Jena Daly)

Data Tools: R and RStudio
Jenna Daly
March 24, 2017
2017 CTData Days:
Equip, Synthesize, and Mobilize with data
intRoduction

R is both a language and an environment



R is the programming language that can be used
for many different statistical and graphical functions
RStudio is the integrated development environment
(IDE) for R, where one can visually manage their
workspace
CRAN - The Comprehensive R Archive Network

Network of web servers that store the most up-todate versions of code and documentation for R
souRces:
https://www.r-project.org/about.html
http://www.gnu.org/
http://www.statmethods.net/about/learningcurve.html
http://analyticstrainings.com/?p=101
wheRe to download


R is available as free software, making it opensource, so everyone can use it!
Where to download both R and RStudio:
 Download R here: https://cran.r-project.org/
 Download RStudio here:
https://www.rstudio.com/products/rstudio/dow
nload/
souRces:
https://www.r-project.org/about.html
http://www.gnu.org/
http://www.statmethods.net/about/learningcurve.html
http://analyticstrainings.com/?p=101
leaRn moRe


R is one of the most comprehensive statistical analyses tools available
Steep learning curve
 Functionality comes from thousands of user-contributed packages
 Iterative learning





Learn more:
https://www.rstudio.com/resources
/training/
https://www.datacamp.com/course
s/free-introduction-to-r
http://swirlstats.com/
http://www.statmethods.net/
souRces:
https://www.r-project.org/about.html
http://www.gnu.org/
http://www.statmethods.net/about/learningcurve.html
http://analyticstrainings.com/?p=101
https://blogs.umass.edu/gwis/2015/05/21/crash-course-in-r-programming/
R demo

Data source: www.irs.gov
Data sets: SOI Tax Stats – Individual Income
Tax Statistics (2011-2014)

What we’ll learn:





That the data we download aren’t always as pretty as
we want
We sometimes have to manipulate data before we can
process
What insights can we gather from the given data?
What variables can be calculated based on the given
data?