Designing a Spatial/GIS Project

Designing a Spatial/GIS
Project
2/10/2017
Research Question
• Does your question involve a
geographic element?
• How is spatiality part of your RQ?
• Data source v variable in itself?
• Demographic data v how boundaries impact
population
• What are you trying to study?
• Avoid ecological fallacies
A visual representation of geographic data
extracted for a typical research project
Scope of Study and Unit of Analysis
• Consider what type of data you need
• (Vector) Point, line or polygon data?
• Points useful for events/counts, distance calculations
• Points are discrete and have zero dimensions
• Polygons are 2D, represent areas, basic unit for census
variables (i.e. demographics by county)
• At what level do you need the data?
• Usually best to go with small units; not always
possible
• Different levels have different info and precision
• Balance efficiency and info available
Factors to Consider/Philosophize
• How were the boundaries created?
• Some boundaries created at random
others with purpose
• Boundaries drawn at random have
similar populations separated by
arbitrary line
• Good use for Regression Discontinuity Design
• Boundaries with purpose will have similar
observations within, endogeneity
• Use a multi-level model, random effects,
robust clustered errors
Extracting Data Organized by Geography
• Might simply need demographic info organized by geography
• Can acquire these data a number of ways;
1. Use dbf in R (via foreign pkg; read.dbf(“name.dbf”)
2. Export data as txt or xls file in ArcGIS
1.
Via copyrows or Table to Excel commands
3. Simply download it from a site that has the info (i.e. social explorer)
1.
http://www.socialexplorer.com/
Obstacles: Switching Between Levels
• Might have data points that you would like as
part of polygons, or polygons that should be
points
• For event point data, join onto polygons
• For polygon-> points, use a random point
generator
• (Switch to ArcMaps)
• Remember, random generators use a uniform
distribution
Map’s point data entirely generated at random
by census block group
Analysis
• Use appropriate methods as has been
discussed to date
• I.E. Controls for multilevel models, spatial
clustering, etc.
• Might present primary dependent and
independent variables on maps
• Do not clutter map too much; every
variable adds a new dimension