An Introduction to

Prof Chris Brunsdon

National Centre for Geocomputation

MUSSI

Maynooth University

September 22nd, 2022

What is ?

Figure 1: R and RStudio in action

  • R is a system for statistical computation and graphics. It provides, among other things, a programming language, high level graphics, interfaces to other languages and software tools and debugging facilities.

  • Versions exist for

    • Windows
    • Mac
    • Linux

A number of packages can be added to R to extend its functionality. Notably:

  • sf: Enables R to work with geospatial data
  • tidyverse: Allows R to process data in a database query like syntax
  • Several packages for spatial statistical data analysis
    • mgcv: x,y,z trend modelling
    • terra: Processing raster data - eg remote sensing
  • tmap: Interactive web-based mapping
  • Fuller list here

Nothing.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form.

Tip

This also means the code used to create R and all of its packages are Open Source, so all code used is open to scrutiny.

  • Use the CRAN website
    • This lets you download R and install it
  • Also go to the RStudio web site and download RStudio
  • This is an editor and development environment for R
    • Write code
    • Test it
    • Check graphics
    • Publish blogs and web sites
    • More …

in a Geospatial Context

Weapons Violations January 2022

Sources:

  • Crime Data:
    • Chicago Data Portal
  • Maps:
    • ESRI
    • OpenStreetMap
  • All freely available
  • Map, database access, data processing all in R
# Load Libraries
library(tmap) 
library(sf)
library(tidyverse)
library(RSocrata)
library(glue)

# Pull data and process it
stem <- "https://data.cityofchicago.org/resource/9hwr-2zxp.json"
query <- "date between '2022-01-01T00:00:00' and '2022-02-01T00:00:00'"
wv <- read.socrata(glue("{stem}?$where={query}")) %>% 
  as_tibble() %>% 
  filter(primary_type=="WEAPONS VIOLATION") %>%
  st_as_sf(coords=c(21,9),crs=4326)

# Make interactive map
tmap_mode('view')
tm_shape(wv) + tm_dots(col='arrest')

R is a hub as well as a data processor:

Diagram 1: Interactions Between R and Servers

Hotspots

Chance of Arrest

Further Issues

  • Conversion between package formats
    • eg spatstat and sf handle point data sets differently
  • Mainly works via coding
    • Many see that as an advantage !
  • Oriented towards in-memory operations
    • Can work with databases directly, but often:
      • Select subset or ‘chunk’ of data, read it in
      • Analyse
      • Re-save the chunk of data
  • Competing packages doing the same job
  • Quarto and Posit
    • Enhances ability to merge R code into documents
    • Increase interoperability, eg mix R and Python in same document
  • More outreach to non-academic organisations
  • Enhancements of sf
    • Increased ability to manipulate spatial data
  • Incorporation of more spatial analysis tools
    • Some not yet invented …

Thank You