Chapter 1 Introduction

notebook filename | 01-Introduction.Rmd

1.1 Exercise book contents

This document describes R scripts used for demonstrations in the NOAA Satellite Course. The scripts highlight the uses for rerddapXtracto, a R package with functions that allow easy extraction of satellite data from ERDDAP servers. Some of these exercies are also in the rerddapXtracto vignette at https://cran.r-project.org/web/packages/rerddapXtracto/vignettes/UsingrerddapXtracto.html

This chapter will provide an overview of the main rerddapXtracto functions. Please review the information presented here before moving on to the other chapters.

The remaining chapters each contain a separate demonstration. The demonstrations are HTML or PDF versions of R notebooks. The source notebook file (.Rmd) are available for following along during the course or on your own.

Chapter 2: Extract data within a boundary
Visualize data from within the boundaries of the Monterey Bay National Marine Sanctuary and visualize the data in a map.

Chapter 3 - Matchups to ship or animal tracks
Extract satellite data around a set of points defined by longitude, latitude, and time coordinates like that produced by an animal telemetry tag, and ship track, or a glider tract. This function can now handle dataset requests which cross the dateline.

Chapter 4 - Create a transect and plot satelliet data for it
Create a transect of stations between two points and create a hovmoller plot of the satellite data showing distance along the transect against time. Thanks to Eli Holmes for supplying the code for this.

Chapter 5 - Create and plot timeseries
Extract a time-series of monthly satellite chlorophyll data for the period of 1997-present from four different monthly satellite datasets. Plot the results to examine the similarities and differences among the datasets. This exercise is useful for application that require piecing together a long time series from several separate satellite missions.

Chapter 6 - Matchup satellite and buoy data
Extract SST buoy data from ERDDAP tabular database and then extract the SST satellite data that is coincident with the buoy data.

Chapter 7 - TurtleWatch
Import SST data and apply a temperature threshold to identify turtle habitats

Chapter 8 - Working with Projected Datasets
Down the grid for a projected seaice dataset (coordinates are given in m from the projection point) and calcualte the indices to use to extract a subset of teh projected data. Shows how to create the url to download data directly in R, i.e. does not use the rerddap or rerddapXtracto functions. Gives 4 different examples of ways to map this projected datset.

1.2 RerddapXtracto R package

The “rerddapXtracto” package contains routines to simplify data extraction using ERD’s ERDDAP web service. The “rerddapXtracto”" package subsets and extracts satellite and other oceanographic related data from any ERDDAP server using the R package “rerddap” developed by Scott Chamberlain and the people at rOpenSci (https://ropensci.org/).

The following is a description of the main functions of the “rerddapXtracto”" package plus key functions from the “rerddap” package that are dependencies for the “rerddapXtracto” functions.

1.2.1 rxtracto function

Summary
Extracts environmental data from an ERDDAP server along a x,y,[z], and time trajectory, e.g. an animal or cruise track. The script allows you to control the size if a the box [cube] surrounding the x,y [z] point to be used to determine means and statistics. You can also control from which ERDDAP you pull data from.

Function
rxtracto <- function(dataInfo, parameter = NULL, xcoord = NULL, ycoord = NULL, zcoord = NULL, tcoord = NULL, xlen = 0., ylen = 0., zlen = 0., xName = ‘longitude’, yName = ‘latitude’, zName = ‘altitude’, tName = ‘time’, urlbase = ‘https://coastwatch.pfeg.noaa.gov/erddap’, verbose = FALSE)

Arguments

  • dataInfo
    The return from an rerddap “info” call to an ERDDAP server

  • parameter
    A character string containing the name of the parameter to extract

  • xcoord
    A comma separated array (list) of numbers containing the x-coordinates of the trajectory (if longitude in #’ decimal degrees East, either 0-360 or -180 to 180)

  • ycoord
    A comma separated array (list) of numbers containing the y-coordinate of the trajectory (if latitude in decimal degrees N; -90 to 90)

  • zcoord
    A comma separated array (list) of numbers containing the z-coordinate of the trajectory (usually altitude or depth)

  • tcoord
    A comma separated array (list) of character strings in the format “YYYY-MM-DD” with the times of the trajectory in “YYYY-MM-DD” (for now restricted to be time).

  • xlen
    A comma separated array (list) of numbers defining the longitude box around the given point (xlen/2 around the point)

  • ylen
    A comma separated array (list) of numbers defining the latitude box around the given point (ylen/2 around the point)

  • zlen
    A comma separated array (list) of numbers defining the depth or altitude box around the given point (zlen/2 around the point)

  • xName
    A character string with name of the xcoord in the ERDDAP dataset (default “longitude”)

  • yName
    A character string with name of the ycoord in the ERDDAP dataset (default “latitude”)

  • zName
    A character string with name of the zcoord in the ERDDAP dataset (default “altitude”)

  • tName
    A character string with name of the tcoord in the ERDDAP dataset (default “time”)

  • urlbase
    A character string containing the base URL of the ERDDAP server being accessed (default “http://upwell.pfeg.noaa.gov/erddap”)

  • verbose
    A logical variable controling if the verbosity of the URL request should high (TRUE) or low (FALSE) (default FALSE)

Output

A dataframe containing:

  • column 1 - mean of data within search radius

  • column 2 - standard deviation of data within search radius

  • column 3 - number of points found within search radius

  • column 4 - time of returned value

  • column 5 - min longitude of call (decimal degrees)

  • column 6 - max longitude of call (decimal degrees)

  • column 7 - min latitude of call (decimal degrees)

  • column 8 - max latitude of call (decimal degrees)

  • column 9 - requested time in tag

  • column 10 - median of data within search radius

  • column 11 - median absolute deviation of data within search radius

Full reference

https://cran.r-project.org/web/packages/rerddapXtracto/rerddapXtracto.pdf

1.2.2 plotTrack function

Summary
plotTrack is a function to plot the results from rxtracto()

Function

plotTrack(resp, xcoord, ycoord, plotColor = “viridis”, name = NA, myFunc = NA, shape = 20, size = 0.5)

Arguments

  • resp
    The data frame returned from rxtracto()

  • xcoord
    The comma separated array (list) of numbers containing the x-coordinates of the trajectory that was passed to rxtracto()

  • ycoord
    The comma separated array (list) of numbers containing the y-coordinate of the trajectory that was passed to rxtracto()

  • plotColor
    the color palette to use in the plot (The cmocean color palette by Kristen Thyng https://matplotlib.org/cmocean/#colormap-details)

  • name
    A name for color bar label

  • myFunc
    A function of one argument to transform the data

  • shape
    The a numeric code that identifies the symbol to use to mark track (https://www.datanovia.com/en/blog/ggplot-point-shapes-best-tips/)

  • size
    The size of symbol to use to mark track

Full reference

https://rmendels.github.io/rerddapXtracto_docs/reference/plotTrack.html#value

1.2.3 rxtracto_3D function

Summary
Extracts environmental data from an ERDDAP server in an (x,y,z, time) bounding box. The same call could be made directly form ERDDAP, but function’s strength is the ability to extract data from polygons.

Function
extract <- rxtracto_3D(dataInfo, parameter = NULL, xcoord = NULL, ycoord = NULL, zcoord = NULL, tcoord = NULL, xName = “longitude”, yName = “latitude”, zName = “altitude”, tName = “time”, urlbase = “https://upwell.pfeg.noaa.gov/erddap/”, verbose = FALSE)

Arguments

  • dataInfo
    the return from an rerddap “info” call to an ERDDAP server

  • parameter
    character string containing the name of the parameter to extract

  • xcoord
    a real array with the x-coordinates of the trajectory (if longitude in #’ decimal degrees East, either 0-360 or -180 to 180)

  • ycoord
    a real array with the y-coordinate of the trajectory (if latitude in decimal degrees N; -90 to 90)

  • zcoord
    a real array with the z-coordinate (usually altitude or depth)

  • tcoord
    a character array with the times of the trajectory in “YYYY-MM-DD” - for now restricted to be time.

  • xName
    character string with name of the xcoord in the ERDDAP dataset (default “longitude”)

  • yName
    character string with name of the ycoord in the ERDDAP dataset (default “latitude”)

  • zName
    character string with name of the zcoord in the ERDDAP dataset (default “altitude”)

  • tName
    character string with name of the tcoord in the ERDDAP dataset (default “time”)

  • urlbase
    base URL of the ERDDAP server being accessed - default “http://upwell.pfeg.noaa.gov/erddap

  • verbose
    logical variable (default FALSE) if the the URL request should be verbose

Output

A dataframe containing: * extract$data - the data array with dimensions (lon,lat,time)

  • extract$varname - the name of the parameter extracted

  • extract$datasetname - ERDDAP dataset name

  • extract$longitude - the longitudes on some scale as request

  • extract$latitude - the latitudes always going south to north

  • extract$time - the times of the extracts

Full reference

https://rmendels.github.io/rerddapXtracto_docs/reference/rxtracto_3D.html

1.2.4 plotBox function

Summary
plotBox is a function to plot the results from rxtracto_3D().

Function
plotBBox(resp, plotColor = “viridis”, time = NA, animate = FALSE, name = NA, myFunc = NA, maxpixels = 10000)

Arguments

  • resp
    data frame returned from rxtracto_3D()

  • plotColor
    the color palette to use in the plot (The cmocean color palette by Kristen Thyng https://matplotlib.org/cmocean/#colormap-details)

  • time
    a function to map multi-time to one, or else identity for animation

  • animate
    animate the plot if there are multiple times (animate = TRUE to animate)

  • name
    name for color bar label

  • myFunc
    function of one argument to transform the data

  • maxpixels
    maximum number of pixels to use in making the map - controls resolution

Full reference https://rmendels.github.io/rerddapXtracto_docs/reference/plotBBox.html

1.2.5 rxtractogon function

Summary
The function rxtractogon() extracts a time-series of satellite data that are within a user supplied polygon.

Function
rxtractogon(dataInfo, parameter, xcoord = NULL, ycoord = NULL, zcoord = NULL, tcoord = NULL, xName = “longitude”, yName = “latitude”, zName = “altitude”, tName = “time”, urlbase = “https://upwell.pfeg.noaa.gov/erddap”, verbose = FALSE)

Arguments

  • dataInfo
    the return from an rerddap “info” call to an ERDDAP server

  • parameter
    character string containing the name of the parameter to extract

  • xcoord
    a real giving longitudes (in decimal degrees East, either 0-360 or -180 to 180) of a polygon

  • ycoord
    a real giving latitudes (in decimal degrees N; -90 to 90) of a polygon

  • zcoord
    a real number with the z-coordinate (usually altitude or depth)

  • tcoord
    a character array of minimum and maximum times as ‘YYYY-MM-DD’

  • xName
    character string with name of the xcoord in the ERDDAP dataset (default “longitude”)

  • yName
    character string with name of the ycoord in the ERDDAP dataset (default “latitude”)

  • zName
    character string with name of the zcoord in the ERDDAP dataset (default “altitude”)

  • tName
    character string with name of the tcoord in the ERDDAP dataset (default “time”)

  • urlbase
    base URL of the ERDDAP server being accessed - default “http://upwell.pfeg.noaa.gov/erddap

  • verbose
    logical variable (default FALSE) if the the URL request should be verbose

Output

A dataframe with the structure:

  • extract$data - the masked data array with dimensions (lon,lat,time)

  • extract$varname - the name of the parameter extracted

  • extract$datasetname - ERDDAP dataset name

  • extract$longitude - the longitudes on some scale as request

  • extract$latitude - the latitudes always going south to north

  • extract$time - the times of the extracts

Full reference

https://rmendels.github.io/rerddapXtracto_docs/reference/rxtractogon.html

1.2.6 Rerddap::info function

Summary
A rerddap function to get information about an ERDDAP dataset.

Function
dataInfo <- rerddap::info(datasetID, url = ‘https://coastwatch.pfeg.noaa.gov/erddap/’)

Arguments

Output

  • dataInfo$variables - a brief overview of the variables and range of possible values

  • dataInfo$alldata$longitude - all information on longitude

  • dataInfo$alldata$latitude - all information on latitude

  • dataInfo$alldata$[variable] - all information on a selected variable, e.g. out\(alldata\)chlorophyll

  • dataInfo$alldata$NC_GLOBAL$attribute_name - all global attribute names

  • dataInfo$alldata$NC_GLOBAL$value - all global attribute values

Full reference

https://www.rdocumentation.org/packages/rerddap/versions/0.5.0/topics/info

logo