Chapter 1 Introduction
notebook filename | 01-Introduction.Rmd
1.1 Exercise book contents
This document describes R scripts used for demonstrations in the NOAA Satellite Course. The scripts highlight the uses for rerddapXtracto, a R package with functions that allow easy extraction of satellite data from ERDDAP servers. Some of these exercies are also in the rerddapXtracto vignette at https://cran.r-project.org/web/packages/rerddapXtracto/vignettes/UsingrerddapXtracto.html
This chapter will provide an overview of the main rerddapXtracto functions. Please review the information presented here before moving on to the other chapters.
The remaining chapters each contain a separate demonstration. The demonstrations are HTML or PDF versions of R notebooks. The source notebook file (.Rmd) are available for following along during the course or on your own.
Chapter 2: Extract data within a boundary
Visualize data from within the boundaries of the Monterey Bay National Marine Sanctuary and visualize the data in a map.
Chapter 3 - Matchups to ship or animal tracks
Extract satellite data around a set of points defined by longitude, latitude, and time coordinates like that produced by an animal telemetry tag, and ship track, or a glider tract. This function can now handle dataset requests which cross the dateline.
Chapter 4 - Create a transect and plot satelliet data for it
Create a transect of stations between two points and create a hovmoller plot of the satellite data showing distance along the transect against time. Thanks to Eli Holmes for supplying the code for this.
Chapter 5 - Create and plot timeseries
Extract a time-series of monthly satellite chlorophyll data for the period of 1997-present from four different monthly satellite datasets. Plot the results to examine the similarities and differences among the datasets. This exercise is useful for application that require piecing together a long time series from several separate satellite missions.
Chapter 6 - Matchup satellite and buoy data
Extract SST buoy data from ERDDAP tabular database and then extract the SST satellite data that is coincident with the buoy data.
Chapter 7 - TurtleWatch
Import SST data and apply a temperature threshold to identify turtle habitats
Chapter 8 - Working with Projected Datasets
Down the grid for a projected seaice dataset (coordinates are given in m from the projection point) and calcualte the indices to use to extract a subset of teh projected data. Shows how to create the url to download data directly in R, i.e. does not use the rerddap
or rerddapXtracto
functions. Gives 4 different examples of ways to map this projected datset.
1.2 RerddapXtracto R package
The “rerddapXtracto” package contains routines to simplify data extraction using ERD’s ERDDAP web service. The “rerddapXtracto”" package subsets and extracts satellite and other oceanographic related data from any ERDDAP server using the R package “rerddap” developed by Scott Chamberlain and the people at rOpenSci (https://ropensci.org/).
The following is a description of the main functions of the “rerddapXtracto”" package plus key functions from the “rerddap” package that are dependencies for the “rerddapXtracto” functions.
1.2.1 rxtracto function
Summary
Extracts environmental data from an ERDDAP server along a x,y,[z], and time trajectory, e.g. an animal or cruise track. The script allows you to control the size if a the box [cube] surrounding the x,y [z] point to be used to determine means and statistics. You can also control from which ERDDAP you pull data from.
Function
rxtracto <- function(dataInfo, parameter = NULL, xcoord = NULL,
ycoord = NULL, zcoord = NULL, tcoord = NULL, xlen = 0., ylen = 0.,
zlen = 0., xName = ‘longitude’, yName = ‘latitude’, zName = ‘altitude’,
tName = ‘time’, urlbase = ‘https://coastwatch.pfeg.noaa.gov/erddap’, verbose = FALSE)
Arguments
dataInfo
The return from an rerddap “info” call to an ERDDAP serverparameter
A character string containing the name of the parameter to extractxcoord
A comma separated array (list) of numbers containing the x-coordinates of the trajectory (if longitude in #’ decimal degrees East, either 0-360 or -180 to 180)ycoord
A comma separated array (list) of numbers containing the y-coordinate of the trajectory (if latitude in decimal degrees N; -90 to 90)zcoord
A comma separated array (list) of numbers containing the z-coordinate of the trajectory (usually altitude or depth)tcoord
A comma separated array (list) of character strings in the format “YYYY-MM-DD” with the times of the trajectory in “YYYY-MM-DD” (for now restricted to be time).xlen
A comma separated array (list) of numbers defining the longitude box around the given point (xlen/2 around the point)ylen
A comma separated array (list) of numbers defining the latitude box around the given point (ylen/2 around the point)zlen
A comma separated array (list) of numbers defining the depth or altitude box around the given point (zlen/2 around the point)xName
A character string with name of the xcoord in the ERDDAP dataset (default “longitude”)yName
A character string with name of the ycoord in the ERDDAP dataset (default “latitude”)zName
A character string with name of the zcoord in the ERDDAP dataset (default “altitude”)tName
A character string with name of the tcoord in the ERDDAP dataset (default “time”)urlbase
A character string containing the base URL of the ERDDAP server being accessed (default “http://upwell.pfeg.noaa.gov/erddap”)verbose
A logical variable controling if the verbosity of the URL request should high (TRUE) or low (FALSE) (default FALSE)
Output
A dataframe containing:
column 1 - mean of data within search radius
column 2 - standard deviation of data within search radius
column 3 - number of points found within search radius
column 4 - time of returned value
column 5 - min longitude of call (decimal degrees)
column 6 - max longitude of call (decimal degrees)
column 7 - min latitude of call (decimal degrees)
column 8 - max latitude of call (decimal degrees)
column 9 - requested time in tag
column 10 - median of data within search radius
column 11 - median absolute deviation of data within search radius
Full reference
https://cran.r-project.org/web/packages/rerddapXtracto/rerddapXtracto.pdf
1.2.2 plotTrack function
Summary
plotTrack is a function to plot the results from rxtracto()
Function
plotTrack(resp, xcoord, ycoord, plotColor = “viridis”, name = NA, myFunc = NA, shape = 20, size = 0.5)
Arguments
resp
The data frame returned from rxtracto()xcoord
The comma separated array (list) of numbers containing the x-coordinates of the trajectory that was passed to rxtracto()ycoord
The comma separated array (list) of numbers containing the y-coordinate of the trajectory that was passed to rxtracto()plotColor
the color palette to use in the plot (The cmocean color palette by Kristen Thyng https://matplotlib.org/cmocean/#colormap-details)name
A name for color bar labelmyFunc
A function of one argument to transform the datashape
The a numeric code that identifies the symbol to use to mark track (https://www.datanovia.com/en/blog/ggplot-point-shapes-best-tips/)size
The size of symbol to use to mark track
Full reference
https://rmendels.github.io/rerddapXtracto_docs/reference/plotTrack.html#value
1.2.3 rxtracto_3D function
Summary
Extracts environmental data from an ERDDAP server in an (x,y,z, time) bounding box. The same call could be made directly form ERDDAP, but function’s strength is the ability to extract data from polygons.
Function
extract <- rxtracto_3D(dataInfo, parameter = NULL, xcoord = NULL, ycoord = NULL,
zcoord = NULL, tcoord = NULL, xName = “longitude”, yName = “latitude”,
zName = “altitude”, tName = “time”,
urlbase = “https://upwell.pfeg.noaa.gov/erddap/”, verbose = FALSE)
Arguments
dataInfo
the return from an rerddap “info” call to an ERDDAP serverparameter
character string containing the name of the parameter to extractxcoord
a real array with the x-coordinates of the trajectory (if longitude in #’ decimal degrees East, either 0-360 or -180 to 180)ycoord
a real array with the y-coordinate of the trajectory (if latitude in decimal degrees N; -90 to 90)zcoord
a real array with the z-coordinate (usually altitude or depth)tcoord
a character array with the times of the trajectory in “YYYY-MM-DD” - for now restricted to be time.xName
character string with name of the xcoord in the ERDDAP dataset (default “longitude”)yName
character string with name of the ycoord in the ERDDAP dataset (default “latitude”)zName
character string with name of the zcoord in the ERDDAP dataset (default “altitude”)tName
character string with name of the tcoord in the ERDDAP dataset (default “time”)urlbase
base URL of the ERDDAP server being accessed - default “http://upwell.pfeg.noaa.gov/erddap”verbose
logical variable (default FALSE) if the the URL request should be verbose
Output
A dataframe containing: * extract$data - the data array with dimensions (lon,lat,time)
extract$varname - the name of the parameter extracted
extract$datasetname - ERDDAP dataset name
extract$longitude - the longitudes on some scale as request
extract$latitude - the latitudes always going south to north
extract$time - the times of the extracts
Full reference
https://rmendels.github.io/rerddapXtracto_docs/reference/rxtracto_3D.html
1.2.4 plotBox function
Summary
plotBox is a function to plot the results from rxtracto_3D().
Function
plotBBox(resp, plotColor = “viridis”, time = NA, animate = FALSE, name = NA, myFunc = NA, maxpixels = 10000)
Arguments
resp
data frame returned from rxtracto_3D()plotColor
the color palette to use in the plot (The cmocean color palette by Kristen Thyng https://matplotlib.org/cmocean/#colormap-details)time
a function to map multi-time to one, or else identity for animationanimate
animate the plot if there are multiple times (animate = TRUE to animate)name
name for color bar labelmyFunc
function of one argument to transform the datamaxpixels
maximum number of pixels to use in making the map - controls resolution
Full reference https://rmendels.github.io/rerddapXtracto_docs/reference/plotBBox.html
1.2.5 rxtractogon function
Summary
The function rxtractogon() extracts a time-series of satellite data that are within a user supplied polygon.
Function
rxtractogon(dataInfo, parameter, xcoord = NULL, ycoord = NULL,
zcoord = NULL, tcoord = NULL, xName = “longitude”, yName = “latitude”,
zName = “altitude”, tName = “time”,
urlbase = “https://upwell.pfeg.noaa.gov/erddap”, verbose = FALSE)
Arguments
dataInfo
the return from an rerddap “info” call to an ERDDAP serverparameter
character string containing the name of the parameter to extractxcoord
a real giving longitudes (in decimal degrees East, either 0-360 or -180 to 180) of a polygonycoord
a real giving latitudes (in decimal degrees N; -90 to 90) of a polygonzcoord
a real number with the z-coordinate (usually altitude or depth)tcoord
a character array of minimum and maximum times as ‘YYYY-MM-DD’xName
character string with name of the xcoord in the ERDDAP dataset (default “longitude”)yName
character string with name of the ycoord in the ERDDAP dataset (default “latitude”)zName
character string with name of the zcoord in the ERDDAP dataset (default “altitude”)tName
character string with name of the tcoord in the ERDDAP dataset (default “time”)urlbase
base URL of the ERDDAP server being accessed - default “http://upwell.pfeg.noaa.gov/erddap”verbose
logical variable (default FALSE) if the the URL request should be verbose
Output
A dataframe with the structure:
extract$data - the masked data array with dimensions (lon,lat,time)
extract$varname - the name of the parameter extracted
extract$datasetname - ERDDAP dataset name
extract$longitude - the longitudes on some scale as request
extract$latitude - the latitudes always going south to north
extract$time - the times of the extracts
Full reference
https://rmendels.github.io/rerddapXtracto_docs/reference/rxtractogon.html
1.2.6 Rerddap::info function
Summary
A rerddap function to get information about an ERDDAP dataset.
Function
dataInfo <- rerddap::info(datasetID, url = ‘https://coastwatch.pfeg.noaa.gov/erddap/’)
Arguments
datasetID
the ERDDAP id for a dataseturl
the URL for the ERDDAP server to useadditional arguments
https://www.rdocumentation.org/packages/rerddap/versions/0.5.0/topics/info
Output
dataInfo$variables - a brief overview of the variables and range of possible values
dataInfo$alldata$longitude - all information on longitude
dataInfo$alldata$latitude - all information on latitude
dataInfo$alldata$[variable] - all information on a selected variable, e.g. out\(alldata\)chlorophyll
dataInfo$alldata$NC_GLOBAL$attribute_name - all global attribute names
dataInfo$alldata$NC_GLOBAL$value - all global attribute values
Full reference
https://www.rdocumentation.org/packages/rerddap/versions/0.5.0/topics/info