ERDDAP Easier access to scientific data
|
Brought to you by
NOAA
NMFS
SFSC
ERD
|
ERDDAP
> Information
ERDDAP (the Environmental Research Division's Data Access Program)
aggregates scientific data from diverse local and remote sources
and offers a simple, consistent way to download subsets of the data in common file formats and make graphs and maps.
The Problems that ERDDAP Tries To Solve
Without ERDDAP, when a person (or a computer program) looks on the Internet
for a specific type of scientific data (for example, satellite sea surface temperature data),
there are problems ...
- The datasets of interest are hard to find because they are at many different web sites.
- Each site requires a different protocol to request the data
(for example,
HTTP GET,
XML,
SOAP+XML,
DAP,
WCS,
WFS,
SOS,
or an HTML form).
- Each site returns the data in a different format
(for example,
XML, SOAP+XML, DAP binary data stream, ASCII text, HDF 4, HDF 5, NetCDF, ...)
and it isn't the common file format that you want.
- Data from different sites is hard to compare
because the dates+times are expressed in different formats
(for example, "Jan 2, 1985", "02-JAN-1985", "1/2/85", "2/1/85",
"1985-01-02", or days since has Jan 1, 1980, or ...).
- ERDDAP can get data from local (on the server's hard drive) and remote (accessed via the web) data sources .
See the
list of types of data sources
that ERDDAP can access.
- ERDDAP can serve many types of scientific data.
It isn't just for oceanographic data.
ERDDAP is a Data Access Program that was written at
NOAA
NMFS
SFSC
ERD.
The ERDDAP server at ERD serves oceanographic data,
but ERDDAP (the program) can access and serve any gridded or tabular data.
- ERDDAP uses just two basic data structures to hold data.
- Since it is difficult for human clients and computer clients to deal with a large number of dataset structures,
ERDDAP uses just two basic data structures:
- Certainly, not all data can be expressed in these structures, but much of it can.
Tables, in particular, are very flexible data structures
(look at the success of relational database programs).
- This makes data queries easier to construct.
- This makes data responses have a simple structure,
which makes it easier to serve the data in a wider variety
of standard file types (which often just support simple data structures).
- This, in turn, makes it very easy for us (or anyone) to write client software which works with all ERDDAP datasets.
- This makes it easier to compare data from different sources.
- And even if this sounds odd to you, most ERDDAP clients will never notice --
they will simply see that all of the datasets have a nice simple structure
and they will be thankful that they can get data from a wide variety of sources
returned in a wide variety of file formats.
- Other data structures could be supported in the future if called for and if we find a good approach to working with them.
- ERDDAP offers several ways to search for datasets of interest.
For example,
full text search and
search by category.
- ERDDAP lets you make requests in a standardized way,
regardless of the data source's request protocol.
ERDDAP also provides Data Access Forms (web pages) which help humans create the DAP requests.
OPeNDAP's
DAP
is the recommended
IOOS
DMAC
data transport mechanism a
NASA EOSDIS standard.
(DAP is great!)
ERDDAP translates your request from the DAP format to the data source's request format
and converts the response to one of the simple data structures.
Then ERDDAP reformats the data in the common file format of your choice
(for example, .html table, ESRI .asc, Google Earth .kml, .mat, .nc, .csv, .tsv, .json, .xhtml, .png)
and sends the file to you.
See the list of grid file types
and the list of table file types.
Other protocols for requesting the data (for example,
WCS,
WFS, and
SOS)
may be added/supported in the future.
ERDDAP is structured for these additions and there don't seem to be any impediments.
- ERDDAP requests can be made in user units.
Although requests for gridded data in ERDDAP can be made with array indices (following the DAP specification),
requests can also be in user units (for example, degrees east), using
a parentheses notation.
- ERDDAP sends results in common data file formats.
The results can be returned in any of several common data file formats
(for example,
.html table,
ESRI .asc,
Google Earth .kml,
.mat,
.nc,
.csv,
.tsv,
.json,
.xhtml),
instead of just the original format
or just the DAP transfer format (which has no standard file manifestation).
These files are created on-the-fly.
Since there are few internal
data structures, it is easy to add additional file-type drivers.
See the list of grid file types
and the list of table file types.
- ERDDAP standardizes the variable names
and units for longitude, latitude, altitude, and time in the results.
To facilitate comparisons of data from different datasets,
the requests and results in ERDDAP use standardized space/time axis units:
- longitude is always in degrees_east.
- latitude is always in degrees_north.
- altitude is always in meters with positive=up.
- time when formatted as a number is always in "seconds since 1970-01-01T00:00:00Z"
(which is UDUNITS-compatible)
and,
when formatted as a string, is formatted according to the
ISO 8601:2004 "extended" format standard
(YYYY-MM-DDThh:mm:ssZ, for example,
"1985-01-02T00:00:00Z").
Also, to avoid time zone and daylight savings time confusion,
time values are always converted to the UTC time zone.
This makes it easy to specify constraints in requests without having to worry about
the altitude data format (are positive values up or down? in meters or fathoms?)
or the time data format (a nightmarish realm of possible formats and time zones).
This makes the results from different data sources easy to compare.
Because the longitude, latitude, altitude, and time variables are specifically
recognized, ERDDAP is aware of the geo/temporal features of each dataset.
This is useful when making images with maps or time-series,
and when saving data in geo-referenced file types (e.g., .esriAscii, .geoJson, and .kml).
- ERDDAP adds metadata.
Many data sources have little or no
metadata
(for example, CF metadata)
describing the data.
ERDDAP lets (and encourages) the administrator to describe metadata which
will be added to datasets and their variables on-the-fly.
See the
addAttributes section
of the
directions for administrators.
- ERDDAP lets you request .png and .pdf image files with graphs and maps of the data
in addition to the actual data.
Three special uses of these images are:
- Web page authors can
embed a graph with the latest data in a web page
using an HTML <img> tag.
- Anyone can use ERDDAP's Slide Sorter
to build a personal web page that displays graphs with the latest data
(or other images or HTML content), each in its own, draggable slide.
- Anyone can use or make
Google Gadgets
to display images with the latest data on their iGoogle home page.
- Requesting Compressed Files
ERDDAP doesn't offer results stored in compressed (e.g., .zip or .gzip) files.
Instead, ERDDAP looks for
accept-encoding
in the HTTP GET request header sent by the client.
If a supported compression type ("gzip", "x-gzip", or "deflate") is found in the accept-encoding list,
ERDDAP includes "content-encoding" in the HTTP response header and compresses the data as it transmits it.
It is up to the client program to look for "content-encoding" and decompress the data.
Browsers and OPeNDAP clients do this by default. They request compressed data and decompress the returned data automatically.
Other clients (e.g., Java programs) have to do this explicitly.
- ERDDAP offers
email/URL and
RSS subscription services,
so you can be notified whenever a dataset changes.
ERDDAP is very good at detecting changes to gridded datasets because it can detect
when the axis values (e.g., the time values) change.
ERDDAP is not very good at detecting changes to tabular datasets because there are usually
no changes to the metadata when new data is added.
ERDDAP will detect if a dataset becomes unavailable (but perhaps not immediately).
ERDDAP will detect when that dataset becomes available again.
ERDDAP makes no promises about the suitability or accuracy of these services (see the
DISLAIMER OF LIABILITY).
Email/URL Subscriptions
(not available at some ERDDAP installations)
Whenever a dataset changes, the email/URL subscription system will immediately
send you an email or contact a URL that you specify.
Email/URL subscriptions are not available at some ERDDAP installations.
To set up an email/URL subscription, click on one of the envelope icons
that appear at the far right on ERDDAP web pages with lists of datasets
(example)
and on the Data Access Forms and Make A Graph web pages for individual datasets
(example)
if this ERDDAP installation supports email/URL subscriptions.
(Computer programmers: if you write web services, you can use the URL system
to have ERDDAP notify your web service immediately whenever a dataset changes.)
RSS Subscriptions
RSS is standard system for notifying users when the content at a web site has changed.
Modern web browsers have an RSS client built in
or you can use a separate RSS Reader.
ERDDAP offers a separate RSS 2.01 feed for each dataset
so that you can find out when datasets of interest have changed.
To subscribe to a dataset's RSS feed, click on one of the RSS icons
that appear at the far right on ERDDAP web pages with lists of datasets
(example)
or on the Data Access Forms and Make A Graph web pages for individual datasets
(example).
Comparison
The RSS service may be just what you are looking for. It is a nice standard.
But if you need to know as soon as possible when a dataset changes, use the email/URL system, not RSS.
RSS clients periodically (every hour?) request and read the RSS XML document to look for changes.
So typically, an RSS client will not detect a change to a dataset quickly (average 30 minutes?).
In contrast, the email/URL subscription system acts immediately whenever ERDDAP detects a change to a dataset.
The more active approach of the email/URL system is also much more efficient:
You may be able to set your RSS client to check for changes every minute (don't do it!),
but that would just lead to lots of unnecessary requests to the ERDDAP server
and it still wouldn't detect changes immediately.
- ERDDAP is a
web application (for humans with browsers) and a
web service
(with services for computer programs).
- ERDDAP has
REST- and
ROA-style
links to make its services available to computer programs.
These features can be used to build another web service on top of ERDDAP (making ERDDAP do all the work!).
ERDDAP is not intended to be a high-level data exploration/graphing service.
Instead, ERDDAP is intended to provide services for such web sites and programs.
So if you have an idea for a better interface to the data the ERDDAP serves,
we encourage you to build your own web application or web service, and use ERDDAP as the foundation.
Read more about ERDDAP's interface for Computer Programs.
- Security - By default, ERDDAP runs as an entirely public server
with no login system and no restrictions to data access.
However, an ERDDAP administrator can configure ERDDAP to restrict access to some or all datasets
to users who log in and have been assigned certain roles.
ERDDAP has built-in methods for authentication (logging in).
If an ERDDAP installation has authentication turned on, there will be a "log in" link at the top of each web page.
Users never have to log in to access the publicly available datasets.
Users who have logged in can access public datasets and the private datasets to which they are allowed access.
ERDDAP uses http URLs for users who aren't logged in, and https (Secure Sockets Layer)
URLs for users who are.
More information.
- ERDDAP processes data in chunks.
To save memory (a big issue) and make responses start sooner,
ERDDAP processes data requests in chunks --
repeatedly getting a chunk of data from the source, cleaning it up (for example, adding
metadata),
and sending that to the client.
For many data sources, this means that the first chunk of data (for example, from the first sensor)
gets to the client in seconds
instead of minutes (for example, after data from the last sensor has been retrieved),
reassuring the client that the data is coming (albeit slowly).
From a memory standpoint, this allows numerous large requests
(each larger than available memory) to be handled simultaneously.
- ERDDAP has a modular structure.
ERDDAP is structured so that it is easy to add different components
(for example, a class to request data from a SOS server and store it as a table).
The new component then gains all the features
and capabilities of the parent
(for example, support for DAP requests
and the ability to save the data in several common file formats).
Is ERDDAP a solution to everyone's data distribution / data access problems?
No. ERDDAP tries to find a sweet spot that is a really good solution to most
of the data distribution problems that we confronted.
ERDDAP can get data from most types of data servers.
ERDDAP can distribute data to most types of data clients and to common file formats.
It isn't that the remaining datasets (e.g., model data using a cubed sphere projection) aren't important.
It's just that we haven't yet found a simple organizing principle to deal with these diverse and more complex datasets.
Groups of researchers working with more complex data structures often already have specialized data servers
and specialized client software which are customized to their community's needs.
ERDDAP, as a general program, doesn't try to compete with these specialized data servers.
They are customized to the needs of their community and do a great job.
However, those datasets are often only "understood" by the specialized software in that community.
ERDDAP uses simple, common data structures, so that the data it serves can be represented
in many common file formats.
Often, complex datassets (e.g., model data using a cubed sphere projection) can be reprojected
to a simpler data structure (platte carre lat lon?) which ERDDAP can work with.
This simpler data structure isn't meant to replace the original data structure,
but it can be a useful way to distribute the data to a wider audience.
You can
Set Up Your Own ERDDAP Server
and serve your own data.
The small effort to set up ERDDAP brings many benefits.
Contact - If you have questions, suggestions, or comments about ERDDAP in general
(not this specific ERDDAP installation),
please email bob dot simons at noaa dot gov .
ERDDAP Version 1.22
Questions, comments, suggestions? Contact bob dot simons at noaa dot gov.
ERDDAP is a brought to you by
NOAA
NMFS
SFSC
ERD.
Disclaimers |
Privacy Policy
Usage Limitations - The SeaWiFS images and data from this site may be used for free, but not redistributed;
all other images and data from this site may be used and redistributed for free.