NOAA   ERDDAP
Easier access to scientific data
   
Brought to you by NOAA NMFS SWFSC ERD    

ERDDAP > Information

ERDDAP (the Environmental Research Division's Data Access Program) is a data server that gives you a simple, consistent way to download subsets of scientific datasets in common file formats and make graphs and maps.

Table of Contents

The Problems that ERDDAP Tries To Solve

Without ERDDAP, when a person (or a computer program) looks on the Internet for a specific type of scientific data (for example, satellite sea surface temperature data), there are problems ...
  • Interesting datasets are hard to find because they are at many different web sites.
     
  • Each site requires a different protocol to request the data (for example, HTTP GET, XML, SOAP+XML, OPeNDAP, WCS, WFS, SOS, or an HTML form).
     
  • Each site returns the data in a different format (for example, XML, SOAP+XML, OPeNDAP binary data stream, ASCII text, HDF 4, HDF 5, NetCDF, ...) and it isn't the common file format that you want.
     
  • Data from different sites is hard to compare because the dates+times are expressed in different formats for example, "Jan 2, 1985", "02-JAN-1985", "1/2/85", "2/1/85", "1985-01-02", or days since Jan 1, 1980, or ...).

ERDDAP's Solutions

  • For a quick introduction to ERDDAP, watch the first half of this YouTube video. (5 minutes)
    In it, a scientist downloads ocean currents forecast data from ERDDAP to model a toxic spill in the ocean using NOAA's GNOME software (in 5 minutes!). This video shows: Thanks to Rich Signell. (One tiny error in the video: when searching for datasets, don't use AND between search terms. It is implicit.)
     
  • ERDDAP can get data from local (on the server's hard drive) and remote (accessed via the web) data sources.
    See the list of types of data sources that ERDDAP can access.
     
  • ERDDAP can serve many types of scientific data, not just oceanographic data.
    ERDDAP is a Data Access Program that was written at NOAA NMFS SWFSC ERD. The ERDDAP server at ERD serves oceanographic data, but ERDDAP (the program) can access and serve any gridded or tabular data.
     
  • ERDDAP offers several ways to search for interesting datasets.
    For example, full text search, search by category (also known as faceted search), and Advanced Search. Advanced Search combines all of the search techniques and adds searches for datasets that have data within longitude, latitude, and time ranges, so you can search for datasets based on many different criteria simultaneously.
     
  • ERDDAP lets you request data in a standardized way,
    regardless of the data source's request protocol. ERDDAP also provides Data Access Forms (web pages) which help humans create the OPeNDAP requests. OPeNDAP's Data Access Protocol (DAP) is one of the recommended IOOS DMAC data transport mechanisms and a NASA EOSDIS standard. (OPeNDAP is great!) ERDDAP translates your request from the OPeNDAP, WMS, or SOS format to the data source's request format and converts the response to one of ERDDAP's internal data structures. Then ERDDAP reformats the data in the common file format of your choice (for example, as an .html table, ESRI .asc, Google Earth .kml, .mat, .nc, ODV .txt, .csv, .tsv, .json, .xhtml, .png, .pdf) and sends the file to you. See the list of griddap file types and the list of tabledap file types. Other protocols for requesting the data (for example, WCS) may be added in the future. ERDDAP is structured for these additions and there don't seem to be any impediments.
     
  • Requests for gridded data can be made in user units.
    Although requests for gridded data in ERDDAP can be made with array indices (following the OPeNDAP specification), requests can also be in user units (for example, degrees east), using a parentheses notation, since users think in those units, not indices.
     
  • ERDDAP sends results in common data file formats.
    The results can be returned in any of several common data file formats (for example, .html table, ESRI .asc, Google Earth .kml, .mat, .nc, ODV .txt, .csv, .tsv, .json, .xhtml), instead of just the original format or just the OPeNDAP transfer format (which has no standard file manifestation). These files are created on-the-fly. Since there are few internal data structures, it is easy to add additional file-type drivers. See the complete list of grid file types and table file types.
     
  • ERDDAP standardizes the variable names and units for longitude, latitude, altitude, depth, and time in the results.
    To facilitate comparisons of data from different datasets, the requests and results in ERDDAP use standardized space/time axis units:
    • longitude is always in degrees_east.
    • latitude is always in degrees_north.
    • altitude is always in meters with positive=up.
    • depth is always in meters with positive=down.
    • time, when formatted as a number, is always in "seconds since 1970-01-01T00:00:00Z" (known as Unix time or epoch seconds, which is UDUNITS-compatible) and, when formatted as a string, is formatted according to the ISO 8601:2004 "extended" format standard (YYYY-MM-DDThh:mm:ssZ, for example, "1985-01-02T00:00:00Z"). (You can convert numeric times to/from ISO string times with ERDDAP's time converter.) Also, to avoid time zone and daylight savings time confusion, time values are always converted to the Zulu (UTC, GMT) time zone.
    This makes it easy to specify constraints in requests without having to worry about the altitude data format (are positive values up or down? in meters or fathoms?) or the time data format (a nightmarish realm of possible formats and time zones, for example, "Jan 2, 1985", "02-JAN-1985", "1/2/85", "2/1/85", "1985-01-02", or days since Jan 1, 1980). This makes the results from different data sources easy to compare.

    ERDDAP has a utility to Convert a Numeric Time to/from a String Time.
    For more details, see How ERDDAP Deals with Time.

    Because the longitude, latitude, altitude, and time variables are specifically recognized, ERDDAP is aware of the geo/temporal features of each dataset. This is useful when making images with maps or time-series, and when saving data in geo-referenced file types (e.g., .esriAscii, .geoJson, and .kml).

    Two common standards for writing units of measure are:

    • UDUNITS - from Unidata, which is used in COARDS, CF, and NetCDF data files. For example, UDUNITS has many options for degrees Celsius, including "degree_C" and "degC".
       
    • UCUM - the Unified Code for Units of Measure. OGC services such as SOS, WCS, and WMS often refer to UCUM as UOM (Units Of Measure). For example, UCUM has just one case-sensitive option for degrees Celsius: "Cel".
       
    Although ERDDAP doesn't require the use of either units standard, most ERDDAP installations favor one or the other. (ERDDAP administrators: you can specify this with the <units_standard> tag in setup.xml.) You can convert UDUNITS to/from UCUM units with ERDDAP's units converter. When you request data or a graph from a tabledap dataset, you can append &units("UDUNITS") or &units("UCUM") to the end of the URL to request UDUNITS or UCUM units. more information
     
  • ERDDAP can add or modify metadata.
    Many data sources have little or no metadata (for example, CF metadata) describing the data. ERDDAP lets (and encourages) the administrator to describe metadata which will be added to datasets and their variables on-the-fly. See the addAttributes section of the directions for administrators.
     
  • ERDDAP lets you request .png and .pdf image files with graphs and maps
    of the data in addition to the actual data. And ERDDAP's Make A Graph lets you customize the images. Some special uses of these images are:
  • Requesting Compressed Files
    ERDDAP doesn't offer results stored in compressed (e.g., .zip or .gzip) files. Instead, ERDDAP looks for accept-encoding in the HTTP GET request header sent by the client. If a supported compression type ("gzip", "x-gzip", or "deflate") is found in the accept-encoding list, ERDDAP includes "content-encoding" in the HTTP response header and compresses the data as it transmits it. It is up to the client program to look for "content-encoding" and decompress the data. Browsers and OPeNDAP clients do this by default. They request compressed data and decompress the returned data automatically. Other clients (e.g., Java programs) have to do this explicitly.
     
  • ERDDAP makes different types of data servers (OPeNDAP, OBIS, SOS, WMS, ...) interoperable.
    Different types of data servers are used in different scientific communities. In the foreseeable future, it is unlikely that any one type will become dominant and replace the others. So ERDDAP acts as a bridge between different types of client programs (web browsers, IDV, Matlab, netCDF programs, ODV, WMS clients, etc.) and the different types of data servers.
    1. ERDDAP accepts client requests for data in different formats (e.g., OPeNDAP, WMS).
    2. ERDDAP converts a given request into the request format used by the source data server (e.g., OPeNDAP, SOS, OBIS, ...) and sends that to the source data server.
    3. ERDDAP converts the response data from the source data server into an internal format, including converting all time data to a common format: "seconds since 1970-01-01T00:00:00Z".
    4. ERDDAP converts the data from the internal format into the file format requested by the client (e.g., .csv, Google Earth .kml, .htmlTable, .dods, .mat, .nc, ODV .txt, .png).
    Clients don't have to worry about, or know about, the type of the source data server. They just get the data they want, in the file format they want.
     
  • ERDDAP uses just two basic data structures to hold data.
    • Since it is difficult for human clients and computer clients to deal with a complex set of possible dataset structures, ERDDAP uses just two basic data structures:
    • Certainly, not all data can be expressed in these structures, but much of it can. Tables, in particular, are very flexible data structures (look at the phenomenal success of relational database programs).
    • This makes data queries easier to construct.
    • This makes data responses have a simple structure, which makes it easier to serve the data in a wider variety of standard file types (which often just support simple data structures). This is the main reason that we set up ERDDAP this way.
    • This, in turn, makes it very easy for us (or anyone) to write client software which works with all ERDDAP datasets.
    • This makes it easier to compare data from different sources, for example for an Integrated Ecosystem Analysis (IEA).
    • We are very aware that if you are used to working with data in other data structures you may initially think that this approach is simplistic or insufficient. But all data structures have tradeoffs. None is perfect. Even the do-it-all structures have their downsides: working with them is complex and the files can only be written or read with special software libraries. If you accept ERDDAP's approach enough to try to work with it, you may find that it has its advantages (notably the support for multiple file types that can hold the data responses). The original ERDDAP slide show (particularly the data structures slide) talks about these issues.
    • And even if this approach sounds odd to you, most ERDDAP clients will never notice -- they will simply see that all of the datasets have a nice simple structure and they will be thankful that they can get data from a wide variety of sources returned in a wide variety of file formats.
       
  • ERDDAP offers email/URL and RSS subscription services, so you can be notified whenever a dataset changes.
    • ERDDAP is very good at detecting changes to gridded datasets because it can detect when the axis values (e.g., the time values) change.
    • ERDDAP is not very good at detecting changes to tabular datasets because there are usually no changes to the metadata when new data is added.
    • ERDDAP will detect if a dataset becomes unavailable (but perhaps not immediately).
    • ERDDAP will detect when that dataset becomes available again.
    • ERDDAP makes no promises about the suitability or accuracy of these services (see ERDDAP's DISCLAIMERS).

    Email/URL Subscriptions (not available at some ERDDAP installations) Whenever a dataset changes, the email/URL subscription system will immediately send you an email or contact a URL that you specify. Email/URL subscriptions are not available at some ERDDAP installations. To set up an email/URL subscription, click on one of the envelope icons Subscribe that appear at the far right on ERDDAP web pages with lists of datasets (example) and on the Data Access Forms and Make A Graph web pages for individual datasets (example) if this ERDDAP installation supports email/URL subscriptions. (Computer programmers: if you write web services, you can use the URL system to have ERDDAP notify your web service immediately whenever a dataset changes.)

    RSS Subscriptions RSS is standard system for notifying users when the content at a web site has changed. Modern web browsers have an RSS client built in or you can use a separate RSS Reader. ERDDAP offers a separate RSS 2.01 feed for each dataset so that you can find out when interesting datasets have changed. To subscribe to a dataset's RSS feed, click on one of the RSS icons RSS that appear at the far right on ERDDAP web pages with lists of datasets (example) or on the Data Access Forms and Make A Graph web pages for individual datasets (example).

    Comparison The RSS service may be just what you are looking for. It is a nice standard. But if you need to know as soon as possible when a dataset changes, use the email/URL system, not RSS. RSS clients periodically (every hour?) request and read the RSS XML document to look for changes. So typically, an RSS client will not detect a change to a dataset quickly (average 30 minutes?). In contrast, the email/URL subscription system acts immediately whenever ERDDAP detects a change to a dataset. The more pro-active approach of the email/URL system is also much more efficient: You may be able to set your RSS client to check for changes every minute (don't do it!), but that would just lead to lots of unnecessary requests to the ERDDAP server and it still wouldn't detect changes immediately.
     

  • ERDDAP is a web application (for humans with browsers)
    and a web service (with services for computer programs).

     
  • ERDDAP has REST- and ROA-style links to make its services available to computer programs.
    These features can be used to build another web service on top of ERDDAP (making ERDDAP do all the work!). ERDDAP is not intended to be a high-level data exploration/graphing service. Instead, ERDDAP is intended to provide services for such web sites and programs. So if you have an idea for a better interface to the data the ERDDAP serves, we encourage you to build your own web application or web service, and use ERDDAP as the foundation. Read more about ERDDAP's Services for Computer Programs.
     
  • Security - By default, ERDDAP runs as an entirely public server with no login system and no restrictions to data access. However, an ERDDAP administrator can configure ERDDAP to restrict access to some or all datasets to users who log in and have been assigned certain roles. ERDDAP has built-in methods for authentication (logging in). If an ERDDAP installation has authentication turned on, there will be a "log in" link at the top of each web page. Users never have to log in to access the publicly available datasets. Users who have logged in can access public datasets and the private datasets to which they are allowed access. ERDDAP uses http: URLs for users who aren't logged in, and https: (Secure Sockets Layer) URLs for users who are. more information
     
  • ERDDAP processes data in chunks.
    To save memory (a big issue) and make responses start sooner, ERDDAP processes data requests in chunks -- repeatedly getting a chunk of data from the source, cleaning it up (for example, adding metadata), and sending that to the client. For many data sources, this means that the first chunk of data (for example, from the first sensor) gets to the client in seconds instead of minutes (for example, after data from the last sensor has been retrieved), reassuring the client that the data is coming. From a memory standpoint, this allows numerous large requests (each larger than available memory) to be handled simultaneously.
     
  • ERDDAP has a modular structure.
    ERDDAP is structured so that it is easy to add different components (for example, a class to request data from a SOS server and store it as a table). The new component then gains all the features and capabilities of the parent (for example, support for OPeNDAP requests and the ability to save the data in several common file formats).
     
  • Data Dissemination / Data Distribution Networks: Push and Pull Technology
    Normally, ERDDAP acts as an intermediary: it takes a request from a user; gets data from a remote data source; reformats the data; and sends it to the user. Pull Technology: But ERDDAP also has the ability to actively get all of the available data from a remote data source and store a local copy of the data. Push Technology: By using ERDDAP's subscription services, other data servers can be notified as soon as new data is available so that they can request the data (by pulling the data). ERDDAP's EDDGridFromErddap and EDDTableFromErddap use ERDDAP's subscription services and flag system so that they will be notified immediately when new data is available. You can combine these to great effect: if you wrap an EDDGridCopy around an EDDGridFromErddap dataset (or wrap an EDDTableCopy around an EDDTableFromErddap dataset), ERDDAP will automatically create and maintain a local copy of another ERDDAP's dataset. Because the subscription services work as soon as new data is available, push technology disseminates data very quickly (within seconds).

    This architecture puts each ERDDAP administrator in charge of determining where the data for his/her ERDDAP comes from. Other ERDDAP administrators can do the same. There is no need for coordination between administrators. If many ERDDAP administrators link to each other's ERDDAPs, a data distribution network is formed. Data will be quickly, efficiently, and automatically disseminated from data sources (ERDDAPs and other servers) to data re-distribution sites (ERDDAPs) anywhere in the network. A given ERDDAP can be both a source of data for some datasets and a re-distribution site for other datasets. The resulting network is roughly similar to data distribution networks set up with programs like Unidata's IDD/IDM, but less rigidly structured.
     

Is ERDDAP a solution to everyone's data distribution / data access problems?
No. ERDDAP tries to find a sweet spot that is a really good solution to most of the data distribution problems that we confronted. ERDDAP takes a middleware approach: It can get data from lots of different types of remote data servers and it can give that data to clients in lots of different file formats. It is designed to be an agnostic solution which seeks to make other data servers (OPeNDAP, SOS, OBIS, WMS, ...) interoperable. Is there one perfect data server that meets everyone's needs perfectly? We don't think so. And even if you think there is or will be, it will be a long time before everyone switches to it, if ever. Until then, ERDDAP is available right now to make other data servers interoperable and to serve data right now.

ERDDAP can handle many/most datasets as is, but not all. It isn't that the remaining datasets (e.g., model data using a cubed sphere projection) aren't important. It's just that ERDDAP's goal of returning data in common file formats (some of which are pretty simple), precludes a more complex internal data structure. Groups of researchers working with more complex data structures often already have specialized data servers and specialized client software which are customized to their community's needs. ERDDAP, as a general purpose data server, doesn't try to compete with these specialized data servers. They are customized to the needs of their community and do a great job. However, those datasets are often only "understood" by the specialized software in that community.

A Work-Around for Complex Datasets - ERDDAP has a way to handle complex datasets that it can't handle directly. Just as a relational database can store a complex dataset by using just one simple data structure (a table), ERDDAP can serve the data from more complex datasets by breaking the source dataset into a few ERDDAP datasets, each with similar, simple data structures. For example, some gridded environmental model datasets can be stored in ERDDAP by putting the sea surface variables ([time][latitude][longitude]) in one ERDDAP dataset, and by putting the variables with altitude ([time][altitude][latitude][longitude]) in another ERDDAP dataset. We know this isn't ideal, but it is necessary to allow ERDDAP to return data in common file formats (some of which are pretty simple).

Another approach to dealing with complex datasets (e.g., for model data using a cubed sphere projection) is to also offer a reprojected version of the dataset ([time][altitude][latitude][longitude]) which ERDDAP can work with easily. These simpler data structures aren't meant to replace the original data structures, but they can be a useful way to distribute the data to a wider audience.

How to Cite ERDDAP in a Paper
If you want to cite ERDDAP itself in a scientific paper, please use something like

Simons, R.A. 2011. ERDDAP - The Environmental Research Division's 
Data Access Program. http://coastwatch.pfeg.noaa.gov/erddap . 
Pacific Grove, CA: NOAA/NMFS/SWFSC/ERD.

If you want to cite a specific dataset in ERDDAP, please generate the citation based on the information in the dataset's metadata. If you are referring to a specific subset of a dataset, please include the complete URL needed to replicate that download.

Guidelines for Data Distribution Systems
Bob's opinions about the design and evaluation of data distribution systems can be found here.

You can Set Up Your Own ERDDAP Server and serve your own data.

  • The small effort to set up ERDDAP brings many benefits.
  • If you already have a web service for distributing your data, you can set up ERDDAP to access your data via the existing service or via the source files or database. Then, people will have another way to access your data and will be able to download the data in additional file formats or as graphs or maps.
  • If you have datasets that are in high demand, you can install multiple ERDDAPs that work together to scale up and meet the needs of a large data distribution center.

Contact Us

If you have questions, suggestions, or comments about ERDDAP in general (not this specific ERDDAP installation), please send an email to bob dot simons at noaa dot gov and include the ERDDAP URL directly related to your question or comment.

 
ERDDAP, Version 1.46
Disclaimers | Privacy Policy | Contact