NOAA ERDDAP
Easier access to scientific data
log in    
Brought to you by NOAA NMFS SWFSC ERD     

ERDDAP > Data Provider Form - Part 1

This is part 1 (of 4) of the Data Provider Form.
Need help? Send an email to the administrator of this ERDDAP (bob dot simons at noaa dot gov).
 

Your Contact Information

This will be used by the ERDDAP administrator to contact you. This won't go in the dataset's metadata or be made public.

What is your name?
What is your email address?
This dataset submission's timestamp is 2017-03-24T16:57:19.

The Data

ERDDAP deals with a dataset in one of two ways: as gridded data or as tabular data.

Gridded Data
ERDDAP can serve data from various types of data files (and from OPeNDAP servers like Hyrax, THREDDS, GrADS, ERDDAP) that contain multi-dimensional gridded data, for example, Level 3 sea surface temperature data (from a satellite) with three dimensions: [time][latitude][longitude].

The data for a gridded dataset can be stored in one file or many files (typically with one time point per file).

If your dataset is already served via an OPeNDAP server, skip this form and just email the dataset's OPeNDAP URL to the administrator of this ERDDAP (bob dot simons at noaa dot gov).

How is your gridded data stored?

Tabular Data
ERDDAP can also serve data that can be represented as a single, database-like table, where there is a column for each type of data and a row for each observation. This includes:

  • All data that is currently stored in a database.
    (See Data in Databases below for more information.)
  • All in situ data.
    Examples: a time series from an instrument or several similar instruments, profile data from a CTD or a group of CTD's, or data collected during a ship's cruise (the similar cuises over several years).
  • Non-geospatial data that can be represented as a table of data,
    Examples: data from a laboratory experiment, genetic sequence data,
    or a list of bibliographic references.
  • Collections of other types of files (for example, image or audio files).
    ERDDAP can present the file names in a table and let users view or download the files.
The data for a tabular dataset can be stored in one file or many files (typically with data for one station, one glider, one animal, or one cruise per file). We recommend making one dataset with all of the data that is very similar, and not a lot of separate datasets. For example, you might make one dataset with data from a group of moored buoys, a group of gliders, a group of animals, or a group of cruises (for example, annually on one line).

How is your tabular data stored?

Frequency of Changes
Some datasets get new data frequently. Some datasets will never be changed.
How often will this data be changed?
 

Finished with part 1?

Click to send this information to the ERDDAP administrator and move on to part 2 (of 4).
 

Additional information about
Data in Databases

If your data is in a database, you need to make one, denormalized (external link) table or view (external link) with all of the data that you want to make available as one dataset in ERDDAP. For large, complex databases, it may make sense to separate out several chunks as denormalized tables, each with a different type of data, which will become separate datasets in ERDDAP. Talk this over with the administrator of this ERDDAP (bob dot simons at noaa dot gov).

Making a denormalized table may sound like a crazy idea to you. Please trust us. The denormalized table solves several problems:

  • It's vastly easier for users.
    When ERDDAP presents the dataset as one, simple, denormalized table, it is very easy for anyone to understand the data. Most users have never heard of normalized tables, and very few understand keys, foreign keys, or table joins, and they almost certainly don't know the details of the different types of joins, or how to specify the SQL to do a join (or multiple joins). Using a denormalized table avoids all those problems. This reason alone justifies the use of a denormalized table for the presentation of the data to ERDDAP users.
     
  • You can make changes for ERDDAP without changing your tables.
    ERDDAP has a few requirements that may be different from how you have set up your database.
    For example, ERDDAP requires that timestamp data be stored in 'timestamp with timezone' fields.
    By making a separate table/view for ERDDAP, you can make these changes when you make the denormalized table for ERDDAP. Thus, you don't have to make any changes to your tables.
     
  • ERDDAP will recreate some of the structure of the normalized tables.
    You can specify which columns of data come from the 'outer' tables and therefore have a limited number of distinct values. ERDDAP will collect all of the distinct values in each of these columns and present them to users in drop-down lists.
     
  • A denormalized table makes the data hand-off from you to the ERDDAP administrator easy.
    You're the expert for this dataset, so it makes sense that you make the decisions about which tables and which columns to join and how to join them. So you don't have to hand us several tables and detailed instructions for several joins, you just have to give us access to the denormalized table.
     
  • A denormalized table allows for efficient access to the data.
    The denormalized form is usually faster to access than the normalized form. Joins can be slow. Multiple joins can be very slow.
In order to get the data from the database into ERDDAP, there are three options:
  • Recommended Option:
    You can create a comma- or tab-separated-value file with the data from the denormalized table.
    If the dataset is huge, then it makes sense to create several files, each with a cohesive subset of the denormalized table (for example, data from a smaller time range).

    The big advantage here is that ERDDAP will be able to handle user requests for data without any further effort by your database. So ERDDAP won't be a burden on your database or a security risk. This is the best option under almost all circumstances because ERDDAP can usually get data from files faster than from a database (if we convert the .csv files to .ncCF files). (Part of the reason is that ERDDAP+files is a read-only system and doesn't have to deal with making changes while providing ACID (external link) (Atomicity, Consistency, Isolation, Durability).) Also, you probably won't need a separate server since we can store the data on one of our RAIDs and access it with an existing ERDDAP on an existing server.

  • Okay Option:
    You set up a new database on a different computer with just the denormalized table.
    Since that database can be a free and open source database like PostgreSQL, this option needn't cost a lot.

    The big advantage here are that ERDDAP will be able to handle user requests for data without any further effort by your current database. So ERDDAP won't be a burden on your current database. This also eliminates a lot of security concerns since ERDDAP will not have access to your current database.

  • Discouraged Option:
    We can connect ERDDAP to your current database.
    To do this, you need to:
    • Create a separate table or view with the denormalized table of data.
    • Create an "erddap" user who has read-only access to only the denormalized table(s).
       
    This is an option if the data changes very frequently and you want to give ERDDAP users instant access to those changes; however, even so, it may make sense to use the file option above and periodically (every 30 minutes?) replace the file that has today's data.
    The huge disadvantages of this approach are that ERDDAP user requests will probably place an unbearably large burden on your database and that the ERDDAP connection is a security risk (although we can minimize/manage the risk).
When you talk with the administrator of this ERDDAP (bob dot simons at noaa dot gov), you can discuss which of these options to pursue and how to handle the details.

 
ERDDAP, Version 1.74
Disclaimers | Privacy Policy | Contact