The Climate Diagnostics Center (CDC) has historically processed and stored climatological data for its resident scientists as well as a relatively limited population of scientists outside the data center. Use of the World Wide Web (WWW) improves the way in which CDC advertises and distributes data to its users. This paper discusses the transition of CDC from a traditional data storage and distribution paradigm to a site that emphasizes the use of WWW capabilities.CDC currently archives 17 sets of climatological data. Each of these datasets is composed of from one to approximately 240 physical files, resulting in over 1300 files that need to be described and available for distribution. By taking advantage of the hypertext interface supported by WWW browsers, our data presentation now allows scientists to evaluate the data only to the degree of detail that they require, from viewing the summary metadata to looking at the data themselves. We are currently in the process of automating our hypertext file management so that scientists are assured of finding all relevant links to their data of interest.
Our transition to a World Wide Web paradigm increased our data distribution options. WWW servers allow customized software to be executed when the user explores a particular link. Presently, we are implementing prototype custom software and hypertext forms which allow the user to electronically request data, even to the point of extracting a subset of the data.
More recently, use of the World Wide Web (WWW) (Berners-Lee, 1994) has dramatically changed data advertisement, presentation and distribution at CDC. The World Wide Web is a system of distributed hypermedia, i.e., media which contains pointers to other (possibly different types of) media (Boutell, 1994). This paper will discuss the transition of CDC from a traditional data storage and distribution paradigm to a site that emphasizes the use of WWW capabilities. We will review the modifications we made to data description and advertisement, as well as the resulting changes in the way scientists request and receive data. Joining the WWW community involves both benefits and risks, and these will also be discussed.
The CDC Data DirectoryThis WWW page provides hyperlinks to keywords which are familiar to scientists as entry points to our data - i.e., they are typically used for querying our database, We also include links to references about the data format. Effective use of the WWW browser interface requires the information to be distributed to be in HTML; thus, CDC converted the metadata for its data holdings from ASCII text to HTML-tagged text. This required designing the hypertext link architecture, as well as additional maintenance to insure that these links remain current. From the intial data interface, scientists can traverse links which provide increasingly detailed information about our data holdings. If a user follows the link to dataset name and then selects the Comprehensive Ocean-Atmosphere Data Set (COADS), detailed metadata is presented which includes the following:
The CDC Data Directory consists of a collection of metadata which describe the CDC climatological data holdings. These metadata are in turn linked to the datasets themselves. The CDC data is archived in netCDF format. We have established some additional conventions for CDC netCDF files. The metadata may be searched by name, by a variable of interest, a statistic of interest, or by data level. Additionally, you may perform keyword searches on the metadata.A quick reference of all of the CDC datasets is also available.
Archive parameters: File names are composed of variable abbreviations and statistic:Should a scientist follow the link to e.g. Air temperature, he or she would be presented with a list of COADS files containing the air temperature variable. At this point, the user could directly browse the netCDF file metadata contents by exploring the hypertext link to the file. Because the Mosaic WWW browser has an interface to HDF/netCDF, no additional code is necessary to display this metadata, in a form similar to the following (edited for brevity):(variable).(statistic).nc "Observed" variables Name Units Precision Air temperature air deg C 0.01 Sea level pressure slp mb 0.01 Sea surface temperature sst deg C 0.01 u-wind uwnd m/s 0.01 v-wind vwnd m/s 0.01 Scalar wind wspd m/s 0.01 Cloudiness cldc okta 0.1 Specific humidity shum g/kg 0.01 Derived variables Relative humidity rhum % 0.1 Sensible heat parameter (sst - air) * wspd sflx degC*m/s 0.1 Latent heat parameter ((saturation shum at sst) - shum) * wspd lflx g/kg 0.01 u-wind stress (wspd * uwnd) ustr m^2/s^2 0.1 v-wind stress (wspd * vwnd) vstr m^2/s^2 0.1 Statistic Abbreviation Mean mean Number of Observations nobs Long Term Mean ltm
Scientific Data Brows-o-ramaRather than rely on the Mosaic HDF interface, we could also store or generate this display of metadata using a local CGI script. Should the user decide to transfer the file to their local computer rather than browse its contents, the Mosaic Load to Local Disk option (or its counterpart in other WWW browsers) would be selected, allowing the user to transparently initiate an FTP of the file.
Datasets
There are 4 datasets and 4 global attributes in this file.Available datasets:
[etc.]
- Dataset air has rank 3 with dimensions [211, 90, 180]. The dataset is composed of 32-bit floating point numbers. It has the following attributes :
- Attribute dataset has the value : Monterey Real-time Marine S
- Attribute var_desc has the value : Air Temperature A
- Attribute level_desc has the value : Surface 0
- Attribute statistic has the value : Mean M
- Attribute actual_range has the value : -44.000000, 39.000000
- Attribute units has the value : degC
- Attribute title has the value : Air Temperature Monthly Mean at Surface
- Dataset time has rank 1 with dimensions [211]. The dataset is composed of 64-bit floating point numbers. It has the following attributes :
- Attribute title has the value : Time
- Attribute units has the value : yyyymmddhhmmss
- Attribute delta_t has the value : 0000-01-00 00:00:00
- Attribute avg_period has the value : 0000-01-00 00:00:00
- Attribute valid_range has the value : 19770100000000.000000, 19940700000000.000000
A list of matching files is returned. Each of the entries in the list is a hyperlink to the metadata matching the search term.air temp
Index cdc_data contains the following 9 items relevant to 'air temp'. The first figure for each entry is its relative score, the second the number of lines in the item.
Since CDC data is archived in netCDF format, the file metadata is available, at its source, in electronic form. We are in the process of making use of the netCDF metadata to generate HTML-tagged metadata via software processes rather than manually.
Remote users must request custom data from CDC support personnel, who then generate the data file and deliver it by anonymous ftp or via magnetic tape. The HTTP server software provides for execution of custom software when the user explores a link. This capability allows CDC to provide an electronic interface to custom data requests. Such automation removes the bottleneck of human intervention, but has some associated security risks. The ability to execute scripts via a hypertext link also gives client software the opportunity to use the script for unexpected - and destructive purposes (McCool, 1994). Keeping these risks in mind, CDC is exploring the use of HTML forms which provide the same functionality as the Extract application to fill specific data requests via the WWW. Additionally, CRDtools itself can be executed as the result of exploring a hypertext link.
The CDC map room page is constructed of small in- line graphics as shown above. These graphics are dynamically updated as the products are generated. Both the figure and descriptive text are linked to a larger version of the image. WWW browsers allow the client to specify how the larger image should be viewed when the link is explored. In the same fashion, CDC can also provide links to animations of climate products as we generate them.CDC Map Room Weather Products CDC Map Room Weather Products
Forecast: 500 mb (Last updated: Wednesday, 14-Sep-94 10:09:45 MDT)
500 mb Hgt. (Last updated: Wednesday, 14-Sep-94 02:36:03 MDT)