Live Visualization and Extraction of Climate Data with Mosaic and FERRET

Steve Hankin - National Oceanic and Atmospheric Administration,
Pacific Marine Environmental Laboratory, Seattle, Washington

Jerry Davison - Joint Institute for the Study of the Atmosphere and Ocean,
University of Washington

Abstract:

At NOAA's Pacific Marine Environmental Laboratory (PMEL) we have developed a Mosaic-based visualization and data extraction system that breaks through some of the traditional barriers to scientific data interchange. The distribution and interchange of scientific data is central to the progress of scientific research, yet the size and complexity of scientific data sets, in combination with the prevalent use of incompatible and discipline-specific formats, make it awkward and time consuming to share data. Researchers are commonly forced to develop custom applications to utilize scientific data. Vital descriptive information, the metadata, is often disconnected and lost in the process.

The PMEL server uses Mosaic forms to provide a point and click front end to the scientific analysis and visualization program FERRET. The server will provide access to a large (>20 gigabytes), research-oriented data base of multi-dimensional, gridded, environmental data collected within NOAA and elsewhere. The data base is maintained by the Thermal Modeling and Analysis Project at PMEL. The server allows the user to graphically browse this data base, selecting data sets and variables and specifying the location, time, and viewing plane or axis. The system provides data visualizations and formatted extractions of any selected data.


Figure 1. Mosaic/FERRET Live Visualization and Extraction System.

Introduction:

The explosive growth of the World Wide Web and the integrating power of Mosaic has placed the scientific user a mouse click away from a vast universe of data. As the richness and complexity of the Web has increased, attention has focussed on the emerging challenge of data navigation -- how to locate the data of interest within the vastness of what is available. Less attention, however, has been paid a previously unsolved problem: scientific data sets, once located, are often structurally complex, and distributed in incompatible, discipline-specific formats that remain a barrier to usage.

Important efforts aimed at providing a degree of standardization continue: for example, the adoption of the HDF standard [1] by the NASA EOSDIS program and the development of the NetCDF standard [2] at Unidata. While these efforts have standardized the binary data encodings and the application programmers' interface (API) they fall short of the scientists needs for ready access. Scientific data sets must include metadata [3, 4]; encodings of metadata and structures to represent complex multi-dimensional data are not adequately standardized. Furthermore the large size of many data sets prohibits downloading an entire data set over the Internet. Frequently in the final stage of Internet delivery systems the data are written to tape and mailed to the user.

One solution to these difficulties is for Internet servers to provide "live visualization and extraction". A live visualization and extraction server permits a user to interactively browse a data base and graphically visualize the data. It facilitates comparison of fields through simultaneous displays and graphical overlays. Such a server would have to generate graphics on the fly rather than retrieve pre-created image files -- hence the term "live access". An advanced server may perform transformations (differencing fields, smoothing, gap-filling, averaging, etc.) and provide visualization of these results. In such an environment the scientific user could obtain much of the desired knowledge without downloading data files. Should the user require the data on a local host he/she would know precisely the limits of the data required and would be in a position to request the minimally required subset. Such modest-sized data extractions can be transferred more efficiently over the bandwidth-limited Internet. The extraction portion of the server should be able to deliver the data in a user-specified format.

Building such a system can be reduced to a modest task by leveraging off an existing visualization and analysis application. The FERRET system Climatological Data Base:

The Thermal Modeling and Analysis Project at NOAA/PMEL maintains an extensive data base of gridded climatological observations in support of its numerical climate studies. The data sets include global surface marine observations, upper air analyses, satellite observations, ship drift derived surface ocean currents, heat flux climatologies, rainfall records, and other observed and derived fields. These data sets collectively represent many years of effort by the individuals and organizations that created them. These include several institutions within NOAA (for example the National Meteorology Center and the environmental research laboratories), the National Center for Atmospheric Research, NASA, and universities and other research institutions, both in the US and other nations.

The data span a range of temporal and spacial resolutions and include geometries from 1-dimensional (e.g. a time series at a specific location) to 4-dimensional (e.g. the time evolution of a 3D flow field). The largest of the data sets are gigabytes in size. NOAA has recognized the importance of centers of data such as this and is actively supporting this development and many others to fulfill its mission as a provider of global environmental data [3].

System Overview:

The user interacts with the Mosaic/FERRET live visualization and extraction system through a graphical user interface (GUI) (see figure 1). HTML forms provide a set of customary GUI controls: option menus, push buttons, radio buttons, and text input fields. With these controls a user can specify the choice of data set, variable name, view, and location of interest in space and time. The selection of views includes any 1-dimensional or 2-dimensional geometry that may be visualized for a particular data set. Thus, for example, the user may elect to view ocean temperature as a filled contour plot on axes of latitude and longitude; as a vertical cast; or as a Hovmoller plot which shows the evolution of temperature in time along the equator. Any data selection that may be visualized may also be extracted and downloaded to the users computer via ftp.

Figure 2. Schematic diagram of the Mosaic/FERRET interface.

When the user presses the button to submit the HTML form to its Mosaic server, the Mosaic Common Gateway Interface (CGI) forks a process to handle the user's request (see figure 2). This handler parses the user's request and in turn forks another process to run the FERRET program in command line mode. Standard IO is redirected through pipes so that the handler can send ASCII commands to FERRET. FERRET, in turn, responds to these commands by opening the requested data files, extracting the desired data, producing graphics (GIF files), and writing the data into files of several possible formats. These files are returned to the client (user) by HTML or FTP as appropriate.

To enhance performance the server uses a GIF image cache. Frequently accessed views (those in which the user has not specified custom space/time locations) are saved on disk and reused when identical requests are made in the future.

A Mosaic form is a static interface; once displayed by the client the range of options offered by the form cannot be modified. Furthermore, only a single button can be meaningfully designated to "submit" the form. These limitations impose significant restrictions on the GUI design. For example, when the user wishes to change the choice of data set the list of variables in the variables option menu must be changed. The Mosaic/FERRET server side steps this limitation by providing separate forms for selecting the data set and view. When the user completes the selection of data set or view the main interface form is regenerated with the proper menu options included. (This work-around could be avoided if HTML forms supported multiple, simultaneous SUBMIT inputs of various widget types.) The data set and view selections are embedded invisibly in the form as hidden text: i.e., < INPUT TYPE="hidden" ... > . In this way the form contains complete information to make a data request -- a vital feature since the Mosaic/FERRET server must remain stateless to handle multiple simultaneous accesses.

FERRET:

The FERRET program is the engine for accessing the data base, performing transformations, generating GIF files, and creating files of extracted data in various formats. FERRET is a visualization and analysis environment originally designed to meet the needs of oceanographers and meteorologists analyzing general circulation model outputs. As such it was designed to work with very large data sets (multi-gigabyte) containing mixed three and four dimensional variables on grids which are often staggered and irregular.

FERRET offers an extensible approach to analysis; new variables may be defined as mathematical expressions involving data set variables and other user-defined variables. Calculations (integrations, averaging, etc.) may be applied over arbitrarily shaped four dimensional regions. Fully documented graphics are produced with a single command. Graphics styles include line plots, scatter plots, rasters, contours, color-filled contours, vector plots, and wire frame plots. A version of the graphics package PLOT PLUS [7] is contained within FERRET. Data may be extracted to files in ASCII, IEEE binary, HDF Raster-8, or NetCDF format.

FERRET is available over anonymous FTP [abyss.pmel.noaa.gov] at no charge. It is distributed in binary for a variety of unix platforms as well as a BETA version for Macintosh.

Conclusion:

The Mosaic/FERRET live visualization and extraction server was developed with a minimum of resources through leveraging the capabilities of FERRET, a powerful visualization environment. The current HTML forms interface enables a user to quickly browse, visualize and extract variables from a climatological data base. Future versions of the server will expand the list of data sets, offer a collection of mathematical transformations, and add pan and zoom options, and graphical overlays. Additional controls will be provided to specify the format of extracted data files.

The "live access" nature of the server permits a user to interactively visualize variables in a particular region of interest. The images are transferred as GIF files -- highly compressed and efficient for network transfer. In cases where the user requires the data on a local host the ability to preview the data enables him/her to download minimum-sized subsets of the data which pass efficiently over the networks.

Acknowledgements:

The authors gratefully acknowledge Dr. D.E.Harrison whose vision created the FERRET system and whose continual support made it grow; and Jim Holbrook whose unflagging support for innovation has made this Mosaic development possible. This is JISAO contribution number 297.

References:

[1] The HDF Reference Manual , Version 3.3, February 1994 (http://hdf.ncsa.uiuc.edu:8001/refman/refmanual.html).

[2] Rew, R.K., and G.P. Davis, 1990: The Unidata netCDF: Software for scientific data access. Sixth International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Feb 5 9, 1990, Anaheim, CA.

[3] Report to the Senate Committee on Commerce, Science, and Transportation and the House of Representatives Committee on Science, Space, and Technology on a plan to modernize NOAA's Environmental Data and Information Systems based on the Needs Assessment for Data Management Archival, and Distribution, March 1994.

[4] National Oceanic and Atmospheric Administration 1995-2005 Strategic Plan (ftp://nic.noaa.gov/strategic-plan).

[5] Hankin, S., J. Davison, K. O'Brien, and D.E. Harrison, 1992: FERRET, A computer visualization and analysis tool for gridded data. NOAA Data Report ERL PMEL-38, 164 pp.

[6] S. Hankin and D.E. Harrison, 1993: FERRET: A Mathematica-style visualization and analysis tool for gridded oceanographic and meteorological Data. Ninth International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Jan 17-22, 1993, Anaheim, CA.

[7] Denbo, D.W., PLOT PLUS Scientific Graphics System Users Guide, April 1987 Plot Plus Graphics, P.O. Box 4, Sequim, Washington

About the Authors:

Steve Hankin is a computer scientist at NOAA's Pacific Marine Environmental Laboratory. He earned his Bachelor's degree in Physics from Reed College and his Master's degree in Applied Mathematics from the University of Washington. He is the principal developer of the FERRET visualization and analysis program. He has served on the ANSI Computer Graphics Committee, X3H3, and has a keen interest in computer standards issues.

Jerry Davison is a research scientist with the Joint Institute for the Study of the Atmosphere and Ocean at the University of Washington. He earned his Bachelor's and Master's degrees in Physics at the University of Missouri, Columbia. His contributions to this effort stem from a fascination with color and enthusiasm for environmental observation.

email: hankin@pmel.noaa.gov