The main concept behind the GCPS is to enhance the exploration of climatological data in ways that will allow an investigator to gain an understanding of the information contained in the data. This is accomplished by creating an environment that allows for the non-linear flow and visualization of information about the data. This enables an investigator to pose a particular question about climate change and use GCPS to provide quantitative information on the metadata, determine the credibility of the climatological datasets, graphically examine the data spatially and temporally, and obtain the data needed to answer the question.
An important aspect of this entire process is having an efficient way to navigate through the information and large datasets. This navigation process is a multilevel investigation of the information and data for a given inquiry. What conclusions have been drawn from the research using this dataset? What is the temporal range and spatial distribution of the data? Answers to these questions are linked using hypertext/media to journal articles, real time display of time series , contour plots and ultimately access to the data, all of which are available on the Internet. These questions should be answered prior to embarking into other investigations with these data. The scientist has a unique way to assimilate scientific information using this hiearchical information approach. But the information on a particular issue should be sufficient to be understood by individuals with different levels of knowledge on the subject.
The (GCPS) is being developed using this paradigm to explore climate change. The GCPS is a coordinated project involving groups within two major NOAA components: the National Climatic Data Center (NCDC), and the Climate Diagnostics Center (CDC).
MOSAIC uses hypertext/media and provides the framework for the hierarchical information delivery system. Information about particular scientific investigations are linked together using hypertext markup language (html). For example, within the MOSAIC framework there is a discussion on global warming (Figure 2), which includes the text and figures electronically linked to key words. Each of the small images can be "clicked on" to view on the screen or download to the client computer as a postscript file. The user has the ability to also download the entire dataset via anonymous ftp within the MOSAIC framework or a subset of the data in the time and space domain using the GCPS data access engine.
The interactive portion is shown in Figure 3. The user can define the following aspects of query :
Figure 4 shows an example of an anonymous ftp dataset description page. Each dataset is displayed like a book. In this case we see the geographic distribution of all stations and an overview of its contents. Selected journal articles that show results from the analysis of this dataset are linked from the anonymous ftp page back to the article and all documentation about the dataset are provided when it is downloaded.
The GCPS data access engine (Carroll and Baker 1994) consists of a relational database (Empress), the Naval Environmental Operational Nowcasting System (NEONS), a X Window system display server and a Motif window manager. The access engine provides a method to browse and extract station and gridded data. The relational tables are an integral part of the information about the datasets residing in the RDBMS. The database access engine provides a user interface that is coupled to all databases within GCPS. The initial window is shown in Figure 5. This window displays the datasets from which subsets of data may be extracted in the time and space domain. Just below the datasets is the query for the time domain. The user can choose the time increment by using this part of the dialog box. Below this part we see a world map. The user can now box in any part of the world using the cursor. These two actions define the time and space domain for extracting a subset of the data. At this juncture a query is formulated and the location of the stations and any metadata associated with those stations appear (Figure 6). Note in Figure 6 we see the station distribution for the chosen area and time range and metadata. In this case the metadata consists of latitude, longitude, station name, and period of record. Ultimately the data and/or the metadata can be written to a file. The user can specify the format for the data.
All of the above methods of exploring and examining the information and data can be done in any order the user would like to pursue. This eliminates the conventional problem of having to step through sequential processes. The MOSAIC interface allows one to explore information and data in a non-linear fashion on the Internet.
The temporal check flags extreme values based on limits determined from a multiple of the interquartile range (IR) calculated for each station/month. This procedure is common in exploratory data analysis procedures. An outlier is flagged when:
Xi - q50 > f * IR (1)where Xi is the monthly mean of year, i, q50 is the median (or the 50th percentile) and f is the multiplication factor. A value of 2.75 was used for temperature and 4.00 for precipitation.
The spatial check uses nearby simultaneous values to calculate an estimated value at a target station over the period of time for which adequate data are available. There are numerous spatial interpolation methods avaialable for point estimation with irregularly spaced data. Typically, the choice of methodology is dependent on several factors: the meteorological variable under consideration; the geographical area; the spatial distribution of surrounding observations; and the month/season for which the target station is to be estimated. Since these estimates are required for each month separately over a variety of terrain with differing number of available surrounding observations, six different methods are utilized and compared for each month for each station, with the best one chosen (based on the correlation).
The end result from these checks are a series of flags and a file of summary statistics for each station and month. All of this information is cross-linked through hypertext to explore these datasets.
In addition to these datasets in GCPS, recent versions of the Geophysical Fluid Dynamics Laboratory Global Climate Model 100 and 1000 year output are available. There are temperature and precipitation model runs for CO2 fixed at the amount observed in 1958 (control run), and for an approximate 1% increase each year thereafter for the 100 year run, and the control run output for the 1000 year run. The data are on a gaussian grid with monthly resolution.
Doty, Brian, J. L. Kinter III, 1993: The Grid Analysis Display System (GrADS) An Update. Proceedings of the Ninth International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, 1993, pp. 165-167.
Eischeid, J. K., C.B. baker, T.R. Karl, and H.F. Diaz, 1994: The quality control of long-term climatological data using objective data analysis. Submitted to J. Appl., Meteor.
Vose, R.S., T.C. Peterson, R.L. Schmoyer, J.K. Eischeid, P.M. Steurer, R.R. Heim, and T.R. Karl. 1993. The Global Historical Climatology Network: Long-term monthly temperature, precipitation, and pressure data. Fourth AMS Symposium on Global Change Studies, Anaheim, CA., January 17-22 1993.