André Heck
Strasbourg Astronomical Observatory, France
The most prominent examples of combined metadatabases and yellow-page services in the field are the compilations set up at five different sites internationally by the AstroWeb consortium and the different complementary products of the Star*s Family.
This paper gives a review of the situation of the Web penetration in the fields concerned, illustrated by a diversity of examples. Future needs are identified.
Mid-1993, astronomy WWW services started flourishing on the Web.
In July 1994,
a server at NASA/JPL,
recorded one million document requests within a few days:
this server was offering,
not more than a few hours after their acquisition, images of
the collision of the comet Shoemaker-Levy 9 with Jupiter,
observed by space probes or ground-based telescopes from all
over the planet.
This example is a striking illustration of the
dramatic importance taken by the Web in astronomy and related
space sciences.
Our ultimate aim as astronomers or space scientists is to contribute to a better understanding of the universe and consequently to a better comprehension of the place and rôle of man in it. To this purpose, together with theoretical studies, we carry out observations to obtain data that will undergo treatments and studies leading to the publication of results. The whole procedure can include several internal iterations or interactions between the various steps as well as with external fields, scientific disciplines and information handling methodologies.
In the following pages, the concept of information will cover the observational material, the more or less reduced data extracted from it, the scientific results, as well as the accessory material used by the scientists in their work (blibliographical resources, yellow-page services, and so on).
The increasing efficiency of cameras and photon collecting devices used by astronomers, either from ground-based telescopes, or from space experiments, is generating an unprecented accumulation of data. The ability of storing, managing, and giving access to this huge quantity of data, and the associated documents, is one of the major challenges of our science (and of natural sciences in general) for the next decade.
In the 80's, dedicated computer archives and databases, accessible remotely (when the networks allowed), were the appropriate answer to the data retrieval problem. Later on, in order to face the diversity and complexity of access to data for the astronomers as part of their research work, integrated information systems started to build up gradually: examples are the Astrophysics Data System (ADS) in the U.S. (Eichhorn, 1994), and the European Space Information System (ESIS) in Europe (Giommi, 1994).
A key lesson of these experiences is that the data should reside and be maintained at the same place where the expertise is located: i.e. as close as possible to the data providers, and generally first users, or Principal Investigators of the experiment, who are able to understand and process the raw data, and provide the routines for producing final data in physical units - a process which sometimes imply several years of iterative improvements.
However, the scientific teams are generally reluctant to devote time, manpower and money in the different aspects of data distribution to a wider community (documentation, homogenization, building of friendly user interfaces, etc.). The role of the data center and information systems is thus to bridge the gaps between the specialized approach of the scientific teams, and the general approach of the wider community of researchers: a striking example being the need to cross-compare data acquired for the same astronomical objects under multiple identifications, and through different intruments working at different wavelengths (panchromatic astronomy). For more on these aspects, see Albrecht &Egret; (1993), Wells (1992).
The recent developments of the World Wide Web bring interesting answers to these problems, by providing
The astronomical community showed a very early interest to the World Wide Web: see, e.g. the invited contribution by White (1993). Several dozen WWW servers were already available in mid-1993, and several hundreds in mid-1994. The astronomical community was ready to leap onto the Web, because of its familiarity with the international collaboration through the data networks, and its computer infrastructure.
Let us characterize some of these servers, and propose a typology of the existing atronomy services on the Web.
Examples: Astronomical Observatory of Bologna; Lund Observatory
Examples: The Canada-France-Hawaii Telescope; European Southern Observatory
Examples: Strasbourg Astronomical Data Center (CDS); NASA's Extreme Ultraviolet Explorer (EUVE); HEASARC
Examples: ADS Abstract Service; ADASS III Proceedings
Examples: WebStars: Astrophysics in Cyberspace; NASA/JPL server for Comet Shoemaker-Levy 9; Starlink
Examples: The Star*s Family products; The AstroWeb database
The number of on-line astronomical services (databases, datasets, catalogs, archives, ftp-services, gophers, WAIS indexes, information systems, directories of services, Web home pages, etc.) is such that we now need not simply directories, but meta-directories, and, of course, an homogeneous interface to the jungle of services.
The paradox is that, because of this extraordinary wealth of information pages, one may have trouble to find the appropriate server for answering a simple request, while, in the same time, sophisticated functions are made available as buttons on an X-window application.
Very often the user has no access to the adequate information about these services: what is the actual data contents ? how frequently is it updated ? what are the data retrieval functionalities, and limitations ?
In order to tackle this problem, important efforts are made, here and there, to help organizing the information:
But these efforts are not sufficient to reach the remote user who is most often specialized in his/her own field, and is not interested to attend these meetings or read these books, ... until he faces the problem in his scientific activity, and has to solve it quickly.
In this context, the effort to combine yellow-page services and meta-databases of active pointers is a crucial solution to the data retrieval problem.
The concept of yellow-page services was probably publicly introduced for the first time in the astronomical community at the ALD-II Conference in Haguenau (Heck &Murtagh;, 1992). Earlier efforts (described by Heck, 1991) were essentially classical publications on paper: directories of organizations, lists of electronic addresses, and so on. Some files started to be retrievable electronically (e.g. Benn &Martin;'s lists of e-mail addresses).
Meanwhile indeed the existing references products have tuned themselves to the new capabilities brought in by the evolution of Information technology. New reference products naturally derived from the new media available.
Star*s Family. This is the generic name for a growing collection of directories, dictionaries and databases, organized around three sets of master files, of which we will mention here the WWW version:
All together, they can be reached via the CDS homepage giving access to the various CDS services, as well as to external astronomy resources such as AstroWeb (see below).
The hypertextual structure of the databases on the CDS Mosaic server includes, beyond the search mechanisms, general introductory documents, access to forms, tips for usage, hot news, e-mailing facilities, lists of national telephone, telefax and telex codes, and so on. All the facilities cross-point to each other, with possible active navigation via retrieved URLs. At the time of writing, upgrading plans include some logical syntax capabilities, underlying thesaurus structure and, last but not least, retrieval of observing facilities on the basis of their location on our planet (especially useful for observing campaigns).
AstroWeb (Jackson et al., 1994) is a collection of pointers to astronomically relevant information resources available on the Internet. It is maintained by the AstroWeb consortium, a group of scientists from CDS (Strasbourg astronomical data center), MSSSO (Mount Stromlo and Siding Spring Observatories), NRAO (National Radio Astronomy Observatory), STScI (Space Telescope Science Institute) and ST-ECF (Space Telescope - European Coordinating Facility, hosted at European Southern Observatory).
Each database entry includes at least one hyperlink to the actual resources, as well as a short and a long description, together with a classification (established by one of the consortium members) allowing to list the description under one or several of the following categories:
Five separate versions of the AstroWeb database are currently accessible, at each participating institution. These versions have different styles and contents, but all are computed from the same master database, with the help of a set of dedicated tools (Jackson, 1994).
This explosion of documents on the web is not a bed of roses. New facilities and new possibilities bring in naturally new questions and new problems. Some of the Mosaic servers have already reached a quite fair degree of maturity. Others are still a bit in a wild stage by lack of structure and homogeneity or simply because they offer, let us say it frankly, rubbish of little interest. Although quite a few features have been adopted de facto by the developers of documents on the web, there is a definite need for an ethical charter. It could concern quite a number of features from the substances of the documents themselves to their aesthetical presentation and a number of recommended functionalities. These questions, and more, are addressed in a review paper by A. Heck (1995).
This shows that there are non-negligible educational aspects to be taken into account as to the introduction and training of young and not-so-young people to the new technologies within the various communities. This is true not only for scientists, but also for librarians and documentalists who will see their rôle significantly changing within their institution and who will deal with a more and more virtual material.
It would be too hazardous to play the game of predicting the long-term impact of this on-going evolution of the information technology. The future is too fuzzy and all predictions are risky. Two years ago, Mosaic was unknown while today it allows a daily cyberspace navigation at a planetary scale. Who would still dare predicting the status of computer technology and information handling a couple of years ahead?