Information which describes a data set is known as "metadata". It is not the data itself but rather "data about the data". It is analogous to library catalogues which describe books yet are not the books themselves. The aim is that the prospective user should be able to find out about the data set without needing to access and investigate the data itself.
Metadata has two main functions:
Until recently the approach has been to combine these two functions. Directories tried to comprehensively describe the data sets and simultaneously provide the means to discover them.
This approach has often failed. The main reason for this is that custodians find the task of maintaining documentation onerous. There is sometimes more work in completing the metadata description than in preparing the actual data. With pressures and tight deadlines one may manage to get the report completed or a map generated but never seem to get enough time to properly document the product. Preparing informative maps and reports can be rewarding and even enjoyable, but documenting one hundred points about data quality is nobody's idea of fun!
As a consequence there is often little descriptive information about the data set available and sometimes none at all. When this happens the means to discover the existence of a data set is lost and so its utility is diminished. In fact the data sets themselves are often effectively lost. No-one knows about them and so they can sit in a cupboard and gather dust.
A new way of handling directories is needed.
One way of circumventing these problems is to separate out the functions of the metadata. A minimal set, of say 20 core fields, would enable a large number of data sets to be documented in a reasonable time frame. Maintaining and updating would also be easy. A comprehensive directory would enable users to discover relevant useful data sets. This meets the first objective of metadata : discovery.
Utilising the multiformat capabilities of WWW and WAIS an associated picture of the data set can be presented along with the metadata. If the user feels that this might be a useful data set then an embedded link (URL) in the metadata document can be followed which would generate a complete metadata report based on the current state of the data.
This next phase now meets the second objective of metadata : to describe fitness for use. In-depth documentation can then be performed as a separate process with separate objectives, priorities and resourcing. The success of the directory is not dependent on it yet can easily link to it if it exists.
This minimalist approach has a better chance of success. The smaller metadata file is less imposing and is more likely to be completed by the custodian. It will still encompass the key fields required for searching, such as theme, region, and date and it can still give a pointer to more comprehensive information. Ideally this pointer will be a link to the detailed metadata but maybe it will just be the contact details of the custodian person.
An example metadata file for the Australian World Heritage Areas ERIN (1993) data set is at [Appendix A].
This metadata file is a plain text file and, as such, could be maintained outside a database. Custodians without access to database facilities can still prepare and maintain data set directories. This makes it even more inviting to fill out a directory entry.
However there are advantages of using a database as it helps to manage this information. If the metadata is maintained in a database then the directory descriptions and accompanying images can be automatically generated on a regular basis. This can help to minimise the drudgery of this task and allows information to be as up-to-date as possible.
We have endeavoured to follow the US "Content Standards for Digital Geospatial
Metadata" [FGDC2]. Australia
is adapting the "Spatial Data Transfer Standard"
[SDTS]
for local use.