NASA's Use of the World Wide Web to Deliver Shoemaker-Levy 9 Collision Data in Near-Real Time


Syed S. Towheed
Systems Programmer

A current version of this document is also available from http://nssdc.gsfc.nasa.gov/misc/www_conf/towheed.html.

Table of Content

Abstract

Thousands of scientists and non-scientists alike were fascinated by the recent spectacle of Jupiter being bombarded by the Shoemaker-Levy 9 comets. The National Space Science Data Center (NSSDC), located at Goddard Space Flight Center, the Jet Propulsion Laboratory (JPL), and the Space Telescope Science Institute (STScI), were among the NASA elements that made Shoemaker-Levy 9 (SL- 9) images from a variety of sources available via the World Wide Web (WWW) in near-real time. The result was record-breaking usage of NASA's World Wide Web servers, as over 2,020,000 accesses were logged from July 15-27, 1994. Through the Shoemaker-Levy 9 collision event, NASA has demonstrated the "real-world" use of this new technology as an immensely powerful and cost effective tool for the rapid development of data delivery and information systems, as well as for the massive transfer of data to a wide user base. This paper will address the methods and resources involved in creating the systems and an analysis of its usage.

Return to Table of Content


Summary of Servers Providing Shoemaker-Levy 9 Data

Space Telescope Science Institute

URL: http://marvel.stsci.edu/EPA/Comet.html

System Hardware and Software


CPU: Sun SS1000
OS: Solaris 2.3
RAM: 64 MB
Server: NCSA httpd (also ftpd, gopher, and freewais)
Network: T1

The Space Telescope Science Institute WWW service mostly provides images taken from the Hubble Space Telescope. However, selected data from ground based observatories are also available. The home page has links to other pages which include thumbnail images, movies, and textual information. The home page also has links to other SL-9 WWW sites and STScI's Gopher service.

Jet Propulsion Laboratory

URL: http://newproducts.jpl.nasa.gov/sl9/sl9.html, http://navigator.jpl.nasa.gov /sl9/sl9.html

System Content (File type, file count, and total size in KB)


HTML: 460, 80
MPEG: 14, 9,761
GIF: 479, 28,377

System Hardware and Software

Initial Site (Newproducts)          Secondary Site (Navigator)

CPU: Sun SPARCstation 2             Sun SPARCstation 2
OS: Sun OS 4.1.3                    Sun OS 4.1.3
RAM: 32 MB                          16 MB
Server: NCSA httpd 1.3              NCSA httpd 1.3
Network: Two T1 lines into JPL

The JPL home page has links to News Flash, Background Information, Images, Animation, Comet Shoemaker-Levy 9 Impact Times, TV Coverage, Spacecraft Observations of the Impacts, Ground Based Observations, and links to Other Comet Shoemaker-Levy 9 Home Pages.

The JPL system provides a substantial amount of information about the SL-9 event - almost all of which are provided in HTML format. Actual images from 50 different observatories are found as near full-size in-line images on pages containing the images caption. A full-size copy of the images can be retrieved by selecting it.

Goddard Space Flight Center

URL: http://nssdc.gsfc.nasa.gov/sl9/comet_images.html

System Content (File type, file count, and total size in KB)


TXT : 171, 190
HTML: 36, 94
GIF : 215, 18,890 (full resolution): 214, 763 (thumbnail)

System Hardware and Software

Initial Site (Bolero)          Secondary Site (NSSDC)

CPU: DEC AXP 3000-400          DEC AXP 3000-600
OS: OSF/1 1.3                  OSF/1 2.0
RAM:  32 MB                    64 MB
Server: NCSA httpd 1.3         NCSA httpd 1.3
Network: T1 lines into GSFC

The NSSDC SL-9 system was designed to provide rapid access to data, therefore the whole system comprise of only two layers. The first layer or "home page" lists the 27 observatories and sources from which data are available. Links are also provided to 12 other SL-9 sites, including the JPL and STScI servers. Two indices are provided which list images by fragment and image summary. The second layer comprise of individual observatory pages with in-line thumbnail images. Full resolution images are obtained by selecting the thumbnail equivalent.

Return to Table of Content


Detailed Description of NSSDC SL-9 WWW System Development

Justification

On the morning of July 18, 1994, both JPL and STScI WWW servers were laboring under the enormous loads and turning away many users. Out of frustration and excitement, users turned to the NSSDC hoping to find SL-9 observations. Our ambitions about providing data were rather modest - we would put up a few of the better observations. However, on Monday July 18, 1994, a hail of e-mail from around the world asking where "the" images were convinced us that a change of plans was in order.

By midday both the author and Dave Williams, NSSDC's resident planetary scientist, had put together a dozen or so of the first observations. This was a difficult task to accomplish since we had to acquire the data from several observatory sites (usually via FTP) while networks had already begun to experience a global slowdown due to the JPL and STScI activity. We tested the system and made it available through our existing services on our primary WWW server Bolero. Much later that night the author made an announcement on several network news groups that the National Space Science Data Center was now providing SL-9 observation through its WWW service.

Primary Server Crash and Ethernet Meltdown

By 9:00 am the following morning, the load on the server Bolero was sufficiently high for us to close down the service (See Figure 1.) Bolero was spawning up to 30 httpd daemons simultaneously and was not able to service any one of them. It went into a loop spawning and closing down daemons. We made the decision to migrate the entire SL-9 system off Bolero to a new server. The new server NSSDC was to server only SL-9 data while our traditional WWW services remained on Bolero.


Figure 1

At the same time network administrators at Goddard were very concerned with the saturation of the backbone and consequent loss or severe slowdown of other services. The Space Telescope Sciences Institute was also experiencing similar difficulties as Fred Romelfanger, developer of their SL-9 WWW system noted,

"[unprecedented access]...cause our T1 link to saturate, and we went off the air that night while NSI moved our network link due to problems that the data transfers were causing other sites. The saturation lasted for about a week and a half. During this time access to the Internet was very slow, but not impossible to use."

Secondary Server Brought On-line Within Three Hours

The initial shock and the subsequent flood of e-mail as we shut down Bolero taught us a few lessons right away. We analyzed the JPL and STScI system architecture and the feedback we were getting from the user community to established the following design criteria.

We made the assumption that the user was essentially interested in the latest observations and not necessarily on the background or other textual information. The whole system was only two layers deep so as to minimize browse activity. We were interested in wanting the user to come in, go to the observatory or fragment they are interested in, view the browse image, click on the image and caption to get the originals, and leave. The home page listed all the observatories from which we had acquired data. The home page also include links to other sites that was providing data. We wanted to encourage the user to look elsewhere immediately if they did not see the observatory of their choice on our list. Next to the name of each observatory, we put a "time stamp" as to when the last update was made to that observatory page, therefore preventing the user from needlessly looking to see if any new data had been obtained since they last visited. We also put a list of the comet fragments for which we had data next to the observatory names. Again, this was done to discourage unnecessary browsing.

Once selecting the observatory of their choice, the user sees the second and last layer of the system. The "observatory page" simply had a collection of in-line "thumbnail" images sorted my observation date and fragment. Below each image, a graphic bullet linked to a text file containing the image caption. Full resolution images was retrieved by selecting the thumbnail.

We knew our SL-9 system was going to be very heavily used. However, we were essentially constructing an image server, so somehow we needed to minimize the impact of transmitting large image files. We decided that we would not use any full resolution images on any page, but instead used in-line browse images. We experimented with several sizes before deciding that 100x100 pixel was sufficiently large for the user to view the essential features of the image. With hindsight, halving the file size and going down to 70x70 pixel might have been better choice.

Textual information such as captions were not included within the HTML pages, thus discouraging browsing, but was made available using a link at the bottom of each image. We also consciously avoided including animated data and motion pictures realizing that their immense size would create a massive bottle neck for users who just wanted the latest observations. We made at least three updates daily to the SL-9 system to ensure that the most recent observations were available. Normally, this was done via FTP to the observatory sites very early each morning when network traffic was lowest.

Return to Table of Content


Usage Statistics

Space Telescope Science Institute

The Space Telescope Science Institute WWW system may have experienced 400,000 accesses (1) between July 17-28, 1994. The STScI server experienced massive access over this period resulting in critical server overload. The actual access may be substantially higher had it not been for the failure of the primary server and the consequent loss of the server log.

Jet Propulsion Laboratory

The Jet Propulsion Laboratory SL-9 WWW system experienced over 1,200,000 accesses from July 15-27, 1994. During the period of actual comet impact (Saturday, July 16 to Friday, July 22, 1994,) the total accesses were over 880,000. JPL set up a "mirrored" site with identical data on July 19, 1994 after experiencing massive access the day before. Unfortunately, the JPL server logs we not available at the time of writing this paper. A common problem faced by all three sites was the management of enormous log files.

Goddard Space Flight Center

The NSSDC SL-9 WWW system located at Goddard Space Flight Center experienced close to 420,000 accesses between July 18 - 27, 1994. The total data transfer for the SL-9 system was little over 6000 megabytes. The number of unique hosts (2) which accessed the system was 9,733. The most requested image was that of an early fragment A impact taken by Hubble Space Telescope. Over 2,600 requests were made for this image alone.


Figure 2a, b

Figure 2a, b demonstrates that our assumption about the user being mostly interested in images was correct. We were transferring 3 times more image files than text files. The new server was also transferring data at 4.6 times faster than Bolero had done at it's peak before the crash (See Figure 3.) We reached peak activity on NSSDC on Thursday July 21, 1994, when we logged close to 6,000 accesses between 1:00 - 2:00 PM. By the end of that day we had logged 73,557 requests, and transferred a little over 1,200 megabytes of data.


Figure 3


Figure 4

Figure 4 shows the daily accesses to the JPL and NSSDC servers plotted against a logarithmic scale. Note the dip in the Newproducts' access on July 19. This is the day the JPL group began to set up the mirror site Navigator. It is interesting to note that although the JPL server Newproducts was handling substantially more accesses than NSSDC throughout this period, the group at JPL reported very little network slowdown. In fact, Roger Lighty (manager of Newproducts) reported that there was no site-wide showdown while Nick Christenson (manager of Navigator) stated that,

"...we were running about 10% above its [network's] capacity."

The anomaly of JPL not experiencing the critical network overload that both NSSDC and STScI experienced is rather intriguing. Unfortunately, the JPL access logs were not available at the time of writing this paper. It is worthy to note that NSSDC's SL-9 server transferred, byte for byte, 4 times more image data than text. It would be interesting to see if the JPL servers achieved a similar image/text transfer ratio. A lower image/text transfer ratio is one of possible scenario which could account for the higher access but lower network load.


Commercial        Educational         Government

HP (20.84%)       Buffalo (15.18%)    NASA (64.60%)
IBM (20.56%)      Virginia (12.20%)   ANL (8.11%)
NETCOM (10.47%)   Berkeley (10.70%)   LLNL (4.93%)
DIGITAL (10.05%)  Stanford (10.65%)   LANL (4.48%)
AT&T; (9.25%)      MIT (10.37%)        FNAL (3.95%)
Sun (7.06%)       UMich (10.09%)      LBL (3.93%)
Tandem (6.46%)    UTexas (9.31%)      ORNL (3.47%)
NYNEXST (6.04%)   UMN (7.18%)         NIH (2.22%)
MMC (4.99%)       UIUC (7.17%)        USGS (2.21%)
Wellfleet (4.28%) Texas A & M (7.13%) NOAA (2.11%)

Table 1

Figure 5 shows the different users domains that were accessing NSSDC between July 16-26, 1994. It is interesting to note the very heavy use of WWW technology in the commercial sector. Table 1 lists the top 10 sites broken down by commercial, government, and educational domains.


Figure 5

(1) An access is defined as a single file transfer to a client. The file may be a HTML page, some image element on the page, or a data file.
(2) Unique hosts is the number of actual TPC/IP addresses that accessed the server.

Return to Table of Content


Conclusions

One of the problems faced by all three groups was the management of the NCSA server log file. For all the systems, the log file was magnitudes larger than the entire SL-9 system content! A very useful enhancement to the NCSA server would be to include user configurable log rollover feature such that separate logs are created daily, weekly, or monthly. Also, there is a need for a real- time log analysis tools so that developers can visualize access to pages as they update and develop them. Of course, installation of broad-band network backbone at NASA GSFC would ensure that similar network overloads do not occur as the use of WWW proliferates. Pundits who foretold the imminent "traffic jams" on the Internet due to the success of Mosaic and WWW were right! The SL- 9 event may possibly be the first recorded case of a such an occurrence.

Return to Table of Content


Acknowledgments

The author is indebted to the following individuals for helping to create and manage NSSDC's WWW SL-9 system and for providing valuable information. Dave and Jim Williams, NSSDC, NASA Goddard Space Flight Center; Ron Baalke, Roger Lighty, and Nick Christenso, NASA Jet Propulsion Laboratory; Zoltan Levay and Fred Romelfanger, Space Telescope Science Institute. The author would also like to thanks the many users at the 9733 nodes world-wide who accessed the NSSDC system. Your endless notes of appreciation made the long days and nights bearable on that historic week.

Return to Table of Content


Biography

Syed S. Towheed is a systems programmer at the National Space Science Data Center, NASA Goddard Space Flight Center. He is the coordinator of NSSDC's WWW development activities and has played a central role in applying WWW technology to NSSDC's mission of providing access to its data holdings. Mr. Towheed was one of the early proponents of using WWW in the NASA environment and was among the first to develop a unique look-and-feel for WWW services. Over the last year, Mr. Towheed has developed several systems including NSSDC CD-ROM Catalog, NSSDC Planetary Sciences, NASA Space Science Education, NSSDC Solar Physics, NSSDC Life Sciences, and NSSDC Shoemaker-Levy 9 system.

Return to Table of Content


Contact Information

The author welcomes your comments and feedback. He may be reached at the address below.

Syed S. Towheed
Code 633
National Space Science Data Center
NASA Goddard Space Flight Center
Greenbelt, MD 20771, USA.
Phone: (301)286-4136
Fax: (301)286-1771
E-mail: towheed@nssdca.gsfc.nasa.gov

Return to Table of Content