Interestingly, this project was not a result of high-level academic vision, but was in many ways a bottom-up, volunteer, emergent system. This could have meant a withering of enthusiasm, but instead translated into a revolutionary zeal.
Thanks to the microcomputer revolution, the information that the College maintains and distributes lives on disk drives throughout the university. The major challenge turned out to be not creating information, but locating the data and extracting it from its owners. The project's success is thus attributable more to personal networks and contacts than to technological brute force.
Frequently the data was available in raw ASCII and translated using scripts. We will show that it is possible and desirable to do what appears to be extremely complicated data extraction with relatively simple scripts.
The information that is available is extensive and comprehensive. Of particular note are:
It was and is a very large and complex project, and its evolution can perhaps be instructive to those working not only on campus Webs, but to those working on any large project. I will talk about both technical issues and sociological issues that I encountered constructing this large Web.
Having said that, I am glad that WWW was NOT developed fifteen years ago. All printed matter starts out its life on a disk drive now, but that was not true in 1979. People would have gotten pained looks on their faces if asked to put their information on the Web. They would have explained that they couldn't put their info on the Web because they didn't have the budget to hire someone to type it into the computer for them.
Couple that with the negative aesthetic value of ASCII viewed on a dumb terminal, and I suspect the WWW would have died stillborn for lack of content and interest. This could have killed all interest in it for possibly thirty years. "Yeah, it's a neat idea, but it has been tried. It didn't work."
Instead, we have in a situation where all the documents exist on a disk somewhere, just in the wrong format. The problem now is one of data translation, not one of data entry. While computers are lousy at reading documents, they are really good at repetitive translation!
For example, the UIUC course descriptions and timetable were already on disks. The timetable had been converted to a form that could be used by ph, and was publicly accessible:
% ph type=fall name=fr134 ---------------------------------------- name: fr134 accelerated intermediate french, ii. text: fall94 : prerequisite: fr 133 or 106, or fr 103 with department approval, : or three semesters of college french, or a placement score : showing high school achievement equivalent to fr 103. : 4 hours. : 03810 lect-disc c 10 mtu thf 325 greg hall : 03811 lect-disc d 11 mtu thf g30 for lang : 03812 lect-disc e 1 mtu thf g30 for lang ----------------------------------------The courses catalog was straight text, grouped by department, and had entries like the following:
213. Aerodynamics, II. Equations of motion for a viscous, heat-conducting fluid; exact solutions of the Navier-Stokes' equations; boundary layer theory; inviscid approximations, vorticity, and circulation; potential flow; solutions of potential flow equations, sources, sinks, and Prandtl-Meyer flow; thin airfoil and slender body theory; and method of characteristics. Prerequisite: Aeronautical and Astronautical Engineering 212. 4 hours.
As long as you aren't impeded by the idea that "regularity" of databases implies one-word fields separated by tabs, you can see that you could HTMLify the course description pretty easily.
In addition to writing scripts to convert the timetable and courses catalog, I used some of the standard converters to translate the Engineering Student Handbook and the Engineering International Minors Program from MS Word to HTML. After painfully putting in all the links to classes by hand in the Student Handbook, I wrote a script (based on my timetable experience) to anchor classes automatically. I have discovered that the more information there is on the Web, the longer it takes to write / convert documents because there are more links to make!
If you have an ASCII file, but think you don't have the time to HTMLify it by hand or to write a script to beautify it, try the following script:
The Macs and PCs have different aspect ratios than UNIX machines, and don't fit as much information on one screenful. Thus the "index" pages are kept as terse as possible.
The Macs cannot handle extremely large files. For example, my Valued Associate Mike Grady discovered that the Music and Math departments' timetable and courses catalog entries caused Macs to hang. So when he modified my scripts to work for the whole campus, he had to change it from one file per department to one file per course.
The best tactic that I've found when people are waiting from The Word is to say, "You've heard of the Information Superhighway? This is it." (Some might quibble that WWW is not exactly equivalent to NIIS, but it is close enough, and the words make people happy.)
In many cases, my personal contacts lubricated the process enormously. I grew up in Champaign, IL. My parents were on the faculty, and my high school was riddled with the sons and daughters of faculty. I got my undergraduate degree at UIUC before going off to the West Coast for ten years. I frequently know The Boss personally. The Boss, knowing me, can feel comfortable that I am not going to take his or her ASCII and sell it to a direct-mail company, the Irish Republican Army, or Purdue.
Telling these people that you'll plaster ***UNOFFICIAL!!*** all over it helps, but frequently they need to go away and think about it and/or talk to The Boss.
I tell these people that no, there is nothing they can do to prevent fraudulent information from being distributed. However, it is harder to get anyone to pay attention to the fraudulent information if the official information comes out first and/or if an officially owned document links to the real information. (I also point out that these documents could be forged on paper as well.) This seems to calm them down.
When confronted by this suspicion, I explain how easy it is to write HTML, and give examples of how long it took to convert specific large documents. I also allow that at some point, their organization might want to run their own server and that it might cost some thousand dollars. Knowing that there might be a cost someday tends to get them to stop hunting for the "catch".
What I was failing to pay attention to is the fact that I have spent the past ten years with a fast computer on my desk, logged in essentially continuously, and with a high-bandwidth connection to the Internet. When I need a dictionary, I am usually at a computer, and it is faster for me to look up the spelling on Mosaic than it is to hunt down a dictionary and look it up.
But if you are a freshman at your fraternity house, your patterns are different. There might be a dictionary on your shelf, while to access Mosaic you might need to walk to campus, climb two flights of stairs, and log in. Even if you have a computer on your desk, it might be a Mac SE that can't handle having Word and MacMosaic both open. If you examine the entire process from the desire to know the spelling to the time the spelling is known, in some cases it is LOTS faster to look it up on paper.
The incremental additional search time due to finding and opening a browser isn't going to change until everybody is on-line for a significant period of time for a significant fraction of the days. At that point, we'll be able to use the Web for all kinds of interesting groupware. I imagine a day when my calendar manager will not only tell me about seminars that I am interested in, but will also have a link to the campus map with the building and seminar room highlighted. However, like email, there is a critical mass that needs to form before this is possible.
To achieve this critical mass, we have to offer these disinterested parties information for which the total information search time is faster. This may mean providing information in a form which is presented in a manner that allows better organization (e.g. hyperlinking the courses catalog and timetable), difficult or slow to obtain (e.g. corporate SEC filings or the NSF Proposal Guide), or not possible to obtain (e.g. interactive maps of campus).
When people are just flat-out disinterested, I try to guess or extract from them what information they need to do their job, and work from there. I tell faculty about the NSF Proposal Guides and search for their research area in Yahoo. I show seniors all the job vacancy information and urge them to post their resume in the resume book. I show freshmen the home page of their favorite rock band.
I'm willing to take their word for it, and let them know that they can come to me for help if they need it.
But for many of these institutional information sources (e.g. Admissions or e.g. Department of General Engineering)), it doesn't make sense for only one person to have write permission on the files. The content should be attached to the organization and not the individual. I don't have an answer for this one, short of having each information source get their own hardware and set it up far from CCSO's prying eyes. I'm doing what I can to change the policy, I am a mere aphid on the lowest leaf of the CCSO organizational hierarchy.
Now I'm more like a farmer, dealing with tractor dealerships and grain silos. In both cases, food gets to the public, but in the first case there is a lot less negotiation that has to go on. Now that I am doing this as a job instead of a hobby, I find that I have to go get input from people, worry about maintenance issues, get input, and all kinds of other time-consuming things. I find myself asking permission instead of forgiveness, and that means that stuff doesn't get done as fast.
So if you are in a situation where you need a information put on the Web quickly, you might want to avoid giving your cowboys an official status.
We've also given a large number of informal training sessions, basically to anyone who will listen. We've found that the smaller the groups, the higher chance that the people will "get it". When showing the Web to a small group, we can poll the people who don't seem to "get it" for something in their interest area. Then we show them not only the information, but how we went about finding the information. Note that it is important for the trainers to do a fair amount of surfing, so that they can quickly find web nodes for things that the audience is interested in. While I think that we should all mail a hundred dollars to Jerry Yang and David Filo and pray that they never get lives, Yahoo's Hotlist is an generalist's list. It is frequently useful to know other routes to information.
Given that the small demos take a lot of time, and that there are excellent and ubiquitous How To Use The Web documents out there, I have felt that it is more important to focus on developing content right now than to give lots of how-to seminars. This will change as people make the transition from being Web surfers to being Web spinners.
It should be noted that we have been aided immeasurably by our contact network. While it happens that I know The Bosses, my Valued Associate Brad Whitmore knows the power structure of the students, by virtue of his involvement with Engineering Council. This has meant that Brad has been able to influence enthusiastic, energetic society leaders to develop their own home pages.
But then I realized that I could use a different paradigm. Instead of visualizing myself trying to harness and drive a team of wild cattle, I had a vision of me instead giving more subtle direction. I'd tell them what needed doing, give them help as needed, but pretty much let them go run on their own. Instead of being a wagon driver, I'd be a sheepdog. So I wrote up detailed suggestions for small, self-contained projects: "Put the football schedule on line. Review a restaurant. Make a home page for the Young Republicans or Gay Illini. Or put your class notes on-line." And now I'm just sitting back and watching them play.
In addition to her work on the UIUC Web, she has developed Web pages with information on tourism in France, tips for travelers, a travelogue of New Zealand, a discussion of the educational merits of shaving her head, advice for women in engineering, and a used car lot. She also was a significant contributor to the information system that Enterprise Integration Technology developed for the National Center for Manufacturing Sciences. She can be reached at ducky@netcom.com.