Robert K. Thralls
thralls@ibm.net
Systems that operate as WEB Applications need three essential guidelines in order to succeed. These three guidelines define for a user the look and feel of the application. For many applications, the correspondence of HTML Fill-out FORMs and backend CGI Compliant programs remains at one to one. One Simple FORM and one simple program do not suffice in large scale applications with multiple step transactions unsuitable for simple hypertext; one program outputs an HTML fill-out FORM for the next program. When faced with this level of complicity, one needs to institute the three basic guidelines. First, an application must project a consistent user interface. Any and all interaction between man and machine must be performed in the same manner each and every time. Similar operations operate similiarly. Above all, the user must feel a high level of comfort with this Fill-out FORM interface. Secondly, programmers do not write HTML; programmers write programs that encapsulate data with appropriate and consistent HTML encoding. Generally, one needs a format for description of a screen, its purpose and functionality. From this description, programs generate an HTML page that conforms to the interface standards. Also, programs generate programs from that description to handle FORM data, if any, on that HTML page. Thus, the dynamics of the application hinge on the combination of data and logic (application) rather than data alone (hypertext). Third, these applications must rely on HTML for carrying the bulk of user interface processing. Just as any application relies on a 4GL for user interface, WEB applications must rely on HTML to bear that burden, as well. Thus, WEB applications look for more flexibility and functionality from HTML as a language and HTTP as a protocol. This paper demonstrates the implemention of the three golden rules. It explores the pitfalls of HTML/HTTP in WEB application development and offers suggestions where modifications become useful.
Wherever possible, the author utilized examples from an HTML Application called WWWonder&cw;. To access this WEB Application, press here . The application lives behind a firewall, which prohibits access from many sites. A demonstration features the WWWonder system scheduled during Workshop G of the Third International WWW Conference. The CDC plans to move the system outside the firewall later this year.
A consistent interface begins with a good screen definition language. Many screen languages exist with a well thought out structure. A philosophy that follows programmers to the grave nevers allows them to re-invent the wheel. One finds it difficult to choose between commonly used 4GLs and other Screen generators. For the sake of porting MS-Windows applications, MS-Windows Resource (.rc) files prove well structured for a product base. Once chosen, the screen definition language needs processing to separate it from its native environment. The MS-Windows Resource Compiler proves useless to us to parse the files. We write our own resource compiler that produces HTML and screen specific CGI backend source code, and a makefile.
Lexigraphically converted into usable HTML and meaningful 'C' source code, the screen definition file becomes our application interface. Since we convert programatically, we eliminate inconsistencies in the user interface definition. Like items always work alike. For instance, in a MS-Windows Resource file, statements such as "DIALOG", "EDIT" and "RADIOBUTTON" always convert directly to <FORM...>, <INPUT...>, and <INPUT TYPE="radio"...> respectfully. This ensures that the user interface always takes on the same look and feel.
The key to consistency lies within the application's existing user base. The selection of the Screen Definition Language and "Look and Feel" standards already exist for most applications. The effort remaining focuses upon building a parser that converts current screens into HTML and hooks into backend source code to perform the normal application functionality. Once achieved, WYSIWIG screen building tools available for the chosen file format become the preferred manner for HTML FORMs definition. By keeping these development tools alive, one bridges the gap between an existing application and the new WEB Application. Current users make the trasition to the WEB environment effortlessly. New users that the WEB brings can utilize the same on-line HELP functions that currently exists (with minor editing). This key to consistency allows users to migrate along with the application.
HTML code generation from a Screen Definition Language like MS-Windows Resources takes a steady hand to shape into place. Most Screen definitions allow one to place fields precisely where one would like it to be displayed. HTML does not really allow for this type of granularity. Until HTML allows for a PREformatted Fill-out FORM, this nearly impossible task of keeping the HTML viewer code from squashing everything to left side of the screen seems virtually hopeless. Other problems arrise from attempting to mimick MENU, DATE, and PUSHBUTTON functonality. Given the limitations of HTML and the control over the viewer end of things, attempting to deal with a POPUP DIALOG BOX makes for some interesting backend code. Limitations aside, about 90% of most Screen Definition Languages can easily translate to a combination of HTML and SGI backend code. For instance, DATE validation is easy as an HTML INPUT Field with code associated with that field to PARSE and VALIDATE the input string. Even under the current limitation of FORMs processing in HTML, don't hesitate to implement with a combination of simple HTML and complex string massaging algorithms on the backend. A key to implementation of resources like DATE are to augment HTML with appropriate field validation code before allowing the backend to accept the entire contents of the FORM.
For Simplicity sake, an example would help greatly to illistrate the process. First, let us take a MS-Windows compatible resource file and crunch it into HTML.
The Prevention Guidelines Data Set Resource File
The Prevention Guidelines HTML FORM
The Prevention Guidelines HTML Image
Note that although the Resource file is very complex, the HTML FORM generates as quite simple. The large portion of defined processing must push to the backend program because no HTML language allows for field checking nor FORM validation. Thus, the backend picks up the slack of this simple HTML FORM to complete the processing as directed by the Resource file.
There are several steps to Generating CGI Compliant Backends. First, assemble a list of variables and declare some logical space for them. Then, map the logical names of variables to the physical space allocated. One may also want to check those variables that have only a few valid values (RADIOBUTTONS are either "ON" or "OFF" and can have an alternate value associated with either 'ON' or 'OFF'; like 'Yes' or 'No') and convert to those values, if necessary. Again, lets look at a concrete example to demonstrate.
The Prevention Guidelines Data Set Resource File
The Prevention Guidelines Header (.h) File
The Prevention Guidelines SGI Source (.c) File
The SGI Backend program can easily handle On-line transactions such as data base look ups. The output from this type of backend program simply becomes the input to Generating the next HTML page. This next page can display final results of processing or become another auto generated HTML page. A good example of the Chaining of Fill-out FORMs resides in the following example of creating a dynamic list of resources to convert, the user picks one and the next FORM is generated from that selection. In the following example, the user selects "Prevention Guidelines" and submits the FORM, he is handed a freshly genererated Prevention Guidelines HTML FORM and a SGI Compliant backend also generated awaits a submission.
The Data Set Selection Resource File
The Data Set Selection HTML FORM
The Data Set Selection HTML Image
The Prevention Guidelines HTML FORM
The Prevention Guidelines HTML Image
There are some cases that require off-line or batch processing before a transaction can complete. Batch processing may mean that one services requests to Database Systems that are not always reachable (like some MAINFRAMEs), or in such high demand that one must "Batch" requests to it (much like an e-mail message). Databases that operate in this mode easily fit the HTML structure of doing things. Since users come and go and may never be heard from again, the amount of time between the GET of a FORM and the submit of the FORM data to the backend is variable. This time gap may span any amount of time without effecting system integrity. Therefore, the amount of time between the submission of the request and the processing of the output of the SGI program can also be equally spuradic without effecting system integrity. This allows the user to submit querier to database systems that may or may not be on-line and still have those requests processed as soon as possible. Instead of returning the data immediately, the data can be stored in a temp file, awaiting further processing when the user comes back to check the progress of his query or transaction.
Just as off-line or batch mode operations work, so does the multiple step transaction. The user can perform steps of a transaction at variable time intervals without damage to system integrity. Made possible by allocating space on the WEB server to each individual user, transaction space can be considered temporary, depending on the application. But, the capability of performing muliple step and long term transactions via a modeless connection such as HTTP depends on local storage of transaction data by the WEB server.
Other related enhancements to the Server code include the ability to specify a user subdirectory on a POST (allow "EXEC NAME/*/* filepath/*/*"in rules config file). This allows application to serve many users that have some diskspace allocated to them for intermediate transaction files, temporary files, specific HTML Application FORMs and SGI Backends generated on the fly for them. This has also been the integral hub allowing for multi-stage transactions and for batch mode queries.
To continue with the Prevention Guidelines example, the first program generated a query against a MAINFRAME database, the results were returned. The result of the first query serve to build the next screen in the chain; it matters not on-line or batch mode, what matters is that one application writes the input for the next application. Lets look at the temporary results file from the first Prevention Guidelines query and the next Resource file and HTML FORM to get an idea of work flow. Remember, requardless of whether data lives on-line or batched the steps remain the same in the assembling of the next step in the transaction.
The VARS file output from previous transaction step
The Next Prevention Guidelines Resource File
The Next Prevention Guidelines HTML File
The Next Prevention Guidelines HTML Image
The same example from earlier illistrates how one application is driven to produce another application in the Chain. Note that the selection of the resource in the list box causes that resource to be converted to HTML and a backend program.
The Data Set Selection Resource File
The Data Set Selection HTML FORM
The Prevention Guidelines HTML FORM
The Prevention Guidelines HTML Image
This portion of the document means to provide feedback to those that develope the HTML language and viewers and HTTP protocols and servers. This information comprises the set backs that WEB Application implementors face as they find themselves stuck in the middle, lodged between HTML and HTTP.
The remainder of this document reflects the needs of HTML applications. Without these type of enhancements, one can truely offer identicle fuctionality of complex applications through the WEB as stand alone applications. Admitedly, some enhancements only make it easier for the application programmer and/or are designed to push processing off to the client site. However, the fact remains that with the limited set of HTML Fill-out FORM functionality, the application suffers and the user tires of the interface. The Interface clunks next to specific client applications that perform the same functionality. This section brings to light what many WEB Applications must work around.
This section takes the opinion that one does not write specialty viewers for the client end. Many WEB applications will have tens of thousands of users, all with various hardware and system software platforms; thus, another major undertaking to develop a specialty viewer for cross platform developement. These ideas offer the platform independence, if and only if more brains are placed in the HTML viewers through augmented language and more aspects of HTTP are controllable via HTML and CGI.
Applications often need the MENU to navigate through functionality. Currently, if a WEB application needs MENU functionality, it is centralized on an HTML form. However, a better WEB Application interface would have the application's MENU displayed at all times with the viewer's menu system. One needs a way to describe in HTML a programable menu bar. Descriptions of menu bars contain METHOD and URL; Menus belong to FORMs and therefore should reflect change if there are multiple FORMs in a HTML document.
Applications often have very specific types of screen formating in mind for FORMs. Since this is part of the look and feel of a particular application, formatting within a FORM should be more extensible to the programmer. HTML almost fulfills this request, but different viewer handle preformatted FORMs in various ways, and some very poorly. One suggests to make <PRE WIDTH=80> the standard for this and viewers compensate for any extra widths added by borders and scrollbars. Otherwise, augment HTML with a <FRAME> style markup tag to physically locate (contain) certain groups of input fields together. For instance if one wanted a screen partitioned into a left and right side, one uses <FRAME> to group like fields together and it is up the viewer to ensure that those fields are placed togetheras a top and bottom; thus, at least the fields are located in proximity of each other. This mainly prevents the viewer from dropping significant whitespace and making inappropriate wrap decisions that make for a bad user interface. Any defined method of predetermined placement of individual HTML tags would improve the situation
Multi-level INPUT fields delicately integrate most applications. Allowing for user definable data types. Mostly used for database type lookups, pop-up query forms populate back to the original form upon completion. These type of application specific data fields and functionality make for superior quality applications. Thus, allowing users to browse for data to use, rather than pulling it from thin air or remembering it from somewhere else. Perhaps, if true CUT and PASTE were implemented in ALL HTML Viewers, then this would not be an issue. However, the needs of complex applications require some type of functionality similar to this.
Making data entry complex for users is not the intention of the programmer, but in HTML fill-out FORMs, much data entry becomes a chore compared to a stand alone application. Thus, the pop-up FORM that can paste back to the original FORM field in a preceeding (or current) FORM needs to be implemented in HTML. For example, to allow users to browse complex mortality and morbidity codes from data bases, rather than entering them as a FREETEXT item in a plain jane INPUT field. This empowers the user with more flexibility and far more data entry accuracy.
Perhaps a twist on the above Pop-up data entry FORM, a Data <PASTE> enhancement could allow empty FILL out FORMs to be populated indepenently of the original file. The HTML page with a FORM is loaded by the client and is originally empty. A separate header to separate the FORM data from the FORM immediately follows to reference the data back to the FORM fields; thus allowing dynamic data population of a static HTML FORM.
A mime header could identify FORM data for an empty loaded FORM on a page. The data from this section could easily populate the empty fields currently displayed.
To make a real twist on this topic, allow updates/replacement data for any previously loaded FORM. Any backend could easily update previously loaded FORMS to reflect the current status. For instance, stock quotes could be updated coninously while the user viewed other FORMs, but when he JUMPS back to the previous FORM, it does not need to be reloaded, it has already been updated with the current information since the last time he communicated with that server. The protocol remains modeless and USER driven, but is enhanced to allow for the altering of previously loaded forms (automagically) so that the USER is not totally depended upon to drive the entire application (request/response). This would be kin to Dynamic Data Exchange (DDE) or similar atomic updates. One could go so far as to allow the client to accept socket connections for these type updates. One could even see the dramatic reorganization of HTML forms. Instead of a stack of Documents that one jumps forward and back to, One sees a desktop where some Documents are Hyper Linked together and thumbnails of those documents are neatly stacked in staggered piles while others float, sort of half-open, revealing real-time updates of stock quotes.... Just a thought. Hmm... Maybe next week...
This section devotes some words to makeing the HTML viewer FORM smart via mini field "POST" of field data at strategic times or HTML allows the Viewer to actually perform that task on its own...
This wish for an HTML extisible Type Checker raised some eye brows last time mentioned. The concept is as simple as sscanf formating. As an example, one may want string types of four characters with alpha as first char and then followed by 3 nuimeric chars; the alpha char can be an A, X or a B. This type of user defined Type is essential for offloading error checking in HTML FORMS from the WEB Server (CGI Backend program).
Sometimes the standard field validation is not enough. Focus of the user is what is important here. The user is concentrating on a field, not the entire FORM. There is focus problem when the user must redirect his attention to work he thought was already completed. I suggest CGI Hooks, just as many 4GLs allow programmers to perform specific type checking and Field validation when the user set the focus on that field control, during the data entry process, and immediately when the user leaves that field.
Preferably, the Server performs more effieciently if FORMs are pre-validated, but at this time the relationship between the fields is the business of the CGI Backend program. Another enhancement that would help offload processing to the client and reduce bandwith usage from data entry mistakes, is to augment HTML with the ability to make some field relationship decisions about a FORM and prevent multiple submissions of all FORM data to correct the data entry errors.
'if DATE_1 > DATE_2 then "message:OK;You Can't Finish Before You Get Started"'
That simple statement holds the key to distributed processing within HTML Applications.
Of course, the philosophy of 'get a bigger server' seems to be catching like wild fire, but
some of us are on a very teany, tiny budget. If the user is going to wait for some type
of special field or form validation, I'm sure he'd rather wait for his machine to process rather
than waiting for a CPU starved server to process the data only to notify him of an entry error.
Another far flung idea is to offload Clickable Image support directly to the client. Along with the image, the config file is also downloaded. This eleviates much extraneous communications just to find out what the user want to do. Anytime one has the chance to make the client CPU do the work, one should jump at that opportunity. Bandwidth for many is at a premium and server CPU cycles need to be conserved, as well.
One interesting idea that WWWonder has implemented is the ability to POST from clickable images. This allows the image to not simply be hypertext, but actually run a program on the server and produce work, not just view it. The system generates clickable image config files with USER information imbeded as normal (hidden field), this data is passed to the backend and it produces the next FILL-OUT FORM in sequence. Thus, a handy menu system much like any graphical system around; one can overlay HTML pages with simillar images to give the illusion of a drop down MENU system.
Other related enhancements to the Server code include the ability to specify a user subdirectory on a POST (allow "EXEC NAME/*/* filepath/*/*"in rules config file). This allows application to serve many users that have some diskspace allocated to them for intermediate transaction files, temporary files, specific HTML Application FORMs and SGI Backends generated on the fly for them. This has also been the integral hub allowing for multi-stage transactions and for batch mode queries.
One nice to have attribute that goes along with the SUBMIT attribute would be to have the NAME attribute passed to the SGI Backend or placed in the Environment somehow. I think this requires a slight modification to HTTP, but this would save much clutter on the Server side with links to the same executable to allow the different functions ie: Submit, Help, Save, Upload, Download, etc. Currently, each function needs to be referenced in HTML to different SGI Backends, although it is the same executable linked to many filenames. The executable then checks to see by what name it executes as (argv[0]) and hence, it knows the function. Unfortunately executables are generated on the fly in most cases, and, although executables remove themselves after running, there is still the requirement of cleaning up external symbolic links. This type of mess can be eleviated easily by the addition of the NAME of the submit button (INPUT).
User Authentication and Encryption needs to be integrated into both HTML and HTTP. First and foremost, all data transmitted via FORM should be ENCRYPTED: ie: < FORM ENCRYPTED ...> Simple and elegant, KEYS can be configured at both sites appropriately. This allows for encrypted authentication to be controlled by the SGI Backend, which may want the USER Database managed by a real Database Engine, especially if one has 13,000 plus registered users, and any number of anonymous users (still requiring restricted accounts). To have the entire Customer Service Staff adding and deleting users, changing passwords, people logging in and out, etc.... Too much of a bottle neck, but perhaps the largest problem occurs when some applications exist already, and the user database lives on its own machine and HTTP is but only one mode of communication in effect (we have SPX/IPX, Proprietary ASYNC, TCP/IP and HTTP) with several different client applications (again, we have WONDER MainFrame, WONDER/PC, WWWonder, and a barage of windows applications for the WAN). Therefore, there needs to a methodology that is accessible via HTML to send data encrypted or plaintext, and a SGI way of decrypting the data input stream (or have it done by the server automagically before envocation).
\n" ); fail=1; } for( i= strlen( Filename ); i >= 0; i-- ) if( !strchr( "_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789", Filename[i] ) ) { printf( "ALLOW: (%s) char (%c) found in %s
\n","Filename", Filename[i], Filename ); fail=1; printf( "\tAllowable Characters are \"_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789\"
\n" ); } for( i= strlen( KEY1 ); i >= 0; i-- ) if( !strchr( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789", KEY1[i] ) ) { printf( "ALLOW: (%s) char (%c) found in %s
\n","KEY1", KEY1[i], KEY1 ); fail=1; printf( "\tAllowable Characters are \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789\"
\n" ); } for( i= strlen( KEY2 ); i >= 0; i-- ) if( !strchr( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789", KEY2[i] ) ) { printf( "ALLOW: (%s) char (%c) found in %s
\n","KEY2", KEY2[i], KEY2 ); fail=1; printf( "\tAllowable Characters are \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789\"
\n" ); } for( i= strlen( KEY3 ); i >= 0; i-- ) if( !strchr( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789", KEY3[i] ) ) { printf( "ALLOW: (%s) char (%c) found in %s
\n","KEY3", KEY3[i], KEY3 ); fail=1; printf( "\tAllowable Characters are \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789\"
\n" ); } for( i= strlen( KEY4 ); i >= 0; i-- ) if( !strchr( "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789", KEY4[i] ) ) { printf( "ALLOW: (%s) char (%c) found in %s
\n","KEY4", KEY4[i], KEY4 ); fail=1; printf( "\tAllowable Characters are \"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ *.- #&0123456789\"
\n" ); } if( !test_toggle( "DAT1" ) && datecmp( "DAT1", "DAT2" ) >= 0 ) { fail=1; printf( "DATE: %s (%s) must be BEFORE %s (%s)
\n", "DAT1", DAT1, "DAT2", DAT2 ); } if( !test_toggle( "DAT1" ) && datecmp( "DAT1", "DAT2" ) >= 0 ) { fail=1; printf( "DATE: %s (%s) must be BEFORE %s (%s)
\n", "DAT1", DAT1, "DAT2", DAT2 ); } if( !test_toggle( "DAT1" ) && datecmp( "DAT1", "DAT2" ) >= 0 ) { fail=1; printf( "DATE: %s (%s) must be BEFORE %s (%s)
\n", "DAT1", DAT1, "DAT2", DAT2 ); } if( !test_toggle( "DAT2" ) && datecmp( "DAT2", "DAT1" ) <= 0 ) { fail=1; printf( "DATE: %s (%s) must be AFTER %s (%s)
\n", "DAT2", DAT2, "DAT1", DAT1 ); } if( !test_toggle( "DAT2" ) && datecmp( "DAT2", "DAT1" ) <= 0 ) { fail=1; printf( "DATE: %s (%s) must be AFTER %s (%s)
\n", "DAT2", DAT2, "DAT1", DAT1 ); } if( !test_toggle( "DAT2" ) && datecmp( "DAT2", "DAT1" ) <= 0 ) { fail=1; printf( "DATE: %s (%s) must be AFTER %s (%s)
\n", "DAT2", DAT2, "DAT1", DAT1 ); } conversion(); if( !fail ) { print_request( "PREVENTION GUIDELINES", upload ); } else printf( "failure to pass field validation!\n" ); if( !fail ) self_rm( fail, USERID, ROOT, BASE ); return(0); } /* END GENERATED CODE */