The formidable problem of automatic or semi-automatic composition of existing Web services is the subject of much current attention. We address a particular subset of this problem with SWORD, a set of tools for the composition of a class of web services including ``information-providing'' services. In SWORD, a service is represented by a rule that expresses that given certain inputs, the service is capable of producing particular outputs. A rule-based expert system is then used to automatically determine whether a desired composite service can be realized using existing services. If so, this derivation is used to construct a plan that when executed instantiates the composite service. As our working prototype and examples demonstrate, SWORD does not require (but could benefit from) wider deployment of emerging service-description standards such as WSDL, SOAP, RDF and DAML. We also distinguish SWORD from some other plausible existing approaches, especially information integration. We show that although SWORD's expressive capabilities are weaker, the abstractions it exposes capture more appropriately the limited kinds of queries supported by typical Web services and thus result in simplicity and efficiency. [Word Count: 7950 words]
Keywords: Information integration, service composition, service integration, rule-based system.
Service composition: the problem of composing autonomous services to achieve new functionality, is generating considerable interest in recent years in several computer science communities. Service composition has the potential to reduce development time and effort for new applications. The web is a particularly interesting domain for service composition for several reasons. Firstly, increasing numbers of interesting services are moving online and the web is fast transforming from a collection of static pages to a provider of numerous useful services. Another reason is that web services conform to the standard HTTP protocol which makes it (relatively) easier to integrate them into a common framework. Third, because the web has several independent service providers providing related services, there is an inherent need for composing complementary services provided by independent providers to achieve the end-user's needs.
The service composition problem is particularly challenging because web services fall into many categories and composing such a diverse set of services may require several different tools, techniques, and technologies. In this paper, we propose one such toolset: SWORD. SWORD is a toolset that allows service developers to quickly compose base web services to realize new composite web services. SWORD can compose information providing services (such as the web services providing information about people, movies, theaters, restaurants, etc) and a class of other services (such as email and image conversion services). The key idea behind SWORD is as follows:
The contributions of SWORD are three-fold:
Note that in general, SWORD cannot currently handle web services with various side-effects, such as services involving account credits/debits, or various other business-business services. This is an area of future work and we plan to explore how the SWORD model can be (incrementally) expanded to include other types of web services.
In this paper, we describe the composition and execution model of SWORD and demonstrate the prototype implementation we have built. The paper is organized as follows: In section 2, we explain the SWORD service model and composition plan generation. In section 3, we analyze the strengths and weaknesses of the SWORD composition model. While doing so, we contrast the SWORD composition model with existing work on data integration or information integration which has been the focus of much research in the databases and AI communities for several years [19, 12]. We do so for two reasons:
In this section, we describe in greater detail the service model and the composition model. Then, we describe rule-based plan generation: the mechanism currently used in SWORD to generate composite service plans.
We model each service by its inputs and outputs. These inputs and outputs are specified in a world model. The world model currently needs be built at the ``composition site'' by the base service modelers (developers integrating underlying web services that will be used to create composite services). Using the terminology of the ER model [26], the world model consists of entities (such as people, images, emails, restaurants, stocks, and movies), and relationships among entities (a theater shows a movie, etc). Entities have attributes (such as the name of a person). Unlike the traditional use of ER model for modeling data, we use it to describe the inputs and outputs of web services. The following must be noted:
Every service has two types of inputs: conditional inputs (that are assertions specifying the entities the service operates upon and the relationship conditions between the entities) and data inputs (the actual data required by the services expressed in terms of the attributes of the involved entities). The same holds true for the outputs: there are both conditional and data outputs. As a concrete example, consider the Yahoo people lookup service. This service is modeled using the following inputs and outputs:
Entities involved: X
Condition Inputs: 
  Person(X) - indicates that X is a person entity
Data Inputs: 
  firstname(X), lastname(X), city(X), state(X) 
  - indicates that the service needs these four attributes of X
Condition Outputs: 
  none - no condition outputs for this service
Data Outputs: 
  streetaddress(X), phone(X)  
  - indicates that the street address and phone attributes of the 
    person are returned by the lookup service.
Two other examples are shown below: an email service and a driving directions service.
EMail service: Entities involved: X, Y Condition Inputs: EMailMessage(X) - indicates that X is an email message Data Inputs: subject(X), body(X), emailaddress(Y) Condition Outputs: MailSent(X, Y) Data Outputs: none Driving Directions service: Entities involved: X, Y Condition Inputs: none Data Inputs: streetaddress(X), city(X), state(X), streetaddress(Y), city(Y), state(Y) Condition Outputs: none Data Outputs: drivingdirections(X, Y)
In reality, this information (i.e., the condition and data inputs/outputs) alone is not sufficient to model a service. We also need run time information about how the service can be invoked and how it returns the results (if any). While we do not show this here, the complete service model descriptions also include this information. We will come back to this issue in greater detail when we discuss the SWORD execution model.
Suppose we wish to create a service that looks up the driving directions between two persons' homes given their names and cities. The inputs and outputs for this composite service are:
Names to Driving Directions service:
Entities involved: X, Y
Condition Inputs: Person(X), Person(Y)
Data Inputs: 
    firstname(X), lastname(X), city(X), state(X), 
    firstname(Y), lastname(Y), city(Y), state(Y) 
Condition Outputs: none
Data Outputs: drivingdirections(X, Y)
This composite service can be realized by composing the people lookup service and the driving directions service. SWORD attempts to determine this automatically from the service specifications. In general, this can be modeled as a planning problem as follows:
 ,
 ,  , .. ,
 , .. ,  ) is defined as
Known(attr,
 ) is defined as
Known(attr,  ,
 ,  , .. ,
 , .. ,  ). Thus, the known
fact corresponding to the data input: input(
 ). Thus, the known
fact corresponding to the data input: input(  ,
 ,  , .. ,
 , .. ,  ), is Known(input,
 ), is Known(input,  ,
 ,
 , .. ,
 , .. ,  ).
 ).Example:
People lookup service: precondition: Person(X) & Known(firstname, X) & Known(lastname, X) & Known(city, X) & Known(state, X) postcondition: Known(streetaddress, X), Known(phone, X) Driving directions service: precondition: Known(streetaddress, X), Known(city, X), Known(state, X) Known(streetaddress, Y), Known(city, Y), Known(state, Y) postcondition: Known(drivingdirections, X, Y) Names-to-driving-directions composite service: Initial state: Person(X), Person(Y), Known(firstname, X), Known(lastname, X), Known(city, X), Known(state, X) Known(firstname, Y), Known(lastname, Y), Known(city, Y), Known(state, Y) Desired final state: Known(drivingdirections, X, Y)
For all the services we currently model, the preconditions and postconditions can be expressed as conjunction of negation free facts. Further, we also avoid the use of function symbols or arithmetic predicates. (Some of these limitations can be overcome through a run time filter mechanism that will be explained later.) For this class of services, plan generation can be achieved efficiently using a rule-based expert system [24]. (Given a set of initial facts and a set of if-then rules, an expert system provides a mechanism of obtaining the facts that can be derived from the initial facts and rules. We use a rule-based engine implemented in Java, called Jess [15].)
When modeling a service, a base service modeler simply defines a
(Horn) rule of the form: precondition  postcondition.
postcondition.
Example: The rules corresponding to the people lookup service and the driving directions service are as follows:
Person(X) & Known(firstName, X) & Known(lastName, X) & Known(city, X) & Known(state, X) => Known(streetaddress, X) & Known(phone, X) Known(streetaddress, X) & Known(city, X) & Knwon(state, X) & Known(streetaddress, Y) & Known(city, Y) & Knwon(state, Y) => Known(drivingdirections, X, Y)
Note that these rules are part of the service models and need be defined manually only once per base service, and not every time a composite service developer wishes to create a composite service! To create a composite service, the developer only needs specify the initial and final state facts for the composite service. The composite service developer need not even be familiar with which underlying services are available. (For the sake of clarity, we distinguish the composite service developer from the base service modeler. Of course, they could be the same person. Also, an unqualified ``developer'' refers to a composite service developer.)
Given the initial and final state facts supplied by the
developer for the desired composite service, and given the rules
for the base services available in their respective service models,
SWORD uses the following algorithm to determine if a plan exists
for the composite service:
Step 1: SWORD asserts the rules corresponding to all the
available services in Jess. Thus, SWORD asserts the two rules
defined above (people lookup service rule and driving directions
service rule) and similarly defined rules corresponding to all the
other available services.
Step 2: SWORD asserts the initial state facts for the composite service. For the names-to-driving-directions service, SWORD asserts the following initial facts:
Person(A) Person(B) Known(firstname, A) Known(lastname, A) Known(city, A) Known(state, A) Known(firstname, B) Known(lastname, B) Known(city, B) Known(state, B)
Step 3: SWORD runs the engine which causes matching rules to fire. When the rules stop firing, SWORD queries the engine for the facts corresponding to the final state of the composite service. For the names-to-driving-directions composite service, SWORD queries the engine for the following fact:
Known(drivingdirections, A, B)
Th conditions on the rules (both LHS and RHS are conjunctions of facts that do not use negations or arithmetic/function symbols) ensure that:
Step 4: If the final state facts can be derived, then
SWORD constructs a plan from the derivation structure (which is
inferred from the call-backs that happen when rules fire) used to
derive these facts. There can be multiple ways of deriving the same
facts, and SWORD currently picks an arbitrary plan (since there is
no cost model yet for evaluating alternative plans).

Step 5: The constructed plan can be viewed by the developer using the GUI plan-viewer tool. Figure 1 shows a simplified view of the names-to-driving-directions plan.
In this section, we discuss the strengths and weaknesses of the SWORD composition model. We sometimes contrast the SWORD model with existing data integration work to better explain the issues. However, note that:
The goal of data integration is to provide a uniform interface to multiple (possibly distributed) data repositories [19, 12]. Here, we only focus on the LAV-approach [12], since it is most similar to SWORD. In the LAV approach, data sources are expressed as queries (or views) on a mediated schema consisting of virtual relations. User queries are also posed on the virtual relations and they need to be answered using the source relation views. In most cases, this problem of answering queries using views has proven intractable (NP-complete) [19]. Currently, the only algorithm for answering queries using views known to scale gracefully is the MiniCon algorithm [22].
While the data integration work applies (for example) to integrate a number of actual databases to create a web service, using this model for composing across web services creates several problems. First, even though they can sometimes be modeled as relations, typical informational web services allow only a limited number of pre-specified queries for various reasons (either because other queries cannot be logically answered or for administrative, engineering or privacy reasons). For example, the Yahoo people lookup service allows only looking up the address of a person by name, but not vice-versa. Similarly, the MapQuest driving directions service only allows looking up the driving directions between two addresses. If these were actual relations, they would allow other queries also such as find the names of persons living at a given address, find the destination address given the source address and driving directions, etc. (The second query cannot be logically answered since the driving directions are computed on-the-fly and there is no table listing all pairs of addresses in the world and the driving directions between them.)
Thus, most existing data integration work does not apply to web services. A section of the data integration research has addressed integration of data sources with query restrictions, which are referred to as binding pattern limitations [23, 9]. However, we still have the following problems in the context of web services:
Once again, we remind the readers that we do not claim that SWORD is a better model for data integration. The purpose of the above discussion is to show that while the data integration research solves the problem of integrating actual data sources, it does not apply well to the web service composition problem, because we have a different set of assumptions and requirements here. Also, data integration obviously does not handle composing non-query services such as conversion and email.
  Rule-based
chaining in SWORD can sometimes generate ``uncertain'' results. To
understand why this happens, consider an entity X with three
attributes a, b, c. Suppose the following two services are
available: a service S1 that provides the b(X) given a(X) and a
service S2 that provides c(X) given b(X). If it were desired to
have a composite service S3 that provides c(X) given a(X), SWORD
would simply chain S1 and S2 together to achieve S3. However, this
can cause uncertain results.
Example: Consider the following services: personname-to-address and address-to-phone. Further, assume (as is the case) that multiple people can live at the same address and that different people living at the same address may use different phones. For the composite service name-to-phone, SWORD would just chain the name-to-address and address-to-phone services. Now, if John and Jack live at the same address but have different phones numbers, the composite service would return both their phone numbers when asked for John's phone. The reason this happens is because address neither uniquely determines the phone number nor the name.
In general, for services S1 and S2 described as above, chaining
can cause uncertain results unless b  c or
b
 c or
b  a (where
 a (where  denotes a ``functional dependency'' [26] using database
terminology). This is because chaining is essentially like a join
and a join can be lossy [26] when the above
property is not satisfied.
 denotes a ``functional dependency'' [26] using database
terminology). This is because chaining is essentially like a join
and a join can be lossy [26] when the above
property is not satisfied.
While the fact that SWORD can yield uncertain results may seem alarming at first, note that:
The last bullet above implies that where SWORD yields uncertain
results, an equivalent query on a virtual relation may yield no
results at all (since there are no guaranteed correct answers
here).
Example: An obvious choice for a virtual relation in
the mediated schema for the name-address-phone example is the
``person'' relation: person(name,address,phone). (In reality, any reasonable person virtual relation will have
several other attributes.) The natural equivalent for the composite name-phone service
would be the following query (using the terminology of [23, 9]):
v(  ,
 ,  ) :- person(n,a,p). However,
this query would yield no results given only the name-address and
address-phone services. Thus, in this example, the tradeoff is
essentially getting a few phone numbers one of which could be the
actual number vs. getting no results at all.
 ) :- person(n,a,p). However,
this query would yield no results given only the name-address and
address-phone services. Thus, in this example, the tradeoff is
essentially getting a few phone numbers one of which could be the
actual number vs. getting no results at all.
The SWORD ``query model'' is weaker than the query languages used in traditional data integration approaches. For example, it does not allow the specification of arbitrary joins. Rather, joins can only be expressed as relationship conditions. For example, suppose there exist two services: a service that returns the director of a given movie, and a service that returns the theaters showing a given movie. Consider the following (simplified) choices for the world model (with the SWORD approach) and mediated schema (with traditional data integration approach):
SWORD world model:
Entities: Movie - Attributes: name, director
          Theater - Attributes: name, address
Relationships: Shows - Involved entities: Theater, Movie 
Mediated schema:
Movie(name, director)
Theater(name, address)
Shows(moviename, theatername)
Given the above mediated schema, all relational query languages would allow posing queries involving arbitrary joins such as the following (we use the notation commonly used in data integration literature here but also provide explanations):
These queries cannot be answered given only the two available web services. The main problem is that the exported schema gives little information about which queries can actually be answered. On the other hand, these unanswerable queries cannot be posed in SWORD. (Recall that all condition input/outputs must be entity or relationship assertions only, and all data input/outputs must be entity or relationship attributes). However, note that SWORD allows the query ``find all theaters showing a given movie'', which may be posed as follows:
Condition Inputs: Movie(X), Theater(Y), Shows(X, Y) Data inputs: name(X) Data outputs: name(Y) Condition outputs: --none--
Suppose a new web service becomes available that when given the name of a movie X, provides the names of all movies that were also directed by the director of X. Then, we can add a new relationship to the SWORD world model called ``Same-Director''. This can then be used to pose queries about movies directed by the same director.
In general, we believe that the weaker query model of SWORD is better suited for web services, since it results in a simple and efficient composition process, while still allowing almost all the answerable queries that may arise in practice.
Note that the SWORD query model can be augmented to allow
arbitrary queries such as the ones above. However, we do not wish
to forgo the simplicity and efficiency of the weaker query model.
An interesting alternative that emerges is to support both a simple
query model and a complicated one, where the simpler one is known
to be more efficient, and the complicated one is only used when
necessary, with the expectation that most queries using it may be
unanswerable.
Service model vs. rule-based plan generation: While our
service model presented in section 2.1 (describing a service
using its condition and data inputs/outputs) is fairly general, the
rule-based plan generation algorithm of section 2.3 is not as
flexible. For example, by using negation, disjunction, arithmetic
and function symbols, we could describe a large class of web
services by their condition and data input/outputs. However, the
simple rule-based plan generation algorithm would not work any more
(at least not in its current form) for these services. However,
there is a sufficiently interesting class of services that can be
modeled subject to the current SWORD constraints. In the future, as
we add services that can't be modeled with these constraints, we
will need to incrementally generalize the plan generation
algorithm.
Plan generation can fail: Note that one of the following can happen in the plan generation process:
In the event one of the above happens (or is anticipated in
advance), the developer has to resort to other techniques to create
and deploy the composite service. (For example, create the
composite service manually, or use other composition tools and
techniques).
Semantic issues in rule-base: When a new base service is added (i.e., modeled) by a base service modeler, the rule for the new service gets added to the ``rule-base'', and becomes available for all future compositions. However, an important issue is consistency across rules in the rule-base built over a period of time. The rules must consistently use the same facts for representing the same concepts. In SWORD, this must be ensured by the base service modelers.
Note that an ``improper'' choice of an initial set of entities
and relationships can hinder the graceful evolution of the
rule-base. A related important issue is the standardization of the
rules across service providers or across composition sites. If such
standardization becomes possible in the future (perhaps due to
deployment of standards such as RDF [27], DAML [14], etc), SWORD could prove more
effective. On the other hand, SWORD is useful even without such
standardization. (Base service modelers at each ``composition
site'' write their own wrappers and rules for each web service
using the world model of that composition site.)
Efficiency: The rule-based system we use is based on the
Rete [11] algorithm, and takes
only O(rfp) time per iteration where r is the
number of rules, f is the number of facts and p is
the number of patterns on the LHS. Thus, the overall time for
forward chaining is  where F is
the total number of facts (initial and derived). (In fact, for
informational services, the only facts that can be derived are the
known facts corresponding to the attributes of the entities
involved in the composite service inputs.) Also, rule-based systems
are well-understood and several practical implementations are
available.
 where F is
the total number of facts (initial and derived). (In fact, for
informational services, the only facts that can be derived are the
known facts corresponding to the attributes of the entities
involved in the composite service inputs.) Also, rule-based systems
are well-understood and several practical implementations are
available.

 
We have implemented a proof-of-concept prototype of the plan generation algorithm and the plan viewer tool described in section 2, and an execution system that can instantiate and execute the plans. We have used SWORD to create seven composite services from a set of ten base web services.
Figure 2 shows two different plans as viewed by the plan viewer. (Recall that figure 1 was a simplified illustration.)
We now demonstrate the movie, restaurant trip planner composite service (plan 1) in action. Figure 3(a) shows the interface to the trip planner as seen by an end-user on the web. As seen in the figure, the trip planner service prompts the user to enter the person name and the city, the movie name and a cuisine. The user enters this information and submits the form. As shown in the figure, say ``John White'' is the entered person name, ``Hardball'' is the entered movie name and ``Chinese'' is the entered cuisine. Suppose multiple matches are found for the name ``John White'' and the given city. Then, the trip planner service prompts the user with the matches found the requests the user to select the appropriate ``John White'' (figure 3(b)). The trip planner then locates the restaurants serving Chinese cuisine near the entered zip code and prompts the user to select one of the found restaurants (figure 3(c)). Similarly, the service finds the theaters showing the movie ``Hardball'' and prompts the user to select one of them (not shown here). Finally, the service finds the three pairs of driving directions for the trip and displays them to the user.
A composite service need not necessarily be interactive. For example, the ``stock notification composite service'' runs without any interaction with the user. Even the trip planner service can be run such that it does not prompt the user for selecting the restaurant etc., but (for example) just selects the first match and proceeds.
In the next section, we discuss plan instantiation and execution (the mechanism that translates the plans shown in figure 2 to the actions shown in figure 3). The run time mechanisms also compensate for some of the limitations of the current service representation and composition model.

If a plan viewed by the plan viewer is found to be appropriate, the developer can request that a persistent representation of the plan be generated. This representation is used for execution whenever the composite service is invoked by an end-user on the web.
The persistent XML representation for a plan lists the services that need to be invoked, the order in which they need be invoked and the flow of entities between them. Further, for each service, the plan representation includes the run time information - the script and the filter. This information is obtained from the respective service model descriptions. (Recall from section 2 that in addition to the condition and data input/outputs, each service model also includes the run time information. The script and the filter constitute the most important pieces of run time information for any service.)
The script points to a WebL [17] (a language custom designed for web scripting) script. We have not focussed on the problem of interacting with the web services and extracting the results from them. Apart from WebL, several other research and industry efforts are addressing this problem [28, 29], and SWORD can easily use these alternatives if widely deployed.
Once the XML representation is generated, the developer creates a suitable custom web interface for the composite web service, which points to the appropriate plan (such as the form shown in figure 3(a)).

In the following description, we will use the names-to-driving-directions composite service (shown in figure 1) to explain how the execution works. SWORD run time is built upon a data-flow engine called Paths [16]. The run time instantiates an (appropriately interconnected) operator for every service listed in the XML representation of the plan. Thus, at run time, we have a graph of operators interconnected by queues. The execution graph for the plan shown in figure 1 is shown in figure 4. The execution model can be thought of as a flow of entities through this graph. For example, suppose the composite service is invoked (from the custom created web interface) with the following data:
Person A: firstname: John lastname: Doe city: Los Angeles state: CA Person B: firstname: Joe lastname: Shmoe city: New York state: NY
The run time first creates a Person entity each for John Doe and Joe Shmoe. Each entity is identified by a unique identifier. Suppose the entity for John Doe has the identifier ``A101'' and the entity for Joe Shmoe has the identifier ``B200''. Then, the system sets the data inputs (if any) for these entities as follows:
A101.firstname = ``John'' A101.lastname = ``Doe' A101.city = ``Los Angeles'' A101.state = ``CA'' B200.firstname = ``Joe'' B200.lastname = ``Shmoe' B200.city = ``New York'' B200.state = ``CA''

Then, the system places the entity A101 in 'queue1' and the entity B200 in 'queue2'. As shown in figure 5, the operator corresponding to 'service1' picks up the entity A101, extracts the firstname, lastname, city and state attributes and launches the Webl script ``peopleLookup.webl'' with these four arguments. When this script returns, the operator extracts its results and assigns them to A101 as shown below (and in the figure):
A101.streetaddress = ``101 FooBar street'' A101.lastname = ``300-999-9999''

After this assignment, the operator places A101 in 'queue3' where it is picked by the operator corresponding to 'service3'. The execution at the other operators proceeds similarly. Figure 6 shows the flow of entities through the graph. When the execution completes, the system removes the entity pair (A101,B200) from 'queue5' and extracts the 'drivingdirections' attribute of this pair. Finally, the system invokes a result Java server page (which JSP to invoke is also contained in the plan) with the extracted driving directions as arguments. This Java server page formats the results appropriately and displays them to the user.
As observed in the demonstration of the trip planner composite service in section 4, base services may produce multiple outputs either expectedly (such as multiple Chinese restaurants in the same area) or unexpectedly (such as multiple matches for ``John White''). Both these cases cases are handled using a general filter mechanism in SWORD. (Thus, the web pages for user interaction shown in figures 3(b) and figures 3(c) were actually generated by filters). A filter may optionally be specified for each service in the service model. If a filter is specified for a service, the run time invokes the filter with the output of the service. When the filter returns, the run time extracts the output of the filter and places it in the output queue for the service. As viewed by the SWORD run time, the filter is a ``black-box'', and it can do anything: seek user intervention and modify the output, or perform some computation (eg. sorting) and modify the output. Since Java server pages can perform both arbitrary computation and provide a user interface, we have chosen them as the medium to implement the filter mechanism.
Note that the web pages for user interaction are difficult to generate automatically, especially since they may need be customized to the individual service, composite service, and/or the web site hosting the composite service. In addition to enabling user interactions, filters can sometimes compensate for the absence of arithmetic/function symbols in the composition logic, since they can perform arbitrary computation (such as sorting, searching, etc).
We have taken preliminary performance measurements for both plan generation and execution. Generating the movie, restaurant trip plan of figure 2(a) took 0.98 seconds, while generating the plan in figure 2(b) took 0.68 seconds. In both cases, the forward chaining itself took only about 50 milliseconds. (Some of the overhead is due to HTTP latency - the plan generator is also web-based). Executing the movie, restaurant trip plan in a non-interactive mode require 15.7 seconds. The run time overhead added by SWORD was 1.7 seconds, while the rest of the time was spent making the HTTP requests to the web services and extracting the results. The SWORD run time is not particularly optimized and we believe the run time overhead added by SWORD is acceptable.
Multiple entities selected: Suppose the people lookup service returns two matches for ``John Doe''. Also, assume that the end-user cannot decide between the two when asked to do so by the filter. The filter then returns a pair of street addresses and phone numbers (say [``101 FooBar street'', 300-999-9999], [``490 BarBaz street'', 300-888-8888]). When this happens, the operator ``clones'' the entity A101 to create another Person entity (say with identifier A102) and sets the following:
A101.streetaddress = ``101 FooBar street'' A101.phone = ``300-999-9999'' A102.streetaddress = ``490 BarBaz street'' A102.phone = ``300-888-8888''
Note that the other attributes of A101 and A102 will have
identical values since A102 was created by cloning A101. The
following 'service3' operator then finds the driving directions for
both the combinations [A101, B200] and [A102, B200].
Optional inputs: Many web services take multiple inputs, not all of which are mandatory. For example, the people lookup service may need only the lastname, city, and state as the mandatory inputs while the firstname may be an optional input. In these cases, we currently only use the mandatory inputs while specifying the service model. However, we allow for using optional inputs at run time wherever possible. While invoking any service at run time, we check if the values for optional inputs are known and include them in the request sent to the web service. This strategy has helped us make it possible to use optional inputs where available while keeping the composition process simpler. However, sometimes it is desirable to derive optional inputs using other services and include them while making the request to a particular service. This does not currently happen in SWORD.
Due to lack of space, we do not discuss other issues such as missing outputs and service invocation errors.
In this section, we discuss the benefits of building composite web services in an automated fashion. In addition, we also discuss an alternative mechanism for web service composition.
Building composite services with an automated/semi-automated tool like SWORD obviously requires less effort than creating the service manually. This is especially important because composite service creation is not necessarily a one-time effort. Rather, the composition may need to change as the underlying services evolve, new services become available and old ones cease to exist. We also have the following ``non-obvious'' advantages due to automation:
Also, while SWORD is currently intended as a toolkit for creating composite web services, many of the ideas could also be used for creating applications and agents intended for end-users.
SWORD currently does planning only at composition time and not at run time. Both online and offline planning have their own advantages. The primary advantage of offline planning in SWORD is predictability at run time. Once the developer verifies the generated plan, she can be (relatively) confident of the run time behavior. Another advantage is efficiency, since no planning overhead is incurred at run time for every request or for every entity flowing through the execution graph. Also, note that multiple alternative plans could possibly be generated at composition time itself to deal with different run time possibilities such as missing outputs, service invocation errors, etc.
As explained earlier, SWORD does not rely on, but could benefit from, the deployment of emerging standards such as SOAP [28], WSDL [29], UDDI [8] and DAML [14]. The semantic web initiative [3, 14, 10, 13, 20] can be broadly subdivided into two categories.
In principle, SWORD belongs to the latter category. However, SWORD does not rely on the actual deployment of any specific semantic markup language by service providers. Further, since it is unclear as to how rich a semantic markup language would get adopted in practice (if at all), we believe that the simple world model of SWORD is advantageous from a deployment perspective.
Research/industry efforts are also beginning to address web service discovery [8, 25]. Discovery and composition are complementary technologies and each could benefit from the other.
Much work has been done in the databases and AI communities on the problems of integrating relational databases [19, 12], as well as semi-structured and XML data [21, 2]. As already explained in section 3, SWORD has a different set of goals from data integration. In that section, we also distinguished SWORD from the relevant aspects of the data integration research. Research efforts have also addressed searching/crawling based technologies for extracting information from static web pages as well as the web's link structure. These technologies include question answering [18, 1] and topic distillation [7, 6, 4].
In this paper, we described SWORD, a toolset that allows developers to quickly compose existing web services to realize new, composite web services. SWORD provides a specific data-point in the obvious expressiveness-complexity-efficiency tradeoff that exists in web service composition. The key aspects of SWORD are summarized below:
Acknowledgments: We thank Mike Genesereth, Prasenjit Mitra, Mayank Bawa and Honglei Zeng for reviewing earlier drafts of this paper and providing valuable feedback that helped improve this draft. We are also grateful to Michael Michael, Jairam Ranganathan, Emre Kiciman and Laurence Melloul for help with various parts of the implementation.
http://www.cs.washington.edu/homes/alon/site/files/levy-di00.ps.