This paper proposes a new Web Service integration approach based on componential process construction and the Service Grid, which enables Web Services to be effectively and efficiently retrieved. The approach includes the component-based Web Service process definition tool, mechanism for retrieving services in the Service Grid and UDDI repositories, algorithms for data flow integration, and rules for service verification. It has three advantages: the complex service integration process can be divided and conquered, the efficiency and effectiveness of service retrieval can be improved, and services can be used by way of single semantic image. The proposed approach has been implemented, integrated into a Service Grid platform, and applied to an online book sale business.
Component, Integration, Service Grid, UDDI, Web Service.
Integration of Web Services for large-scale applications is a challenging problem due to unmanageable efficiency and quality of the involved complex service processes. Another issue arising from service integration is how to accurately and efficiently retrieve services from the rapidly expanding and large-scale service repositories.
This paper solves the issue of complex service process construction by making use of component technology, an important way to decompose the complex service processes. A component-based service process definition tool has been developed to assist users to transform a business process into a service process and then specify the requirements of the related service components, which are integrated by control flows and data flows. Interactions between the components in the service processes are based on XML, SOAP and WSDL. Data returned from multiple services may be heterogeneous in structure and semantics [1, 2], so we propose algorithms to integrate these heterogeneous data.
We solve the issue of improving the accuracy and efficiency of service retrieval by making use of the Service Grid, which organizes Web Services in an orthogonal multi-dimensional service space so that services could be retrieved efficiently and effectively [3].
The general architecture of the proposed Web Service integration approach is illustrated in Figure 1, which mainly consists of the following modules: process definition, service retrieval, service integration, verification and registration.
The main control flow process includes the following steps:
Step 1. Process definition. According to the business processes, users can define the service processes as components by using the definition tool.
Step
2. Definition verification. After definition, the completeness and time constraints of
the defined process components are verified.
If there is something wrong, modification is required.
Otherwise, users can specify the requirements
for the related service components by using the
definition tool.
Step 3. 3. Service retrieval. It works as follows: a SOAP message encapsulating the requirements for the components will be sent to the Service Grid firstly. If the Service Grid fails to find the related services, the SOAP message will be sent to the UDDI repositories. The SOAP message returned by the Service Grid and the UDDI repositories includes a list of services satisfying the requirements or error messages. If there are no matching services found, users have to develop the service or modify the requirements to carry out the second round of retrieval.
Step 4. 4. Service integration. The components in the service processes are integrated by using control flows and data flows. The control flow reflects the control dependence and the data flow reflects the data dependence among the components.
Step 5. Integration verification and service registration. The integration verification module checks the reachability, deadlock and the execution state of the service process. If there is nothing wrong, the new service will be registered at the Service Grid and also the UDDI repositories. Otherwise, the modification of the process definition is triggered.
Figure 1. General
architecture of the proposed approach.
A Service Grid is a multi-dimensional service space with a
set of uniform service operations (http://kg.ict.ac.cn).
A referential model for the service
space can be expressed as ServiceGrid=(classification- type, category,
registry-node), where the classification-type dimension indicates the
service classification standards, the category dimension reflects the
detail-classified hierarchy related to a classification type, and the registry-node
dimension describes the portal where business entities register their services.
Based on
our previous work [4], we first define the multi-valued specialization
relationships including identical, partial, extension and revision
between services and then construct the similarity degree between services in
each subspace, which corresponds to a high-level point of the service space.
We adopt heuristic graph searching on the multi-valued specialization hierarchy
to improve the query efficiency in each subspace. An interface of the Service Grid is shown in Figure 2.
Figure 2. An interface of the Service Grid.
A component-based Web Service process definition tool has been developed to assist users to transform a business process into a service process. The interface of the definition tool is shown in Figure 3. The menu and the tool bar for operating the service process are arranged at the up portion. The scalable component hierarchy of the current process is arranged on the left portion. The service process component to be defined is arranged on the middle-and-right portion. The verification information including the error location and error content is displayed at the bottom portion. There are two kinds of basic elements in a service process: nodes and arcs. Nodes denote components of a service process, while arcs denote the data flows and control flows.
Figure 3. An interface of the Web Service process definition tool.
A service node may receive multiple data flows of different structures. So they should be integrated to reflect the complete semantics.
The first step is information integration, which is to determine the basic information (i.e., the semantic structure) and the data dependence relationships between the basic information (i.e. the semantic constraints) among which the determinant information is called the key just as the key in relational databases.
The second step is structure integration, which is to integrate the schemas of the data flows to generate a unified structure, i.e. the integrated schema. The idea of the schema integration is to load each data flow, traverse the schema in the depth-first order recursively and append the corresponding nodes to the integrated schema. The algorithm considers the following cases:
1) If the current node is the key in the semantic constraints, an attribute of the root node will be created in the integrated schema.
2) If the current node is not the key and its value does not vary with the involved data flows, a leaf node will be created as a child of the root.
3) If the current node is not the key but its value varies with the involved data flows, a non-leaf node will be created as a child of the root node. Besides, children of the current node to distinguish information from different data flows will be created.
4) If the current node does not exist in the semantic structure, corresponding node will be appended to a separate branch under the root node.
The third step is data integration, which is to traverse the data flows in the depth-first order and integrate data by referring
to the node mapping information.
To solve the issues of complexity and efficiency in
large-scale service integration, this paper presents a new Web Service
integration approach based on componential process construction and the Service
Grid. The Service Grid is used to organize Web Services in a normalized
service space so that the efficiency of service retrieval can be improved. The
integration of semantic and structural data flows enables the integrated service
to be used in a way of single semantic image.