Data warehousing has increasingly become an important and rather essential phenomenon in the world of enterprise data storage. The idea is to allow a single infrastructure to be used in the functionality of data warehouse and to facilitate the data distribution and manipulation by the users/ clients of the data in a simplified way. Different techniques are used to achieve this target with the main emphasis resting with the performance of the data warehouse. It is not easy to handle an enterprise-sized data and yet meet the target performance as well. Performance may be the aim but the fact remains that with the data of an entire enterprise resting in a collection, the security has to be the critical issue.
This paper propose a security model which is mostly based on XML and related technologies.
Data Warehouse, Security, XML web services, Data Storage, Internet Enabled Data Warehouse, XML signatures, SOAP, XML Firewall. XML Security Model.
Data warehouses represent an implementation to simplify the storage of data of an entire enterprise. It will not only be easy to manage the stored data but also for the users and data manipulators to use this data more effectively and without having to tackle the underlying complexity of data storage format and the different extraction procedures specialized to handle only one type of data storage and management system. In the design and implementation strategy the aspect, which is not given its due importance, is the security. Security not only of the stored data but also the security of flowing data and the data transferred to the users is very important. With the introduction of Internet in accessing the data warehouse, the security becomes even more critical an issue.
This paper will propose a security model using XML[1] and its related technologies to ensure a better and secure data warehousing experience.
Figure 1 shows a proposed model to implement security in data warehouse.
The different steps of implementing security are given below:
The security is implemented right from base of the data warehouse i.e. by using encryption to scramble the data. Using asymmetric keys for encryption and decryption.This means that the size of data warehouse will somewhat increase but nonetheless there will not be a big depreciation in the performance. The data is stored in scrambled format using triple DES[2] and generating the encryption -decryption keys accordingly.
Firewall is put around the data warehouse. The traditional firewalls are not good enough because they can only filter at the packet level, and can't examine the contents of messages. A better option is to use XML[1] services in the network and allow usage of XML firewalls. XML firewalls typically work by examining SOAP[3] message headers. In addition, XML firewalls can look into the body of the message itself and examine it down to the tag level. XML firewalls can also provide authentication, decryption, and real-time monitoring and reporting.
The messages are passed using SOAP[3] protocol over HTTP[4] in communication between users and the data warehouse. SOAP[3] messages are fundamentally one-way transmissions from a sender to a receiver, but SOAP[3] messages are often combined to implement patterns such as request/response.
Also the SOAP[3] header is used to communicate information regarding the services requested so that these headers can also be checked by the implemented XML firewall as a further security precaution.
To secure the network, XML signatures[5] are also used. Each user is identified by its signatures.
SOAP[3] Security Extensions proposes a standard way to use XML-Signature[5]to sign SOAP[3] 1.1 messages by defining the
The components in the security model are as follows:
The authentication server is used to authenticate the users. It is placed outside to firewall to prevent malicious user from getting access in the actual storage part of warehouse.
The firewall represents the protected boundary around the data warehouse and its core components.
XML firewalls can also look into the body of the message itself and examine it down to the tag level. It can tell if a message is an authorized one, or coming from an authorized recipient and then take action based on that - for example, blocking traffic, sending it to a secure environment where it can be further examined, or allowing it to pass through. XML firewalls can understand metadata about the service requestor as well as metadata about the operation itself.
When the authentication server authenticates the user, the authentication server hands over the request to the request handler. If user is privileged to access the requested data, it sends the request to the extractor otherwise sends back the request to the authentication server with the error message.
Extractor represents the mechanism to extract the data depending upon the valid request handed to it by the request handler.
This represents the collection of data for all types of users. The data is divided into different partitions and encrypted using different keys for each partition.
The data in encrypted form is passed to the user and depending upon its privilege level, the users are assigned private keys for the partitions for which it has access. This encrypted data is placed under a tag in XML[1] format to be sent to the user.
The users are the costumers , partners, clients using the data warehouse. Each user is issued a Username and password, XML Signatures[5] and Private keys for each partition it has access to.
The Log server stores all the information about the processes taking place.
The working steps of the proposed security model are given below:
The user interacts with the authentication server by presenting its issued user name password as an initial step.
If the password is correct the authentication server sends a document to be digitally signed by the user with its XML signatures[5]. A part of the document is digitally signed by the authentication server itself to confirm its own identity. Otherwise the server generates an error message and logs that into the log file.
The user signs the document after verifying the digital XML-Signature[5] of the Authentication Server itself.
Authentication server verifies the XML signatures[5] of the user and checks the time stamp and if found valid, sends the request to the Request Handler inside the firewall along with the log information placed in the SOAP[3] message.
The firewall checks the information in the header to verify that the message is from Authentication Server itself and allows only the request part to be transmitted to the request handler.
The request handler checks whether the request is within the privilege level of the user, if so it sends the request to the extractor otherwise it generates an error and returns the request to the firewall which in turn attaches the error information and sends it back to the Authentication Server.
The Extractor extracts the required data from the data warehouse, which is in encrypted form.
The Extractor sends the data back to the Request Handler.
The Request Handler then creates a SOAP[3] message placing the required data in it and sends it to the firewall.
The firewall sends the data back to the authentication server along with a log of the requested information and the log is sent to the log server.
The authentication server sends the required information to the user.
The user then extracts the required data, no doubt in scrambled form, and then uses his private keys to decrypt the data.
The security of a data warehouse is a critical issue. The extra added security may increase the costs and decrease the performance by a fraction but the fact remains that its better to be safe then sorry. A single technology for implementing security meeting the required performance target is yet to be devised but the existing technologies can be used in combination to produce satisfactory results. XML[1] with the proper extensions to its different technologies looks a likely candidate to be used in solving the security problems in data warehouse but still it is a long way from perfection.