Trifox, Inc. -- Database Access Experts

Data Replication in VORTEX

Overview

This paper describes Trifox's solution for database-independent data replication, a product called VORTEXreplicator.

VORTEXreplicator works with VORTEXserver's channel, a virtual database interface that supports all the major databasess. Used in conjunction with VORTEXclient and VORTEXaccelerator (Transaction Multiplexor), user sites have found that performance often exceeds that of individual database vendors' own solutions.

The design and implementation of VORTEXreplicator is based on the notion that the many databases currently employed at a site can, and should, be integrated into the system on an as-needed basis. Also, existing applications, in particular table naming conventions, must remain intact. VORTEXreplicator meets both these criteria, functioning independently from existing database applications and allowing incremental additions of databases to its replication scheme.

Background

Over the years, database vendors have been adding distributed features to their flagship kernel products. However, the difficult maintenance and poor performance resulted in very few production installations. Two-phase commit was designed to overcome these difficulties. This approach ensured that all the transactions across a network were coordinated so participating nodes either all committed or rolled back, even if a network failure or a system failure on any node occurs during the process. This technology proved difficult to implement causing vendors to be late getting to market.

In the meantime, large corporations began to deploy production databases based on relational technology. To complicate the situation further, many organizations adopted different vendors' databases in different departments. Although, in the era of "rightsizing" it also becomes obvious that data across various types of machines must be shared, true two-phase commits across heterogeneous databases are not likely in the near future.

Data Replication

Data replication provides relief to the dilemma. By replicating data across several databases instead of keeping one copy of the data distributed over many databases, it can avoid many of the database-specific issues.

Several major database vendors have announced that they support data replication. Some have even stated that they can replicate data onto competitors' databases via gateways. This claim is mainly considered a "checkoff item" and in reality is a painfully slow exercise. A distributed system relying on data replication from a database vendor typically consists of databases from that vendor.

As delivered by Trifox, the replication is executed transparently, (that is, the application performs a single update which in turn results in one or more actual updates behind the scenes), further simplyfying the process.

During data retrieval, alternate readers (fail-over) provide service so that if a database is unavailable, the data can be fetched from an alternate database. Again, this activity is transparent to the client application.

VORTEXreplicator

The heart of VORTEXreplicator, is the routing tables. These tables are regular SQL tables that are created and maintained in any of the supported databases. Several sets of routing tables may be created in any number of the databases. VORTEXreplicator consults an initialization file to determine the location of the router tables. Each line in the initialization file corresponds to one set of router tables. If VORTEXreplicator cannot connect using the first line in an initialization file, it tries connecting using the next line and so on.

There are four (4) router tables:

» VORTEX_ROUTER_TABLE.
» VORTEX_ROUTER_DB. Contains the logical name for each database available to the router. It also contains four (4) flags for each database:

OFFLINE -- If accessible or not.
DEFAULT -- Database to use if table is not known to VORTEXreplicator.
READ -- Select allowed.
WRITE -- Insert/Update/Delete allowed.

» VORTEX_ROUTER_USER. Contains the connect string for each logical user and database pair.
» VORTEX_ROUTER_TABLE. Contains an entry for each table that is known to VORTEXreplicator. Each table can have multiple read, master, and slave databases associated with it.
» VORTEX_ROUTER_LOG. All errors are logged in this table. All vital information such as error code, error message, function code, time, etc... is kept.

Operation

VORTEXreplicator allows SELECT, INSERT, UPDATE, and DELETE statements as well as COMMIT and ROLLBACK. For obvious reasons it does not allow DDL commands.

On a SELECT statement the first table in the FROM list becomes the basis for the router table lookup. If the table is not found, the SELECT statement passes to the default database, as specified in VORTEX_ROUTER_DB. If the table is found, the SELECT passes to the primary read database. The router tries each read database in the list until one succeeds. Each failure is logged and if all read databases fail,VORTEXreplicator returns an error to the application.

The lookup for INSERT, UPDATE, and DELETE statements is similar to that of the SELECT statement. If the table is not found the statement is passed to the default database. If the table is found the following occurs:

» The statement is passed to each database in the MASTER list with each failure being logged.
» If one or more of the statements succeeded in the previous MASTER step the statement is passed to each database in the SLAVE list with each failure being logged.

If none of the statements in step 1 succeeds, the last error in step 1 is returned to the application.

Summary

VORTEXreplicator provides an easy-to-implement solution to a complex problem. If your environment includes heterogenous databases and you need that information to be shared across the network, VORTEXreplicator provides reliable performance, even in a high-traffic system.