Oracle Warehouse Builder (OWB) is at the end of its lifecycle. The ETL tool will not be supported by Oracle in the future. But many customers have built data warehouses with OWB. What can they do with their existing OWB projects? A Trivadis white paper describes different migration scenarios.
As explained in Oracle’s ODI and OWB Statement of Direction, Oracle Warehouse Builder will not be enhanced anymore after version 11g release 2 and is replaced by Oracle Data Integrator (ODI). OWB 188.8.131.52 is the final release of the tool, and its support is available until end of lifecycle of Oracle 12c Release 1. So, what can we do with our existing OWB projects in the future? What options do we have to reuse the existing ETL logic with other technologies in the future? With Oracle Data Integrator 12c, a powerful successor is available, and Oracle recommends to use this tool instead of OWB. This is definitely a good, but not the only option. My colleague Stanislav Lando and me described several scenarios for migrating OWB projects to different tools in a Trivadis white paper. Because this white paper is available in German only, here a short summary of the migration options.
Migration to Oracle Data Integrator
The easiest way to move an existing OWB project to a new technology is to use Oracle Data Integrator. This is the recommended migration path by Oracle. With ODI 12c, many concepts and features of OWB were added to ODI. For OWB developers, the change to ODI 12c is quite easy.
The OWB Migration Utility is used to migrate existing OWB projects and import them into an ODI master repository. It is a command line tool and allows to migrate most – but not all – objects from OWB to ODI. It is recommended to run several test migrations (called “dry runs”) before the real migration is started. Many issues can be solved before the migration, when the existing OWB mappings are adapted. This helps to reduce the manual post-editing steps afterwards. The migration itself is simply started by calling a script. The list of migrated objects as well as the errors and warnings of the migration can be reviewed in a log file.
Usually, not all objects and mappings can be migrated automatically. After running the migration utility, several post-processing steps are required. The effort of these additional work depends on the complexity of the OWB mappings and the operator used in the existing projects. Some objects cannot be migrated, for example dimensional OWB objects (dimensions, cubes), user defined PL/SQL procedures or process flows. For projects using these types of objects, a manual post-processing will definitely be necessary. For projects using just the standard operators of OWB, an automated migration with only few manual changes is possible.
Oracle Data Integrator is part of the Oracle Fusion Middleware and therefore not included in the database license. For customers that used only the basic functionality of OWB, that means additional license costs. For many customers this is a reason to prospect for other solutions.
Reuse OWB Generated PL/SQL Code
One possible solution is to reuse the PL/SQL code generated by OWB. For each mapping, Oracle Warehouse Builder generates a PL/SQL package. The OWB client is not required to run these packages, but the generated code contains many procedure and function calls of the package OWBSYS.WB_RT_MAPAUDIT (and some other packages) in the OWB runtime repository. Therefore, an OWB installation is typically needed to run the mappings.
One possibility is to write your own, simplified version of the package WB_RT_MAPAUDIT. It must contain the same signatures for all procedures and functions as the original package, but not all of the functionality has to be implemented. The package is only needed to be able to run the existing OWB mappings. This approach is only sufficient, as long as no mapping changes are required. Without an OWB installation, it is quite hard to change the existing mappings. Manual enhancements in the generated PL/SQL code are not really funny because the packages contain a lot of overhead for auditing, error handling, parameter settings, etc. Although these code lines are important and useful in case of an OWB implementation, they make the generated code extensive and hard to read.
If code changes of the existing mappings are planned, it is probably a better approach to extract only the relevant parts of the generated packages and save them in your own packages. For set-based OWB mappings, this is typically an INSERT or MERGE statement that can be extracted using the Intermediate Results functionality of OWB. Generate the mapping in OWB Design Center, select the attribute group of the target table and select Intermediate in the generated results. The main SQL statement of the mapping can now be copied to any other package and changed there like manually written PL/SQL code.
For a couple of OWB mappings, this way of extracting the SQL statements of the mappings is convenient, but if you have hundreds or thousands of mappings, you probably don’t want to extract all the code manually. In this case, an automated way must be provided. With a source code analyzer, for example the Trivadis PL/SQL Analyzer, it is possible to parse and extract the PL/SQL code for further analysis. Of course, this approach is extensive, but for large OWB projects with a lot of mappings it is more efficient than a manual extraction of the code parts.
Data Warehouse Automation Tools
In many data warehouses, large parts of the ETL processes are standardized. Standard transformations for data extraction, data cleansing, error handling, delta detection, surrogate key and code lookups or building slowly changing dimensions are similar for all tables. In such an environment, it does not make sense to manually migrate hundreds or thousands of mappings with a similar logic.
For standardized data warehouses, it is possible and recommended to use DWH automation tools. Such tools are able to generate table structures and ETL processes for the standard transformations. Let’s explain this approach with the Trivadis tool biGenius, a metadata-driven data warehouse automation tool. In contrast to OWB, ODI or other ETL/ELT tools, we don’t create individual tables for all DWH layers and implement the ETL processes to load these tables. In biGenius, logical entities, dimensions and facts are defined in a metadata repository. The metadata also contains information for each attribute about the related source attribute(s), the relationships to other entities, types of slowly changing dimensions, aggregation rules, cleansing rules, etc. All this metadata is required to generate the corresponding tables for each DWH layer (e.g. Staging Area, Cleansing Area, Core and Data Marts) and the ETL processes between the DWH layers. The code to be generated is defined in templates with placeholders, similar to the knowledge modules in ODI. In case of the Oracle implementation of biGenius, the tool generates SQL scripts to create tables and views, and PL/SQL packages for the ETL processes of all DWH layers.
A data warehouse automation tool is an option for OWB projects that contain mainly standardized ETL logic. In this case, it is recommended to define or import the metadata based on the existing data warehouse and generate all ETL processes with the automation tool. This is usually easier and more efficient than to migrate each of the existing OWB mappings. Of course, there will always be some special cases that have to be adapted manually. In biGenius, such cases can be handled with additional templates that contain the special business logic. Of course, this is only feasible if the DWH architecture uses any standards.
Migration to Other ETL Tools
Some customers are replacing OWB with ETL tools of other software vendors. There can be different reasons for this decision, for example license costs or existing knowhow of an ETL tool within the company. Of course, the migration effort to a third party ETL tool is higher than a migration to ODI, because the OWB migration utility cannot be used. A general migration approach is not possible because this depends on the functionality of the ETL tool that will be used in the future. The table structures of source and target tables are usually not an issue. They can be imported from the data dictionary into almost any ETL tool. But the ETL processes are harder to migrate. In the worst case, all the ETL mappings must be implemented from scratch.
A pragmatical way for such a migration is to extract the SQL statements with Intermediate Results in OWB (as described above) and to reuse them in the new ETL tool. Some ETL tools allow to run SQL queries in their mappings or jobs. For example, the OpenSource ETL tool Talend provides a component tOracleRow which allows to execute an SQL statement, e.g. an INSERT INTO … SELECT statement. The component name tOracleRow is a bit misleading, because the statement is executed set-based. This workaround allows a simple migration of set-based OWB mappings, but of course it is only a workaround. The ETL tool is used to run SQL statements in a Talend job, and all the rest of the Talend functionality is not used in this case.
A similar approach can be used for Informatica PowerCenter, one of the most common ETL tools in the market. PowerCenter provides a functionality called SQL Override (some ETL developers love it, others hate this feature). SQL Override allows to “replace” a source table by adding a SELECT statement in the source qualifier of the PowerCenter mapping. The SELECT statement can be extracted from the OWB mapping with Intermediate Results, too. In this case, only the Incoming part has to be copied. The PowerCenter mapping executes the SQL statement and writes it into a target table. No other functionality of PowerCenter is used.
Because most ETL tool provide a possibility to execute SQL statements, at least set-based OWB mappings can be migrated with a similar approach. But be careful: This kind of “migration” to another ETL tool allows to execute the existing transformation logic in a new ETL tool, but it is not a good practice for using these tools in this way. For future changes, the “hidden” SQL statements in the ETL jobs or mappings are hard to understand and to maintain, and all of the features of the ETL tool, especially impact analysis, cannot be used. Therefore this approach is only recommended as a temporary solution. The final solution is to replace all OWB mappings with new ETL jobs or mappings that use the recommended operators and components of the target ETL tool. And this is a lot of manual work.
Which Is The Best Approach?
Unfortunately, the answer to this question is the typical consulting answer: “It depends”. Each of the solutions described above have advantages and disadvantages. The decision of how to replace Oracle Warehouse Builder depends on several criteria, for example project size, complexity, data amount, performance, maintainability, technology of source and target systems and of course license costs. Therefore you have to decide which approach is the best for your OWB projects. Maybe, the migration scenarios described in this blog post and in the Trivadis white paper may help you with this decision.