In Oracle Autonomous Data Warehouse Cloud, External Tables can be used to read files from the cloud-based Object Storage. But take care to do it the official way, otherwise you will see a surprise, but no data.
Together with my Trivadis colleague Christian Antognini, I currently have the opportunity to do several tests in the Autonomous Data Warehouse Cloud (ADWC). We are checking out the features and the performance of Oracle’s new cloud solution for data warehouses. For the kick-off of this project, we met in an idyllic scenery in the garden of Chris’ house in Ticino, the southern part of Switzerland. So, I was able to work on a real external table.
Testing External Tables in the Cloud on an external table with view to the clouds.
A typical way to load data files into a data warehouse is to create an External Table for the file and then read the data from this table into a stage table. In ADWC, the data files must first copied to a specific landing zone, a Bucket in the Oracle Cloud Infrastructure Object Storage service. The first steps to do this are described in the Oracle Autonomous Data Warehouse Cloud Service Tutorial. The Oracle Cloud Infrastructure command line interface CLI can also be used to upload the files.
The tutorial uses the procedure DBMS_CLOUD.copy_data to load the data into the target tables. The procedure creates a temporary external table in the background and drops it at the end. Another procedure DBMS_CLOUD.create_exernal_table is available to create a reusable External Table on a file in the Object Storage. But is it possible to create an External Table manually, too? To check this, I extracted the DDL command of the table CHANNELS (created with DBMS_CLOUD.create_extrenal_table):
Then, I created a new table CHANNELS_2 with exactly the same definition, only with a different name. It seems to be obvious that both tables should contain the same data. But this is not the case, table CHANNEL_2 returns no data:
First, I was confused. Then I thought it has to do with missing privileges. Finally, I assumed to be dazed because of the heat in Chris’ garden. But the reason is a different one: CHANNELS_2 is not an External Table, but a normal heap-organized table. Even it was created with an ORGANIZATION EXTERNAL clause! Extracting the DDL command shows what happened:
What is the reason for this behavior? The explanation can be found in Appendix B: Autonomous Data Warehouse Cloud for Experienced Oracle Database Users (the most interesting part of the ADWC documentation): Most clauses of the CREATE TABLE command are either ignored or not allowed in the Autonomous Data Warehouse. In ADWC, you cannot manually define physical properties such as tablespace name or storage parameters. No additional clauses for logging, compression, partitioning, in-memory, etc. are allowed. They are either not supported in ADWC (like Partitioning), or they are automatically handled by the Autonomous Database. According to documentation, creating an External Table should not be allowed (i.e. return an error message), but instead, the clause is just ignored. The same happens for index-organized tables, by the way.
External Tables are supported (and even recommended) in the Autonomous Data Warehouse Cloud, but they cannot be created manually – we are in an Autonomous Database.
If you follow the steps explained in the documentation and use the provided procedures in package DBMS_CLOUD, everything works fine. If you try to do it the “manual way”, you will get a non-expected behavior and probably loose a lot of time to find your data in the files.
The PL/SQL package DBMS_CLOUD contains many additional useful procedures for file handling in the Cloud, but not all of them are documented. A complete reference of all its procedures with some examples can be found in Christian Antognini’s blog post DBMS_CLOUD Package – A Reference Guide.