Wednesday, April 15, 2020

Inline External Tables in Oracle Database 18c

Oracle Database 18c allows you to access data in flat files using an inline external table defined in a SELECT statement.

Oracle Database 18c, Oracle Database Learning, Oracle Database Tutorial and Material

This is a step further than the ability to override external table parameters from a query introduced in Oracle Database 12c Release 2 (12.2).

Setup


In order to demonstrate an inline external table we need some data in flat files. The following code spools out four CSV files with 1000 rows each.

CONN / AS SYSDBA
ALTER SESSION SET CONTAINER = pdb1;

SET MARKUP CSV ON QUOTE ON
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 0

SPOOL /tmp/gbr1.txt
SELECT 'GBR',
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id <= 2000
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/gbr2.txt
SELECT 'GBR',
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id BETWEEN 2000 AND 3999
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/ire1.txt
SELECT 'IRE',
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id <= 2000
AND    rownum <= 1000;
SPOOL OFF

SPOOL /tmp/ire2.txt
SELECT 'IRE',
       object_id,
       owner,
       object_name
FROM   dba_objects
WHERE  object_id BETWEEN 2000 AND 3999
AND    rownum <= 1000;
SPOOL OFF

SET MARKUP CSV OFF
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 14

Create two directory objects to access these files. In this case both are pointing to the same directory, but it will still allow us to see the syntax.

CONN / AS SYSDBA
ALTER SESSION SET CONTAINER = pdb1;

CREATE OR REPLACE DIRECTORY tmp_dir1 AS '/tmp/';
GRANT READ, WRITE ON DIRECTORY tmp_dir1 TO test;

CREATE OR REPLACE DIRECTORY tmp_dir2 AS '/tmp/';
GRANT READ, WRITE ON DIRECTORY tmp_dir2 TO test;

CONN test/test@pdb1

Inline External Table


Oracle Database 18c, Oracle Database Learning, Oracle Database Tutorial and Material

An inline external table allows the external table definition to be placed in the FROM clause of a SQL statement using the EXTERNAL clause, so there is no need for an external table to be explicitly created.

SELECT country_code, COUNT(*) AS amount
FROM   EXTERNAL (
         (
           country_code  VARCHAR2(3),
           object_id     NUMBER,
           owner         VARCHAR2(128),
           object_name   VARCHAR2(128)
         )
         TYPE oracle_loader
         DEFAULT DIRECTORY tmp_dir1
         ACCESS PARAMETERS (
           RECORDS DELIMITED BY NEWLINE
           BADFILE tmp_dir1
           LOGFILE tmp_dir1:'inline_ext_tab_%a_%p.log'
           DISCARDFILE tmp_dir1
           FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
           MISSING FIELD VALUES ARE NULL (
             country_code,
             object_id,
             owner,
             object_name 
           )
        )
        LOCATION ('gbr1.txt', 'gbr2.txt', 'ire1.txt', 'ire2.txt')
        REJECT LIMIT UNLIMITED
      ) inline_ext_tab
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE  AMOUNT
------------ ----------
GBR                2000
IRE                 2000

SQL>

In the following example we use a different directory object, and specify a different list of files in the LOCATION clause. This, not surprisingly gives us a different result.

SELECT country_code, COUNT(*) AS amount
FROM   EXTERNAL (
         (
           country_code  VARCHAR2(3),
           object_id     NUMBER,
           owner         VARCHAR2(128),
           object_name   VARCHAR2(128)
         )
         TYPE oracle_loader
         DEFAULT DIRECTORY tmp_dir2
         ACCESS PARAMETERS (
           RECORDS DELIMITED BY NEWLINE
           BADFILE tmp_dir2
           LOGFILE tmp_dir2:'inline_ext_tab_%a_%p.log'
           DISCARDFILE tmp_dir2
           FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
           MISSING FIELD VALUES ARE NULL (
             country_code,
             object_id,
             owner,
             object_name 
           )
        )
        LOCATION ('gbr1.txt', 'gbr2.txt')
        REJECT LIMIT UNLIMITED
      ) inline_ext_tab
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE    AMOUNT
------------ ----------
GBR                2000

SQL>

The inline external table definition is a little ugly, so you may prefer to put it into a WITH clause if you are planning to join it to other tables.

WITH inline_ext_tab AS (
  SELECT *
  FROM   EXTERNAL (
           (
             country_code  VARCHAR2(3),
             object_id     NUMBER,
             owner         VARCHAR2(128),
             object_name   VARCHAR2(128)
           )
           TYPE oracle_loader
           DEFAULT DIRECTORY tmp_dir2
           ACCESS PARAMETERS (
             RECORDS DELIMITED BY NEWLINE
             BADFILE tmp_dir2
             LOGFILE tmp_dir2:'inline_ext_tab_%a_%p.log'
             DISCARDFILE tmp_dir2
             FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
             MISSING FIELD VALUES ARE NULL (
               country_code,
               object_id,
               owner,
               object_name 
             )
          )
          LOCATION ('gbr1.txt', 'gbr2.txt')
          REJECT LIMIT UNLIMITED
        )
)
SELECT country_code, COUNT(*) AS amount
FROM   inline_ext_tab 
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE    AMOUNT
------------ ----------
GBR                2000

SQL>

Security Implications


Access to directory objects should be controlled carefully, regardless of if you use external tables or not.

As pointed out by Pete Finnigan, you need to consider the security implications of this functionality. At the time you define the external table you have made a conscious decision about the directory objects and file locations you are going to use. The ability to alter these parameters at runtime mean you need to pay special attention to the directory objects that are available to the user, or risk a security issue.

Create a second test user.

CONN / AS SYSDBA
ALTER SESSION SET CONTAINER = pdb1;

CREATE USER test2 IDENTIFIED BY test2;
GRANT CREATE SESSION TO test2;

Create the basic external table from the first example in the initial test user, and grant access to the new test user.

CONN test/test@pdb1

DROP TABLE tab_ext;

CREATE TABLE tab_ext (
  country_code  VARCHAR2(3),
  object_id     NUMBER,
  owner         VARCHAR2(128),
  object_name   VARCHAR2(128)
)
ORGANIZATION EXTERNAL (
  TYPE oracle_loader
  DEFAULT DIRECTORY tmp_dir1
  ACCESS PARAMETERS (
    RECORDS DELIMITED BY NEWLINE
    BADFILE tmp_dir1
    LOGFILE tmp_dir1:'part_tab_ext_%a_%p.log'
    DISCARDFILE tmp_dir1
    FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
    MISSING FIELD VALUES ARE NULL (
      country_code,
      object_id,
      owner,
      object_name
    )
  )
  LOCATION ('gbr1.txt', 'gbr2.txt', 'ire1.txt', 'ire2.txt')
)
REJECT LIMIT UNLIMITED;

GRANT SELECT ON tab_ext To test2;

Now attempt to use the external table from the new test user.

CONN test2/test2@pdb1

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) AS amount
FROM   test.tab_ext
GROUP BY country_code
ORDER BY 1;
            *
ERROR at line 2:
ORA-06564: object TMP_DIR1 does not exist

SQL>

As we can see, the external table is run in the context of the current user, which doesn't have access to the directory objects. Let's grant access and try again.

CONN / AS SYSDBA
ALTER SESSION SET CONTAINER = pdb1;

GRANT READ, WRITE ON DIRECTORY tmp_dir1 TO test2;


CONN test2/test2@pdb1

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) AS amount
FROM   test.tab_ext
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE  AMOUNT
------------ ----------
GBR 2000
IRE       2000

SQL>

This demonstrates the basic external table has now worked. We can see below we are able to alter the runtime parameters from the new test user.

SELECT country_code, COUNT(*) AS amount
FROM   test.tab_ext EXTERNAL MODIFY (
                      ACCESS PARAMETERS (
                        BADFILE tmp_dir1:'part_tab_ext_%a_%p.bad'
                        LOGFILE tmp_dir1
                        NODISCARDFILE
                      )
                      LOCATION ('gbr1.txt', 'ire1.txt', 'ire2.txt')
                      REJECT LIMIT 5
                    )
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE    AMOUNT
------------ ----------
GBR               1000
IRE                2000

SQL>

If another directory object is available to the new test user, we can also alter the directory references.

CONN / AS SYSDBA
ALTER SESSION SET CONTAINER = pdb1;

GRANT READ, WRITE ON DIRECTORY tmp_dir2 TO test2;

CONN test2/test2@pdb1

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) AS amount
FROM   test.tab_ext EXTERNAL MODIFY (
                      DEFAULT DIRECTORY tmp_dir2
                      ACCESS PARAMETERS (
                        BADFILE tmp_dir2:'part_tab_ext_%a_%p.bad'
                        LOGFILE tmp_dir2
                        NODISCARDFILE
                      )
                      LOCATION ('gbr1.txt', 'ire1.txt', 'ire2.txt')
                      REJECT LIMIT 5
                    )
GROUP BY country_code
ORDER BY 1;

COUNTRY_CODE   AMOUNT
------------   ----------
GBR                 1000
IRE                  2000

SQL>

So you must make sure you control what directory objects are visible to the user accessing the external table, or you could find yourself in a position where you are exposing data you shouldn't be.

Related Posts

0 comments:

Post a Comment