Performance Tips: PL/SQL Functions in SQL Queries

PL/SQL functions in SQL statements may cause major performance problems. If it is not possible to avoid these calls, there are several solutions to improve the performance. But not for all situations.

In my last blog post, I wrote about the impact of function calls in WHERE conditions to the performance of a SQL query. The best way to solve such issues is to avoid function calls. But there are also other solutions. With function-based indexes, virtual columns or extended statistics, it is possible to reduce or even avoid bad performance.

All these solutions work for PL/SQL functions, too. But there are some restrictions and preconditions you have to take care about. To explain the different options, I will use a simple example of a PL/SQL function and use it in SQL query.

For our example, I want to select different price categories in a table ORDER_ITEMS. Depending on the quantity and unit price of an order item, the price category is either low, medium or high. The (simple) business logic is defined in this PL/SQL function:


CREATE OR REPLACE FUNCTION f_price_category

   (in_quantity order_items.quantity%TYPE,

    in_price order_items.price_per_unit%TYPE) RETURN VARCHAR2

IS

   v_total_price order_items.price_per_unit%TYPE;

   v_category VARCHAR2(6);

BEGIN

   v_total_price := in_quantity * in_price;

   IF v_total_price < 100 THEN

      v_category := ‘low’;

   ELSIF v_total_price < 1000 THEN

      v_category := ‘medium’;

   ELSE

      v_category := ‘high’;

   END IF;

   RETURN v_category;

END f_price_category;

/
 


Let’s use this function in the following SQL query. As you can see, the query on table ORDER_ITEMS (828270 rows) runs for about 10 seconds:


SELECT COUNT(*), SUM(quantity)

  FROM order_items

 WHERE f_price_category(quantity, price_per_unit) = ‘high’;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

Elapsed: 00:00:10.275
 


Deterministic Functions

Can we improve the query performance with the features described in the last blog post? Let’s try to create a function-based index for the PL/SQL function:


CREATE INDEX ordi_price_cat_fbi

ON order_items(f_price_category(quantity, price_per_unit));

 

ORA-30553: The function is not deterministic
 


What does the error message ORA-30553 mean? A function is deterministic when it always returns the same result for the same input values. This is the case for most SQL functions, but it cannot be guaranteed for a self-written PL/SQL function. To tell Oracle that our function is deterministic, we must add the keyword DETERMINISTIC in the function declaration. This is required for all functions that are used in a function-based index, in a virtual column or for extended statistics.

 

CREATE OR REPLACE FUNCTION f_price_category

   (in_quantity order_items.quantity%TYPE,

    in_price order_items.price_per_unit%TYPE) RETURN VARCHAR2

DETERMINISTIC

IS

 

Be careful with defining deterministic functions! The keyword DETERMINISTIC is only a declaration, i.e. the PL/SQL compiler does not prove whether the logic of the function is deterministic or not. It is your responsibility as a developer to guarantee that the function is really deterministic. If it uses for example global package variables, calls of SYSDATE or contains SQL queries, the function is not deterministic. If you declare it with the keyword anyway, this can cause wrong results of your queries.

In our example, the function is deterministic. So, after adding the additional keyword, we can create the function-based index:

 

CREATE INDEX ordi_price_cat_fbi

ON order_items(f_price_category(quantity, price_per_unit));

 

Index ORDI_PRICE_CAT_FBI created.

 

Elapsed: 00:00:22.358

 

SELECT COUNT(*), SUM(quantity)

  FROM order_items

 WHERE f_price_category(quantity, price_per_unit) = ‘high’;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

 

Elapsed: 00:00:00.291

 


Creating the index takes a while, but after that the query with the function call runs much faster than before. This is the case because the PL/SQL function is not called anymore for each row. Instead, the pre-calculated result can be read from the index. It is obvious now why the function must be deterministic. Otherwise, it would not be guaranteed that the result in the index is the same as the result on the execution time of the query.

With a function-based index, the response time of our query is much faster than before. But we have to be aware that the index creation takes time and disk space, and an index range scan is not always a good choice. Therefore, we drop the function based index and try to find the root cause for the performance issue.

 

DROP INDEX ordi_price_cat_fbi;

 

 

Context Switches between PL/SQL and SQL

The main reason for the bad performance of our query are context switches between the SQL engine and the PL/SQL runtime engine of the Oracle database. Whenever a PL/SQL function is called in a SQL statement, or a SQL query is executed in a PL/SQL block, this causes a so-called “context switch”. A good explanation of this behaviour you can find in this blog post of Steven Feuerstein. Although, the performance of context switches is much better nowadays than in former Oracle versions, the problem still exists. The good news: in the current releases of the Oracle databases (since Oracle 12c), there are different ways to avoid context switches.

User Defined Function

One option is to create a User Defined Function (UDF). The PL/SQL function still looks the same, but the additional PRAGMA UDF tells the compiler that the function is mainly used in SQL statements.

 

CREATE OR REPLACE FUNCTION f_price_category

   (in_quantity order_items.quantity%TYPE,

    in_price order_items.price_per_unit%TYPE) RETURN VARCHAR2

IS

   PRAGMA UDF; 

   v_total_price order_items.price_per_unit%TYPE;

   v_category VARCHAR2(6);

BEGIN

   …

 


With PRAGMA UDF, the query is almost fast as with the function-based index before, but without the overhead of the index. Like the original query, it performs a full table scan on table ORDER_ITEMS, but in 0.3 instead of 10 seconds. In other words: 97% of the time in the first query was spent with context switches.

 

Elapsed: 00:00:00.368

 


Scalar SQL Macro

When you already work with Oracle 21c, you can use another feature to improve the performance of self-written functions: SQL Macros.

Unlike a PL/SQL function, a SQL macro is not called during the execution of a SQL statement, but replaced in the SQL code at parse time. The executed statement is therefore pure SQL code, so no context switch is needed anymore. There are two types of SQL macros:

  • SQL macros for tables can be used in the FROM clause of a SQL query. They were introduced with Oracle 21c and backported to Oracle 19c.
  • SQL macros for scalar expressions can be used in SELECT, WHERE, GROUP BY and ORDER BY clauses. They can be used since Oracle 21.

For our purpose, a Scalar SQL Macro is the suitable variant. Because the replaced part of the SQL statement must be pure SQL code, we have to rewrite our business logic. Instead of an IF … THEN … ELSE statement in PL/SQL, we write a CASE statement that is returned as a string in the SQL macro:

 

CREATE OR REPLACE FUNCTION f_price_category_macro

   (in_quantity VARCHAR2,

    in_price VARCHAR2) RETURN VARCHAR2 SQL_MACRO (SCALAR)

IS

   v_category VARCHAR2(4000);

BEGIN

   v_category := ‘CASE

                     WHEN in_quantity * in_price < 100 THEN ”low”

                     WHEN in_quantity * in_price < 1000 THEN ”medium”

                     ELSE ”high”

                  END’;

   RETURN v_category;

END f_price_category_macro;

/

 

With the scalar SQL macro, our example query runs fast. No context switch between SQL and PL/SQL takes place, and the response time is about the same as for the User Defined Function:

 

SELECT COUNT(*), SUM(quantity)

  FROM order_items

 WHERE f_price_category_macro(quantity, price_per_unit) = ‘high’;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

 

Elapsed: 00:00:00.369
 


PL/SQL Function with Embedded SQL

The business department is now happy with the performance of the query, but they want to have more flexibility. Instead of a hard-coded function or SQL macro for the calculation of the price category, they want to have a more dynamic solution. Th price categories are defined in a configuration table CATEGORIES:

 

SELECT * FROM categories;

 

CATEGORY  MIN_PRICE  MAX_PRICE

——– ———- ———-

low               0        100

medium          100       1000

high           1000     999999

 


For this purpose, we can change the source code of the PL/SQL function. Instead of an IF … THEN … ELSE statement, it contains a lookup on table CATEGORIES, implemented with an embedded SQL query in the function:

 

CREATE OR REPLACE FUNCTION f_price_category

   (in_quantity order_items.quantity%TYPE,

    in_price order_items.price_per_unit%TYPE) RETURN VARCHAR2

IS

   PRAGMA UDF;

   v_total_price order_items.price_per_unit%TYPE;

   v_category VARCHAR2(6);

BEGIN

   v_total_price := in_quantity * in_price;

 

   SELECT category

     INTO v_category

     FROM categories

    WHERE v_total_price >= min_price

      AND v_total_price <  max_price;

 

   RETURN v_category;

END f_price_category;

/

 

The result of the query is the same, but the response time is horrible:

 

SELECT COUNT(*), SUM(quantity)

  FROM order_items

 WHERE f_price_category(quantity, price_per_unit) = ‘high’;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

 

Elapsed: 00:01:31.045

 

What is the problem here? Although the function contains a PRAGMA UDF, there are still many context switches. For each of the 828270 rows in the ORDER_ITEMS table, a context switch for the SQL statement occurs, and the lookup on the CATEGORIES table must be executed. This leads to a dramatic increase in response time of the query.

How can we solve this problem? A function-based index is not possible, because the function is not deterministic anymore. An index on the CATEGORIES table does not help, because the table is very small. Something like a PRAGMA UDF within the SQL call is not available. There is no simple solution for this issue.

I remember such a situation in a customer project about 3 years ago: I was working on the balcony (like now) and did some performance tuning for a complex application. The PL/SQL package I reviewed contained several SQL statements on a view. This view was full of PL/SQL function calls, and these functions called other SQL statements, containing more function calls that executed SQL statements to… and so on. At the end, I spent several days to redesign a small part of the package and to explain the developers why this application design is not a good idea.

Performance tuning on the balcony
Source: Twitter (@dani_schnider)

After this experience (and many other similar situations), my recommendation is clear: Never use PL/SQL function calls with embedded SQL queries!

The only feasible solution is to rewrite the SQL query to avoid function calls. For our example query, we can do that is different ways. A straight-forward implementation is convert the embedded SQL call into a correlated subquery. This works, but is not very fast (but still much faster than the PL/SQL solution):

 

SELECT COUNT(*), SUM(quantity)

  FROM order_items

 WHERE (SELECT category 

          FROM categories

         WHERE quantity * price_per_unit >= min_price

           AND quantity * price_per_unit <  max_price) = ‘high’;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

 

Elapsed: 00:00:01.797

 

A better and faster solution is to write a simple SQL query that joins the ORDER_ITEMS table and the lookup table CATEGORIES with a non-equi join. This response time of this pure SQL query is faster than all other implementations described above.

 

SELECT COUNT(*), SUM(quantity)

  FROM order_items oi, categories cat

 WHERE cat.category = ‘high’

   AND oi.quantity * oi.price_per_unit >= cat.min_price

   AND oi.quantity * oi.price_per_unit <  cat.max_price;

 

  COUNT(*) SUM(QUANTITY)

———- ————-

     81115        268471

 

Elapsed: 00:00:00.127

 

 

Conclusion

Response times per solution
Response times for the different solutions showed in this blog post

As you can see from the simple example in this blog posts, the response times depend strongly on the implementation of the various solutions. If you can avoid PL/SQL function calls in a SQL statement, this is of course the easiest and fastest solution. But if PL/SQL functions are needed, there are multiple variants to improve the query performance. User Defined Functions (PRAGMA UDF) or Scalar SQL Macros are the best solutions for most cases. Depending on the selectivity of the query, a Function-based Index may be helpful, too.

What you must avoid under any circumstances are PL/SQL function calls in a SQL statement, if these contain other SQL statements within the function. These embedded SQL calls cause many nested SQL calls and context switches and therefore horrible response times.

5 thoughts on “Performance Tips: PL/SQL Functions in SQL Queries

  1. I seem to also recall that many years ago somebody else posted a blog about problems with SQL-within-PL/SQL-within-SQL, in that there are problems with read consistency, that the inner SQL might be as of a newer SCN than the outer query that’s driving the whole process.

    Liked by 1 person

  2. Jason, perhaps this relates to the issue you recall; 9i/10g docs used to say
    “Transactions containing DML statements with subqueries should use serializable isolation to guarantee consistent read.”
    e.g. https://docs.oracle.com/cd/B13789_01/server.101/b10743/consist.htm
    I do not see this warning note in later versions.
    Believe Metalink used to have a doc that demonstrated SCN of subquery was different to that of outer/driving query.

    Like

  3. just want to pick your thought is oracle for warehouse still staying strong with this cloud? I love oracle and somehow felt DB lost to cloud / nosql DB as it isnt that fast in adding their advantages to oracle db(i might be 100% not true). please share your thoughts.

    Like

      • Thanks for your time. Its just in the name of cloud I see some clients moving to platform owned dbs (redshift or synapse) or even databricks and I even saw shops which rewrote their entire(or in the process of ) informatica code in pyspark.
        Do you think its lack of understanding of what to use when ?.
        —- same SQL they are rewriting with “.” notation in pyspark.
        I am not able to understand which DB will stand out to be winner but looks like more cloud share vendor db will be the one?, am I correct on this?. Thanks as always!!

        Like

Leave a comment