APEX_COLLECTIONS - are you getting invalid datatype errors?

·

6 min read

The latest official documentation for Oracle APEX 23.1 defines APEX collections as follows:

Collections enable you to temporarily capture one or more nonscalar values. You can use collections to store rows and columns currently in session state so they can be accessed, manipulated, or processed during a user's specific session. You can think of a collection as a bucket in which you temporarily store and name rows of information.

They are "bit-buckets" where we developers can temporarily store data in a way that, to a large extent, mimics a global temporary table. One of the easiest examples to explain their usage is as a shopping basket for an e-commerce application (hence the blog image!) used to keep track of potential purchases as a user browses products.

Why can't we use a global temporary table in APEX? APEX uses a session pool to connect to the database from ORDS and the association of a single database session to an APEX session is not guaranteed so this prevents the use of global temporary tables.

So our thoughtful APEX team at Oracle provided the "collections" solution to provide this functionality and it is an essential tool for many applications. There is a PL/SQL API to work with these collections conveniently named APEX_COLLECTION and a view to see the data in SQL queries called APEX_COLLECTIONS.

An APEX collection has a name (COLLECTION_NAME) that you, the developer, can use to reference in both the API and the view and, using that as the key, you can store multiple rows of data in a generic manner. There are 50 VARCHAR2 columns, 5 NUMBER columns, 5 DATE columns, 1 CLOB column, 1 XMLTYPE column and 1 BLOB column. Oracle will provide each row with a unique ID within that collection name (SEQ_ID) and an MD5 hash of the data.

This ability to store data in such a generic format has led to some unexpected errors in the past when it is used in SQL queries via the view. I have seen several instances where a numeric key value is stored in the first character column in the collection (column C001), implicitly converted, and then used in a join to a table where the corresponding key is defined correctly as NUMBER datatype. This works fine for months, sometimes years and then all of a sudden users start to report they are seeing ORA-01722 errors regarding invalid numbers. This leaves the developer perplexed as they are certain that the only data going into that collection in the C001 column is valid numeric data that should convert without issue - so what is happening?

💡
This is also true for DATE data being stored as VARCHAR2 and joined to tables with columns defined as DATE or where an explicit TO_DATE is performed on the collection's character column.

It's all to do with how the data is being stored internally within the APEX installation schema. APEX has to use a single, conventional database table to store this data.....well two tables to be precise. WWV_FLOW_COLLECTIONS$ holds the metadata about the APEX_COLLECTION such as the all-important name along with the application ID, the user ID who owns it, and the session that created it to name but a few. WWV_FLOW_COLLECTION_MEMBERS$ holds the row data for the collection and has a foreign key back to WWV_FLOW_COLLECTIONS$.

Armed with that knowledge, many of you probably already have a strong suspicion of who the culprit is of the invalid datatype error and if you suspect it is Oracle's optimizer you'd be right. WWV_FLOW_COLLECTION_MEMBERS$ is a single table and holds all the collection members for all collections in all applications and some will be correctly using C001 to store character data.

The optimizer is influenced by many things, the presence (and absence) of indexes, table and index statistics, constraints and so on. If the optimizer believes the members$ table is small enough to pull into the plan in its entirety before pruning the data to just your specific collection, it will now have all values from all collections in the C001 column to work with. If it then opts to apply the join predicates before it filters down to just your collection, it will now be comparing your numeric values in C001 to the numeric primary key along with all the non-numeric C001 values from other collections....... and there's that lovely ORA-01722 error appearing on your users' screens.

💡
Adding an index on a foreign key column is a common practice to avoid table locks so it seems strange that the APEX team do not appear to have an index on the foreign key in WWV_FLOW_COLLECTION_MEMBERS$. This potential oversight is potentially exacerbated by the fact the foreign key is set to ON DELETE CASCADE which will now force Oracle to perform a full table scan to propagate the deletion of the parent to the child records, locking it for the duration. There is an APEX "housekeeping" job that runs periodically and one of its tasks is to purge collections for long-since-expired APEX sessions and I suspect it just deletes the parent records en masse in a single DML action allowing the foreign key to clean up the child records. While this job may run in the early hours, for a global application accessed at all hours of the day, this can impact users if their session is trying to work with collections at this precise moment.

So the obvious and correct solution is to use the N001 through N005 numeric columns in APEX_COLLECTIONS for the numeric keys but there are scenarios where this is simply not possible:

  1. You need a sixth numeric column to store the keys that you will join to other tables. (APEX Idea FR-599 indicates that more columns will be coming in a future release as it is on the APEX road-map.)

  2. You need to use the APEX_COLLECTION.CREATE_COLLECTION_FROM_QUERY% procedures to load the data as the alternative of adding members one at a time is prohibitively slow and the bulk processing these procedures offer is required. By design, these procedures force all values that are dynamically fetched exclusively into the 50 VARCHAR2 columns.

    💡
    Some use of custom bulk processing into associative arrays could be used with the APEX_COLLECTION.ADD_MEMBERS procedure but even that could prove cumbersome and you still have the limitation of only 5 numeric columns.

When the character columns are your only option to store these numeric keys there is some salvation. To be treated purely as a last resort and not a lazy "quick-fix" because refactoring of code is "too hard", we have an often overlooked feature of the TO_NUMBER function. The function provides an error handler for conversion errors that will suppress it and allow the query to complete.

TO_NUMBER(column DEFAULT NULL|<<number>> ON CONVERSION ERROR)

So we can simply set the non-numerics Oracle finds in the collection member's C00x column to NULL or a negative value (assuming only positive values are valid for the key column) so the join is attempted without causing an error.

A word of caution though, this workaround will also mask errors within your own data if non-numerics have somehow managed to creep into your collection and will lead to even more head-scratching as you try to debug why you aren't seeing all the data in your queries.

SELECT TO_NUMBER('non-numeric data here' DEFAULT 0 ON CONVERSION ERROR) AS CONV
FROM DUAL;
CONV
-------
0
💡
There is an equivalent error handler in the TO_DATE function with identical syntax.

To summarize, always use the N00x columns to hold the numeric data and the D00x columns to hold dates in your collections but when you have exhausted these columns, use the TO_NUMBER (or TO_DATE) function to explicitly convert your C00x column and add the exception handler syntax as a safeguard against other collections interfering with your data.

Â