Subscriber Count

    459

Subscribe to Posts by Email

Pages

Oracle 12.2 New Features : Approximate Query Processing

While sometime ago, I was looking for a database project requirement to process analytical workload. The organisation choice was Oracle and Developers choice was postgres or vertica. But to handle Vertica we do not have that much of data processing actually needed so the final contenders are Oracle and Postgres.

As per licensing guidelines I cannot publish the details of outcome , but I would say the results are not good for me as a Oracle DBA , atleast in processing aggregate functions Oracle for a 2 million record table with more than 200 columns doing a count (*). But indeed we want an enterprise database and stable one not the open source database. Well our data in terms of distinctivity does not change much so some sort of approximation is also fine for us.

That is what now in 12.2, with Approximate Query Processing which is as per documentation

Approximate Query Processing

The 12.2 release extends the area of approximate query processing by adding approximate percentile aggregation. With this feature, the processing of large volumes of data is significantly faster than the exact aggregation. This is especially true for data sets that have a large number of distinct values with a negligible deviation from the exact result.

Approximate query aggregation is a common requirement in today's data analysis. It optimizes the processing time and resource consumption by orders of magnitude while providing almost exact results. Approximate query aggregation can be used to speed up existing processing.

Oracle provides a set of SQL functions that enable you to obtain approximate results with negligible deviation from the exact result. There are additional approximate functions that support materialized view based summary aggregation strategies. The functions that provide approximate results are as follows:

APPROX_COUNT_DISTINCT
APPROX_COUNT_DISTINCT_DETAIL
APPROX_COUNT_DISTINCT_AGG
TO_APPROX_COUNT_DISTINCT
APPROX_MEDIAN
APPROX_PERCENTILE
APPROX_PERCENTILE_DETAIL
APPROX_PERCENTILE_AGG
TO_APPROX_PERCENTILE

Approximate query processing can be used without any changes to your existing code. When you set the appropriate initialization parameters, Oracle Database replaces exact functions in queries with the corresponding SQL functions that return approximate results.

Parameter that effects the Approximate Query Processing

SQL> show parameter approx

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
approx_for_aggregation               boolean     TRUE
approx_for_count_distinct            boolean     TRUE
approx_for_percentile                string      ALL
SQL>

When we use the functions , the value returns in hexadecimal to retrieve exact values we need to use associate to_* functions along with materialized views

SQL> select unit_code,APPROX_COUNT_DISTINCT_AGG(id) from (select unit_code,APPROX_COUNT_DISTINCT_DETAIL(ID) id from INSIGHTS.TABLE group by unit_code) group by unit_code;

UNIT_CODE       APPROX_COUNT_DISTINCT_AGG(ID)
--------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------
MHIX115         0D0C0925009E00000000000000A1000000000000000000000040000000000000000000000000000000001000000000000000000000000000000000000000000000040000000000000000000000000000
POIX108         0D0C0DD500740000000000000075000000000000000000000000000080000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000
PHIX365         0D0C0DD500270000000000000027000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000

So, let me create a MV first with my query

CREATE MATERIALIZED VIEW test_approx_agg_mv AS
SELECT unit_code,
APPROX_COUNT_DISTINCT_DETAIL(id) id
FROM INSIGHTS.TABLE
GROUP BY unit_code;
Then let's access the query as below, as you see the values provided are approximate , when you compare the values with original count (second query)

SQL> SELECT unit_code, TO_APPROX_COUNT_DISTINCT(id) "ids" FROM test_approx_agg_mv ORDER BY unit_code;

UNIT_CODE              ids
--------------- ----------
ABR13                    2
ABT10                   42
ABT13                   63

SQL> SELECT unit_code, count(id) from INSIGHTS.TABLE where unit_code in ('ABR13','ABT10','ABT13') group by unit_code;

UNIT_CODE       COUNT(id)
--------------- -------------
ABR13                      11
ABT13                     264
ABT10                     202

If i directly try to access the column with regular count we get an error;

SQL>SELECT unit_code, count(id) from test_approx_agg_mv ORDER BY unit_code
                     *
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected - got BLOB
SQL>

So use the approximation function with associate function to retrieve the data, this works well with MV's rather plain queries

-Thanks

Suresh

 

 

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>