Wednesday, July 6, 2022

ANY_VALUE Aggregate Function in Oracle Database 21c

Oracle Database 21c, Oracle Database Exam Prep, Oracle Database Tutorial and Materials, Oracle Database Guides, Oracle Database Materials, Database Career, Database Skills, Database Jobs, Database Guides

The ANY_VALUE function allows us to safely drop columns out of a GROUP BY clause to reduce any performance overhead.

◉ Setup

The examples in this article require the following tables to be present.

-- drop table emp purge;

-- drop table dept purge;

create table dept (

  deptno number(2) constraint pk_dept primary key,

  dname varchar2(14),

  loc varchar2(13)

) ;

create table emp (

  empno number(4) constraint pk_emp primary key,

  ename varchar2(10),

  job varchar2(9),

  mgr number(4),

  hiredate date,

  sal number(7,2),

  comm number(7,2),

  deptno number(2) constraint fk_deptno references dept

);

insert into dept values (10,'ACCOUNTING','NEW YORK');

insert into dept values (20,'RESEARCH','DALLAS');

insert into dept values (30,'SALES','CHICAGO');

insert into dept values (40,'OPERATIONS','BOSTON');

insert into emp values (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,null,20);

insert into emp values (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);

insert into emp values (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);

insert into emp values (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,null,20);

insert into emp values (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);

insert into emp values (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,null,30);

insert into emp values (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,null,10);

insert into emp values (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,null,20);

insert into emp values (7839,'KING','PRESIDENT',null,to_date('17-11-1981','dd-mm-yyyy'),5000,null,10);

insert into emp values (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);

insert into emp values (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,null,20);

insert into emp values (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,null,30);

insert into emp values (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,null,20);

insert into emp values (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,null,10);

commit;

◉ The Problem

We want to return a list of departments with a count of the number of employees in the department, so we use the COUNT aggregate function and a GROUP BY clause.

select d.deptno,

       d.dname,

       count(e.empno) as employee_count

from   dept d

       left outer join  emp e on d.deptno = e.deptno

group by d.deptno, d.dname

order by 1;

DEPTNO DNAME   EMPLOYEE_COUNT 
10 ACCOUNTING 3
20  RESEARCH 
30  SALES 
40  OPERATIONS 

SQL>

We are forced to include all non-aggregate columns from the select list into the GROUP BY or we will get an error. In this case we don't really care about including the DNAME column in the GROUP BY, but we are forced to do so. Adding extra columns in the GROUP BY represents an overhead. To get around this, people will sometimes use the MIN or MAX functions.

select d.deptno,
       min(d.dname) as dname,
       count(e.empno) as employee_count
from   dept d
       left outer join  emp e on d.deptno = e.deptno
group by d.deptno
order by 1;

DEPTNO DNAME   EMPLOYEE_COUNT 
10 ACCOUNTING 3
20  RESEARCH 
30  SALES 
40  OPERATIONS 

SQL>

select d.deptno,
       max(d.dname) as dname,
       count(e.empno) as employee_count
from   dept d
       left outer join  emp e on d.deptno = e.deptno
group by d.deptno
order by 1;

DEPTNO DNAME   EMPLOYEE_COUNT 
10 ACCOUNTING 3
20  RESEARCH 
30  SALES 
40  OPERATIONS 

SQL>

This allows us to remove the DNAME column from the GROUP BY, but now we have additional work associated with the MIN and MAX functions, which is a new overhead.

◉ ANY_VALUE : The Solution

Oracle 21c introduced the ANY_VALUE aggregate function to solve this problem. We use it in the same way we would use MIN or MAX, but it is optimized to reduce the overhead of the aggregate function. Rather than doing any type of comparison, ANY_VALUE just presents the first non-NULL value it finds.

select d.deptno,

       any_value(d.dname) as dname,

       count(e.empno) as employee_count

from   dept d

       left outer join  emp e on d.deptno = e.deptno

group by d.deptno

order by 1;

DEPTNO DNAME   EMPLOYEE_COUNT 
10 ACCOUNTING 3
20  RESEARCH 
30  SALES 
40  OPERATIONS 

So now we can reduce the overhead of additional columns in the GROUP BY, without having to add the overhead of the MIN or MAX functions.

◉ Considerations

👉 It's non-deterministic. Don't assume any specific behaviour beyond what is stated.

👉 You are not going to notice dramatic performance improvements in small data sets like this, but as data sets increase in size, so do the overheads of additional columns in the GROUP BY or using the MIN and MAX functions.

👉 Using MIN and MAX have meaning. Someone new looking at the code won't know if you chose to use MIN or MAX just to remove the column from the GROUP BY, or if there was a specific reason you chose it. The ANY_VALUE aggregate function is non-deterministic, so using it is a clear message to any other developer that you are using it to drop the column out of the GROUP BY. This extra clarity is a good thing from a support perspective.

👉 The ANY_VALUE function supports ALL and DISTINCT keywords, but they have no function.

👉 NULL values in the expression are ignored, so ANY_VALUE will return the first non-NULL value it finds. If all values in the expression are NULL, then the value NULL will be returned.

👉 It supports any data type except XMLTYPE, ANYDATA, LOB, file, or collection data types, which result in an ORA-00932 error.

👉 Like most functions, the input expression can be a column, constant, bind variable, or an expression made up of them.

Source: oracle-base.com

Related Posts

0 comments:

Post a Comment