The ANY_VALUE function allows us to safely drop columns out of a GROUP BY clause to reduce any performance overhead.
◉ Setup
The examples in this article require the following tables to be present.
-- drop table emp purge;
-- drop table dept purge;
create table dept (
deptno number(2) constraint pk_dept primary key,
dname varchar2(14),
loc varchar2(13)
) ;
create table emp (
empno number(4) constraint pk_emp primary key,
ename varchar2(10),
job varchar2(9),
mgr number(4),
hiredate date,
sal number(7,2),
comm number(7,2),
deptno number(2) constraint fk_deptno references dept
);
insert into dept values (10,'ACCOUNTING','NEW YORK');
insert into dept values (20,'RESEARCH','DALLAS');
insert into dept values (30,'SALES','CHICAGO');
insert into dept values (40,'OPERATIONS','BOSTON');
insert into emp values (7369,'SMITH','CLERK',7902,to_date('17-12-1980','dd-mm-yyyy'),800,null,20);
insert into emp values (7499,'ALLEN','SALESMAN',7698,to_date('20-2-1981','dd-mm-yyyy'),1600,300,30);
insert into emp values (7521,'WARD','SALESMAN',7698,to_date('22-2-1981','dd-mm-yyyy'),1250,500,30);
insert into emp values (7566,'JONES','MANAGER',7839,to_date('2-4-1981','dd-mm-yyyy'),2975,null,20);
insert into emp values (7654,'MARTIN','SALESMAN',7698,to_date('28-9-1981','dd-mm-yyyy'),1250,1400,30);
insert into emp values (7698,'BLAKE','MANAGER',7839,to_date('1-5-1981','dd-mm-yyyy'),2850,null,30);
insert into emp values (7782,'CLARK','MANAGER',7839,to_date('9-6-1981','dd-mm-yyyy'),2450,null,10);
insert into emp values (7788,'SCOTT','ANALYST',7566,to_date('13-JUL-87','dd-mm-rr')-85,3000,null,20);
insert into emp values (7839,'KING','PRESIDENT',null,to_date('17-11-1981','dd-mm-yyyy'),5000,null,10);
insert into emp values (7844,'TURNER','SALESMAN',7698,to_date('8-9-1981','dd-mm-yyyy'),1500,0,30);
insert into emp values (7876,'ADAMS','CLERK',7788,to_date('13-JUL-87', 'dd-mm-rr')-51,1100,null,20);
insert into emp values (7900,'JAMES','CLERK',7698,to_date('3-12-1981','dd-mm-yyyy'),950,null,30);
insert into emp values (7902,'FORD','ANALYST',7566,to_date('3-12-1981','dd-mm-yyyy'),3000,null,20);
insert into emp values (7934,'MILLER','CLERK',7782,to_date('23-1-1982','dd-mm-yyyy'),1300,null,10);
commit;
◉ The Problem
We want to return a list of departments with a count of the number of employees in the department, so we use the COUNT aggregate function and a GROUP BY clause.
select d.deptno,
d.dname,
count(e.empno) as employee_count
from dept d
left outer join emp e on d.deptno = e.deptno
group by d.deptno, d.dname
order by 1;
DEPTNO | DNAME | EMPLOYEE_COUNT |
10 | ACCOUNTING | 3 |
20 | RESEARCH | 5 |
30 | SALES | 6 |
40 | OPERATIONS | 0 |
DEPTNO | DNAME | EMPLOYEE_COUNT |
10 | ACCOUNTING | 3 |
20 | RESEARCH | 5 |
30 | SALES | 6 |
40 | OPERATIONS | 0 |
DEPTNO | DNAME | EMPLOYEE_COUNT |
10 | ACCOUNTING | 3 |
20 | RESEARCH | 5 |
30 | SALES | 6 |
40 | OPERATIONS | 0 |
◉ ANY_VALUE : The Solution
Oracle 21c introduced the ANY_VALUE aggregate function to solve this problem. We use it in the same way we would use MIN or MAX, but it is optimized to reduce the overhead of the aggregate function. Rather than doing any type of comparison, ANY_VALUE just presents the first non-NULL value it finds.
select d.deptno,
any_value(d.dname) as dname,
count(e.empno) as employee_count
from dept d
left outer join emp e on d.deptno = e.deptno
group by d.deptno
order by 1;
DEPTNO | DNAME | EMPLOYEE_COUNT |
10 | ACCOUNTING | 3 |
20 | RESEARCH | 5 |
30 | SALES | 6 |
40 | OPERATIONS | 0 |
◉ Considerations
👉 It's non-deterministic. Don't assume any specific behaviour beyond what is stated.
👉 You are not going to notice dramatic performance improvements in small data sets like this, but as data sets increase in size, so do the overheads of additional columns in the GROUP BY or using the MIN and MAX functions.
👉 Using MIN and MAX have meaning. Someone new looking at the code won't know if you chose to use MIN or MAX just to remove the column from the GROUP BY, or if there was a specific reason you chose it. The ANY_VALUE aggregate function is non-deterministic, so using it is a clear message to any other developer that you are using it to drop the column out of the GROUP BY. This extra clarity is a good thing from a support perspective.
👉 The ANY_VALUE function supports ALL and DISTINCT keywords, but they have no function.
👉 NULL values in the expression are ignored, so ANY_VALUE will return the first non-NULL value it finds. If all values in the expression are NULL, then the value NULL will be returned.
👉 It supports any data type except XMLTYPE, ANYDATA, LOB, file, or collection data types, which result in an ORA-00932 error.
👉 Like most functions, the input expression can be a column, constant, bind variable, or an expression made up of them.
Source: oracle-base.com
0 comments:
Post a Comment