Skip to Main Content
  • Questions
  • Avoid Group by in subquery and update table

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, V.

Asked: December 14, 2015 - 12:54 pm UTC

Last updated: February 20, 2016 - 4:21 am UTC

Version: 11.2.0.4

Viewed 1000+ times

You Asked

Hi Tom

It looks simple but I am unable to do this efficiently.

CUST_LEGACY table has duplicate records on BRAND. So basically, I want to group by PRODUCT only.

create table customer(cust_id char(4), product varchar2(50), make varchar2(50), qty number);

insert into customer (cust_id) values ('ABR1');
insert into customer (cust_id) values ('ABR2');
insert into customer (cust_id) values ('CD01');
insert into customer (cust_id) values ('CD08');

create table cust_legacy(cust_id char(4), product varchar2(50), brand varchar2(50), qty number);

insert into cust_legacy values ('ABR1', 'DVD', 'Sony', 5);
insert into cust_legacy values ('ABR1', 'DVD', 'Canon',6);
insert into cust_legacy values ('ABR2', 'Camera','Sony',8);
insert into cust_legacy values ('ABR2', 'Camera', 'Nikon',23);
insert into cust_legacy values ('ABR1', 'Laptop', 'Dell', 2);


Now, I have to update the customer table based on CUST_ID and irrespective of the brand, we have to group by product only from cust_legacy and where there is no duplicate in cust_legacy.

update customer a set (product, make, qty) =
 (select product, brand, qty from cust_legacy b where
  product= (select product from cust_legacy c where a.cust_id=b.cust_id group by
    product having count(*) =1)
 );

SQL> select * from customer;

CUST PRODUCT  MAKE            QTY
---- -------- -------- ----------
ABR1 Laptop   Dell              2
ABR2
CD01
CD08



As cust_legacy table can have about 12 million records for updation in customer table, I am looking for an efficient SQL and I came up with this below but as you can see, it throws "invalid identifier" error.

UPDATE
   CUSTOMER A
SET
   (A.PRODUCT, A.MAKE, A.QTY) = 
( SELECT
     PRODUCT, BRAND, QTY FROM
    (SELECT PRODUCT, BRAND, QTY, COUNT(*) OVER (PARTITION BY PRODUCT) CNT FROM CUST_LEGACY B
   WHERE A.CUST_ID = B.CUST_ID)
 ) WHERE CNT=1
);


This throws A.CUST_ID as invalid identifier. I understand that table CUSTOMER is not visible inside the second subquery. How do I resolve this and also at the same time, efficiently update the CUSTOMER table?

Thanks,
VBala

and Chris said...

You can change the count(*) to include the cust_id in the partition. This enables you to find those customers and products with a single row. This allows you to reference cust_legacy just once. You can then join on cust_id in the subquery:

update customer a
set ( a.product, a.make, a.qty ) =
 (select product, brand, qty
 from
  (select cust_id, product, brand,
       qty, count ( * ) over ( partition by cust_id, product ) cnt
   from cust_legacy b
  ) b
 where cnt     =1
 and a.cust_id = b.cust_id
 ) ;

4 rows updated.
 
select * from customer;

CUST PRODUCT    MAKE              QTY
---- ---------- ---------- ----------
ABR1 Laptop     Dell                2
ABR2                                 
CD01                                 
CD08


I don't fully understand the logic though. Is it guaranteed that a customer will only have one product that doesn't have duplicates?

If not, you'll hit errors:

insert into cust_legacy values ('ABR1', 'Camera', 'Nikon',23);

1 row inserted.

update customer a
set ( a.product, a.make, a.qty ) =
 (select product, brand, qty
 from
  (select cust_id, product, brand,
       qty, count ( * ) over ( partition by cust_id, product ) cnt
   from cust_legacy b
  ) b
 where cnt     = 1
 and a.cust_id = b.cust_id
 ) ;

SQL Error: ORA-01427: single-row subquery returns more than one row

Rating

  (7 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Subquery issue

V Bala, December 15, 2015 - 10:37 am UTC

Thanks very much for the response.

It is a tricky situation with regard to this requirement as we are migrating data from a legacy home-grown application (excel :)- into Oracle database and there are lot of such scenarios, one of which is picking the records with no duplicate and in some cases, pick only the last two, etc

I'll work on this and get back to you if I have any queries.

Thanks a lot
Vbala

Analytic usage

VBala, December 16, 2015 - 10:09 am UTC

Hi Chris

Thanks for that. Can you please help me with this too?

For the same migration, I have a different requirement. I have a date column in cust_visit table. This one stores when a customer enters the shop and what time he leaves the shop.

Just to simplify it, I'll give you only few columns
create table cust_visit(cust_id number , cust_name varchar2(25), shop_branch varchar2(10), visit_type varchar2(10), visit_date date);

insert into cust_visit values(111,'Joe','London','Enter', to_date('01/01/2015 09:00:00','dd/mm/yyyy hh24:mi:ss');
insert into cust_visit values(111,'Joe','London','Leave', to_date('01/01/2015 12:10:00','dd/mm/yyyy hh24:mi:ss');
insert into cust_visit values(222,'Mary','London','Enter', to_date('03/03/2015 14:04:00','dd/mm/yyyy hh24:mi:ss');
insert into cust_visit values(222,'Mary','London','Leave', to_date('03/03/2015 16:22:00','dd/mm/yyyy hh24:mi:ss');

I want to display the start time and leave time in the same row as follows

cust Name Branch   Enter_Time           Leave_Time
111, Joe, London, 01/01/2015 09:00:00, 01/01/2015 12:10:00
222, Mary, London, 03/03/2015 14:04:00, 03/03/2015 16:22:00


Can you help me please?

Thanks,
VBala
Connor McDonald
December 18, 2015 - 1:06 am UTC

Hi, Connor here - you just need a PIVOT

SQL> drop table cust_visit purge;

Table dropped.

SQL>
SQL> create table cust_visit(cust_id number , cust_name varchar2(25), shop_branch varchar2(10), visit_type varchar2(10), visit_date date);

Table created.

SQL>
SQL> insert into cust_visit values(111,'Joe','London','Enter', to_date('01/01/2015 09:00:00','dd/mm/yyyy hh24:mi:ss'));

1 row created.

SQL> insert into cust_visit values(111,'Joe','London','Leave', to_date('01/01/2015 12:10:00','dd/mm/yyyy hh24:mi:ss'));

1 row created.

SQL> insert into cust_visit values(222,'Mary','London','Enter', to_date('03/03/2015 14:04:00','dd/mm/yyyy hh24:mi:ss'));

1 row created.

SQL> insert into cust_visit values(222,'Mary','London','Leave', to_date('03/03/2015 16:22:00','dd/mm/yyyy hh24:mi:ss'));

1 row created.

SQL>
SQL>
SQL> SELECT *
  2  FROM   cust_visit
  3  PIVOT  (min(visit_date) AS visit_date FOR (visit_type) IN ('Enter' AS Enter, 'Leave' AS Leave));

   CUST_ID CUST_NAME                 SHOP_BRANC ENTER_VISIT_DATE    LEAVE_VISIT_DATE
---------- ------------------------- ---------- ------------------- -------------------
       111 Joe                       London     01/01/2015 09:00:00 01/01/2015 12:10:00
       222 Mary                      London     03/03/2015 14:04:00 03/03/2015 16:22:00

SQL>
SQL>



Review

VBala, December 17, 2015 - 4:45 pm UTC

Hi Chris

Can you please respond to my requirement of analytic function ?

Thanks
VBala
Chris Saxon
December 18, 2015 - 1:08 am UTC

We have lives outside of AskTom too :-)

Good solution

VBala, December 21, 2015 - 9:16 am UTC

Hi Connor

Sorry about that but thanks a lot for your solution.

Merry Christmas and happy new year

Regards,
VBala

How to pivot when there are non-unique values

VBala, December 29, 2015 - 11:18 am UTC

Hi Conor

Thanks for your help. I am now stuck with pivot query when it comes to non-unique values between corresponding records. I'll give the same example with an additional column SALES_MAN as we need to capture who attended the customer. I understand that in this case it cannot be pivoted as there can be different sales man.

drop table cust_visit purge;

create table cust_visit(cust_id number , cust_name varchar2(25), shop_branch varchar2(10), 
visit_type varchar2(10), sales_man varchar2(25), visit_date date);

insert into cust_visit values(111,'Joe','London','Enter', 'Mark Higgins', to_date('01/01/2015 09:00:00','dd/mm/yyyy hh24:mi:ss'));
insert into cust_visit values(111,'Joe','London','Leave', 'Ann Leeman', to_date('01/01/2015 12:10:00','dd/mm/yyyy hh24:mi:ss'));

insert into cust_visit values(222,'Mary','London','Enter', 'Mark Higgins', to_date('03/03/2015 14:04:00','dd/mm/yyyy hh24:mi:ss'));
insert into cust_visit values(222,'Mary','London','Leave', 'George Henman', to_date('03/03/2015 16:22:00','dd/mm/yyyy hh24:mi:ss'));

SELECT *
    FROM   cust_visit
    PIVOT  (min(visit_date) AS visit_date FOR (visit_type) IN ('Enter' AS Enter, 'Leave' AS 
Leave));

   CUST_ID CUST_NAME                 SHOP_BRANC SALES_MAN                 ENTER_VISIT_DATE     LEAVE_VISIT_DATE
---------- ------------------------- ---------- ------------------------- -------------------- -----
       222 Mary                      London     Mark Higgins              03-MAR-2015 14:04:00
       222 Mary                      London     George Henman                                  03-MAR-2015 16:22:00
       111 Joe                       London     Mark Higgins              01-JAN-2015 09:00:00
       111 Joe                       London     Ann Leeman                                     01-JAN-2015 12:10:00



But I need an output with both sales men in the same record as well as I do the enter time and leave time. Is it possible?

Something like this:

111, Joe, London, Mark Higgins, Ann Leeman,01-JAN-2015 09:00:00, 01-JAN-2015 12:10:00
222, Mary, London, Mark Higgins,George Henman,03-MAR-2015 14:04:00, 03-MAR-2015 16:22:00


Thanks in advance and happy new year to all Ask Tom group.

Best Regards,
VBala

Got it

VBala, December 29, 2015 - 2:00 pm UTC

Hi Conor

OK, I have pivotted the sales_man as well and I got the desired results

SELECT *
    FROM   cust_visit
    PIVOT  (min(visit_date) AS visit_date, min(sales_man) as sales FOR (visit_type) IN ('Enter' AS Enter, 'Leave' AS 
Leave));


This is the result
   CUST_ID CUST_NAME SHOP_BRANC ENTER_VISIT_DATE     ENTER_SALES    LEAVE_VISIT_DATE     LEAVE_SALES
---------- --------- ---------- -------------------- -------------- -------------------- -----------
       222 Mary      London     03-MAR-2015 14:04:00 Mark Higgins   03-MAR-2015 16:22:00 George Henman
       111 Joe       London     01-JAN-2015 09:00:00 Mark Higgins   01-JAN-2015 12:10:00 Ann Leeman


Thanks
VBala
Chris Saxon
December 30, 2015 - 3:41 am UTC

Nice work.

Improve performance

Vbala, February 19, 2016 - 11:53 am UTC

Hi Chris

Sorry, I am back again with a question on improving performance on this.

This one works perfectly fine as far as functionality is concerned but as the source table has 23 million records, the pivot analytic function takes too long to complete. It takes about 11 hours to insert the records from source table with pivot.

Can you please give me some tips? Would any index or FBI do the trick here in improving the performance?

Thanks
VBala
Connor McDonald
February 20, 2016 - 4:21 am UTC

OK, just so we'all on the same page here. There's been a fair few queries etc we've been through.

Please post the query you are *currently* referring to, and I'll take a look at how we might speed it up.

Cheers,
Connor

More to Explore

Analytics

Analytic SQL got you confused? Check out Connor McDonald's complete video course.