Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, Prasad.

Asked: January 18, 2001 - 7:06 pm UTC

Last updated: January 16, 2018 - 3:49 am UTC

Version: Oracle 8.0.5

Viewed 50K+ times! This question is

You Asked


How to find duplicate rows in a table ?

and Tom said...



select c1, c2, c3, ..., cn, count(*)
from T
group by c1, c2, c3, ...., cn
HAVING count(*) > 1
/

will do it. c1, c2, .. cn represents the set of columns that should be "unique" in T.

Rating

  (11 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

duplicate rows

Merle R. Vermuele, September 05, 2001 - 3:04 pm UTC

The answer was exactly what I was looking for to solve a problem I was having.

A reader, May 02, 2003 - 4:31 am UTC


Oracle Developer

Developer, January 14, 2005 - 2:44 pm UTC

This is a good way to find a duplicate record. But, there is over 50,000,000 records in my table which is partitioned. If I use this way to find a duplicate record, it will take a long long time. What is a best way?

Please advice.

Tom Kyte
January 14, 2005 - 8:43 pm UTC

even on 50,000,000 rows, a very small number today in the year 2005, that query won't take very long.

Duplication of Item Name

Muhammad Waseem Haroon, May 08, 2005 - 11:52 am UTC

I Have Created a Table Named Items_Master

CREATE TABLE items_master
(
item_code NUMBER(10) PRIMARY KEY,
item_name VARCHAR2(200),
item_creation_date DATE DEFAULT SYSDATE
);


There are 10,000 rows in ITEMS_MASTER tabe.
Let suppose there are 3 rows are same but the item_code is unique e.g.

ITEM_CODE ITEM_NAME item_creation_date
--------- -------------------------------------- ------------------
2658 CD-ROM 08-JAN-2005
5986 CDROM 10-FEB-2005
9982 CD ROM DRIVE 05-MAY-2005
1112 HARD DRIVE 08-JAN-2005
2225 HARD-DRIVE 09-AUG-2004

on the basis of ITEMS_MASTER table, many transactional tables are effected.
e.g. (items recieving & issuance)

for that problem i have given an special form/application to a user which is called ITEM MERGING FORM
to eliminate the duplicatin of item name. The purpose of this form is to Marge same items on one code,
for example.

user wants to marge item_code (5986,9982) with 2658.
then he will first merge 5986 with 9982 and second he will merge 9982 with 2658
and finally items_master table contain one row of 2658.
5986,9982 are deleted all transactional tables are updated.
Example :
-------------------------------------------------
| ITEM MERGING FORM |
| |
| Item Code : 9982 |
| |
| Merge With: 2658 |
| |
| |
| |
| OK CANCEL |
| |
|------------------------------------------------
On pressing OK Button item_code 9982 will be deleted from ITEMS_MASTER table and all the
transactional rows of 9982 will be updated with 2658.

suppose user has merged CD-ROM with HARD-DRIVE on one code and has taken many steps like that.
how can i create revert procedure for these things using tracking or from other way.

Please Suggest me that how can i eliminate the duplication and how can i create a
revert/undo procedure which will revert the steps to the initial stage.

Thanks in advance
Muhammad Waseem Haroon.

Tom Kyte
May 08, 2005 - 1:47 pm UTC

You are on "your own" for the revert -- it is basically "versioning"

there is the do it yourself way (design it). There is workspace manager (the database will do it).

But this is just a fairly simple application. If you want the "easiest", it might well be workspace manager, else it is a custom design of your choosing.

</code> http://docs.oracle.com/docs/cd/B10501_01/appdev.920/a96628/toc.htm <code>

The do it yourself is going to involve some amount of work and design - of which I cannot do, only you can. You'll need to save "before and after keys" in the child tables so you'd know what rows to update back to what.

nitin tyagi, May 16, 2005 - 9:57 pm UTC

This is very easy understandable and best answer to find dup row....but if there is a very big table with lots of coloum than can we hav other option too????????

Tom Kyte
May 17, 2005 - 8:11 am UTC

write a script to query the data dictionary to write the query for you. would be a very small plsql routine for you to develop. Have it return a dynamically opened ref cursor.

Great :)

Kompt de Yoyo, June 26, 2005 - 12:06 am UTC

This was just what I needed. Thanks alot :)

also just what i needed

A reader, January 19, 2006 - 11:11 am UTC


Eliminating duplicates

RajMohan, July 02, 2007 - 7:24 pm UTC

Hi Tom,


Is it possible to get the output mentioned below for the tables structure and data provided below.
Currently I use pl/sql program to get the said output, but I guess there is way to do it SQL.
Please help.


Tab1


Id project Date
--- ------- -----
1 2748 07/11/2007
2 2748 07/12/2007
3 2748 07/13/2007



SQL


create table tab1 (id int,project int, dt date);

Insert into tab1 values (1,2748,to_date('07/11/2007','mm/dd/yyyy'));
Insert into tab1 values (2,2748,to_date('07/12/2007','mm/dd/yyyy'));
Insert into tab1 values (3,2748,to_date('07/13/2007','mm/dd/yyyy'));


Tab2


Id Feature
-- -------

1 10
1 20
1 30
1 40

2 20
2 50

3 40
3 20
3 60




create table tab2 (id int, feature int);

Insert into tab2 values (1,10);
Insert into tab2 values (1,20);
Insert into tab2 values (1,30);
Insert into tab2 values (1,40);
Insert into tab2 values (2,20);
Insert into tab2 values (2,50);
Insert into tab2 values (3,40);
Insert into tab2 values (3,30);
Insert into tab2 values (3,60);

Tab1 and Tab2 can be joined using the 'Id' column


Expected Output sorted by feature
---------------------------------

feature should show up only once and should reflect the latest(max) Id it is associated with

Latest Id Feature
========= =======
1 10
2 20
3 30
3 40
2 50
3 60
Tom Kyte
July 03, 2007 - 10:30 am UTC

i did not get this at all. unless you just mean:

ops$tkyte%ORA10GR2> select max(id), feature from tab2 group by feature order by feature;

   MAX(ID)    FEATURE
---------- ----------
         1         10
         2         20
         3         30
         3         40
         2         50
         3         60

6 rows selected.


in which case, the relevance of tab1 is "not there"

Thanks Tom.

A reader, July 12, 2007 - 2:07 pm UTC

Thanks Tom. I am wondering what I was thinking when I asked you this. I presented a simple situation here but your answer gave me enough hints to solve the real issue I had.

Answer

Betty, March 07, 2016 - 4:22 pm UTC

I did exactly this before consulting the Ask Tom page but it returned a count without any of the related column information

Chris Saxon
March 07, 2016 - 5:30 pm UTC

What exactly did you do?

Need more

Jeff Fontenot, January 15, 2018 - 9:07 pm UTC

this inst useful if you want to see the actual duplicates as it groups them. I get a list of non-unique values but no way to update them as this doesnt give each of their row_keys