Skip to Main Content
  • Questions
  • Creating a unique HASH value for the contents of a table

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Udi.

Asked: October 21, 2009 - 1:51 pm UTC

Last updated: March 02, 2022 - 2:39 am UTC

Version: 11.1.0.7

Viewed 50K+ times! This question is

You Asked

Hi Tom,

We run a large data warehouse with giant tables that have to be copied to multiple destination databases. We copy them via db-link / export-import, etc.

We need a dependable mechanism that would QA and ensure that the copies are precisely identical to the source table.

Ideally - we would like some kind of a HASH function that gets as parameter a table name and returns a unique number that's a product of the data in that table. It needs to do that consistently, of course, and then - by comparing the HASH values of a source table and one of its copies - you will be able to ascertain that they are equal.

Even a function that returns a consistent HASH value for one row is useful - we can then add the values for all the rows in the table - and identical sums would indicate identical tables on both sides with a high degree of confidence.

Ideas ? On Teradata there's an efficient builtin HASH function that gets as parameter a list of columns (typically the keys, or all the columns of a table) and returns this value for every row.

Any Oracle equivalent? Or any other slick mechanism in Oracle to compare 2 tables to see if they are precisely identical?

We are talking tables with billions of rows, so efficiency of the mechanism is of the essence.

Thanks,

Udi

and Tom said...

well, this sort of thing always surprises me

- if you use dblinks, you know the two are the same because you just did SQL - you either a) succeed, b) get an error message

- if you use datapump or even exp - you are hopefully using transportable tablespaces

there would be no need to compare a thing - especially in the case of transporting. Since you MUST be using the same charactersets on source and target system (else the data would not compare!) - you can absolutely use transport.

You would load on system A (and index if you want)
and then transport (copy datafiles) to system B

done - no compare needed.

You have as manual tools if you want (but you DO NOT WANT - really - just transport, you cannot lose a single row during a transport)

o dbms_rectifier_diff
http://docs.oracle.com/cd/B28359_01/server.111/b28327/rarrectdifpac.htm#i94795

o ora_hash( c1||'/'||c2||'/'|| ... , .... )

http://docs.oracle.com/cd/B28359_01/server.111/b28286/functions112.htm#sthref1780

o dbms_crypto.hash

http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_crypto.htm#i1002022



but again, the fastest way to do something is TO NOT DO IT and if you transport or use dblinks - I fail to see why you would even consider doing it

heck, even with export/import - as long as you review the logs - you are done there as well.


for billions of rows (I assume that is pretty big, billions of rows is sort of MEANINGLESS when you think about it. How big is a billion rows? Sort of like asking "how long is a piece of string" - until you measure it, you don't know) I would not be looking at verifying after the fact - I would be looking at using techniques that make it so I don't have to (transporting using data pump!)

Rating

  (16 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

The fastest way...

Dana, October 21, 2009 - 5:59 pm UTC

"We run a large data warehouse with giant tables that have to be copied to multiple destination databases."
I really like the answer: "the fastest way to do something is TO NOT DO IT"
Why would you really need multiple large data warehouse tables copied in multiple databases?

Thanks - it's not so simple

Udi Karni, October 21, 2009 - 6:20 pm UTC

Tom - thanks for the HASH links - sounds like that's the way to go. However - the criticism over our need to do this is misplaced. I was oversimplifying. Sorry to report that Oracle's Data Warehouse toolbox needs some beefing up .

(1) DB-Links - are the most convenient mechanism. You get the full power of SQL - you can add/remove columns, etc. But DB-Links are serial, do NOT parallelize (why ???), and are very CPU intensive (why ???). To move a Terabyte table between 2 multi-CPU servers you need to kick off as many single-threaded DB-Link jobs as there are CPUs, maybe one partition per job, "manually parallelizing" in order to max the pipe between the servers. Many jobs ==> very easy to miss or double transfer.

(2) Datapump - at the cost of making an intermediate flat file copy - this would seem like a good option (though not as flexible as DB-Link SQL). A single parallel job that does it all. But guess what - parallel datapump export/import of a multi-terabyte entire schema/tablespace simply doesn't work. Try to export 20 Terabytes in parallel - and you will encounter some very unpleasant ORA-0600s.

(3) Transportable tablespaces - you're right - it's quite "safe" - but again - you get overwhelmed by the numbers. Tens of Terabytes mean hundreds of Oracle datafiles to be moved. And reconnected. Metadata extracted. And reapplied. Quite the busy scene.

With careful scripting - you can make it happen. But executives are funny - between HIPAA, and SOX - they demand double and triple audits to ensure that the copy on server "B" is EXACTLY identical to that on server "A". And we just work here and need to deliver.

So - a magical function that will hash an entire table, or at least an entire partition - to serve as a "proxy" for the contents of the table/partition is crucially useful. Absent that - we will try and construct something row-based per your suggestions.

BTW - when I said "billions of rows" I was implying that process-intensive solutions like "MINUS-ing" 2 copies of a
2-billion row 1 Terabyte table over a DB-Link looking for a "null set" result - is not a practical option.

And yes, I know, if our entire universe resided in a single RAC database - there will be no need to move data between DBs.

But those same executives are reluctant to put all the eggs in one RAC basket and demand multiple machines/DBs. Hence the need to make copies.

Thanks for the kick start as always.

Udi
Tom Kyte
October 23, 2009 - 1:08 pm UTC

(1) so what? Not sure what you are saying there - because I didn't say "don't use dblinks"

I said "hey, if you are using a dblink and you say "insert this data over there", it will do it and you have procedural access to the results and you know what happened.

(2) but I would never even consider doing that.

(3) so? I don't know why you would move 10's of terabytes (sounds like you should be doing your tuning at your OPERATIONAL LEVEL, not in the database, why move that which need not be moved)

but even so, so what - who cares how many files there are. If you have 10's of terabytes and you unload it - I'll bet you have........

a lot of files - don't you.




Use transportable tablespaces, nuff said. You can use as large a file as you want, you could use bcv splits or whatever to make instance copies, you could use your backups (you can backup a tablespace on one system and restore to another)

think about it.

It's really not so simple

Udi Karni, October 21, 2009 - 6:46 pm UTC

Dana - you raise a good point - but we've learned this the hard way.

When you have a REALLY LARGE data warehouse you start hitting the limits of physics and scalability. Last year our warehouse required 500 CPUs and a ton of IO. This year we run on 1000 CPUs and next year 2000 CPUs, etc., with IO doubling correspondingly.

They simply don't sell multi-thousand CPU servers in your friendly vendor's catalog. And even if you manage to put one together - you can't just order an expansion box to double its capacity every year. At some point - you simply run into the law of seriously diminishing returns.

So - the only option (unless you have billion dollar budgets to order something completely custom) is to buy the largest commercially available servers - typically with no more than 64 or 128 CPUs - and just get a number of them, with multiple copies of your DB.

You can try to pursue a single DB RAC - but it's really hard to build an IO backbone powerful enough to support a RAC of 1000 CPUs.

At some point - you're forced to cut it up into a few identical copies that must be maintained... perfectly identical.

It's a completely different paradigm than datamarts.
Tom Kyte
October 23, 2009 - 1:12 pm UTC

you sound like a serious candidate for exadata - push the database down to the storage level itself.

Have you heard of it?
Have you looked at it?


Duplication of data warehouse data

Dana, October 22, 2009 - 2:37 pm UTC

Udi,

I don't disagree with your premise. And your solutions show hard work and creativity.
I do wonder if the premise is workable.
In my experience, business abhors unused capacity, like nature abhors a vacuum.

It would be very interesting to see the ROI statement requiring that much capital outlay, developer and administrative support. Does the ROI double each year like the resource requirement?

The proposed solutions are intriguing, and useful. I just don't know I can say the same about the requirements.

Re-write of SQL "could" reduce requirements so the original outlay of resources works. I believe SQL techniques advance faster than hardware. That's because I've been reading AskTom for so long. :)

Tom - you're making my point...

Udi Karni, October 23, 2009 - 4:41 pm UTC

... any solution means lots of detail. "bcv" is probably the safest because it's SAN hardware based, but SAN is a poor solution for data warehouses because it derives its performance from its cache which is typically too small and insignificant for large data warehouses.

This is why the smart people that built Exadata (which we are looking at) - who know a thing or two about data warehouses - are basically using "direct attached JBOD" in their storage cells.

As for why we're making all these copies - as I explained - we have too much processing for a single commercial machine to handle, so we have to split the load. We also don't have a Los Alamos budget to custom order a 10,000 CPU supercomputer, nor are there very many multi-thousand CPU Oracle RAC installations out there for us to learn from their mistakes (we checked).

The point is - making the copies involves a lot of details - and the executives aren't interested in technicals. Patients' lives are at stake and they want to feel confident that the LAB table on server A is precisely identical to the LAB table on server B.

They are business people - and in their world 2 tables are identical if their counts are the same, or the sum of their numeric columns is equivalent, or table A minus table B = table B minus table A = (null set). Stuff like that.

So - while they trust that we have taken every precaution - they also expect us to run these types of audits when all the loading is done - and generate a clean bill of health.

So - an Oracle builtin function that will return some kind of "number" that uniquely and consistently represents the data contents of a table or partition - would be very handy for us - because we would just run this function on the various copies and make sure the "number" for a given table on all the copies is the same.

Absent such a function - we'll have to continue to count, and sum, and build something ourselves - maybe using ORA_HASH - but it will definitely be much less efficient than an Oracle supplied "ORA_HASH on steroids" function.

Before we go all out - I wanted to pick your brain in the hope we can spare ourselves all this development.

Makes sense?

Udi
Tom Kyte
October 26, 2009 - 1:41 pm UTC

... we have too much
processing for a single commercial machine to handle, ...

probably not with exadata - which is commercial.


And if you have a hard time handling the load NOW, why would you want to do this hashing (read that as "extremely cpu intensive thing) for all of the data (read that as excessive, unnecessary IO).

If you have the money to finance ALL of the redundant copies
And their hardware
And your time
And your design efforts

than you probably have twice as much money as you need to right size a machine, with technology (such as exadata) that can scale it out.

... Patients' lives are at stake and they want to
feel confident that the LAB table on server A is precisely identical to the LAB
table on server B.
...

and now I'm really afraid, for you have introduced huge latencies into the dispersal of really important data and you have synchronization issues.


... They are business people - and in their world 2 tables are identical ..

and (i've written this before) it is our job as TECHNOLOGY PEOPLE to tell the BUSINESS PEOPLE "we don't tell you how to accomplish your task, you don't tell us how to accomplish ours - we provide a service, you tell us what needs be in place and we do it". The cost what of they propose is the Los Alamos budget you were talking about.


.. when all the loading is done ... and if you do it the way I suggest (transport), there is no loading, hence - you are done, at 1% of the cost (no load, no unload, no reindex, no compare)

.... an Oracle builtin function that will return some kind of "number" that
uniquely ...

as long as you are aware that when you hash an infinite number of inputs into a finite space - the word UNIQUE cannot be used - right - you know that unique doesn't come into play here.

... I wanted to pick your brain in the hope we can spare
ourselves all this development.

...


and I'll keep telling you "TRANSPORT", then you are done


It was making sense until you keep going that you need to do this, because you do not.


I'd like to invite you to the trenches.....

Udi Karni, October 30, 2009 - 6:16 pm UTC

Tom - this is all good advice - if life were only so simple. In corporate environments you can't always do what's technically "correct".

(1) Exadata is great - but if it's an "XYZ" shop and there's reluctance to buy HP/Sun hardware - there won't be any Exadata.

(2) Transportable TBS are great - but they involve moving datafiles across machines - for which you need system permissions, that are not always easy to obtain. With DB-Links you create a "secure tunnel" between DBs that is under your control and allows you to get the work done without mountains of red tape - even if DB-links have their challenges.

(3) Plus, if you have many tables that get updated throughout the day, and you want to propagate them as soon as they are ready - putting each one in a tiny tablespace and shipping it across - is using an 18-wheeler to transport an evelope. Transportable tablespaces were meant to transport an entire system or schema, not one table. Yes you can transport one table or one partition - but the overhead of the process, I hope you will agree, is way greater than just saying "insert into destination@db-link select from source".

(4) And I particularly like your advice about telling the Business Bosses "you don't tell me how to accomplish my job...". In the world of HIPAA and SOX - when execs are facing 25 years in the slammer for data inequities - they WILL tell us which audit procedures they are comfortable with - or WE will spend 25 years DBA'ing in the north pole outpost. Come on.

Life isn't perfect, but we still manage to run a 100 Terabyte data warehouse, and since we can't get the one machine to do it - we make copies across a bunch of machines, spread the load across them, and make it work that way. Yes, it adds up to 1 Petabyte of spinning Oracle copies. Yes, there's the propagation overhead. Yes, there are QA issues. But overall - not too many other realistic options for us. Trust me - we've considered many.

Our size doubles every year. It's not easy to get a new giant machine every year that's 2x its predecessor. It is, however, easy to get new smaller machines and upgrade the copies one-by-one, or get a few extra ones to accomodate a surge in growth.

So - I very much appreciate your thoughts and suggestions. We could, however, use a function that "represents" the contents of a table or partition. If you think about it - a terabyte table is really an 8-trillion-bit string - so there's your uniqe representation. I was hoping, though, that the genious developers in Oracle could find a way to compress the 8 trillion bits into something more compact, just as WinZIP compresses large files to very small ones without losing contents. Then we could compare source and destination strings and verify that they are equal. Will it use up cycles? no doubt. But better expend some cycles than spend some long winters in our north pole facility. I like California better.

Thanks in any event for your time and thoughts.
Tom Kyte
November 09, 2009 - 11:57 am UTC

I live in the trenches? I work in a large company. And I violently disagree with your opening statement.

violently. hugely. 100%


It is your job to actually educate the business. If you sort of told them the cost benefit - you and your team, you know, the people that actually know the technology - I have a feeling it would be different.

and you know what, if you are forced to work in a place where you are doing things you know to be utterly stupid, wasteful, the wrong way - how does that make you feel at the end of the day? Are you happy with your current place?


(1) so what, but these same people would consider a netezza or a teredata proprietary solution.

and exadata is not just hp, it is a san you can connect to things. sun is a huge player as well.

(2) so what, so what, so what. period - all I say to you is "so what". Do you have any technical points to make? I made quite a few.


as for your secure tunnel. Hmm, I hope you are using ASO or some other encryption technology. And it is as secure as moving files would be. If you do things in a professional manner.


dblinks not only have their challenges - but for data moves ( which frankly, as stated, SHOULD NOT BE DONE IN THE FIRST PLACE - the first "not smart" thing we are doing here is copying a ton of data - not smart) of this size are about the slowest way to do it. Think about that, think about that cost. 'Business' is about cost and getting things fast - you are doing it the most costly slow way possible.


(3) you didn't even say you were doing that, if you change the problem, then I will change the answer.

we've been talking about using things like BCV's to move tons and tons of data - not a trickle feed.

Fine, you want to trickle feed just changes - use streams. Done - but that doesn't make your need to move tons of data go away and we are back to either

(a) DO NOT DO IT (that is best)
(b) DO IT RIGHT using transports.


(4) you again need to educate, tell them how insecure what you are doing is - if what you say is true, copying the data all over the place is the LAST THING THEY WANT. give me a *break*.




... but we still manage to run a 100 Terabyte data warehouse,
...

that is a medium, almost small, database these days, you are doing what hundreds of thousands of people do.


And you could get a machine to do a lot more.


... But overall - not too many other realistic
options for us. Trust me - we've considered many.
...

I disagree. You know, as an Oracle shareholder, I should be promoting this - big time. How much money are you burning to copy this to so many machines.

But I'm not, I'll persist in pushing the right agenda, not giving a stamp of approval to a really bad idea.


... I was hoping, though, that the genious
developers in Oracle could find a way to compress the 8 trillion bits into
something more compact, ...

you know what, we can, you just have to whip through 8 tillion bits first. Your analogy with winzip causes me huge confusion - I see not a single parallel??

And you would hope the genius developers where you are - could do this smarter?



Beware of hash cracking

Another reader, November 02, 2009 - 5:15 am UTC

As Tom Kyte already pointed out, a hash mathematically is a mapping of a huge set of possible values to a smaller set. It cannot be unique because if it were it would need the same amount of space to represent the same cardinality.

So your bosses who are afraid of going "to the slammer" bet on statistics: they hope you will not hit a collision of your hashes in your lifetime. This sounds too optimistic considering 100 TB of data. Let us assume you have 256-bit hashes which are accepted as "safe". 256 bit = 32 bytes. Let us further assume you map one 1-TB-table's contents to one hash value. This means 1,099,511,627,776 bytes map into a 32 byte value. So you will have guaranteed collisions

power (1099511627776/32, 2) = 2^34,359,738,368 (around 10^10,343,312,041)

times! And your bosses rely on hash values? Hmm. Maybe mathematics may convince them if sound advice cannot.

Beware of hash cracking

Another reader, November 02, 2009 - 5:15 am UTC

As Tom Kyte already pointed out, a hash mathematically is a mapping of a huge set of possible values to a smaller set. It cannot be unique because if it were it would need the same amount of space to represent the same cardinality.

So your bosses who are afraid of going "to the slammer" bet on statistics: they hope you will not hit a collision of your hashes in your lifetime. This sounds too optimistic considering 100 TB of data. Let us assume you have 256-bit hashes which are accepted as "safe". 256 bit = 32 bytes. Let us further assume you map one 1-TB-table's contents to one hash value. This means 1,099,511,627,776 bytes map into a 32 byte value. So you will have guaranteed collisions

power (1099511627776/32, 2) = 2^34,359,738,368 (around 10^10,343,312,041)

times! And your bosses rely on hash values? Hmm. Maybe mathematics may convince them if sound advice cannot.

That's a little extreme.....

Udi karni, November 02, 2009 - 1:20 pm UTC

Reader from Germany - you raise a very good point, but I think we have a little lattitude here.

Maybe take uptime as a guide - executives equate a system with 5 "9"s to a system that "never goes down" and make such promises to the board, stockholders, and public.

A 5 "9"s system - if I did my math right - is a system that doesn't go down for more than 30 seconds out of the 30 million seconds in a year - so 1 out of a million is considered an acceptable error.

A hash function that can reduce the 8 trillion bits of a 1 trillion byte (1 Terabyte) table by a factor of 1 million - down to an 8 Mbyte string - might just be an acceptable error. You compare two 8 Mbyte hash strings and if they are equal - you assume the entire 1 Terabyte is equal.

We do the same when comparing DNA, when taking a thumb print or a retina scan and allowing access - I doubt the computer checks every pixel - there must be some "formula" derived from the picture and compared.

So - there has got to be some reasonable way to do this. Maybe a function with a throttle parameter - the more accuracy you want - the bigger the string will be, and then the users can make an educated determination of how far they want to extend it.

I am not an expert on this, but in my humble opinion - in the world of data warehouse - you often end up with large objects that are copies of each others - and it would be very useful to have a function that could quickly tell you whether they are identical - without comparing every bit.

Comparison Exactness

Another Reader, November 02, 2009 - 2:20 pm UTC

Hi Udi,

I just wanted to show the numbers to give you a helping hand when arguing with management ;-)

When Oracle compares two tables and finds they *are* the same they *are* the same with regards to their use by the appropriate application layer. Bit by bit, however, not necessarily on the physical layer (one disk may have a different representation of "1" and "0" than another) but with respect to "select * from tab;".

And here comes the point: when you use transportable tablespaces you have them read-only. So you can mount them read-only from several sides *and* you can ensure they *are* the same, not just the same with regards to the equivalence class (mathspeak) of the same hashvalue.

So by transportable tablespaces you might provide an even "better equality" than by comparing hashvalues. What about that?

You are right...

Udi Karni, November 06, 2009 - 5:21 pm UTC

Another reader from Germany - you are right, and I should clarify 2 things -
(1) I completely agree that transportable tablespaces are a great way to both move the data and guarantee equality
(2) when discussing the 8 trillion bits I was speaking metaphorically - of course the hash string needs to represent the data from a user's perspective - put the 2 tables "side-by-side" and every column of every row must have the same value.

As I stated before -

The problem with transportable tablespaces is just that it's really meant for shipping large chunks - whole tablespaces, entire schemas. You could of course put a tiny table in a tablespace and ship it - but it's not exactly how it was meant to be used. In our environment - parts of the tables get updated at different times of the day (east coast earlier than west coast)and need to be propagated immediately - so a DB-link transfer in a SQL script is a very easy way to ship one chunk of a table that has just been updated, while the transportable tablespace is a little more cumbersome.

Also - transportable tablespaces require jumping between an Oracle process that loads the data into the tablespace and a system process that extracts the metadata, moves the datafiles, loads, connects on the other side, etc. In a highly secure shop - this requires much more setup and permissions, and is a lot less flexible than a db-link. So - the ability to get things done quickly without involving other groups (network, system) gets a few points as well.

Oracle obviously has a ton of features, and every customer uses the ones that work for them. Sometimes I am just amazed at how resistant Oracle is to adding features even if they are only useful to a small segmment of the user base.

Examples that come to mind -
(1) the ability to rename a tablespace - a seemingly "nothing-to-it" feature that for many long years wasn't available.
(2) the lack for all these years of an "IsNumeric" builtin function (Googling for it still brings up Tom Kyte's you-code-it-yourself workaround). All kinds of stuff that exists in SQL Server - that brings joy to no end to the Microsoft crowd - even though theirs, ultimately, is an inferior product.

In any event - thanks for your feedback !

Simply agree...

Dana, November 18, 2009 - 4:48 pm UTC

This very much sounds like a presentation given several years ago by an esteemed member of the db community.
He owned a consulting company where he solemnly agreed with everything the customer wanted.
Need 2X the CPU... Great!
Need 10X the disk... Even better!!!
Need 100X the memory... That'll do it!!!

Sadly for the premise, it was completely with tongue firmly in cheek. But talk about funny!

Oracle Hash Function accepting multiple columns in Exadata?

Amit Nair, January 02, 2013 - 4:18 pm UTC

Hi Tom,

Thanks for your inputs on Oracle hash functions through various forums.

I noticed that you mentioned below for a multi-column hash.
ora_hash( c1||'/'||c2||'/'|| ... , .... )

With Exadata or Oracle 11g, did Oracle introduce a hash function which can accept multiple columns as arguments and give a hash value for those columns? Something like:

hash(col1, col2, col3, ... )

Need it to store natural keys in tables, allows me to compress a multi-column natural key into a single column.

I have used a similar function on Vertica DBMS, but have not found an equivalent function in Oracle.

Thanks!
Tom Kyte
January 04, 2013 - 2:56 pm UTC

No, but there is no reason you cannot write your own....


.. Need it to store natural keys in tables, allows me to compress a multi-column
natural key into a single column. ...

be careful with that, hashing an infinite set of values into a finite set will result in collisions.

Comparing

Ondrej, March 13, 2014 - 3:23 pm UTC

Hello,

Please, what do you suggest to make sure that a table in a live customer database has identical data as a table in a testing database?

Different organizations are responsible for the two databases. There are very limited possibilities to copy the data between the two databases because of the organizational responsibility. Live access is limited. There's a long way from corporation A to corporation B and we just want to make sure that no Joe from corporation B did something to the data prepared by Fred from corporation A, and if someone did, we want to identify it asap and simply.

In my opinion, this is a very valid scenario where I do want to compare two tables. I did not read all the long responses in detail, but I understood that your initial response was to tell me that I don't want to compare two tables. I have considered it, very seriously, but still I do insist. I do want it. In my opinion, above is a valid example of a situation where people do want it, and I believe there is quite a lot of other valid scenarios encountered by many people every day.

Limitations of DB_LINKS aren't considered in reply

george, February 28, 2022 - 10:14 pm UTC

Apparently, for example, I can't love a BLOB with DB_LINK. So, this generic problem of needing to have something that does go across the wire isn't a frivilous requirement.

In my case, the check sums aren't used on code but on images, that are stored as BLOB. I want to know what changed, to decide if I can safely use what's in my DEV.
Connor McDonald
March 02, 2022 - 2:39 am UTC

Hash them on each side of the db link, ie

select my_hash_function(blob_col) from mytable@db_link;

select my_hash_function(blob_col) from my_local_table;

They can be compared like that.

More to Explore

Data Pump

All of the database utilities including Data Pump are explained in the Utilities guide.