Skip to Main Content
  • Questions
  • A tool for data inflation for volume tests

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Orna.

Asked: May 26, 2002 - 4:35 pm UTC

Last updated: March 30, 2008 - 8:33 am UTC

Version: 8.1.7

Viewed 1000+ times

You Asked

Hi,

Are you aware of any good tool ( Oracle's, third party ) - that can artificialy inflate a small amount of data intelligently - so we can perform volume tests when we do not have a large volume yet ?

this is a very common situation with our projects - many times we still do not have large volume conversion or production data - but we still need to test our software on a large volume

thanks a lot
Orna

and Tom said...

No, I am not. I typically have as part of my implementation plan a test data generator when needed for volume testing like this.

Anyone out there aware of anything more "intelligent"?

Rating

  (6 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

A tool for data inflation for volume tests

George Spears, May 28, 2002 - 9:21 am UTC

You can use any of the following tools

PumpItUp, </code> http://www.hallogram.com/pumpitup/
CA-Datamacs - 
http://www3.ca.com/Files/ProductAnnouncements/datamacs_12pa.pdf
Data Xccelerator - 
http://www.tangiblesoftware.com/products/xccelerator/ <code>

another good tool

Paul Druker, May 28, 2002 - 10:23 am UTC

You may also want to look at DataFactory:

</code> http://www.quest.com/datafactory/ <code>

The solution provided was great...........But

shekar, March 27, 2008 - 12:49 pm UTC

The third party tools you given are helpful.... but how could i implement the same logic in my project without using any third party toos.......

Please suggest me if oracle provides any special utilities or functions for this purpose...

Thanks in Advance,
Tom Kyte
March 27, 2008 - 1:59 pm UTC

there is nothing included in the database to do that, no.

inflate a small amount of data intelligently?

Duke Ganote, March 27, 2008 - 2:50 pm UTC

Doesn't this require guesstimating data distributions, skew, and the like? We can use the 'infinite dual'
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:40476301944675
and DBMS_RANDOM, if we're willing to make assumptions.

For example, someone recently posted a question about a "query that's currently taking 2 hours to join the main table S_F_ASSET_VALUE with the Calendar table to get back one row for each day that falls between VALUATION_TYPE_DATE_FROM and VALUATION_TYPE_DATE_TO. So if VALUATION_TYPE_DATE_FROM is 01.Jan.2008 and VALUATION_TYPE_DATE_FROM is 13.Jan.2008, we'd get back 13 rows. It's possible that a row is "OPEN" until the end of times (our max date is 31.12.2099) ... there's also something called the Official date, which is the last day until which dates should be cloned...
There are about 600k rows in the main table when doing the initial load. Calendar's got all the rows from the 1950's until far in the future, and the cloning produces about 1.7 million rows in total."

They provided the query. I was able to construct a similar scenario using DBMS_RANDOM and the 'infinite dual' like so:

-- **********************************************
-- SET UP TABLES AND ROWS ***********************
-- **********************************************
create table d_calendar as
select to_number(
to_char(
to_date('19491231','YYYYMMDD')+level -- all the rows from the 1950's
,'yyyymmdd')
) calendar_dw_id
, to_date('19491231','YYYYMMDD')+level calendar_dt
from dual connect by level < 150*366 -- until far in the future
/
alter table d_calendar modify
( calendar_dw_id primary key
, calendar_dt unique )
/
create table S_F_ASSET_VALUE as
Select dt AS VALUATION_TYPE_DATE_FROM
, dt+DBMS_RANDOM.VALUE(1,10)
AS VALUATION_TYPE_DATE_TO
from (
select to_date('19491231','YYYYMMDD')
+ DBMS_RANDOM.VALUE(1,150*365)
AS dt
from dual
connect by level < 600000 -- 600k rows in the main table
)
/

-- **********************************************
-- QUERY ****************************************
-- **********************************************
SELECT count(*) FROM
S_F_ASSET_VALUE,
(select 20100101 DATE_OFFICIAL from dual )DATE_OFFICIAL,
d_calendar
where (calendar_dw_id between
greatest(
to_number(
to_char(VALUATION_TYPE_DATE_FROM ,'yyyymmdd')
),20000101
)
and
least(
to_number(
to_char(
VALUATION_TYPE_DATE_TO,'yyyymmdd')
),DATE_OFFICIAL)
)
or
(calendar_dw_id=to_number(to_char(VALUATION_TYPE_D ATE_FROM,'yyyymmdd'))
and calendar_dw_id<20000101)
/

Sure it's time-consuming (in terms of planning and the like), but generating data is pretty quick in implementation.
Tom Kyte
March 30, 2008 - 8:33 am UTC

... Doesn't this require guesstimating data distributions, skew, and the like? ...

yes, it does sort of, or at least analyzing the existing data and inflating it using the observed values - also you have domain value ranges to obey, uniqueness rules, foreign keys, etc...


if you just want random data, I've used this in the past:

create or replace procedure gen_data( p_tname in varchar2, p_records in number )
authid current_user
as
    l_insert long;
    l_rows   number default 0;
begin

    dbms_application_info.set_client_info( 'gen_data ' || p_tname );
    l_insert := 'insert /*+ append */ into ' || p_tname ||
                ' select ';

    for x in ( select data_type, data_length,
                nvl(rpad( '9',data_precision,'9')/power(10,data_scale),9999999999) maxval
                 from user_tab_columns
                where table_name = upper(p_tname)
                order by column_id )
    loop
        if ( x.data_type in ('NUMBER', 'FLOAT' ))
        then
            l_insert := l_insert || 'dbms_random.value(1,' || x.maxval || '),';
        elsif ( x.data_type = 'DATE' )
        then
            l_insert := l_insert ||
                  'sysdate+dbms_random.value+dbms_random.value(1,1000),';
        else
            l_insert := l_insert || 'dbms_random.string(''A'',' ||
                                       x.data_length || '),';
        end if;
    end loop;
    l_insert := rtrim(l_insert,',') ||
                  ' from all_objects where rownum <= :n';

    loop
        execute immediate l_insert using p_records - l_rows;
        l_rows := l_rows + sql%rowcount;
        commit;
        dbms_application_info.set_module( l_rows || ' rows of ' || p_records, '' );
        exit when ( l_rows >= p_records );
    end loop;
end;
/

Article on random data generation

Vinay, April 08, 2008 - 3:34 am UTC

This may be of help perhaps -

http://www.dbazine.com/olc/olc-articles/nanda13

URL for "The Art and Science of Randomization in Oracle"

Duke Ganote, April 14, 2008 - 11:28 am UTC

Vinay's URL is intended to point to Arup Nanda's article entitled "The Art and Science of Randomization in Oracle" which I found either by adding the slash at the end:
http://www.dbazine.com/olc/olc-articles/nanda13/
or via search:
http://www.dbazine.com/olc/olc-articles/nanda13/view?searchterm=nanda%20randomization