Skip to Main Content

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, swadesh.

Asked: September 28, 2016 - 8:31 pm UTC

Last updated: October 03, 2016 - 12:09 am UTC

Version: Oracle 11g R2

Viewed 1000+ times

You Asked

Hi Tom,

I have a table in PCA database(11g R2) with 30 columns.It has POS data.Data volume is very high in first year,near about 5 TB.Next 3 yrs It will be 6TB approx.This table will be use for huge insert(Informatica will be used for load) like 25 Millon transaction per day from Open source table.Insertion must complete within 2/3 hours at non business hour,probably night time.Also table will be used for analytic purpose.Like 25 millon tran read. But write and read will be different time.

1.There is ID column in this table, it is near to unique key. Is hash partition on that id with 8 would be best for this table for best performance?
2.Or what partition is required to do insert and read very very fast?

Please reply ,this is very urgent.

Regards,
Swadesh Mondal

and Connor said...

Several things to consider here

1) hash partitioned *table*, is basically for taking something huge and making it into smaller more manageable chunks. You might get administrative benefits, but unlikely to get performance benefits. In particular, the partitioning strategy for the *table* is more likely to be combination/compromise between manageability and the types of queries you run against it. (eg large date range queries possibly suggest partitioning by a date range)

2) hash partitioned *index*, is for handling high concurrency insertion on the table. Note, this is not the same as high *volume*. If you have only a handful of sessions (or one) doing the large load, then hash partitioning the index is unlikely to be use.

The fastest insert is a combination of:
- the least amount of indexes you can afford
- loading data in bulk (not one row at a time)
- using direct mode facilties where possible

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Swadesh

A reader, September 29, 2016 - 4:38 am UTC

What is pos data
What mean near uk
What is table definition
What look like qry on this table
What dml on apart inserts
What strategies are regarding stats gathering
....

Connor McDonald
September 29, 2016 - 10:57 am UTC

On this table bulk data load and one row load at a time both will happen(Informatica will use for load).Per day maximum 25 Million rows insertion.

=> that depends. You dont say whether you are bulk loading 90% of the rows, or 0.9% of the rows.

Also Adobe CRM tool will read 25 Million rows per day for analytics purpose. In this case what will be my perfect table struct?

=> that depends on the queries you want ot run

Which partition I can use here and which column can create index for best performance for both read and write.please advise .

=> that depends on the queries you want to run, which ones are most important, and what write options you have (eg direct load, exchange partition, etc)

(We are using Oracle 11g r2 in PCA,If we use Exadata x3-2 quarter rack any differences or extra config required?pl. advise)

=> I dont know what a "pca" is ? private cloud appliance ? But exadata has different mechanisms for optimizing data access, which can impact your design decisions.



swadesh Mondal, September 29, 2016 - 9:07 am UTC

Hi,

Thanks for prompt reply.

Below I have given my table struct.

CREATE TABLE "C360_DWH"."STG_F_CDW_POS"
( "STORE_ID" NUMBER(4,0),
"TRANSACTION_DATE" DATE,
"NK_GTIN_PLU_ID" NUMBER(19,0),
"NK_ENCRYPTEDCARDNO" VARCHAR2(255 BYTE),
"NK_EVENT_ID" NUMBER(15,0),
"NK_OFFER_ID" NUMBER(15,0),
"NK_MECHANIC_ID" NUMBER(15,0),
"PEO_OFFER_ID" NUMBER(13,0),
"RECEIPT_DD_KEY" NUMBER(20,0),
"NK_HR_MIN_CHAR" VARCHAR2(8 BYTE),
"SALES_AMT_INCL_VAT" NUMBER(4,0),
"QUANTITY_MEASURE" NUMBER(4,0),
"VAT_AMT" NUMBER(4,0),
"DEPOSIT_VALUE_AMT" NUMBER(4,0),
"DISCOUNT_CENTRAL_AMT" NUMBER(4,0),
"DISCOUNT_LOCAL_AMT" NUMBER(4,0),
"DISCOUNT_PEO_OFFER_AMT" NUMBER(4,0),
"CURRENT_BUYING_PRICE" NUMBER(4,0),
"CURRENT_SALES_PRICE" NUMBER(4,0),
"QUANTITY_DISCOUNT" NUMBER(4,0),
"LOCAL_EVENT_IND" NUMBER(1,0),
"NON_SALABLE_IND" NUMBER(1,0),
"DISCOUNT_COUPON_AMT" NUMBER(4,0),
"COMPENSATION_VALUE_COUPON" NUMBER(4,0),
"COMPENSATION_VALUE_PEO" NUMBER(4,0),
"VAT_AMT_AFTER_DISC" NUMBER(4,0),
"NO_OF_ITEMS" NUMBER(9,0),
"WEIGHT_OR_VOLUME" NUMBER(4,0),
"RECEIPT_TYPE" VARCHAR2(255 BYTE),
"SELFSCANNING_WAS_USED" NUMBER(1,0),
"ICA_CARD" NUMBER(1,0),
"ICA_BONUS_ONLY" NUMBER(1,0),
"OTHER_CARD" NUMBER(1,0),
"PERS_CHECK" NUMBER(1,0),
"CASH" NUMBER(1,0),
"CREDIT" NUMBER(1,0),
"VOUCHER" NUMBER(1,0),
"REQUISITION" NUMBER(1,0),
"FOREIGN_CURRENCY" NUMBER(1,0),
"BONUS_CHECK" NUMBER(1,0),
"ROUNDING" NUMBER(1,0),
"INSERTED_DT" DATE
);

Here only one column RECEIPT_DD_KEY is unique.

On this table bulk data load and one row load at a time both will happen(Informatica will use for load).Per day maximum 25 Million rows insertion.

Also Adobe CRM tool will read 25 Million rows per day for analytics purpose. In this case what will be my perfect table struct?

Which partition I can use here and which column can create index for best performance for both read and write.please advise .

(We are using Oracle 11g r2 in PCA,If we use Exadata x3-2 quarter rack any differences or extra config required?pl. advise)

Regards,
Swadesh Mondal

Connor McDonald
October 03, 2016 - 12:09 am UTC

On this table bulk data load and one row load at a time both will happen(Informatica will use for load).Per day maximum 25 Million rows insertion.

=> that depends. You dont say whether you are bulk loading 90% of the rows, or 0.9% of the rows.

Also Adobe CRM tool will read 25 Million rows per day for analytics purpose. In this case what will be my perfect table struct?

=> that depends on the queries you want ot run

Which partition I can use here and which column can create index for best performance for both read and write.please advise .

=> that depends on the queries you want to run, which ones are most important, and what write options you have (eg direct load, exchange partition, etc)

(We are using Oracle 11g r2 in PCA,If we use Exadata x3-2 quarter rack any differences or extra config required?pl. advise)

=> I dont know what a "pca" is ? private cloud appliance ? But exadata has different mechanisms for optimizing data access, which can impact your design decisions.

More to Explore

VLDB

If you are new to partitioning, check out Connor McDonald's introduction series here.

VLDB

Documentation set on VLDB and Partitioning.