Skip to Main Content
  • Questions
  • Varchar2 declared size and sql performance.

Breadcrumb

Question and Answer

Chris Saxon

Thanks for the question, Tomas.

Asked: May 28, 2024 - 7:53 pm UTC

Last updated: May 31, 2024 - 5:11 pm UTC

Version: 19c

Viewed 1000+ times

You Asked

Dear Ask Tom team,

Could you please clarify why using a longer declared size of VARCHAR2 affects CPU consumption during SQL execution? Assuming that there are no essential flaws in my benchmark design.

Based on my findings, the declared size of VARCHAR2 does impact SQL performance within an Oracle data warehouse database (not an autonomous database).

I understand that precise VARCHAR2 sizing is crucial as a constraint for data quality and for optimizing performance and memory allocation in client applications, as they allocate memory based on the declared VARCHAR2 size.

However, I have been pondering whether the declared size of VARCHAR2 columns also affects performance within the Oracle database itself.

Here’s the setup for my benchmark:

Two pairs of tables with identical data, structure, environment, and schema.
The only difference is that the first pair has all VARCHAR2 columns with a declared size of 1, while the other pair has columns with a declared size of 4000.
I conducted experiments with various lengths and joins, and consistently found that longer declared VARCHAR2 sizes were slower. The most significant difference occurred when joining the tables on VARCHAR2 columns (without indexes), which aligns with the common scenario in data warehouses of full scans and hash joins on relevant data.

As for the choice of 5 million rows, it is still relatively small compared to our data warehouse workload, but it provides a sufficiently large example. The decision to run the benchmark 15 times was driven by the desire for more representative results, leveraging big numbers and probability.

-- first pair

create table varchar2_precise1 (table_key integer,
                                 string1 varchar2(1), 
                                 string2 varchar2(1), 
                                 string3 varchar2(1),
                                 string4 varchar2(1),
                                 string5 varchar2(1));
                                 
 create table varchar2_precise2 (table_key integer,
                                 string1 varchar2(1), 
                                 string2 varchar2(1), 
                                 string3 varchar2(1),
                                 string4 varchar2(1),
                                 string5 varchar2(1));
                                 
 -- second pair    
 
                                 
                                 
  create table varchar2_long1 (table_key integer,
                                 string1 varchar2(4000), 
                                 string2 varchar2(4000), 
                                 string3 varchar2(4000),
                                 string4 varchar2(4000),
                                 string5 varchar2(4000));
                                 
                                 
  create table varchar2_long2 (table_key integer,
                                 string1 varchar2(4000), 
                                 string2 varchar2(4000), 
                                 string3 varchar2(4000),
                                 string4 varchar2(4000),
                                 string5 varchar2(4000));
    
  -- table for data, we need exactly the same data.
     
     create table varchar2_data1 (table_key integer,
                                 string1 varchar2(1), 
                                 string2 varchar2(1), 
                                 string3 varchar2(1),
                                 string4 varchar2(1),
                                 string5 varchar2(1));
  
    create table varchar2_data2 (table_key integer,
                                 string1 varchar2(1), 
                                 string2 varchar2(1), 
                                 string3 varchar2(1),
                                 string4 varchar2(1),
                                 string5 varchar2(1));

  -- write data into data tables.
   
    insert into varchar2_data1 select level,dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1) 
                                                 from dual connect by level <=5000000;
    commit;
    insert into varchar2_data2 select level,dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1),
                                               dbms_random.string(opt => 'a',len => 1)
                                                 from dual connect by level <=5000000;
    commit;
                                         
   -- write data into tables used in selects:

    insert into varchar2_precise1 select * from varchar2_data1;
    commit;                                           
    insert into varchar2_precise2 select * from varchar2_data2;
    commit;
    insert into varchar2_long1 select * from varchar2_data1;
    commit;                                          
    insert into varchar2_long2 select * from varchar2_data2;
    commit;
     
    -- count rows check.

       select count (*) from varchar2_precise1 t join varchar2_long1 i on t.table_key = i.table_key ;
      select count (*) from varchar2_precise1 t join varchar2_long2 i on t.table_key = i.table_key ;
      
   -- statistics, for replication, schema must be changed to your schema.  
      begin
        dbms_stats.gather_table_stats(ownname =>'L2_DYN_MAP' ,tabname =>'varchar2_precise1');
        dbms_stats.gather_table_stats(ownname =>'L2_DYN_MAP' ,tabname =>'varchar2_precise2');
        dbms_stats.gather_table_stats(ownname =>'L2_DYN_MAP' ,tabname =>'varchar2_long1');
        dbms_stats.gather_table_stats(ownname =>'L2_DYN_MAP' ,tabname =>'varchar2_long2');
      end;


  Benchmark for varchar2(1) tables hash join.

alter session set statistics_level = 'all';

begin 
  dbms_application_info.set_module(module_name => 'varchar2_precise_run1.1',action_name => 'varchar2_precise_run1.1');
end;

declare
   l_count integer;
    l_time_start_all  integer;
    l_time_end_all integer;
    l_time_all integer;
    l_time_start  integer;
    l_time_end integer;
    l_time integer;
  begin
      l_time_start_all := dbms_utility.get_cpu_time;
       for i in 1..15 loop
         
          l_time_start := dbms_utility.get_cpu_time;
           
          select /* MONITOR USE_HASH (t i) */ count(*) into l_count from varchar2_precise1 t join varchar2_precise2 i
                                                          on t.table_key = i.table_key
                                                           where t.string1 = i.string1;
          l_time_end := dbms_utility.get_cpu_time;
          l_time := (l_time_end - l_time_start);
          dbms_output.put_line('l_time run: '||i||' = '||l_time);
          
       end loop;
   
   
   l_time_end_all := dbms_utility.get_cpu_time;
   l_time_all := (l_time_end_all - l_time_start_all);
   dbms_output.put_line('l_count = '||l_count);
   dbms_output.put_line('l_time_all = '||l_time_all);
 end;

results:
l_time run: 1 = 127
l_time run: 2 = 103
l_time run: 3 = 115
l_time run: 4 = 112
l_time run: 5 = 126
l_time run: 6 = 116
l_time run: 7 = 114
l_time run: 8 = 117
l_time run: 9 = 118
l_time run: 10 = 112
l_time run: 11 = 137
l_time run: 12 = 117
l_time run: 13 = 111
l_time run: 14 = 125
l_time run: 15 = 121
l_count = 95942
l_time_all = 1771



 Benchamrk for varchar2(4000) :

alter session set statistics_level = 'all';
begin 
  dbms_application_info.set_module(module_name => 'varchar2_long_run1.1',action_name =>'varchar2_long_run1.1' );
end;

declare
   l_count integer;
    l_time_start_all  integer;
    l_time_end_all integer;
    l_time_all integer;
    l_time_start  integer;
    l_time_end integer;
    l_time integer;
  begin
      l_time_start_all := dbms_utility.get_cpu_time;
       for i in 1..15 loop
         
          l_time_start := dbms_utility.get_cpu_time;
           
          select /* MONITOR USE_HASH (t i) */ count(*) into l_count from varchar2_long1 t join varchar2_long2 i
                                                          on t.table_key = i.table_key
                                                           where t.string1 = i.string1;
          l_time_end := dbms_utility.get_cpu_time;
          l_time := (l_time_end - l_time_start);
          dbms_output.put_line('l_time run: '||i||' = '||l_time);
          
       end loop;
   
   
   l_time_end_all := dbms_utility.get_cpu_time;
   l_time_all := (l_time_end_all - l_time_start_all);
   dbms_output.put_line('l_count = '||l_count);
   dbms_output.put_line('l_time_all = '||l_time_all);
 end;

results:

l_time run: 1 = 160
l_time run: 2 = 172
l_time run: 3 = 162
l_time run: 4 = 166
l_time run: 5 = 151
l_time run: 6 = 193
l_time run: 7 = 165
l_time run: 8 = 163
l_time run: 9 = 169
l_time run: 10 = 171
l_time run: 11 = 164
l_time run: 12 = 164
l_time run: 13 = 154
l_time run: 14 = 171
l_time run: 15 = 147
l_count = 95942
l_time_all = 2472


dbms_xplan.display_cursor :


SQL_ID  4nhywrd3fza4s, child number 0
-------------------------------------
SELECT COUNT(*) FROM VARCHAR2_PRECISE1 T JOIN VARCHAR2_PRECISE2 I ON 
T.TABLE_KEY = I.TABLE_KEY WHERE T.STRING1 = I.STRING1
 
Plan hash value: 2910278931
 
-------------------------------------------------------------------------------------------------
| Id  | Operation           | Name              | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                   |       |       |       | 17797 (100)|          |
|   1 |  SORT AGGREGATE     |                   |     1 |    16 |       |            |          |
|*  2 |   HASH JOIN         |                   |  5000K|    76M|    95M| 17797   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| VARCHAR2_PRECISE1 |  5000K|    38M|       |  4155   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| VARCHAR2_PRECISE2 |  5000K|    38M|       |  4120   (1)| 00:00:01 |
-------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("T"."STRING1"="I"."STRING1" AND "T"."TABLE_KEY"="I"."TABLE_KEY")
 


SQL_ID  093xashuzfpxd, child number 0
-------------------------------------
SELECT COUNT(*) FROM VARCHAR2_LONG1 T JOIN VARCHAR2_LONG2 I ON 
T.TABLE_KEY = I.TABLE_KEY WHERE T.STRING1 = I.STRING1
 
Plan hash value: 1466879629
 
----------------------------------------------------------------------------------------------
| Id  | Operation           | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                |       |       |       | 17554 (100)|          |
|   1 |  SORT AGGREGATE     |                |     1 |    16 |       |            |          |
|*  2 |   HASH JOIN         |                |  5000K|    76M|    95M| 17554   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| VARCHAR2_LONG1 |  5000K|    38M|       |  4016   (1)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| VARCHAR2_LONG2 |  5000K|    38M|       |  4016   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("T"."STRING1"="I"."STRING1" AND "T"."TABLE_KEY"="I"."TABLE_KEY")
 



and Chris said...

Thanks for the detailed test case.

Although you've only loaded single-length strings now, at any point someone could insert a value up to 4,000 bytes. This limits optimizations the database can do.

This is clearest with rowsets (added in 12.1). This allows operations to pass rows up to their parent operation in a plan in batches.

If you look at the projection information for the two queries, you'll see that the long tables have a maximum rowset=15. Whereas precise has rowset=256 when reading from the tables and rowset=1019 for the hash join:

select  count(*) from varchar2_long1 t join varchar2_long2 i
on t.table_key = i.table_key where t.string1 = i.string1;

select * from dbms_xplan.display_cursor ( format => 'ALLSTATS LAST +PROJECTION -PREDICATE');

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                |      1 |        |      1 |00:00:01.52 |   28900 |       |       |          |
|   1 |  SORT AGGREGATE     |                |      1 |      1 |      1 |00:00:01.52 |   28900 |       |       |          |
|   2 |   HASH JOIN         |                |      1 |   5000K|   5000K|00:00:01.50 |   28900 |   263M|    19M|  245M (0)|
|   3 |    TABLE ACCESS FULL| VARCHAR2_LONG1 |      1 |   5000K|   5000K|00:00:00.11 |   14450 |       |       |          |
|   4 |    TABLE ACCESS FULL| VARCHAR2_LONG2 |      1 |   5000K|   5000K|00:00:00.11 |   14450 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT(*)[22]
   2 - (#keys=2; rowset=15)
   3 - (rowset=15) "T"."TABLE_KEY"[NUMBER,22], "T"."STRING1"[VARCHAR2,4000]
   4 - (rowset=15) "I"."TABLE_KEY"[NUMBER,22], "I"."STRING1"[VARCHAR2,4000]

select count(*) from varchar2_precise1 t join varchar2_precise2 i
on t.table_key = i.table_key where t.string1 = i.string1;

select * from dbms_xplan.display_cursor ( format => 'ALLSTATS LAST +PROJECTION -PREDICATE');

------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name              | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                   |      1 |        |      1 |00:00:01.17 |   28900 |       |       |          |
|   1 |  SORT AGGREGATE     |                   |      1 |      1 |      1 |00:00:01.17 |   28900 |       |       |          |
|   2 |   HASH JOIN         |                   |      1 |   5000K|   5000K|00:00:01.17 |   28900 |   254M|    20M|  236M (0)|
|   3 |    TABLE ACCESS FULL| VARCHAR2_PRECISE1 |      1 |   5000K|   5000K|00:00:00.09 |   14450 |       |       |          |
|   4 |    TABLE ACCESS FULL| VARCHAR2_PRECISE2 |      1 |   5000K|   5000K|00:00:00.09 |   14450 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - (#keys=0) COUNT(*)[22]
   2 - (#keys=2; rowset=1019)
   3 - (rowset=256) "T"."TABLE_KEY"[NUMBER,22], "T"."STRING1"[VARCHAR2,1]
   4 - (rowset=256) "I"."TABLE_KEY"[NUMBER,22], "I"."STRING1"[VARCHAR2,1]


This is because the database needs to allow more memory for each row for the table with larger varchar2 columns - even if it doesn't need it.

So the precise table can read rows in batches (256/15) ~ 17 larger => faster queries. If you disable rowsets you'll find both queries are much slower and the difference between them is smaller.

Mike Dietrich discusses rowsets in more detail at https://mikedietrichde.com/2016/10/11/_rowsets_enabled-apply-patch-and-use-the-default/

Rating

  (1 rating)

Comments

Thank you.

Thomas, May 30, 2024 - 4:01 pm UTC

Thanks a lot dear Chris.

Excellent explanation.

All the best.
Thomas.
Chris Saxon
May 31, 2024 - 5:11 pm UTC

You're welcome

More to Explore

DBMS_XPLAN

More on PL/SQL routine DBMS_XPLAN here