You Asked

here is the test.

create table t1(id int, name char(10));
create table t2(id int, name char(12));
insert into t1 values(100, 'yang');
insert into t2 values(200, 'yang');
commit;

-- return 1 row
select * from t1, t2 where t1.name = t2.name and t1.name = 'yang';

-- return 1 row
select * from t1, t2 where t1.name = t2.name and t1.name = rpad('yang', 10);

-- return 1 row
with tmp as (
    select id, name from t2 where name = rpad('yang', 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = 'yang';

-- return 0 row
with tmp as (
    select id, name from t2 where name = rpad('yang', 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = rpad('yang', 10);

Why the last query return 0 row?

and Chris said...

The database sees t1.name = t2.name and t2.name = rpad('yang', 12), so has added the predicate t1.name = rpad('yang', 12). It does similar for rpad('yang', 10).

You can see this by looking at the predicates section of the plan:

set serveroutput off
with tmp as (
    select id, name from t2 where name = rpad('yang', 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = rpad('yang', 10);

select * from dbms_xplan.display_cursor();

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |       |       |     1 (100)|          |
|*  1 |  FILTER             |      |       |       |            |          |
|*  2 |   HASH JOIN         |      |     1 |    52 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| T1   |     1 |    25 |     3   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |     1 |    27 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(NULL IS NOT NULL)
   2 - access("T1"."NAME"="NAME")
   3 - filter(("T1"."NAME"='yang      ' AND "T1"."NAME"='yang        '))
   4 - filter(("NAME"='yang        ' AND "NAME"='yang      '))

But there's a mix of blank-padded and nonpadded comparisons in this case.

char_col = char_col => blank-padded comparison, so trailing spaces are added to make the strings the same length
char_col = rpad ( ... ) => nonpadded comparison, so the strings must match exactly - without adding trailing spaces.

So

t1.name = rpad ( 'yang', 12 ) is false => no rows as in your final query.

I believe adding the extra predicates in this case is incorrect (i.e. the last query should return 1 row) and have raised this internally.

Rating

(2 ratings)

Comments

Ummm...

Chuck Jolley, February 19, 2025 - 11:01 pm UTC

My takeaway is "Never use CHAR columns."

February 20, 2025 - 9:59 am UTC

They can be useful if you're working with strings that are always exactly N characters (think ISO country & currency codes). But yes, generally it's better to use varchar2 for character data.

DarkAthena, February 20, 2025 - 7:32 am UTC

using cast function or UDF rpad or sys.standard.rpad instead of internal rpad,can get right result

--return 1 row
with tmp as (
    select id, name from t2 where name = sys.standard.rpad('yang', 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = sys.standard.rpad('yang', 10);

--return 1 row
with tmp as (
    select id, name from t2 where name = cast('yang        ' as varchar2(12))
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = cast('yang      ' as varchar2(10));

--return 1 row

create or replace function  RPAD2(STR1 VARCHAR2 CHARACTER SET ANY_CS,
                LEN pls_integer,
                PAD VARCHAR2 CHARACTER SET STR1%CHARSET)
        return VARCHAR2 CHARACTER SET STR1%CHARSET is
begin
  return rpad(STR1,len,pad);
end;   
/
with tmp as (
    select id, name from t2 where name = rpad2('yang', 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = rpad2('yang', 10);

So, I guess the possible reasons are as follows:
when all the input parameters of the rpad function are constants, the SQL optimizer automatically optimizes it to a constant of type varchar2,
so the rpad function doesn't show up in the execution plan.
At this point, the filtering condition values are no longer a complex expression but just two simple constant values, making it quick to determine if these two constants are equal.
Since there's a condition "T1.NAME=NAME," the optimizer automatically adds the "NULL IS NOT NULL" filtering condition after comparing these two constants, leading to incorrect results.
If one of the parameters of the rpad function is changed to a non-constant value, like a column name, then the optimizer won't remove the rpad, so it can't compare the two values in advance, and "null is not null" won't be added.

--return 1 row
with tmp as (
    select id, name from t2 where name = rpad(name, 12)
    )
select * from t1, tmp where t1.name = tmp.name and t1.name = rpad(t1.name, 10);

Comparing different types of data like "char=varchar2" shouldn't lead to additional filtering conditions in the optimizer，but it do

February 20, 2025 - 10:00 am UTC

You're right, it shouldn't lead to additional filtering, which is why I've raised this internally.

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Ummm...

More to Explore

DBMS_XPLAN

two table join with char return 0 row

Breadcrumb

Question and Answer

You Asked

and Chris said...

Rating

Comments

Ummm...

More to Explore

DBMS_XPLAN