Skip to Main Content
  • Questions
  • How can I implement a “natural sort” order-by in a SQL query using REGEXP_LIKE ?

Breadcrumb

May 4th

Question and Answer

Connor McDonald

Thanks for the question, AKASH.

Asked: January 20, 2017 - 7:25 am UTC

Last updated: January 20, 2017 - 4:48 pm UTC

Version: 11g

Viewed 1000+ times

You Asked

e.g.

@foo
foo
foo1
foo2
foo10
foo100

Rather than

foo1
foo10
foo100
foo2
@foo
foo


and Chris said...

Why regexp_like? Surely regexp_substr is what you're looking for?

You can use this to strip out all non-numbers, then find the just the numbers:

with rws as (
  select '@foo' x from dual union all
  select 'foo' x from dual union all
  select 'foo1' x from dual union all
  select 'foo2' x from dual union all
  select 'foo10' x from dual union all
  select 'foo100' x from dual 
)
  select x, regexp_substr(x, '[^0-9]*'), regexp_substr(x, '[0-9]+') from rws
  order  by regexp_substr(x, '[^0-9]*'), regexp_substr(x, '[0-9]+') nulls first;

X       REGEXP_SUBSTR(X,'[^0-9]*')  REGEXP_SUBSTR(X,'[0-9]+')  
@foo    @foo                                                   
foo     foo                                                    
foo1    foo                         1                          
foo10   foo                         10                         
foo100  foo                         100                        
foo2    foo                         2  


Note this only works if you're sure numbers will appear at the end of your string...

with rws as (
  select '@foo' x from dual union all
  select 'foo' x from dual union all
  select 'foo1' x from dual union all
  select 'foo2' x from dual union all
  select 'foo10' x from dual union all
  select 'foo100' x from dual union all
  select 'a123d123' x from dual union all
  select 'a123g2' x from dual union all
  select 'a123g10' x from dual  
)
  select x, regexp_substr(x, '[^0-9]*'), regexp_substr(x, '[0-9]+') from rws
  order  by regexp_substr(x, '[^0-9]*'), regexp_substr(x, '[0-9]+') nulls first;

X         REGEXP_SUBSTR(X,'[^0-9]*')  REGEXP_SUBSTR(X,'[0-9]+')  
a123g2    a                           123                        
a123g10   a                           123                        
a123d123  a                           123                        
@foo      @foo                                                   
foo       foo                                                    
foo1      foo                         1                          
foo10     foo                         10                         
foo100    foo                         100                        
foo2      foo                         2 

Rating

  (2 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

AKASH NAMDEV, January 20, 2017 - 11:59 am UTC

Hi Chris Saxon,

FUNCTION naturalsort
(p_str IN VARCHAR2, p_number_of_zeroes IN NUMBER DEFAULT 4) RETURN VARCHAR2 PARALLEL_ENABLE DETERMINISTIC
IS
tempStr VARCHAR2(4000);
tempChr VARCHAR2(4);
strResult VARCHAR2(4000);
indexN NUMBER;
BEGIN
strResult := '';
IF p_str IS NULL THEN
RETURN '';
ELSIF
length(p_str) = 0 THEN
RETURN '';
END IF;
indexN := 1;
tempStr := '_';

WHILE indexN <= length(p_str) LOOP
tempChr := substr(p_str,indexN,1);
IF ascii(tempChr) < 58 AND ascii(tempChr) > 47 THEN
tempStr := concat(tempStr,tempChr);
ELSIF length(tempStr) > 1 THEN
strResult := concat(strResult,lpad(substr(tempStr,2,p_number_of_zeroes),p_number_of_zeroes,'0'));
tempStr := '_';
strResult := concat(strResult,tempChr);
ELSE
strResult := concat(strResult,tempChr);
END IF;
IF length(tempStr) = p_number_of_zeroes+1 THEN
strResult := concat(strResult,substr(tempStr,2,p_number_of_zeroes));
tempStr := '_';
END IF;
indexN := indexN + 1;
END LOOP;
IF length(tempStr) > 1 THEN
strResult := concat(strResult,lpad(substr(tempStr,2,p_number_of_zeroes),p_number_of_zeroes,'0'));
END IF;

RETURN strResult;
END;



Right now i am using this for natural sorting but it's not fast so i wrote this

order by case when not regexp_like(NAME,'|^([0-9]+)$') then lpad(name,4) else name end ;

result :
Kazenlaan 1
Kazenlaan 11
Kazenlaan 2

but it's not correct it should be like

Kazenlaan 1
Kazenlaan 2
Kazenlaan 11




Chris Saxon
January 20, 2017 - 4:48 pm UTC

Gah, forgot to to_number the regexp getting the numbers:

with rws as (
  select 'Kazenlaan 1' name from dual union all
  select 'Kazenlaan 11' from dual union all
  select 'Kazenlaan 2' from dual  
)
  select name from rws
  order by  regexp_substr(name, '[^0-9]+'), to_number(regexp_substr(name, '[0-9]+'));

NAME          
Kazenlaan 1   
Kazenlaan 2   
Kazenlaan 11 

A reader, January 30, 2017 - 6:08 pm UTC

Hi Charis,

Thanks for the reply.