Skip to Main Content
  • Questions
  • JSON_VALUE returns on strings that contain unicode characters on an non-unicode database

Breadcrumb

Question and Answer

Connor McDonald

Thanks for the question, John.

Asked: September 29, 2020 - 12:21 pm UTC

Last updated: October 07, 2020 - 3:42 am UTC

Version: 19c (patch id 29859191)

Viewed 1000+ times

You Asked

Hi Tom!

In a database with EL8MSWIN1253 character set, json_value() returns value only for the keys prior to unicode key. In my example, json_value() returns result only for keys Val1 and Val2.
When I run it on a unicode instance, it works as expected. I have similar behavior on 12.2.0.1.

declare
js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
begin
  -- Non null result
  dbms_output.put_line('Val1:'||json_value(js, '$.Val1'));
  
  -- Non null result
  dbms_output.put_line('Val2:'||json_value(js, '$.Val2'));
  
  -- Null result
  dbms_output.put_line('Val3:'||json_value(js, '$.Val3'));

  -- Null result
  dbms_output.put_line('Val4:'||json_value(js, '$.Val4'));  
end;


Is the use of json_value() forbitten on national character sets?

According to https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adjsn/json-character-sets-and-encoding.html the database will automatically convert the input string to UTF8 in order to process it.

What am I missing here?

and Connor said...

This is easier seen when we add the ERROR clause

--
-- UTF
--
SQL> select parameter, value from v$nls_parameters;

PARAMETER                                                        VALUE
---------------------------------------------------------------- ------------------------------------
NLS_LANGUAGE                                                     AMERICAN
NLS_TERRITORY                                                    AMERICA
NLS_CURRENCY                                                     $
NLS_ISO_CURRENCY                                                 AMERICA
NLS_NUMERIC_CHARACTERS                                           .,
NLS_CALENDAR                                                     GREGORIAN
NLS_DATE_FORMAT                                                  DD-MON-RR
NLS_DATE_LANGUAGE                                                AMERICAN
NLS_CHARACTERSET                                                 AL32UTF8
NLS_SORT                                                         BINARY
NLS_TIME_FORMAT                                                  HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT                                             DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT                                               HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT                                          DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY                                                $
NLS_NCHAR_CHARACTERSET                                           AL16UTF16
NLS_COMP                                                         BINARY
NLS_LENGTH_SEMANTICS                                             BYTE
NLS_NCHAR_CONV_EXCP                                              FALSE

SQL> declare
  2  js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
  3  begin
  4    -- Non null result
  5    dbms_output.put_line('Val1:'||json_value(js, '$.Val1' error on error));
  6
  7    -- Non null result
  8    dbms_output.put_line('Val2:'||json_value(js, '$.Val2' error on error));
  9
 10    -- Null result
 11    dbms_output.put_line('Val3:'||json_value(js, '$.Val3' error on error));
 12
 13    -- Null result
 14    dbms_output.put_line('Val4:'||json_value(js, '$.Val4' error on error));
 15  end;
 16  /
Val1:2
Val2:Some nonUnicode text
Val3:Some Greek Text ΕΛΛΗΝΙΚΑ
Val4:123

--
-- non UTF
--
SQL> select parameter, value from v$nls_parameters;

PARAMETER                                                        VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
NLS_LANGUAGE                                                     AMERICAN
NLS_TERRITORY                                                    AMERICA
NLS_CURRENCY                                                     $
NLS_ISO_CURRENCY                                                 AMERICA
NLS_NUMERIC_CHARACTERS                                           .,
NLS_CALENDAR                                                     GREGORIAN
NLS_DATE_FORMAT                                                  DD-MON-RR
NLS_DATE_LANGUAGE                                                AMERICAN
NLS_CHARACTERSET                                                 EL8MSWIN1253
NLS_SORT                                                         BINARY
NLS_TIME_FORMAT                                                  HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT                                             DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT                                               HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT                                          DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY                                                $
NLS_NCHAR_CHARACTERSET                                           AL16UTF16
NLS_COMP                                                         BINARY
NLS_LENGTH_SEMANTICS                                             BYTE
NLS_NCHAR_CONV_EXCP                                              FALSE

SQL> declare
  2  js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
  3  begin
  4    -- Non null result
  5    dbms_output.put_line('Val1:'||json_value(js, '$.Val1' error on error));
  6
  7    -- Non null result
  8    dbms_output.put_line('Val2:'||json_value(js, '$.Val2' error on error));
  9
 10    -- Null result
 11    dbms_output.put_line('Val3:'||json_value(js, '$.Val3' error on error));
 12
 13    -- Null result
 14    dbms_output.put_line('Val4:'||json_value(js, '$.Val4' error on error));
 15  end;
 16  /
Val1:2
Val2:Some nonUnicode text
declare
*
ERROR at line 1:
ORA-40474: invalid UTF-8 byte sequence in JSON data
ORA-06512: at line 11


So we *did* try to convert it, but could not.

Bottom line - running a database in non-UTF nowadays is always asking for problem


Is this answer out of date? If it is, please let us know via a Comment

More to Explore

Design

New to good database design? Check out Chris Saxon's full fundamentals class.