Hi Tom!
In a database with EL8MSWIN1253 character set, json_value() returns value only for the keys prior to unicode key. In my example, json_value() returns result only for keys Val1 and Val2.
When I run it on a unicode instance, it works as expected. I have similar behavior on 12.2.0.1.
declare
js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
begin
-- Non null result
dbms_output.put_line('Val1:'||json_value(js, '$.Val1'));
-- Non null result
dbms_output.put_line('Val2:'||json_value(js, '$.Val2'));
-- Null result
dbms_output.put_line('Val3:'||json_value(js, '$.Val3'));
-- Null result
dbms_output.put_line('Val4:'||json_value(js, '$.Val4'));
end;Is the use of json_value() forbitten on national character sets?
According to
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adjsn/json-character-sets-and-encoding.html the database will automatically convert the input string to UTF8 in order to process it.
What am I missing here?
This is easier seen when we add the ERROR clause
--
-- UTF
--
SQL> select parameter, value from v$nls_parameters;
PARAMETER VALUE
---------------------------------------------------------------- ------------------------------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_CHARACTERSET AL32UTF8
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SQL> declare
2 js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
3 begin
4 -- Non null result
5 dbms_output.put_line('Val1:'||json_value(js, '$.Val1' error on error));
6
7 -- Non null result
8 dbms_output.put_line('Val2:'||json_value(js, '$.Val2' error on error));
9
10 -- Null result
11 dbms_output.put_line('Val3:'||json_value(js, '$.Val3' error on error));
12
13 -- Null result
14 dbms_output.put_line('Val4:'||json_value(js, '$.Val4' error on error));
15 end;
16 /
Val1:2
Val2:Some nonUnicode text
Val3:Some Greek Text ΕΛΛΗΝΙΚΑ
Val4:123
--
-- non UTF
--
SQL> select parameter, value from v$nls_parameters;
PARAMETER VALUE
---------------------------------------------------------------- ----------------------------------------------------------------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICA
NLS_NUMERIC_CHARACTERS .,
NLS_CALENDAR GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE AMERICAN
NLS_CHARACTERSET EL8MSWIN1253
NLS_SORT BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY $
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_COMP BINARY
NLS_LENGTH_SEMANTICS BYTE
NLS_NCHAR_CONV_EXCP FALSE
SQL> declare
2 js string(500 byte) := '{"Val1":2, "Val2":"Some nonUnicode text", "Val3":"Some Greek Text ΕΛΛΗΝΙΚΑ","Val4":123,"Val5":"2020-09-29"}';
3 begin
4 -- Non null result
5 dbms_output.put_line('Val1:'||json_value(js, '$.Val1' error on error));
6
7 -- Non null result
8 dbms_output.put_line('Val2:'||json_value(js, '$.Val2' error on error));
9
10 -- Null result
11 dbms_output.put_line('Val3:'||json_value(js, '$.Val3' error on error));
12
13 -- Null result
14 dbms_output.put_line('Val4:'||json_value(js, '$.Val4' error on error));
15 end;
16 /
Val1:2
Val2:Some nonUnicode text
declare
*
ERROR at line 1:
ORA-40474: invalid UTF-8 byte sequence in JSON data
ORA-06512: at line 11
So we *did* try to convert it, but could not.
Bottom line - running a database in non-UTF nowadays is always asking for problem