Database, SQL and PL/SQL

JSON in Motion

Create JSON documents by using plain SQL, text indexes for fuzzy searches, and relational views.

By Arup Nanda

November/December 2016

“You’re a victim of your own success, Mark,” pronounces Cindy, the chief information officer of Acme Bank. Last year Mark, the chief technology officer, introduced new Oracle Database 12c Release 1 features for storing and searching JavaScript Object Notation (JSON) documents directly in the database (read about it in “Run with JSON,” in Oracle Magazine January/February 2015). The bank associates now want more from JSON, and that’s the reason for the convergence of visitors in Mark’s office today.

Generation of JSON with SQL

Dave, the lead developer at Acme, starts with the first issue. JSON format, now more than ever, is used to exchange data between Acme’s internal systems as well as with partner banks, financial institutions, and even the government. But, as in many other organizations, much of Acme’s corporate data is stored in relational format in an Oracle Database instance. When Acme needs to send some data in JSON format, Dave’s team painstakingly writes complex programs in Java or PL/SQL, or both, to extract individual data elements from many sources and create a text file in JSON format—a process that is not only time-consuming but also error-prone. This delays the dispatch of requested data, introduces inaccuracies, and affects the business as a whole. The business users, already familiar with SQL but not Java, want something simpler and quicker for producing JSON documents. They like to spend time analyzing data—not producing JSON documents. Likewise, Dave’s team wants to write code to create systems that support Acme’s business—not produce JSON documents. Dave asks Mark: Is there a faster and easier way?

There is, Mark responds to Dave and the other hopeful visitors in Mark’s office. Oracle Database 12c Release 2 introduces many new JSON features, one of which is the ability to create JSON documents directly from the database, using built-in functions. Mark points to the familiar ACCOUNTS table that holds the information on all the account holders at the bank. Instead of writing a complex program to construct a JSON document of the data in that table, Acme can use a new function: JSON_ARRAY. He calls the function

select json_array(
    accno, accname, acctype, open_dt
)
from accounts;
JSON_ARRAY(ACCNO,ACCNAME,ACCTYPE,OPEN_DT)
————————————————————————————————————————————
[101,"John Smith","S","2015-04-24T16:33:13"]
[102,"Jane Smith","S","2015-08-02T16:38:06"]
[103,"John Doe","C","2015-11-10T16:38:06"]
[104,"John Doe","S","2015-06-13T16:38:06"]
[105,"Jane Doe","C","2016-02-18T16:38:06"]

and the output shows the data for each row of the table as a JSON array, with the proper notations: double quotation marks around the strings, the date and time stamp notation separated by T, and so on. All of this involves simply calling the JSON_ARRAY function—without any programming. (To follow along with the examples in this article, download and run the setup.sql file.)

Objects

JSON arrays are the building blocks of many JSON documents, but discovering such a simple method to create the arrays did not satisfy Dave. These are just arrays, Dave explains, and these arrays lack two major components of JSON documents:

The labels for data. For example, 101 in the first line of the output doesn’t mean much. It has to be annotated with a label, such as ‘accountNumber’:101.
The curly braces around the document name. These mark the contents of the document.

Therefore, Dave adds, a program still needs to create a JSON document. There is no need, assures Mark, because another function, called JSON_OBJECT, creates the complete document. Dave perks up and gives Mark the requirements for the JSON document:

The labels in the JSON document for the database columns ACCNO, ACCNAME, ACCTYPE, and OPEN_DT should be “accountNumber”, “accountName”, “accountType”, and “openingDate”, respectively.
The ACCTYPE column stores the type of account as a one-letter abbreviation. It should be expanded in the JSON document. For example, the ACCTYPE column value “S” should appear as “Savings” in the JSON document.

Mark writes the following simple SQL statement to address both requirements:

select json_object (
    'accountNumber' is accno,
    'accountName'   is accname,
    'accountType'   is decode(acctype,'S','Savings', 'C',
'Checking', 'Unknown'),
    'openingDate'   is open_dt
)
from accounts;
{"accountNumber":101,"accountName":"John Smith",
"accountType":"Savings","openingDate":"2015-04-24T16:33:13"}
… output truncated …

In fact, he explains to everyone, the output can be formatted with standard SQL functions, such as writing the date in mm/dd/yyyy or dd/mm/yyyy format. It’s as simple as that.

Nested Output

Dave is happier but still doesn’t look impressed. The problem, he explains, is the way JSON documents are created. Relational databases store different types of data in different tables, because of rules of normalization. In this case, for example, the ACCOUNTS table holds account information whereas the TRANSACTIONS table holds account transactions such as deposits and withdrawals. There can be multiple transactions—or no transactions—for an account. In JSON all of the information is in one place—the account information is immediately followed by the transactions for that account:

{
"accountNumber":101,
… other elements of accounts …
   {
        "transId":1,
        "transAmount":1000,
         … other elements of transactions …
   {
}

Not only are the transaction details nested under the account details, Dave stresses, but only the transactions for a specific account are nested under that account. Looks like I will need to write a program after all, he sighs. Not quite, replies Mark. Another function—JSON_ARRAYAGG—can aggregate multiple tables. He demonstrates how it is done (see Listing 1).

Code Listing 1: Nested JSON documents

select json_object (
   'accountNumber' is a.accno,
   'accountName'   is a.accname,
'accountType'   is decode(a.acctype,'S','Savings',
                            'C', 'Checking', 'Unknown'),
   'openingDate'   is a.open_dt,
   'transactions' is (select
      json_arrayagg (
        json_object (
           'transID'   is t.trans_id,
           'transDate' is t.trans_dt,
'transType' is decode(t.trans_type, 'D', 'Deposit',
                                'W', 'Withdrawal', 'Unknown'),
           'transMode'     is t.trans_mode,
           'transAmount'   is t.trans_amt
        )
      )
      from transactions t
      where t.accno = a.accno
    )
) as accounts_with_trans
from accounts a
order by a.accno
/
Output (formatted)
{
   "accountNumber":101,
   "accountName":"JohnSmith",
   "accountType":"Savings",
   "openingDate":"2015-04-24T16:33:13",
   "transactions":[
      {
         "transID":1,
         "transDate":"2016-05-28T16:50:42",
         "transType":"Deposit",
         "transMode":"Check",
         "transAmount":1000
      },
      {
         "transID":2,
         "transDate":"2016-04-08T16:55:56",
         "transType":"Deposit",
         "transMode":"ATM",
         "transAmount":2000
      },
      {
         "transID":3,
         "transDate":"2016-06-07T16:55:56",
         "transType":"Withdrawal",
         "transMode":"Transfer",
         "transAmount":1500
      }
   ]
}{
   "accountNumber":102,
   "accountName":"JaneSmith",
   "accountType":"Savings",
   "openingDate":"2015-08-02T16:38:06",
   "transactions":[
      {
         "transID":4,
         "transDate":"2016-05-28T16:55:56",
         "transType":"Deposit",
         "transMode":"Check",
         "transAmount":1200
      },
      {
         "transID":5,
         "transDate":"2016-02-18T16:55:56",
         "transType":"Withdrawal",
         "transMode":"Check",
         "transAmount":1100
      }
   ]
}{
   "accountNumber":103,
   "accountName":"JohnDoe",
   "accountType":"Checking",
   "openingDate":"2015-11-10T16:38:06",
   "transactions":[
      {
         "transID":6,
         "transDate":"2016-04-08T16:55:56",
         "transType":"Deposit",
         "transMode":"ATM",
         "transAmount":2000
      },
      {
         "transID":7,
         "transDate":"2016-05-08T16:55:56",
         "transType":"Deposit",
         "transMode":"Check",
         "transAmount":2500
      },
      {
         "transID":8,
         "transDate":"2016-06-17T16:55:56",
         "transType":"Withdrawal",
         "transMode":"Check",
         "transAmount":1000
      }
   ]
}{
   "accountNumber":104,
   "accountName":"JohnDoe",
   "accountType":"Savings",
   "openingDate":"2015-06-13T16:38:06",
   "transactions":null
}{
   "accountNumber":105,
   "accountName":"JaneDoe",
   "accountType":"Checking",
   "openingDate":"2016-02-18T16:38:06",
   "transactions":null
}

He points to the output and specifically to how the transactions have been listed under the respective account. Dave is ecstatic.

Key-Value Pair

Debbie, another development manager, presents a different problem. Acme often stores data in big data systems such as data lakes in a schemaless format with just two columns—a “key” that defines the data and its “value.” Each of these key-value pairs is identified by an ID and stored in a table named TRANS_KEYVALUE (created by the setup.sql script for this article). Debbie selects data from the table, as shown in Listing 2, to show everyone how it is stored. Pointing to the output, she explains that instead of storing the data in multiple columns such as Trans_Date, Trans_Amt, and so on, there are just three columns: ID, an identifier, which in this case is trans_id because this is a key-value pair about transactions; KEYNAME, the name of the data element (key); and KEYVALUE, the value of that element (value).

Code Listing 2: Key-value pair of transactions

SQL> select * from trans_keyvalue order by id;
        ID KEYNAME       KEYVALUE
—————————— ————————————— ————————
         1 transDate     05/28/16
         1 transAmount   1000
         1 transType     D
         1 accountNumber 101
         1 transMode     Check
         2 transMode     ATM
         2 transDate     04/08/16
         2 transAmount   2000
         2 accountNumber 101
         2 transType     D
         3 transType     W
         3 transDate     06/07/16
         3 accountNumber 101
         3 transAmount   1500
         3 transMode     Transfer
         4 transType     D
         4 accountNumber 102
         4 transDate     05/28/16
         4 transMode     Check
         4 transAmount   1200
         5 transType     W
         5 transMode     Check
         5 accountNumber 102
         5 transAmount   1100
         5 transDate     02/18/16
         6 transType     D
         6 transDate     04/08/16
         6 accountNumber 103
         6 transAmount   2000
         6 transMode     ATM
         7 transAmount   2500
         7 transDate     05/08/16
         7 transMode     Check
         7 transType     D
         7 accountNumber 103
         8 transType     W
         8 accountNumber 103
         8 transDate     06/17/16
         8 transMode     Check
         8 transAmount   1000
40 rows selected.

Debbie points to the first record in the Listing 2 output. The KEYNAME and KEYVALUE column values are “transDate” and “05/28/16,” which means that the transaction with ID = 1 occurred on 5/28/16. The subsequent records provide more details on the transaction. When a new data element is added, the table does not need a new column for that data to be inserted. For example, suppose the newly arrived input data contains a “checkNumber” element with the value “1001.” For traditional tables, you would need to create the column in the table before you could insert the data. In this case, however, the data can be inserted without any table changes. All Debbie does is store the record with “checkNumber” and “1001” as the values of the KEYNAME and KEYVALUE columns, respectively.

Big data systems that have to store data someplace before processing and have no control over the variety of data elements sent by source systems often use this simple key-value pair store to respond quickly to incoming data. But the JSON format still expects a fully annotated document with nested records, so Debbie has to write code to create JSON documents from this key-value store. When a data element changes or a new data element is inserted (for example, “checkNumber” is new), no change is needed in Acme’s TRANS_KEYVALUE table but Debbie has to change her code to reflect the new element in the JSON document, so she wants to know if Mark has a solution that will help eliminate the need to make those code changes.

There is a solution, Mark reassures her. Another new SQL function, JSON_OBJECTAGG, takes the values of the KEYNAME and KEYVALUE columns—grouped by ID number—from any key-value table and produces a JSON document format, as shown in Listing 3. The “group by” column (in this case ID) defines the scope of each JSON document, so the function knows where to put the curly braces to indicate the start and end of the JSON document. Debbie can call the function, and the function will create the document as needed, without any additional effort. She couldn’t be happier.

Code Listing 3: JSON from key value

select json_objectagg (keyname,keyvalue)
from trans_keyvalue
group by id;
{"accountNumber":"101","transMode":"Check","transAmount":"1000",
 "transDate":"05/28/16","transType":"D"}
{"accountNumber":"101","transMode":"ATM","transAmount":"2000",
 "transDate":"04/08/16","transType":"D"}
{"accountNumber":"101","transMode":"Transfer","transAmount":"1500",
 "transDate":"06/07/16","transType":"W"}
{"accountNumber":"102","transMode":"Check","transAmount":"1200",
 "transDate":"05/28/16","transType":"D"}
{"accountNumber":"102","transMode":"Check","transAmount":"1100",
 "transDate":"02/18/16","transType":"W"}
{"accountNumber":"103","transMode":"ATM","transAmount":"2000",
 "transDate":"04/08/16","transType":"D"}
{"accountNumber":"103","transMode":"Check","transAmount":"2500",
 "transDate":"05/08/16","transType":"D"}
{"accountNumber":"103","transMode":"Check","transAmount":"1000",
 "transDate":"06/17/16","transType":"W"}

Search Index

Jason is the analytics guru at Acme, and he has to analyze volumes of data to identify information patterns. He has to churn through JSON documents stored in the database, and he appreciates the fact that he can select specific data elements from the JSON documents in the database. (Read “Run with JSON” to learn how to store JSON documents in the database and retrieve elements from those documents directly.) At Acme a table named JSON_TRANS includes all the JSON documents (the table is created and populated with JSON documents by the setup.sql script described earlier in the article). When Jason wants to select a specific element—“accountNumber”, for example—inside the document, he uses the following SQL:

select json_value(trans_msg,'$.accountNumber') from json_trans t

However, this is not enough, he tells Mark. Jason often doesn’t know what specific elements he is looking for while mining information in the data. For example, he may be looking for a string “Stewart”—which can be a first name, a last name, the street name of the customer, the city, the spouse’s name, and so on. He has to perform a search in many ways and with different types of data, which results in full-table scans of the JSON document table, consumes significant CPU and I/O resources on the server, and makes the queries take so long that he has no time to analyze results. He looks to Mark to see if there are any enhanced approaches.

“In Oracle Database 12c Release 2,” Mark responds, “all of your problems are now past.” A new search index makes searches extremely efficient, reducing resource consumption. The search is written in a JSON-compliant manner and located in the predicate of the SQL statement.

Mark suggests an example: suppose I want to select the record where transMode = ‘Check’ and transAmount < 1500. Looking at the JSON document stored in the JSON_TRANS table, Mark points out that transMode is actually a child element under the “transaction” element. In JSON representation, the root element is represented by “$”—but this element is represented by “@” in search. Therefore, @.transactions.transMode represents the hierarchy “root” element -> “transactions” element -> “transMode” element. Because Mark passes multiple predicates (transMode = ‘Check’ and transAmount < 1500), he uses “&&” as the equivalent of “AND” in SQL. He uses TRANS_MODE and TRANS_AMT bind variables, which are identified with a “$” prefix. The values of these bind variables are indicated in the PASSING clause. The query is shown in Listing 4.

Code Listing 4: Query of JSON_TRANS

col "Account No" format a15
col "Account Name" format a15
col "Account Type" format a15
set autot on explain stat
select
        json_value(trans_msg,'$.accountNumber') as "Account No",
        json_value(trans_msg,'$.accountName') as "Account Name",
        json_value(trans_msg,'$.accountType') as "Account Type"
from json_trans t
where json_exists (
        trans_msg,
         $?(@.transactions.transMode == $TRANS_MODE &&
            @.transactions.transAmount < $TRANS_AMT)'
        passing 'Check' as "TRANS_MODE", 1500 as "TRANS_AMT"
)
/

To illustrate the effectiveness of the search indexes, Mark first selects data from the table, using the query shown in Listing 4. In that query, he includes the line

set autot on explain stat

to generate the execution plan and statistics of the query execution, shown in Listing 5.

Code Listing 5: Execution plan and stats (no index)

Execution Plan
———————————————————————————————
Plan hash value: 3001044691
—————————————————————————————————————————————————————————————————————————————————————
| Id  | Operation              | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————————————————————————————————————————————————————————————————
|   0 | SELECT STATEMENT       |            |   408 |   394K|    33   (0)| 00:00:01 |
|   1 |  NESTED LOOPS          |            |   408 |   394K|    33   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL    | JSON_TRANS |     1 |   985 |     4   (0)| 00:00:01 |
|   3 |   JSONTABLE EVALUATION |            |       |       |            |          |
—————————————————————————————————————————————————————————————————————————————————————
Predicate Information (identified by operation id):
———————————————————————————————————————————————————
   2 - filter(JSON_EXISTS2("TRANS_MSG" FORMAT JSON ,
              '$?(@.transactions.transMode == $TRANS_MODE && @.transactions.transAmount <
              $TRANS_AMT)' PASSING 'Check' AS "TRANS_MODE" , 1500 AS "TRANS_AMT" FALSE ON
              ERROR)=1)
Statistics
———————————————————————————————————————————————————
        121  recursive calls
          0  db block gets
         88  consistent gets
          0  physical reads
          0  redo size
        813  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         11  sorts (memory)
          0  sorts (disk)
          3  rows processed

Next, he creates the search index, a new feature in Oracle Database 12c Release 2:

create search index injson_json_trans_01
on json_trans (trans_msg)
for json parameters ('sync (on commit) search_on text_value');

Then he performs the same query (Listing 4) and shows the output (Listing 6) to Jason. Mark points to the output, noting that the domain index was used (step 3 in the execution plan). Referring to the Predicate Information section, he points to the usage of the “sdatap” operator—new in this database release—which performs a fast numerical comparison.

Code Listing 6: Execution plan and stats (with index)

Execution Plan
—————————————————————————————
Plan hash value: 2164686598
—————————————————————————————————————————————————————————————————————————————————————————————————————
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————————————————————————————————————————————————————————————————————————————————
|   0 | SELECT STATEMENT             |                      |     1 |  1003 |    33   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |                      |     1 |  1003 |    33   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| JSON_TRANS           |     1 |   997 |     4   (0)| 00:00:01 |
|*  3 |    DOMAIN INDEX              | INJSON_JSON_TRANS_01 |       |       |     4   (0)| 00:00:01 |
|   4 |   JSONTABLE EVALUATION       |                      |       |       |            |          |
—————————————————————————————————————————————————————————————————————————————————————————————————————
Predicate Information (identified by operation id):
———————————————————————————————————————————————————
   2 - filter(JSON_EXISTS2("TRANS_MSG" FORMAT JSON , '$?(@.transactions.transMode ==
              $TRANS_MODE && @.transactions.transAmount < $TRANS_AMT)' PASSING 'Check'
              AS "TRANS_MODE" ,
              1500 AS "TRANS_AMT" FALSE ON ERROR)=1)
   3 - access("CTXSYS"."CONTAINS"("T"."TRANS_MSG",'{Check} INPATH (/transactions/transMode)
              and sdatap(CTXSYS.JSON_SEARCH_GROUPNUM*  < 1500 /transactions/transAmount)')>0)
Statistics
———————————————————————————————————
         33  recursive calls
          0  db block gets
         19  consistent gets
          0  physical reads
          0  redo size
        813  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

There were changes in query execution plan. But do they help reduce resource consumption? Jason asks.

Sure, responds Mark. Pointing to the Statistics section, he notes that “recursive calls” dropped from 121 to 33 after the index was created. That reduction results in a drop in “consistent gets” from 88 to 19. As a result, the bytes processed by the query (shown under “Bytes” in the Execution Plan section) dropped from 394K to only 1,003—an impressive 99.7 percent drop, leading to proportional drops in CPU; I/O; and, consequently, execution time. Considering that Jason is selecting from tables with billions of rows, the savings add up exponentially. Jason is thrilled.

Data Guides

Mark notices Amy sitting by herself, obviously mulling over a question. Amy is the master data administrator at Acme, so when business analysts perform an analysis, they typically ask her what table and column to get the data from. In a relational database world, there are tables and columns with strictly defined meanings and anyone can look them up in the data dictionary views. But in JSON there is no such metadata repository to guide the users. For example, she explains, the JSON_TRANS table contains JSON documents that, in turn, contain transaction information and fields such as accountNumber, but unless she selects from that table and looks over the result, she will not be able to confirm the presence of that field. Typically JSON documents are very long, with lots of child elements with several levels of nesting, so it’s impractical to manually examine document contents. Even if she does inspect and record metadata manually, the information will become inaccurate when JSON documents with newer fields arrive.

Lack of metadata in JSON is a problem, concedes Mark, but the good news is that in this release, data guides record the metadata such as the elements, the nesting levels, and so on of JSON documents. This metadata can also be saved permanently in a search index, and as new content is added, the metadata is updated automatically. What’s more, this information is presented as a relational view, making it useful to anyone with SQL access.

Mark first creates the data guide:

create search index injson_json_trans_02
on json_trans (trans_msg)
for json parameters ('sync (on commit) search_on none dataguide on')
/

After the data guide is created, Mark selects from it to show the structure:

select dbms_json.get_index_dataguide (
        'ACME',
        'JSON_TRANS',
        'TRANS_MSG',
        dbms_json.format_hierarchical,
        dbms_json.pretty
) data_guide
from dual
/

Code Listing 7: Data guide output

{
  "type" : "object",
  "properties" :
  {
    "accountName" :
    {
      "type" : "string",
      "o:length" : 16,
      "o:preferred_column_name" : "TRANS_MSG$accountName"
    },
    "accountType" :
    {
      "type" : "string",
      "o:length" : 8,
      "o:preferred_column_name" : "TRANS_MSG$accountType"
    },
    "openingDate" :
    {
      "type" : "string",
      "o:length" : 32,
      "o:preferred_column_name" : "TRANS_MSG$openingDate"
    },
    "transactions" :
    {
      "oneOf" :
      [
        {
          "type" : "null",
          "o:length" : 4,
          "o:preferred_column_name" : "TRANS_MSG$transactions"
        },
        {
          "type" : "array",
          "o:length" : 512,
          "o:preferred_column_name" : "TRANS_MSG$transactions_1",
          "items" :
          {
            "properties" :
            {
              "transID" :
              {
                "type" : "number",
                "o:length" : 1,
                "o:preferred_column_name" : "TRANS_MSG$transID"
              },
              "transDate" :
              {
                "type" : "string",
                "o:length" : 32,
                "o:preferred_column_name" : "TRANS_MSG$transDate"
              },
              "transMode" :
              {
                "type" : "string",
                "o:length" : 8,
                "o:preferred_column_name" : "TRANS_MSG$transMode"
              },
              "transType" :
              {
                "type" : "string",
                "o:length" : 16,
                "o:preferred_column_name" : "TRANS_MSG$transType"
              },
              "transAmount" :
              {
                "type" : "number",
                "o:length" : 4,
                "o:preferred_column_name" : "TRANS_MSG$transAmount"
              }
            }
          }
        }
      ]
    },
    "accountNumber" :
    {
      "type" : "number",
      "o:length" : 4,
      "o:preferred_column_name" : "TRANS_MSG$accountNumber"
    }
  }
}

The output in Listing 7 shows JSON objects, elements, datatypes, lengths, and so on, which is very useful for someone who wants to use the data. But Amy looks a bit disappointed. You said the data guide was in a relational format, she comments, clearly puzzled about why this result is in JSON format. Of course, assures Mark, you can create a relational view from this data with this SQL:

begin
  dbms_json.create_view_on_path (
    viewName   => 'JSON_TRANS_VIEW',
    tableName  => 'JSON_TRANS',
    jcolname   => 'TRANS_MSG',
    path       => '$'
  );
end;
/

Code Listing 8: JSON_TRANS_VIEW

SQL> desc JSON_TRANS_VIEW
 Name                               Null?    Type
 —————————————————————————————————— ———————— ———————————————————————————
 ID                                 NOT NULL RAW(16)
 LOAD_DT                                     TIMESTAMP(6) WITH TIME ZONE
 TRANS_MSG$accountName                       VARCHAR2(16)
 TRANS_MSG$accountType                       VARCHAR2(8)
 TRANS_MSG$openingDate                       VARCHAR2(32)
 TRANS_MSG$transactions                      VARCHAR2(4)
 TRANS_MSG$accountNumber                     NUMBER
 TRANS_MSG$transID                           NUMBER
 TRANS_MSG$transDate                         VARCHAR2(32)
 TRANS_MSG$transMode                         VARCHAR2(8)
 TRANS_MSG$transType                         VARCHAR2(16)
 TRANS_MSG$transAmount                       NUMBER

This creates a view named JSON_TRANS_VIEW, described in Listing 8. Not only is this view useful for learning about various elements in the JSON document, but it also acts as a relational view to show data from the JSON document. Mark selects a few columns from JSON_TRANS_VIEW:

select
  "TRANS_MSG$transID",
  "TRANS_MSG$transDate",
  "TRANS_MSG$transMode",
  "TRANS_MSG$transType"
from json_trans_view;

Referring to the output of this SQL statement in Listing 9, Mark points out that JSON_TRANS_VIEW actually read and selected data from the underlying JSON documents. As new JSON records come in, this view immediately shows them when they are selected. In summary, Mark announces, this view provides the data dictionary for as well as the data in the JSON documents stored in relational tables. A dictionary view named USER_JSON_DATAGUIDES, Mark adds, is available to show all data guides created so far.

Code Listing 9: SELECT output from JSON_TRANS_VIEW

TRANS_MSG$transID TRANS_MSG$transDate         TRANS_MS TRANS_MSG$transT
————————————————— ——————————————————————————— ———————— ————————————————
                1 2016-05-28T16:50:42         Check    Deposit
                2 2016-04-08T16:55:56         ATM      Deposit
                3 2016-06-07T16:55:56         Transfer Withdrawal
                4 2016-05-28T16:55:56         Check    Deposit
                5 2016-02-18T16:55:56         Check    Withdrawal
                6 2016-04-08T16:55:56         ATM      Deposit
                7 2016-05-08T16:55:56         Check    Deposit
                8 2016-06-17T16:55:56         Check    Withdrawal

Renaming Columns

One thing to pay attention to, Marks warns his developer visitors, is that they need to enclose the complete case-sensitive column names—with $ signs—in double quotation marks. This, Amy warns, is confusing for relational database developers, who are not used to enclosing column names in double quotation marks. No worries, assures Mark: the columns can be renamed with the following SQL, where the JSON element “$.transactions.transID” is renamed TRANS_ID:

begin
        dbms_json.rename_column (
                'JSON_TRANS',
                'TRANS_MSG',
                '$.transactions.transID',
                dbms_json.type_number,
                'TRANS_ID'
        );
end;
/

After executing this statement, Mark re-creates JSON_TRANS_VIEW with the earlier SQL (with DBMS_JSON.CREATE_VIEW_ON_PATH). Then he describes the view as shown in Listing 10. Referring to the output, he notes that there is no longer a TRANS_MSG$transID column but that there is a new TRANS_ID column. The other columns can be changed the same way, he adds.

Code Listing 10: JSON_TRANS_VIEW with renamed column

SQL> desc json_trans_view
 Name                               Null?    Type
 —————————————————————————————————— ———————— ———————————————————————————
 ID                                 NOT NULL RAW(16)
 LOAD_DT                                     TIMESTAMP(6) WITH TIME ZONE
 TRANS_MSG$accountName                       VARCHAR2(16)
 TRANS_MSG$accountType                       VARCHAR2(8)
 TRANS_MSG$openingDate                       VARCHAR2(32)
 TRANS_MSG$transactions                      VARCHAR2(4)
 TRANS_MSG$accountNumber                     NUMBER
 TRANS_ID                                    NUMBER
 TRANS_MSG$transDate                         VARCHAR2(32)
 TRANS_MSG$transMode                         VARCHAR2(8)
 TRANS_MSG$transType                         VARCHAR2(16)
 TRANS_MSG$transAmount                       NUMBER

Virtual Columns
Listing 11
Listing 10

Code Listing 11: Insert with new columns

insert into json_trans
(
        ID,
        load_dt,
        trans_msg
)
values
(
        sys_guid(),
        systimestamp,
'{
   "accountNumber":105,
   "specialAccount":"Y",
   "accountName":"Charlie Doe",
   "accountType":"Checking",
   "openingDate":"2015-11-10T16:38:06",
   "transactions":[
      {
         "transID":10,
         "transDate":"2016-06-17T16:55:56",
         "transType":"Deposit",
         "transMode":"Check",
         "transAmount":1000
      }
   ]
}'
)
/

Mark agrees, but he notes that he can easily solve this problem by enabling automatic updates for the search. He creates the search index with the special parameters ON CHANGE ADD_VC:

create search index injson_json_trans_03
on json_trans (trans_msg)
for json parameters ('sync (on commit) search_on none dataguide on change add_vc')
/

After he creates the index, Mark inserts the row, using the code in Listing 11, and he describes JSON_TRANS_VIEW once again:

SQL> desc json_trans_view
 Name                           Null?    Type
 —————————————————————————————— ———————— ————————————————————————————
 ID                             NOT NULL RAW(16)
 LOAD_DT                                 TIMESTAMP(6) WITH TIME ZONE
 TRANS_MSG                               CLOB
 TRANS_MSG$accountName                   VARCHAR2(16)
 TRANS_MSG$accountType                   VARCHAR2(8)
 TRANS_MSG$openingDate                   VARCHAR2(32)
 TRANS_MSG$transactions                  VARCHAR2(4)
 TRANS_MSG$accountNumber                 NUMBER
 TRANS_MSG$specialAccount                VARCHAR2(1)

Referring to the output, Mark points to the new “specialAccount” column (shown in bold). But this column was only in the newly inserted row—it wasn’t there earlier, Amy points out. Amy wants to know how the relational view handles the column values for the earlier records. Easily, Mark responds: the view simply shows the value of that column as null for the previous records. He demonstrates by selecting that column from the view:

select "TRANS_MSG$accountNumber",
       "TRANS_MSG$specialAccount"
from json_trans_view;
TRANS_MSG$accountNumber TRANS_MSG$specialAccount
——————————————————————— ————————————————————————
                    101
                    102
                    103
                    104
                    105
                    104
                    105 Y

The “specialAccount” column value was automatically populated for the last row. The previous records didn’t include that element, so those records show null for that column, Mark explains. When new JSON records with new elements not present earlier get loaded into the database, the data guide and the view will be automatically updated. Amy couldn’t be more thrilled.

Summary

JSON is the lingua franca of data exchange for modern systems, from transactional to big data sources and services. In this article, you learned how to create JSON documents directly from the database and search database-stored JSON documents for faster access.

Next Steps

LEARN more about Oracle Database 12c Release 2.

LEARN more about JSON in Oracle Database.

DOWNLOAD the setup script for this article.

Arup Nanda

Arup Nanda has been an Oracle DBA since 1993, handling all aspects of database administration, from performance tuning to security and disaster recovery. He was Oracle Magazine’s DBA of the Year in 2003 and received an Oracle Excellence Award for Technologist of the Year in 2012.

DISCLAIMER: We've captured these popular historical magazine articles for your reference. Links etc contained within these article possibly won't work. These articles are provided as-is with no guarantee that the information within them is the best advice for current versions of the Oracle Database.

Share

JSON in Motion

Create JSON documents by using plain SQL, text indexes for fuzzy searches, and relational views.