Database, SQL and PL/SQL

On Oracle Database In-Memory

Our technologist talks with Oracle product management about Oracle Database 12c’s Oracle Database In-Memory option.

By Tom Kyte

January/February 2015

Usually I take three or four user-submitted questions from the past two months and present those questions and answers in my Ask Tom column. In this issue, I’m taking a different approach to discuss the #1 new capability of Oracle Database 12c patch release 12.1.0.2. In July 2014, Oracle shipped this major patch release to Oracle Database 12c, and it included some significant new capabilities as well as one huge new capability: the Oracle Database In-Memory option. (See bit.ly/newin121 for a complete listing of new 12.1.0.2 features.) What follows is a conversation I had with the product manager for the Oracle Database In-Memory option, Maria Colgan. I hope you enjoy this change from my usual style.

Kyte: Maria, please tell us about yourself. What is it you do at Oracle? What did you do at Oracle before this? How did you get to where you are now?

Colgan: I’m the product manager for Oracle Database In-Memory, so I’m responsible for evangelizing the new functionality and getting the feedback from our customers and partners incorporated into future releases of the product. Prior to this, I was the product manager for the Oracle Database query optimizer, which earned me a number of nicknames, including Chief Apologizer for the Optimizer and, of course, Optimizer Lady.

But my career at Oracle didn’t start in product management. I started at Oracle as a developer for the IBM OS/2 product line, but I didn’t last very long in that role. About six months in, I knew I wasn’t going to be a very good coder, so I applied to become a member of the Real-World Performance team, where the likes of Andrew Holdsworth and Graham Wood taught me the fundamentals of SQL and database tuning—skills I’m still using today.

Kyte: Let’s talk about in-memory computing in general. It’s not new, but it’s getting a lot of attention right now. What’s the state of in-memory computing, and how has it evolved?

Colgan: That’s right: the general notion of storing data in memory and avoiding disk I/O is not new. The database buffer cache was widely adopted more than two decades ago to keep frequently accessed database blocks in memory, for example.

In fact, the recent in-memory trend got its start around the year 2000, with the advent of 64-bit operating systems, which vastly increased the amount of memory that could be addressed for databases and caches, as opposed to 32-bit operating systems, which were limited to around 2 gigabytes.

This evolution laid the groundwork for a surge of new in-memory techniques; algorithms; and, in some cases, on-chip routines (software in silicon) to take advantage of how data in memory can be processed for performance gains. These in-memory developments deliver capabilities that far exceed the basic in-memory advantage of removing physical I/O to speed performance.

Kyte: Now on to specifics: the new Oracle Database In-Memory option. What is it, and why now? Why not 10 or 20 years ago?

Colgan: Oracle Database In-Memory provides a unique dual-format architecture that enables Oracle Database tables to be simultaneously represented on disk and in memory, using a traditional row format and a new in-memory column format. The Oracle Database query optimizer automatically routes analytic queries to the column format and OLTP [online transaction processing] queries to the row format, transparently delivering best-of-both-worlds performance. Oracle Database 12c automatically maintains full transactional consistency between the row and column formats, just as it maintains consistency between tables and indexes today. The new column format is a pure in-memory format and is not persistent on disk, so there are no additional storage costs or storage synchronization issues.

As to why this was the right time to develop Oracle Database In-Memory, one reason is that massive in-memory processing is technically feasible for the first time. In the past, servers were limited by 32-bit (or less) operating systems with limited memory addressability, but with 64-bit operating systems, this is no longer a concern. For example, the SPARC-based Oracle M6-32 Big Memory Machine can be configured with 32 terabytes of DRAM [dynamic random access memory] and 384 processor cores. Additionally Oracle’s Exadata Database Machine X4-8 can be configured with as many as 36 servers with 216 terabytes of DRAM and 4,320 processor cores. These machines were not possible just a few years ago.

Another reason why Oracle Database In-Memory is here and important now is memory cost. Not too long ago, a terabyte of memory would have cost millions of dollars. Today the cost of a terabyte of memory is measured in single-digit thousands of dollars, so having large banks of memory is now economically feasible. Another reason why Oracle Database In-Memory is important to businesses now is that organizations are demanding to be able to analyze their OLTP information in real time—without having a negative impact on OLTP performance and without having to wait for the classic ETL [extract, transform, and load] process to load the data into a data warehouse. Analytic queries tend to hit a few columns out of millions and billions of rows, whereas OLTP applications hit all the columns of very few rows at a time. Having the data structured automatically for both—column-wise for analytic queries and row-wise for OLTP—is a capability that businesses demand.

Kyte: Yes, the hardware landscape and the amount of data businesses are processing have changed dramatically. Does Oracle Database In-Memory require specialized hardware such as an engineered system?

Colgan: No, that’s the beauty of Oracle Database In-Memory; it can be used on all hardware platforms that are supported by Oracle Database 12c.

Kyte: That’s great. What do you see as the major use cases for Oracle Database In-Memory?

Colgan: Speed-sensitive applications that require access to large amounts of data to answer business-driving questions are the obvious systems that would benefit from Oracle Database In-Memory. But hybrid applications that combine analytics with transactions for real-time commerce can also see some immediate returns from using Oracle Database In-Memory.

For example, in the financial industry, being able to manage exposure to risk, particularly credit risk and market risk, is critical. But if it takes too long to determine whether a particular investment is a good risk, there is a risk of losing the competitive edge or even the investment opportunity altogether.

The same is true for credit card fraud detection. A lot of analytics is required to determine whether a transaction is legitimate, but no one wants to experience delays in processing a credit card transaction. So the analysis has to be completed in less than a second.

Query response time is extremely important in both of these scenarios, because it is critical for the business to mitigate risk but also to make a decision in a very short amount of time. In-memory technology provides these capabilities, by enabling analytics to be run on a large amount of data in a very short time.

Another area that greatly benefits from in-memory technology is advertising-supported websites. Being able to predict which ad an anonymous user will click is crucial for retaining advertisers. These predictions require a lot of data mining and analytics to be done in a very short time.

In-memory technology also provides huge speedups for the most-active data of existing data warehouses or data marts.

o15asktom-f1

Figure 1: Row format versus column format processing

Kyte: I see. Now let’s get a little deeper into the technical bits and bytes. In general terms, how does Oracle Database In-Memory work?

Colgan: Oracle Database In-Memory enables data to be simultaneously populated in memory in both a row format (in the buffer cache) and a new in-memory column format. The Oracle Database query optimizer is fully aware of the column format: it automatically routes analytic queries to the column format and OLTP operations to the row format, ensuring outstanding performance and complete data consistency for all workloads without any application changes.

The database maintains full transactional consistency between the row and column formats, just as it maintains consistency between tables and indexes. What’s really great about this approach is that there is only a single copy of the table in storage, so there are no additional storage costs or synchronization issues.

Kyte: What do I need to do—as a DBA—to take advantage of this new capability?

Colgan: You can start taking advantage of Oracle Database In-Memory with two easy steps.

First you need to allocate an in-memory area in the SGA [system global area]. You can allocate as little or as much memory as you want to the in-memory area. The larger the in-memory area, the greater the number of database objects that can utilize it. The size of the in-memory area is controlled by the INMEMORY_SIZE initialization parameter.

Next you need to determine which database objects you want to store in memory and populate them in the in-memory column store. Unlike with a pure in-memory database, not all of the objects in an Oracle database need to be populated into the in-memory column store. Oracle recommends that the in-memory column store be populated with the most-performance-critical data in the database. Less-performance-critical data can reside on lower-cost flash or disks. Of course, if your database is small enough, you can populate all your tables into the in-memory column store.

To indicate that an object is a candidate to be populated into the in-memory column store, you need to specify a new INMEMORY attribute, which can be specified on a tablespace, table, subpartition, partition, or materialized view.

Kyte: That sounds pretty straightforward. What do I need to do as a developer to use Oracle Database In-Memory?

Colgan: Any application that connects to an Oracle database today and issues SQL should be able to take advantage of Oracle Database In-Memory, but there are some simple rules application developers should follow to get the best performance from Oracle Database In-Memory.

Data should be processed in the database and not in the application. By that I mean that the application shouldn’t just connect to the database and do a SELECT * FROM on all the tables. The business logic—such as joins and aggregation—should be built into the queries, so the database actually returns meaningful results.
The application should also use set-based processing instead of row-by-row processing. This will enable the database to process the large volumes of data typically used in analytics much more efficiently.
Gather a representative set of optimizer statistics. Like it or not, the optimizer gets to decide whether the in-memory column store will be used. The optimizer cost model has been enhanced to make it aware of the contents of the in-memory column store, but the model still relies on basic table-level statistics such as the number of rows in the table and the number of distinct values in a column.

Kyte: That sounds easy enough, but what if my application is doing row-by-row processing of billions of rows? Is Oracle Database In-Memory really going to help me there?

Colgan: If your application today doesn’t follow the simple rules I just mentioned, code changes will be required before that application will see the full benefit of Oracle Database In-Memory.

Kyte: That makes sense to me. Those rules are part of an approach we’ve been promoting for what seems like our entire careers! Now, how is it that Oracle Database In-Memory can actually make my OLTP faster? It seems like adding in-memory processing would necessarily mean doing additional work.

Colgan: Although OLTP transactions won’t use the in-memory column store, its presence may improve the performance of these transactions by reducing the need to create and maintain additional indexes to support analytical queries.

In a mixed-workload environment—one that has transactions and reports—there are two types of indexes. A small subset of indexes is used to support referential integrity and single-record lookups, typically the primary key and foreign key indexes, whereas a much larger set of indexes is used to improve the performance of analytic or reporting queries.

By replacing the larger set of reporting indexes with the in-memory column store, you can speed up the OLTP transactions, because you no longer have to maintain those indexes every time data changes.

But please don’t take this as an invitation to drop all your indexes! You definitely still need your primary key and foreign key indexes.

o15asktom-f2

Figure 2: Scan billions of rows per second per CPU core in memory.

Kyte: Interesting. How does Oracle Database In-Memory work with other database features? What restrictions does Oracle Database In-Memory entail?

Colgan: Because the in-memory column store is built into the heart of Oracle Database, it transparently operates with all the existing features of Oracle Database. So partitioning, security, parallel execution, Oracle RAC [Oracle Real Application Clusters], Oracle Automatic Storage Management, Oracle Multitenant, and Oracle Flashback query all work with Oracle Database In-Memory.

The only gotcha in this initial release comes with Oracle Active Data Guard. Queries on the Oracle Active Data Guard standby database will not be able to use the in-memory column store. But replicated systems that use logical standbys or Oracle GoldenGate can use the in-memory column store on the standby database.

Kyte: That’s great: a new feature without real restrictions. And it is all because Oracle Database In-Memory is just that—in memory—and doesn’t affect the disk format of anything. But tell me: if I can give Oracle Database In-Memory only a few gigabytes of memory, will it have any real impact?

Colgan: Yes, you will definitely still see an impact, even if you have only a couple of gigabytes to spare. Remember, not all the data in an Oracle database needs to be in the in-memory column store.

Plus the data populated into the in-memory column store is automatically compressed with a new set of compression techniques that not only expand the memory capacity but also improve query performance.

The compression ratios vary from 2 times to 20 times, depending on the compression option chosen and redundancy in the data. But the compression method can be different across columns or partitions in a table. For example, some table partitions can be optimized for scan speed and others for the memory footprint, whereas others can be optimized to efficiently handle frequent DML [data manipulation language] operations.

Kyte: That’s great. Maria, is there anything more you want to tell us about Oracle Database In-Memory before we wrap up?

Colgan: There’s one last thing I want to make folks aware of. Oracle Database has been optimized and tuned for decades to scale up on SMP [symmetric multiprocessing] servers and scale out on clusters of servers, and using the in-memory column store doesn’t change that.

Oracle Database In-Memory builds on these technologies to scale up to very high memory and CPU capacities on large SMP servers with terabytes of memory. Oracle Database In-Memory optimizes performance on large SMP servers, by preferentially scheduling threads on each CPU to access in-memory data located on DRAM local to that CPU.

In addition to being able to scale up, Oracle Database In-Memory can also scale out to very high memory and CPU capacities, by using all the memory and processors in a cluster of servers. Oracle Database In-Memory automatically distributes tables across the in-memory column stores of all the instances in a cluster. This distribution is similar to the striping of data in a storage subsystem. In-memory SQL execution transparently queries and combines data across all the instances of the cluster, using parallel execution processes. Oracle Database In-Memory further optimizes scale-out query processing by colocating frequently joined partitions on the same instance to enable local partition-wise joins.

Kyte: Thanks—and one last question: where should we go to get additional information and technical details and maybe even play with this new capability?

Colgan: The Oracle Database In-Memory page at oracle.com/us/products/database/options/database-in-memory is a great place to start.

The In-Memory blog (blogs.oracle.com/in-memory), another great resource, has plenty of how-to articles to get you started.

Kyte: Thanks again, Maria. This is truly a revolutionary new capability. A game-changer.

Next Steps

ASK Tom
Tom Kyte answers your most difficult technology questions. Highlights from that forum appear in this column.

Share

On Oracle Database In-Memory

Our technologist talks with Oracle product management about Oracle Database 12c’s Oracle Database In-Memory option.