Skip to Main Content
  • Questions
  • RAC on VMWare, is it advisable and will it work?

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, John.

Asked: July 30, 2012 - 6:53 pm UTC

Last updated: March 07, 2013 - 3:20 pm UTC

Version: 11.2.0.3

Viewed 10K+ times! This question is

You Asked

Tom,

The company has decided that every system/software will go on VMWare. I am concerned about using Oracle RAC on VMware (VSphere). I am concerned because Oracle RAC on VMware was only recently supported by Oracle as you already know. Also, I don't suspect that there are many Oracle RAC VMware setups in production so I feel that we will be the first face issues, some of which could be big. Also, it goes against common-sense to use RAC on VMware when the intent of RAC is to make a big system from several small systems, while the intent of VMware is to breakdown a large system into several smaller systems.

RAC on VMWare, is it advisable and will it work? Please advise?


and Tom said...

There are two intents with RAC

o scalability (make a big system from several small systems)

o reduce planned downtime events (patching and the like) and reduce unplanned downtime (system panic, database crash, whatever)


You will still be getting point #2 from above (as long as the VM's are on different machines you'll have the same sort of availability characteristics as you would on straight hardware)


You will not be the first - not by any means. There are quite a few people out there doing it. You will be in a smaller ecosystem - not everyone is doing it.

There could be cases where support would need you to reproduce an issue outside of VMware - if you run into funny operating system related issues - but for most all other issues, support will support you.

Rating

  (9 ratings)

Is this answer out of date? If it is, please let us know via a Comment

Comments

Is RAC on VMWare advisable?

John Cantu, August 01, 2012 - 4:18 pm UTC

Tom, thank you for your responses.

However, is RAC on VMWare advisable?

Let me list two scenarios. Let's say that we have blades that have 2 CPUs and each CPU has 8 cores.

Customer one decides to build a 4 VM node RAC cluster, each with 4 CPUs. They follow Oracle's recommendation of making sure that the VM nodes are on separate physical blades.

Customer two decides not to use RAC or VMware, but instead uses the single 16 core blade to serve a nonRAC database.

Customer three decides to install RAC on a single 16 core blade to serve a RAC database with a single instance to allow future addition of RAC instances on seperate nodes.

All three customers can schedule a maintenance window every few months and both have licenses for RAC and VMware already purchased.

Which customer has the significantly better performing system? Would you advise customer two to use VMware with RAC only to get the benefit of "point 2", reduced planned downtime? Which customer made the best decision?

BTW: at a Hotsos presentation, one of the presenters told us that a two node cluster should not be allowed. I don't remember the details of why. Do you think a two node RAC cluster is a bad idea?

Tom Kyte
August 01, 2012 - 4:29 pm UTC

However, is RAC on VMWare advisable?


I can say this "there are people doing it and they appear happy with the results"

do I think it is a good thing personally? My opinion says "no, probably not, I'd rather not with RAC"


Which customer has the significantly better performing system?

it depends, it depends on so many things.

BTW: at a Hotsos presentation, one of the presenters told us that a two node
cluster should not be allowed. I don't remember the details of why. Do you
think a two node RAC cluster is a bad idea?


oh, how I hate statements like that. "I heard from a friend that something isn't good, i don't remember why or when it isn't good. do you agree?"



two node RAC can be awesome. Are there situations where it might not be the best - yes, but that is *true of everything* (and the main reason I don't like "best practices" too much - they are only best in CERTAIN SPECIFIC situations - not in general).



Probably the one where you used one 16 core server has the best performance, but then you are missing other things. Followed by RAC. Followed by RAC on VMware. Probably.

A reader, August 02, 2012 - 1:46 pm UTC

>> Which customer has the significantly better performing system?

Assuming same load, application and database. I would say the customer that identifies the performance bottlenecks (server, network, storage, database,... ) and tunes accordingly.
Tom Kyte
August 02, 2012 - 5:35 pm UTC

hence "it depends"

RAC OneNode?

David Penington, August 02, 2012 - 6:49 pm UTC

I agree that the single physical server will probably be fastest. VMWare would add the ability to move to a different server without reinstalling Oracle, and some other maintenance and recovery benefits. You always need to fight the VMWare administrators to keep other things off your physical server.

People have been running RAC on VMware "unsupported" for quite a while. I have seen the benefits of VMware when hardware fails or is retired.

VMWare plus RAC OneNode could be a nice position, getting the maintenance benefits of RAC without as much complexity and without the multi-node scaling issues for poorly behaved applications.

Remember, Oracle have special licensing rules on VMware - some Oracle people say every physical node in the VMware cluster must be fully licensed to the same level, even if some never actually host an Oracle application. I haven't seen clear documentation on this.

Silly response gets a silly response back

John Cantu, August 03, 2012 - 1:33 pm UTC

"I would say the customer that identifies the performance bottlenecks (server, network, storage, database,... ) and tunes accordingly."

In the scenario I mentioned above, all customers are supported by Tom Kite.
Tom Kyte
August 17, 2012 - 9:52 am UTC

The golfer!!! :))

I didn't know he did databases :)

Mark, August 03, 2012 - 3:56 pm UTC

Remember, Oracle have special licensing rules on VMware - some Oracle people say every physical
node in the VMware cluster must be fully licensed to the same level, even if some never actually
host an Oracle application. I haven't seen clear documentation on this. This comment from David above should be taken to heart.

We looked into this and got a very big surprise from our sales rep on the licensing of Oracle on VMWare. Licensing needs to be looked at before any migration to VMWare.


Tom Kyte
August 17, 2012 - 9:55 am UTC

you have to license the box oracle is running on fully - even if the vm is restricted to N out of M cpus, you license M.

Only if you use a hard partitioning schema for CPU isolation (vmware is 'soft') would you license N.

Alexander, August 07, 2012 - 9:21 am UTC

Tom, what is the situation where a 2 node RAC cluster is problematic? I don't see how the number 2 matters, vs 3, 4 etc. Thanks.
Tom Kyte
August 17, 2012 - 12:33 pm UTC

where did I ever say it was?

two actually has different algorithms then 3 and above for discovering where a block is and can be a little more efficient because of that.

Pete, August 17, 2012 - 12:57 pm UTC

>> do I think it is a good thing personally? My opinion says "no, probably not, I'd rather not with RAC"

Tom, can you expand on that thinking? Our organization is contemplating this very thing and asking a lot of the same questions I see above. Mostly what I get out of this is "it depends" on our particular situation. We plan to test performance benchmarks so we know what we're getting into before we make the final leap, but beyond a likely performance hit is there anything in particular that drives your opinion?
Tom Kyte
August 17, 2012 - 3:35 pm UTC

because RAC is about availability in most peoples eyes. And adding a layer that isn't necessary - in my experience - makes it 'less available', it is yet another moving piece.

or RAC is about performance, scaling horizontally, and having another moving piece inhibits that to a degree - a bare metal hypervisor won't, but a host system one does.


Or I see people running their VMs on the same machine with RAC and I just ask "why"?

so, it depends - probably not is my opinion. I can be convinced otherwise in the face of numerate, fact based information.

How is the cache shared?

Khalid R, March 07, 2013 - 1:30 pm UTC

Hello Tom,

Is there such a thing as cache-fusion at all? especially if the nodes are sourced from seperate physical machines. How is the cache shared between the nodes? I am not sure I understood this hypervisor architecture. What is the principle?

I have been looking at the various configuration options, most recently I was reading this document but still don't see any interconnect (10g ethernet or inifiniband or fibre ...) or anything like that in those pictures

http://www.oracle.com/technetwork/articles/systems-hardware-architecture/rac-vmsrvrsparc-163927.pdf

Thanks,
Khalid



Tom Kyte
March 07, 2013 - 3:20 pm UTC

the cache is kept consistent between nodes - by sharing blocks over the interconnect.

It is a lot like the L1/L2 caches on a CPU in a multi-core machine and memory in DRAM. DRAM = disk, L1 cache = SGA. Each CPU may need access to the same bits and bytes. So, the CPUs keep their caches consistent with each other.

Rac is a lot like a symmetric mulipprocessor machine, each have their own caches which are "fused" together (kept consistent) by sharing blocks over the interconnect.

page 7 in that link has a picture of an interconnect, it is labeled "private network"

Have a better understanding now

Khalid R, March 07, 2013 - 4:59 pm UTC

Now that I am looking up my clusterwear Grid Infrastructure class notes, the private network looks a lot like the interconnect. I am glad I asked, that was troubling me.