Hardware redundancy
A reader, December 29, 2004 - 11:13 am UTC
"eg: I've got hardware based redundancy and
would like to use that -- so don't do that in ASM"
Can you expand on that?
The way I understand ASM to work is you give it a bunch of disks and define disk groups (primary disk groups, failure disk groups and redundancy disk groups).
Since the OP has failure and redundancy covered, are you suggesting that he define just the primary disk group and not the other ones?
But then if a disk fails, the underlying storage system will kick in and search for a "hot spare" NOT ASM, right?
So, it doesnt really make sense to use ASM in a half-baked fashion like this, does it? In other words, just give it the physical disks and NOT a SAN/NAS volume?
Thanks
December 29, 2004 - 7:12 pm UTC
you do not need to setup the failure disk groups, no mirroring. just a diskgroup.
if asm doesn't mirror either
a) someone else is and it takes care of everything
b) you are not mirroring.
sure it does make sense. file systems are for -- well, files. file systems are not for databases -- for file systems are databases in fact and a database on a database.....
it is not 1/2 baked -- most of the text above wasn't about mirroring.
To ASM without real disk
sami, September 28, 2005 - 9:40 pm UTC
Dear Tom,
If I have NFS mount point in Unix, I can use the following command to create multiple disks which can be used as ASM disks.
dd if=/dev/zero of=/u00/raw_01 bs=1M count=1024 (bs=block size)
dd if=/dev/zero of=/u00/raw_02 bs=1M count=1024 (bs=block size)
Is there any similar thing available for windows (just to play around ASM stuff)
September 29, 2005 - 6:43 am UTC
no idea, not a windows person at all.
RAC on ASM
Yogesh, January 20, 2006 - 1:47 pm UTC
Can you please give me a link where I can get information about implementing RAC on ASM.
January 20, 2006 - 2:48 pm UTC
it is not any different from implementing ASM without RAC.
RAC ON ASM
Yogesh, January 23, 2006 - 5:33 am UTC
Well I was specifically looking for a document for replacing OCFS with ASM. I'm in a process of building high-level database arch, for one of the new projects.
I'll be using NetApp NAS as my data store & ASM for database file management. Being a critical and high usage system I need good resilience, and for that I don't have any other option but to go for RAC. I'm not able to visualize how RAC with communicate with ASM/NAS. Can you please give me some inputs?
January 23, 2006 - 10:27 am UTC
it is the same as for replacing a unix file system without rac with ASM...
However you'll want to think about ocfs for trace files and such - so you have the alert logs for all instances available on one.
You'll give the NAS disks to ASM, ASM will be used to create diskgroups, DBA uses diskgroups just like a file system.
Don't get confused by "rac" here, it is not really "relevant" to seeing how this works - it is the same as if there were no rac.
RAC ON ASM
Yogesh, January 23, 2006 - 1:31 pm UTC
But how this setup will simulate cluster?
I want to create 2 node RAC cluster.
my understanding is
1. I'll require some disk groups in ASM,
2. I'll have one ASM instance each on RAC node
But how will I be able to use same disk groups from RAC? bit confused....
Can I create trace files on one small LUN from NAS?
January 23, 2006 - 10:59 pm UTC
just use them, just like you would use a file system from OCFS.
if the 'nas' volume is a file system writeable and readable from all nodes, sure.
ASMLIB
Yogesh, January 26, 2006 - 11:22 am UTC
I was reading couple of articles about ASM, where I came across something called as 'ASMLIB', after searching I found most of the information about ASMLIB for Linux. IS it specific to Linux or generic across UNIX?
January 26, 2006 - 11:49 am UTC
Alexander, July 19, 2007 - 3:12 pm UTC
Hi,
We are doing some testing on NAS and trying to figure out why it's running slowly (disk i/0) vs another database on NAS that's older. We are testing by writing a bunch data to a file.
The good environment has:
SunOS 5.9
Oracle 9.2.0.7
Mount options:/bg/hard/intr
Poor performing environment:
SunOS 5.10
Oracle 10.2.0.2
Mount options:/bg/hard/rsize=32768/wsize=32768/vers=3/nointr/proto=tcp/forcedirectio
If you can comment on that, it would be great but I don't expect you to be able to determine much from that.
I'm also curious as to what this means from the documentation
http://download.oracle.com/docs/cd/B19306_01/install.102/b15704/app_nas.htm#sthref970 "Use NFS version 3. Oracle recommends that you use NFS version 3 where available, unless the performance of version 2 is later."What does "unless the perfomance of version 2 is later" mean? Thanks as always.
July 19, 2007 - 7:15 pm UTC
sorry, i won't be too much help on this one... not my specific area of expertise.
control file of asm
A reader, July 27, 2007 - 2:00 am UTC
Hello ,Mr. Tom!
I have a question of the controlfile of asm:
I am sure the asm instance need not the control file,but:
export ORACLE_SID=+ASM1
sqlplus sys as sysdba
SQL*Plus: Release 10.2.0.2.0 - Production on Thu Jul 26 15:44:34 2007
Copyright (c) 1982, 2005, Oracle. All Rights Reserved.
Enter password:
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SQL> show parameter control
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
control_file_record_keep_time integer 7
control_files string /u01/app/oracle/product/10.2.0
/rac_db/dbs/cntrl+ASM1.dbf
SQL>
And when I go to the controlfile directory,I can not find the file 'cntrl+ASM1.dbf'.
Best Regards!
Alan
ASM Normal Redundancy to External Redundancy
Stefan Pastrilov, May 14, 2008 - 8:36 am UTC
Hello,
I have read the most of this post but I couldn't find how I can turn off my ASM database from Normal Redundancy Mode to External Redundancy Mode. Is it possible to do it in production database. I want to remove the internal raid because of more free space. I have a hardware RAID and I do not need that Oracle ASM Option. But I was worry about that - when selecting from v$asm_group
select statement:
-----------------------------------------------
select disk_number, mount_status, redundancy
from v$asm_disk
-----------------------------------------------
result:
DISK_NUMBER MOUNT_S REDUNDANCY
----------- ------- ----------
0 OPENED UNKNOWN
1 OPENED UNKNOWN
What does mean ? That there is no redundancy or something other. I am runnig Oracle 10.2.0.1 on Win2003
Regards,
Stefan Pastrilov
asm
A reader, May 17, 2008 - 7:23 am UTC
Is there a way to move a diskgroup from normal to external redundancy ?
May 19, 2008 - 3:51 pm UTC
I plugged the text of your question into metalink and immediately hit on
Note 438580.1
Title: How To Move The Database To Different Diskgroup (Change
Diskgroup Redundancy)
OMF
Aru, May 27, 2008 - 2:41 am UTC
Hi Tom,
Is it mandatory to use Oracle Managed Files with ASM?
Even when I do
alter system set db_create_file_dest='' scope=both;
The database creates datafiles as per OMF format.
Please explain,
Regards and thanks again,
ARU.
May 27, 2008 - 8:10 am UTC
no, OMF (oracle managed files) is always optional
the database uses OMF when you don't specify a datafile name. so, start specifying a datafile name when you deal with datafiles.
OMF
Aru, May 28, 2008 - 6:21 pm UTC
Hi Tom,
Thanks for that.
But was I wrong in assuming that to enable OMF you had to set db_create_file_dest to some destination?
Setting :-
alter system set db_create_file_dest='' scope=both;
Still creates OMF type datafile.eg- +ORCL_D1/orcl/datafile/abc.278.655456909.
Am bit confused. Please help...
May 29, 2008 - 7:25 am UTC
can you show us....
show parameter db_create
create tablespace test datafile size 1m;
Conventional Tablespace and Datafiles on ASM
Arindam Mukherjee, March 05, 2010 - 10:47 pm UTC
Sir,
Could you kindly write a command to create table space with path and data file name on ASM Environment? Whenever I try, it fails and according to guideline, I can create only OMF.
In addition to that, in Oracle 10g Administrator guide (10g Release 2 (10.2) - B14231-01), it has been written the following.
Oracle-managed files are most useful for the following types of databases:
1. Databases that are supported by the following:
A logical volume manager that supports striping/RAID and dynamically extensible logical volumes
A file system that provides large, extensible files
2. Low end or test databases
Sir, huge transactions are made on our database daily. So should we go for OMF? Candidly speaking, we are not aware of OMF. So we wish to use conventional datafile and its tablespace on ASM. Please help us.
March 09, 2010 - 10:36 am UTC
'it fails'
'my car won't start'
tell me why my car won't start and i'll tell you why it fails.
create tablespace t datafile '+/hello/world.dbf';
would work if you have ASM set up to have a /hello mount point.
OMF is not a performance thing at all, OMF just automagically names files for you - nothing more. Don't think of it for performance reasons.
Different tablespaces on different Disk Groups
arindam Mukherjee, March 27, 2010 - 1:04 am UTC
Sir,
Earlier on NON-ASM setting, we maintain different table spaces for data base tables and Index for those tables on different locations. On ASM environment, we use disk groups. So should we follow the same rule on ASM – Master tables for one disk groups, Transaction tables for another disk groups and Index on Transaction and Master tables on different disk groups? Please help us.
March 27, 2010 - 8:28 am UTC
use tablespaces to make your life more enjoyable.
they are not a performance thing, they are a book-keeping thing. If you find it easier to manage because indexes are in tablespace X and tables are in Y, go for it. If you find it easier to put both into tablespace A - go for it.
There is a reason you might put indexes (and other objects that are easy to recreate in the even to loss) in tablespaces separate from tables and other objects that are hard/impossible to recreate - and that would be because it is easier to recreate them in the event of media failure than to restore them.
Additional question on SAN use with ASM and RAC
A reader, May 26, 2010 - 1:53 pm UTC
I am preparing for 11.2 RAC installation using SAN (EMC). I am new to RAC and 11.2. So, my question is: is it necessary to use ASM for RAC (since I have SAN)? And if I do use ASM for its administrative advantages, would it not add quite a bit of overhead in terms of I/O processing when SAN is already using RAID and how to estimate this overhead?
May 27, 2010 - 7:09 am UTC
A SAN just provides disks that look as if they were internal disks.
We still need file systems - from someone, from somewhere
ASM is a file system manager - it can provide us with ASM diskgroups and/or a clustered file system.
You could buy things from other third parties (EMC included) that provide the same functionality you already own with ASM.
It would be up to you as to which you used - it is my opinion that ASM would be the easiest path (less integration on your part, just install and go).
The SAN does not already have to do RAID but if it does, RAID on RAID is not any sort of overhead (that is just striping) if you meant to say "SAN is providing redundancy", then you can use ASM just as a file system manager, not to provide redundancy.
Does ASM take care of hot blocks??
Pattar, January 30, 2013 - 11:39 am UTC
If it does then does it still make sense for us to do ASM over NetApp NFS? The only reason I ask is because NetApp NFS cannot rearrange the hot blocks online. That would be the only benefit that I can look to see using ASM.
January 31, 2013 - 2:16 pm UTC
ASM will only move stuff around during a rebalance when you add or remove storage.
the goal of ASM is to strip everything across as many devices as you give it - avoiding hot spots.
ASM makes managing storage by the DBA easier, more flexible.
ASM serving several databases
Andre, May 03, 2013 - 2:49 pm UTC
Hi Tom,
The Group IT is looking into deploying ASM for a few Oracle databases (all hosting SAP Apps).
Please correct me if I am wrong.
1/ ASM can manage many disks provided by SAN
2/ ASM can allocate ana manage space for several (7) databases
3/ As ASM becomes a single point of failure - should we consider:
-- a. ASM deployed on a dedicated server
-- b. Have an Oracle Data-Guard - physical stand-by for ASM
-- and/or
-- c. have ASM deployed on a 2-node RAC
the above points just to improve High Availability.
4/ Are there any other considerations you would suggest..?
Thank you
Kind regards
Andre
May 06, 2013 - 7:15 pm UTC
1) yes
2) yes
3) only inasmuch as a single machine does. access to the disk would have to go down - and that would mean the machine went (ASM doesn't do the IO's, it just mannages the storage, your dedicated server would do the actual IO's to disk). to protect against that, you would run clustered, yes.
4) test a lot :) get comfortable with it. test it to destruction.
ASM environment protection
Andre, May 08, 2013 - 10:41 am UTC
Hi Tom,
Many thanks for your confirmations / clarifications
Just on the item (3) - trying to understand it:
A./ ASM gets set up and storage gets configured for 7 databases. Now if the server running ASM instance goes down - it would not have any affect on these databases - correct..? - or would it..???
<1> If it does NOT then ASM should be regarded as superior to any O/S file system managed
<2> However if it DOES - then a prospect of ALL 7 PROD databases going down is HUGE
B./ Your comment "your dedicated server would do the actual IO's to disk" I understand to be that this should be our main concern = hence for High Availability RAC on the database servers - but not on ASM ... correct.?
C./ Are there any Production installations that would have ASM configured itself in the RAC cluster...? - and if so then the datafiles used for ASM itself would need to be configured using ASM also... if I am not missing anything
Thank you
Kindly
Andre
May 08, 2013 - 1:22 pm UTC
a) if the server running ASM goes down, then the server running that database instance that uses that ASM instance went with it.
ASM runs on the machine with the database(s).
b) I don't know what you mean. If you are running RAC, you'll have ASM running on each node.
c) most all of them actually.
http://docs.oracle.com/cd/E18283_01/server.112/e16102/asmcon.htm#i1021337 read that guide from cover to cover.
Common ASM configuration for many databases
Andre, May 11, 2013 - 12:46 pm UTC
Hi Tom
Thank you.
<< Actually - I had downloaded the more recent doc on ASM - E18951-03 dated Feb.2012 - and I had gone over the sections that deal withn ASM configurations >>
- - however I was not 100% sure as to interpret the ASM configuration described there.
Your answer had confirmed that in order to have ONE COMMON ASM deployed for several App databases - I would need to:
>> Have ASM in its own RAC environment to span over ALL database servers (regardless whether they are RAC or not)
And if ASM is not deployed on its own RAC, then each ASM instance would ONLY manage a specific database deployed on the same server.
Considering very large capacity disk drives that are deployed on SAN and many small databases (we have several that are tiny = i.e. under 20GB each) - I am concerned that the RAC configuration where ASM would be deployed on - may be too complex - if we want to optimize I/O and have say 20 such small 20GB databases storage managed by ONE ASM.
What would be the practical MAX number of RAC nodes to hold ASM which would manage storage for these tiny databases.?
a) not greater than 8...?
b) not greater than 16..?
c) or it does not matter - and ASM deployed even on 100 nodes would be very much OK.
The Oracle docs do not go into such considerations - but vaious Forums report that RAC does not function well at 20 nodes or more - and with some hardware platforms the MAX number of nodes should be in a single digit.
+
The above considerations are with the view that SAN admin team provides a certain number of physical drives = say 500GB capacity.
In order to improve I/O with striping there would need to be several drives - perhaps at least 4 if not more... but just one single drive is more than it is needed for 20 tiny databases (20 x 20GB = 400GB).
So - what is the optimal solution:
(1) get 8 mirrored (1:0) drives and use only 10% of their total 4TB capacity so that 20 tiny databases (20GB each) would be spread over all 8 drives?
(2) try to locate other (medium size) databases and deploy those along with the 20 tiny ones?
- but then - how to handle distribution of database files belonging to tiny but very I/O busy databases with other medium size or larger ones..?
or:
(3) request from SAN Admin team drive partitions and have those used by several ASM RAC configurations instead of just one ?
- but then - one would need to be very careful with optimal allocation of specific disk partitions to various ASM environments - which might work initially - but might cause problems later on when new drives would need to be added...
Above all - It goes without saying that it would be a difficult sell to finance people who approve a PO if a request was made for lots of SAN drives when only 10% would be utilized...
... or very hard sell to Group IT when their BAU team would see a very complex ASM configuration shaping up...
Tom - do you know of any specific and proven optimal deployments where such issues needed to be addressed... tests done at Oracle - or better still Oracle partners who had done a great deal of testing and optimizing and were so kind to share facts - rather than assertions..?
Thank you
Kindly
Andre
May 12, 2013 - 3:54 pm UTC
basically, you need to run ASM on every machine where a database instance that will be using ASM storage is running. If clustering is involved for the dataases, then clustering would be involved for ASM.
I'm not sure how to parse "And if ASM is not deployed on its own RAC".
If your ASM instance is clustered, it can support clustered databases.
If your ASM instance is not, it will support single instance databases.
I'm not getting the tie in between you concern about having many small database and a RAC configuration.
The size of the cluster is limited by OS limits only - 100 would be a high number for not impossibly high. But tell me - are you going to run a 100 node database cluster? That is what would drive this number.
The optimal solution isn't anything I can come up with here. we don't know the IO needs of these databases. A tiny database might perform 1000 times more IOPS than a large database. A tiny database might need 1000 times more IO bandwidth than a multi-terabyte database. It comes down to - how much of what type of IO do these things need and what sort of devices do I need to support that. Do I need IOPS for OLTP or do I need bandwidth/transfer capabilities with reporting/warehousing. And what kind of hardware do I need underneath to support that.
forget ASM for a minute, pretend you weren't using it. What would you demand device wise from your SAN to support your IO needs? Once you have that number, just stick ASM back into the equation (it is just a file system manager).
Take oracle, asm, the database out of the equation. Do the math to determine what sort of IO you need - period. It doesn't matter what file system you use, you *need* that sort of IO rate (IOPS or GB's/sec transfer).
ASM infrastructure for many Oracle databases
Andre, May 13, 2013 - 3:01 pm UTC
Tom,
MANY THANKS for the above response.
The premise / Solution:
-----------------------
(A) There is a mixed bag of Oracle databases - a few tiny (under 20GB) and a few large (4-10 TB range)
(B) This landscape is under review to upgrade from various old Oracle releases (9i - 10g) into 11gR2 and migrate from a mix of h/w platforms into new h/w possiblu running Oracle Linux - but maybe Solaris (TBD)
(C) The migration would also address the storage needs using SAN instead of the current mixed bag of storage incl local disks being used on Win-2003 environments
(D) ASM is being considered to be used as a SHARED storage management for ALL databases although these databases do not and will not be deployed on RAC. The rationale for ASM is for optimal - balanced I/O that may be achieved when SAN Admin team just provides a number of disk drives for the total capacity of 25-30TB (possibly mirrored) and this disk farm would then be managed by ASM configured in a RAC accross all servers (initially about 10 - later 20 or more) where each DB server would run its own database.
< My understanding from reading Oracle docs is that for ASM to support shared storage for many databases - would require RAC - even if inidividual databases hosting Apps run as single databases >
(E) IF ASM is not deployed then each single Oracle database would be provisioned with storage from SAN and managed by SAN Admin team - and it would be outside of any control or influence of the DBA team - to the extreme that regardless of any reasons - the tiny and very active databases might receive all of their (20GB) storage on one volume mapped on one single drive - or worse still drive partition with other partitions being used by equally I/O demanding tiny databases.
(F) With ASM deployed in RAC I can invisage thin stripes allocated for tiny but highly I/O active databases sharing the same disk groups with other large databases that are low on IOPS.
(G) Furthermore - having ASM and thus control over storage it would be much easier to alter storage configuration if/when I/I hot spots get discovered - something that it is always a problem when dealing with a SAN team that is out there to provision storage initially and later add more when needed.
Just to clarify (..I'm not sure how to parse ..) -
(1) ASM would be used in the shared configuration so that the disk groups could be shared between databases - which would mean that ASM would need to be configured in RAC even when the databases using the ASM-allocated storage are not running RAC (this is my understanding from Oracle docs)
(2) My concern about the number of the DB servers over which ASM would be deployed in RAC had been derived from various Forums where people were stating that going over 20 nodes (RAC) would be a problem and on some h/w even over 8 RAC presents problems.
>> Your answer as "..100 would be a high number for not impossibly high.." has basically dismissed those arguments found on various Forums - which was my reason to ask, as I thought Oracle would run various stress tests trying to see the limits and derive some optimums.
Certainly I would not see in our case a need to go over 25 node RAC to deploy ASM so that it would manage storage for 25 single Oracle databases - so your comment about 100 had brought some comfort.
YES - I perfectly well understand the principal issue of IOPS as the factor to zero-in and not the size of any database.
And this is precisely the reason for my earlier questions as I believe the tiny active databases could or even should co-exist with large databases but with low IOPS.
If not then a great deal of storage would be wasted.
As before - I would welcome very much any document from Oracle or (third party) but available at Oracle which would show real-life implementations highlighting pros and cons to give more depth to what Orcale Concepts docs say.
YES again - I am into MATH (love it) and will be spending adequate time to assess IOPS and GB/sec rates - before committing to any deployment solution.
But I do NOT want to take Oracle/ASM OUT of equation, as this is what should be considered to avoid poor SAN storage allocation and get it fixed for years to come. I wish to SELL Oracle-ASM to the Group-IT management as an optimal solution - and in doing so I want to avoid any nasty surprises down the road - hence my questions regarding real life optimal deployments - like CASE Studies... if you can get any for me/us.
Thank you
Kindly
Andre
May 13, 2013 - 3:33 pm UTC
Sharing the storage - having servers 1, 2, 3, 4, .. N all reading and writing the same devices could be a recipe for disaster ultimately. When server 1 decides "I'm going to go to town on these disks" and servers 2..N start running really poorly for no apparently good reason whatsoever....
1) You would be using a clustered set of asm instances yes.
2) but you are not running any RAC databases. The asm instances just do the book-keeping for what bits of disk are owned by which files. The database instances themselves do all of the IO. There is no RAC here at all. There is no "20 nodes of RAC", they are all single instance databases.
the comments you've seen regard RAC, you are not running RAC, you are not doing cache fusion, you are not sharing blocks between instances, you are running single instance databases that each perform their own reads and writes. ASM is just a file system manager, maps out disk, it doesn't really do anything at runtime really - it isn't involve in the second by second access to the disk.
As before - I would welcome very much any document from Oracle or (third party)
but available at Oracle which would show real-life implementations highlighting
pros and cons to give more depth to what Orcale Concepts docs say.
there won't be anything since it has nothing to do with the database itself - not the database software, rather it is all about how many IOPS you need to support and how much bandwidth you need to be able to supply.
I'm not sure what such a document from Oracle would say? What do you think would be in such a document? given that the number of IOPS / bandwidth you need is something only you know?
On our engineered systems - we have specs that tell you how many IOPS you can perform, what sort of bandwidth from disk you can expect - but for the plain vanilla software you install on your own homegrown hardware - you are the ones that supply those numbers..
But I do NOT want to take Oracle/ASM OUT of equation, as this is what should be
considered to avoid poor SAN storage allocation and get it fixed for years to
come. I wish to SELL Oracle-ASM to the Group-IT management as an optimal
solution - and in doing so I want to avoid any nasty surprises down the road -
hence my questions regarding real life optimal deployments - like CASE
Studies... if you can get any for me/us.
that would be any SAN related paper with databases on SAN. ASM is just managing who gets what devices, just like any file system would - it is no different.
You need to make sure that machine A can get the IOPS it demands, machine B can get the bandwidth it deserves and so on. That is a matter of slicing up your disks, your controllers, your network IO to ensure that. There is nothing "special" about ASM there. ASM just remembers the mapping of disks that you came up with to ensure all of this.
storage / recipe for disaster - ASM or not
Andre, May 15, 2013 - 1:30 pm UTC
Hi Tom,
Thank you.
COMFORT: from your statement that ASM configured on RAC will be fine as the databases hosting Apps are not on RAC
CONCERN: from your statement "..same devices could be a recipe for disaster ultimately.."
I am afraid I had failed to communicate or I am missing this entirely.
We cannot afford misconceptions - so let me try again.
I understand that ASM is "..just managing who gets what devices.." - but isn't this exactly the reason for Oracle to develop ASM, so that Oracle DBA Group would be able to control this storage allocation.
Please consider the typical scenario:
(a) New Application needs to be installed on Oracle but requires a tiny storage = 20GB - this is less than 5% of the total disk capacity of one single DASD
(b) Infrastructure team quotes the cost of storage (among of others) and as they have no idea ..AND.. do NOT want to know or be bothered - they simply carve a 25GB slice (being generous) and map it to one mount point - done deal- fixed for ever
(c) the remaining 475GB is being sliced in a similar way and ends up in 19 more mount points to address others storage needs - all tiny Oracle databases
(d) performance is terrible - finger pointing directed at DBA team and Apps developers - but NOT SAN Team
Alternative scenario - with ASM:
(1) Oracle DBA team manages to secure 60 DASD devices - all mirrored = total 30TB
(2) DBA team deploys ASM accross 20 RAC nodes to be able to manage shared storage for many databases
(3) The IOPS are being considered and data is striped accross many drives in a their disk groups and allocated so that large database files known to store rarely accessed data is mixed with thin stripes for more active tiny databases
(4) If mistakes are made and DBA team discover I/O hot spots they can intervene - i.e. reallocation of what is stored where
+
IF you feel that BOTH of the above are "..recipe for disaster.." then would you suggest the every new database no matter what the size gets a few (say 8) DASD for its exlusive use - so in the case of 20GB database 8 x 500GB is provisioned thus only 0.5% of the provided disk storage is used and 99.5% is wasted..?
Nobody would ever sign a PO for that.
+
My understanding had been that with ASM we can optimize storage resources.
But if I quote your words about ASM = "..it is no different.." then the IT management would ask SAN team to do storage management with would mean allocate a fixed configuration and fix it for years - and DBA team would have to live with that.
Am I missing something..?
I am not sure anymore ...
Please clarify - when you can
Thanks
Regards
Andre
May 16, 2013 - 8:26 am UTC
I understand that ASM is "..just managing who gets what devices.." - but isn't
this exactly the reason for Oracle to develop ASM, so that Oracle DBA Group
would be able to control this storage allocation.
sure, but you are planning on having multiple disparate systems running on separate machines share storage.
just like a SAN administrator would.
My understanding had been that with ASM we can optimize storage resources.
only inasmuch as you have resources to manage.
why have a 20gb database? Why not consolidate so you have few very large databases to deal with?
A reader, May 15, 2013 - 4:36 pm UTC
>> Nobody would ever sign a PO for that.
Not really. It's a cost-benefit issue. If the business loses a lot of money because of poor performance than they would be smart to spend that money.
To me, your main point is that the SAN team is not helpful in addressing storage performance issues and you are looking at a way to handle storage performance in your DBA group.
If your goal is to be a storage admin, ASM is something you can use, with a set of features. However, you could also use raw disk or a third party file system. You need to identify your requirements and compare he pros and cons of each solution.
SAN turf - vs - DBA with ASM turf
Andre, May 16, 2013 - 1:21 pm UTC
Tom & the reader from NY
Thank you both.
Former project
--------------
DBA team had requested from Infrastructure team (SAN Admin) specifically to provide info showing what physical devices hhad been allocated to what logical volumes and what databases are using what.
ZERO response.
Repeated requests for Info have been sent through all levels of management during the period of May-2012 through Mar-2013
ALL IGNORED.
The management does NOT understand the importance of proper SAN deployment - while all fingers point at DBA + APPS development teams.
Furthermore - re that former project.
One server had been configured with TWO databases accessing disk volumes that have different names - for appearance sake as I/O tests had proven that one one database - using Tom's expression - "goes to town on a specific volume then the other database I/O degrades right away.
DBA had quoted these measurements - but all of that is denied - again with finger pointing.
+
CONSOLIDATION ..?
Sure our team would love to consider this - but these are SAP applications which demand exclusive databases for each App and enforce tablespace naming conventions + Schema naming conventions - so consolidation is simply impossible.
However - my question now is Tom
What would be the actual difference between two Apps sharing storage with ASM allocating it individually as compared with the same disk groups being allocated to consolidated two Apps into ONE DATABASE.
Sorry - but I fail to see the difference
Thanks
Andre
May 18, 2013 - 6:53 am UTC
You can resource manage two applications inside of a single database, you'll have one lgwr who is not contending with some other lgwr, you'll have a dbwr doing it's batches writes without worrying about N other dbwrs all trying to do the same thing.
versus having N applications in N instances. And not being able to use the resource manager across those instances. And having multiple lgwr's writing to a scarce resource. And having many dbwrs going to town on the disks without coordinating.
you have scarce resources (few really big disks), you would like to be able to manage those scarce resources
by the way - what you wrote above is the best advertisement for engineered systems ever! There the 'SAN', the network connecting the san to the server, and the servers themselves are in the same box, the same cabinet. All under single control. You know how many IOPS you can do, you know your IO bandwidth, you know exactly what devices you have and how you are using them, you have an IO resource manager to allow you to resource manage across databases (not available except on engineered systems)...
A reader, May 17, 2013 - 1:37 pm UTC
>> What would be the actual difference between two Apps sharing storage with ASM allocating it
individually as compared with the same disk groups being allocated to consolidated two Apps into
ONE DATABASE.
The i/o load would be mostly the same. I guess there could be some slight differences for "internal" i/o due to the different databases.
Going back to the main point. I think you are approaching this problem from the wrong direction. Forget about ASM, SAN, etc. and think about the i/o load your environments need to support. No technology can fix poor i/o if the load is too great for the hardware. Identify your requirements and design and build a system that can support it.
High I/O to be handled by SAN or ASM
Andre, May 17, 2013 - 5:01 pm UTC
Thanks to The reader from NY - also checking with Tom
We already know about that very busy-bee tiny database that has high IOPS + several hundreds of users queueing for their JAVA calls through a connection pool.
As said earlier - if left to SAN team this App will be left on just one partition of a 500GB DASD. The data content is 13GB indexes approx 6GB - add SYSTEM + UNDO and we have 25GB - but let's say 50GB gets allocated.
Then the SAN team will try to impress the management with cutting costs and will allocate the remaining storage of the same DASD to other small Apps - as they do NOT CARE about IOPS - all they do understand is lean-capacity.
Even if consolidation is possible - then the server ends up running 2 Apps.
As per Tom (other posts) I remember Tom said - he would never ever run 2 databases on the same server.
Well, running 2 or 5 different Apps on the same RDBMS would be a bit different than having 5 physical databases - but not that much different.
However from the I/O perspective the larger database would have hundreds of GB being close to idle and this could be managed with ASM shared disk groups - - I was hoping - i.e. until Tom said that it would be a disaster.
We already know that we have a DISASTER as far as I/O goes on the current systems and we have ZERO influence.
The difference is:
(1) SAN team is clueless as to what Apps need and they go for the "solution" that is suitable to them
(2) DBA team knows very well which database does what - so it we feel that shared disk groups allocating striped busy tiny database DBF's along with 6 out of 8TB od the large database would work.
But you and Tom seem NOT to agree - and I am the message I am getting is that ASM "is just another file system" and that the management should be convinced to get 8x500GB DASD mirrored to deploy 20GB of data and intexes while leaving 99.75% of DASD space totally unused.
CANNOT and WILL NOT happen - not in this organization.
I/we are trying to understand the options available and how best put ASM to use to get into an optimum - considering budget constraints.
PLEASE HELP me/us understand this.
Thank you
Andre
May 21, 2013 - 2:35 pm UTC
As per Tom (other posts) I remember Tom said - he would never ever run 2
databases on the same server.
not without and IO and CPU resource manager (we have those on Exadata for example) - ala virtualization at some level.
Well, running 2 or 5 different Apps on the same RDBMS would be a bit different
than having 5 physical databases - but not that much different.
It will be hugely different - and is one of the major new features of the next release of Oracle - a facility to make this sort of consolidation easier. But suffice it to say - it makes a big difference (I've said why in part above).
Go with ASM, but realize that running lots of tiny databases is going to lead you to the same issues - you don't have enough resources to give out to dozens of tiny databases, you do have enough for a few large databases.
I'm telling you "do not deploy lots of 20gb databases, deploy few much larger databases"
In your current situation, just stripe and mirror everything - you don't have enough disks to really do anything else. If you want lots of log writers, lots of dbwrs all contending with each other on limited resources, it won't matter which way you go. If you get it down to fewer resource contention points via consolidation (which will be easier to backup, manage, patch, upgrade, etc as well) - then you'll actually be able to think about how to slice and dice things up.
ASM/SAN - best ad for engineered system ever
Andre, May 18, 2013 - 1:53 pm UTC
Hi Tom,
Your last response said:
".. by the way - what you wrote above is the best advertisement for engineered systems ever! There the 'SAN', the network connecting the san to the server, and the servers themselves are in the same box, the same cabinet. All under single control. You know how many IOPS you can do, you know your IO bandwidth, you know exactly what devices you have and how you are using them, you have an IO resource manager to allow you to resource manage across databases (not available except on engineered systems)..."
Oracle doc on ASM "E18951-03 Feb 2012" shows 2 examples of ASM configurations (page 1-4).
Clearly Oracle docs suggest what you would never do - especially Figure 1-1, as you said before and you had said again in reponse to my post to consolidate and have several Apps on ONE SINGLE PHYSICAL DB and NOT on a few single DB's.
Figure 1-2 expands further showing ASM RAC configured supporting 3 Oracle DB instances (2 RAC + 1 NON-RAC).
Both these figures show how disk groups A and B are shared between several databases.
+
I will be talking to SAP people who will be soon joining this migration project and see if there is any chance to go for DB CONSOLIDATION.
If I get this confirmed by the SAP expert, then GREAT.
However - if the answer is "NO" - then we are back with considering option similar to the one presented on Fig 1-2 except that ALL 3 database instances are non-RAC, but ASM on RAC would provide shared disk groups.
IF/WHEN we are forced to this by not being able to consolidate all 3 DB's - then I would like to check with you the issues you had raised with LGWR and DBWR.
(a) REDO-LOGS can be placed on separate disk groups or strped accros many disks to reduce contention..?
(b) DBWR processes would not impact one another as they would be writing to different devices
My proposal is to have the tiny busy bee DB having its DBF striped thin on a disk group of 8TB and use the 25GB for this DB, while the remainder (or say 80% of the remaining disk space woule be used for lots of data belonging to other DB/Apps but not read very often and written to even less often
Then the busier parts of the other 2 Apps could get their storage on dedicated disk groups.
This is how I see the engineered system - should DB consolidation be impossible due to SAP restrictions.
Your comment on that ... please..?
Thanks
Andre
May 21, 2013 - 2:39 pm UTC
why couldn't you give a link to the docs...
please be much more clear in where you say the documentation conflicts with what I said.
I don't see any conflict.
It would be my approach to have few large databases. So what if they drew a picture with three instances all sharing the same storage. they are showing you what you can do, I said you can do that - but if you are worried about performance - think about it.
I'm still not getting this "RAC" stuff. If the databases are not RAC, there will be no cluster. you would be doing figure 1-1.
Optimal ASM w/ RAC to serve single-nstance DB's
Andre, May 22, 2013 - 12:53 pm UTC
Hi Tom,
".. please be much more clear in where you say the documentation conflicts with what I said... I don't see any conflict.."
+ "..So what if they drew a picture with three instances all sharing the same storage. they are showing you what you can do, I said you can do that - but .."
OK - Sorry worded poorly on my part.
I understand that docs show options of what possible - while you are saying what you would preferably do.
I accept it - and happily.
+
"..It would be my approach to have few large databases... but if you are worried about performance - think about it."
And - YES, I would opt for this as well.
However SAP people do not like placing different Apps on one physical DB. Last meeting had gone almost violent!!
Actually they said that ther is nothing wrong of doing RFC (Remote Function Call) - which is like using database links except we are talking here about:
(a) replicating the SAP way of 5 data streams with the total of over 200,000,000 rows - every day from one physical DB to another. This takes 12 hours to process
(b) users complain - performance needs to be improved but when I say that these 200mil RFC is work that can be wiped out if the Apps share the same DB - I am being overruled
and ... the company does not want to be out of SAP support
+
"..I'm still not getting this "RAC" stuff. If the databases are not RAC, there will be no cluster. you would be doing figure 1-1. .."
Actually it is on the Figure 1-3 which shows ASM clustered and servicing the three DB that are deployed on single instances.
>> Leading commentary in Oracle docs say:
A clustered storage pool can be shared by multiple single-instance Oracle databases as shown in Fig. 1-3, In this case, multiple databases share common disk groups. A shared Oracle ASM storage pool is achieved by using Oracle Clusterware, However, in such environments an Oracle RAC license is not required.
To share a disk group among multiple nodes you must install Oracle Clusterware on all of the nodes, regardless of whether you install Oracle RAC on the nodes /
<<
My understanding is:
(1) I can have 3 single-instance DB on dedicated servers
(2) When I know exactly what which is doing I can engineer a sensible (optimal) shared storage pool
(3) Then in order to get it utilized amongst all 3 DB I will need ASM on a Clusterware and RAC - although this RAC would ONLY be needed to support ASM
(4) Being on EE release the RAC license would not be required as it would only be used to deploy ASM itself but NOT the actual database instances that host the Apps.
Is my understanding correct..?
Thanks
Best wishes
Andre
May 22, 2013 - 1:57 pm UTC
... Is my understanding correct..?
...
yes, if you are going to share disks/devices without further partitioning these disks/devices amongst databases.
To Andre
A reader, May 22, 2013 - 5:56 pm UTC
Andre,
You said:
================================
My understanding is:
(1) I can have 3 single-instance DB on dedicated servers
(2) When I know exactly what which is doing I can engineer a sensible (optimal) shared storage pool
(3) Then in order to get it utilized amongst all 3 DB I will need ASM on a Clusterware and RAC -
although this RAC would ONLY be needed to support ASM
(4) Being on EE release the RAC license would not be required as it would only be used to deploy
ASM itself but NOT the actual database instances that host the Apps.
================================
When you install Oracle grid infrastructure, you must choose between single server or RAC install. If you choose RAC install, you will need RAC licenses. Please check with your Oracle sales representative.
SAN or ASM - your comments..?
Andre, June 05, 2013 - 3:29 pm UTC
Tom,
After a long 50-weeks of wait we have finally received a statement from the SAN Admin team "to address our queries":
They say:
======
>>>>>>
We have multiple enterprise storage servers. Each has a number of Raid-5 arrays with disks mapped through to our SVC environment. The arrays tend to be 8 disks (7+P or 6+S+P) There will be a couple of volumes presented from these arrays to the SVC as managed disks. We put a number of managed disks into a managed disk group and when we present a volume to the OS we are stripping the volume across the disks in a managed disk group.
In simple terms each volume the OS sees is stripped across about 80 to 100 spindles. There is also extensive cache available on both the SVC and backend storage servers. We have the ability to move the volumes around on the fly with no outage if required.
All the disks behind your systems are 450GB 15K. They are spread over 2 managed disk groups and each has 14 managed disks - so ballpark - 7 x 14 - 98 spindles per group.
Stripping is 256MB extents
If you are having performance problems - has anyone tuned the parameters/buffers on the cards ?
=========
<<<<<
How would you propose to configure a very busy small database I had described in my former posts...?
We cannot get any answers at all as to other Apps that are sharing the same managed disks - and if (as you warned me) one of those "goes to town on any of these managed disks" - then what..?
That is the reason behind getting allocation from SAN for all our 3 Apps under the same ASM - as we would be able to manage it better - we think.
Thank you for your most valuable comments in advance
Andre
+++
To the other reader who made the comment as quoted below
Page 1-4 in reference to Figure 1-3 of the ASM Admin Guide (Feb-2012) says:
"A clustered storage pool can be shared by multiple single-instance Oracle databases as shown in Figure 1-3. In this case, multiple databases share common disk groups. A shared Oracle ASM storage pool is achieved by using Oracle Clusterware. However, in such environments an Oracle RAC license is not required."
So is it..? or is it not..?
When you install Oracle grid infrastructure, you must choose between single server or RAC install.
If you choose RAC install, you will need RAC licenses. Please check with your Oracle sales
representative.
SAN = Recipe for DISASTER
Andre, July 01, 2013 - 11:43 am UTC
Dear Tom,
Further to my report above (posted a few weeks ago) - we are dealing with a major IT services provider who are in 100% control of the infra.
(FYI - their corp acronym has the same 3 letters as a small British air-line that competes with Easy-Jet or Ryanair on their short distance travels between the UK and the EU - you should have guessed by now what this IT services major corp is... right?)
And what they say ... goes.
And this includes SAN - as described above.
So -
(1) the latest restore process of just 80GB of database files had taken over 11 hours.
(2) the I/O is erratic = i.e. the same job that is 99% I/O dependent takes between 15mins and 5.5 hrs
And they (SAN team) say - "All is perfectly optimized + RAID5 is fine (we had proved it) as the SAN has a great amount of cache + the database files are nicely spread over the 96 physical disks - so it cannot be any better on ASM + we cannot give you a granular level of info as to what exactly the other 14 hosts are doing in terms of I/O at any given time"
Does it sound to you as a RECIPE for DISASTER..?
But under such circumstances - we are not hopeful to obtain RAW DISKS from the SAN to deploy within ASM - and if we get a few mount points that are mapped onto the same logical volume that in turn is mapped onto ONE SINGLE "managed disk" then we would be doomed for failure and have ASM to be "proven" useless.
And who can argue with such a large IT Services corp + and how can we prove that there is a much better configuration of the storage infra - - under such circumstances..???
Your comments - when you get a moment..?
Merci
Andre
July 01, 2013 - 9:33 pm UTC
just dd a couple of datafiles from one place to another - take the database out of the equation.
then you'll have an IO rate
and you'll find it erratic, unpredicable, and slower than the disks should be.
and it won't have anything to do with the database.
and ask them if that is the best they can do for you.
SAN config - recipe for disaster - PLEASE HELP
Andre, July 04, 2013 - 8:57 am UTC
Thank you Tom,
Your response "implicitly" confirms that the SAN config as given to us - is bad... right?
Going forward - I would appreciate your help
Specifically - I would need:
(a) set up a test that is recognized as valid proof of poor performance due to SAN set-up
(b) have at hand HARD DATA - that would show what I/O rates should be expected and at which point they would be deemed unacceptable.
In addition - I would need to be able to prove that the I/O rate is INDEPENDDAT from the DB server itself.
I know of "dd" command in UNIX - but what we have here is:
1/ old Itanium 2 HP server "rx4640"
2/ Windows-Server 2003
I would be extremely grateful if you could let me know what I could execute under Windows + how to measure and present data + with what could I compare these I/O rates data - in order to actually PROVE anything
I would be happy to spend many hours - over the course of the next week or two - executing such commands at various times to be able to obtain irrefutable evidence - and it would be highly appreciated if you could help me in this.
Thanks
Andre
just dd a couple of datafiles from one place to another - take the database out of the equation.
then you'll have an IO rate
and you'll find it erratic, unpredicable, and slower than the disks should be.
and it won't have anything to do with the database.
and ask them if that is the best they can do for you.
July 16, 2013 - 1:13 pm UTC
A reader, July 05, 2013 - 3:03 pm UTC
First, you need to identify the theoretical maximum performance YOUR server environment can provide. This calculation is based on the hardware specs and your configuration. As an example, a server that is connected to an iSCSI SAN with a single 1 Gb/s connection can never have a maximum throughput greater than 125 MB/s. It's math.
Next, you need to benchmark your environment using a synthetic benchmark utility. Tom mentioned dd, it's on all Unix servers and is easy to use. But, if your goal is a serious storage benchmark, I would recommend a specialized program like iozone (
http://www.iozone.org/ ). There are others.
Next, you need to configure benchmark tests that are typical to your usage types. Example, OLTP is different than Data Warehouse.
Run your tests and compare your actual results to the theoretical maximum performance your environment can provide. Of course, you will not hit the theoretical maximum performance, but how close are you? At 90%, you might say good enough and move on to other performance fixes like program code. But, if your actual is only 50% of theoretical maximum then you need to do more investigation. Where in the path from server to disk is the bottleneck? Too few spinning disks? Not enough I/O paths? Speed of server I/O cards are too slow? Server storage kernel parameters?
In the example above ( 1Gb/s iSCSI SAN ), if you only had one spinning disk, than it is not likely that you will get better then 65 MB/s. In that case, you would need to change your configuration to use more striped disks.
Keep in mind that these synthetic benchmark utilities are doing dumb reads and writes. Oracle doing disk I/O based on joins and functions, etc. will never be equal.
Now, when your app is slow, you need to use server, database, network, SAN utilities to identify where the bottleneck is. Maybe, you are CPU bound? Maybe, I/O? You can't fix it until you can measure it. You need to be able to demonstrate that the app is slow because ... and here are the metrics to prove it.
SAN - shared I/O - disaster..?
Andre, July 11, 2013 - 4:49 pm UTC
Dear Reader from NY,
Many thanks for your input ... that I failed to notice as Tom had taken a few days off.
With Win-2003 I had to resort to very basic BAT files that would echo the time + COPY file with /B option and echo the time.
I ran it many times at various days and hours - two different BAT files - one for a bundle of small-medium size files 256MB - 2GB + and then another just to copy a 33,554MB of an old copy of a database file.
Exec times in secs + rates in MB/sec
------------------ ---------------
1/ 1533 ...........: 21.88
2/ 863 ...........: 38.88
3/ 1007 ...........: 33.32
4/ 1492 ...........: 22.49
Your comment..?
Also - I had written a simple PL/SQL proc to BULK COLLECT read a 10000 ID values into an array that are very random by design (which takes under a second - and then read rows from a table using these ID values = this takes over 3 minutes.
Perhaps Tom may comment on validity / usefullness of such tests and suggest some other approach.
I would be very happy to follow any such suggestions.
Many thanks
Regards - Andre
July 16, 2013 - 4:01 pm UTC
windows OS file system caching can have a big effect on this - ASM would be using unbuffered IO.
A reader, July 15, 2013 - 7:36 pm UTC
Personally, I do not feel that your test has any value. Is it random or sequential? What is the i/o size? Etc....
>> First, you need to identify the theoretical maximum performance YOUR server environment can provide.
For your environment: how many i/o paths in the server? What "speed" are they? What do these cards attach too? SVC or SAN switch, etc? How many paths go into the SAN? What are the performance specs? Your i/o will only be as fast as the slowest link.
After getting your specs, I would than start using benchmark apps like Iozone or iometer.
SAN => Erratic, unpredictable, and slower ...
Andre, July 16, 2013 - 12:14 pm UTC
To Tom and the “Reader from NY”
I am assuming that it is still the same person – “Reader from NY” – and I do not know your name, or your technical background – Oracle or h/w and SAN storage in particular.
So I would appreciate if you could kindly introduce yourself.
My expertise is with Oracle – with good knowledge of UNIX not so much with Win.
As to the tests that I had executed – it is just my first 2 steps – as per Tom’s suggestions.
I am fully aware of the “weakest link” syndrome – and I would like to get some sensible assessment of the available I/O bandwidth.
I am OPEN TO ANY SUGGESTIONS - which I would try out within the constraints imposed by the SYS Admin and SAN Admin teams – as they do not provide responses just because I ask.
Tom (in his follow-up of Jul 1, 9pm) had suggested:
“just dd a couple of data files from one place to another - - and ask them if that is the best they can do for you”
And I had done exactly that on a UNIX server which is another DB server mapped onto the very same managed-disk on the SAN (amongst 12 other servers).
So I also executed a dd on UNIX as well as cp of a 30GB file to get 47 and 39 MB/sec respectively.
I executed a series of such tests to find out that they too vary quite a bit, which was what Tom was saying:
“.. Erratic, unpredictable, and slower than the disks should be.”
This is only the starting point.
I am happy to try more under your specific guidance – so I would welcome ref materials (URL + white papers – anything) and further advice
Thank you.
Andre
July 16, 2013 - 4:59 pm UTC
A reader, July 18, 2013 - 5:35 pm UTC
My car get 20 miles per gallon. Is that good? You don't know because you don't know what kind of car I drive and what the EXPECTED gas mileage is.
Regardless of the testing tool you use ( and ORION is a good one ) you won't know how well you are doing until you know what to expect.
The reason to use a tool like ORION or IOZONE is that they offer parameters that allow you to modify the load. By modifying these parameters, you will ultimately get to the max performance. You can adjust the io size and the number of threads, etc. A simple copy won't allow for that.
I'm a DBA with 20+ years of experience with an interest in SAN performance.
ASM vs shared SAN
Andre, July 22, 2013 - 11:27 am UTC
Hi Tom + Reader (DBA) from NY.
I had done what I could do within the imposed constraints - i.e. copy files from one storage location to another - just like Tom originally prompted.
I understand perfectly that this is NOT a valid test.
I really wish it could be easier - but ...
The fact of the matter is that I have constraints:
(1) The machine is an OLD HP Itanium (EOL = 2009)
(2) It runs Win-2000 and hosts Oracle 9i (9.2.0.8)
(3) I/O tools cannot be installed just because I would like to
(4) Oracle docs - Chapter 8 makes refs to Async-IO and Oracle sys package Resource_Manager (Calibrate IO).
(5) SAN team say:
--- a. SAN is good
--- b. The fact that one volume is shared between 14 hosts is percetly OK as the SAN has a large cache
--- c. RAID-5 is good - we had proven that to those who queried that in the past
--- d. Our tools deployed on SAN say that I/O tends to be under-utilized
--- e. If an APP has I/O issues it is the App and not the SAN
BTW - no specific info provided re any of the above "proven" facts.
+
I can either accept this - while being criticized for poor performance - or - try to prove the case.
As per Tom - (to my earlier post on this issue) - under such SAN configuration I/O rates will be erratic unpredictable and worse from what one should expect from a disk I/O
I am trying to get some measurements (WITHOUT TOOLS and UNDER CONSTRAINTS) to at the very least demonstrate that there is indeed AN ISSUE - so that it gets the attention it should deserve - and THEN we may even get tools.
BTW - Upgrade from 9i to 11g is being considered for 2.5 years and may ... finally be done in summer of 2014 - but ... in the meantime ...?
Thank you for any further suggestions
Andre
PS
I have worked with Oracle since 5.1.B (24 years) as DBA, architect, Apps developer as well as project leader etc...
> with MSc degrees in Applied Math and Computer Science
and I do the best I can to measure (typically execution times)
A reader, July 25, 2013 - 6:18 pm UTC
I asked before: What type of HBA cards ( Fibre Channel? ), what speed and how many paths does the server have to the SAN?
If you cannot run a benchmark app, you will have to measure and use Oracle (AWR,etc) and server performance stats. What do the stats look like when it is "fast" and what do they look like when "slow."
SAN shared between more than a dozen hosts
Andre, July 29, 2013 - 5:03 pm UTC
Hi Tom + the Reader from NY
As mentioned earlier - the SAN team is behind a few walls.
The info "NY" would expect is simply not provided.
The statements received is in the lines of George Orwell's Animal Farm:
"Our SAN's good - your App's baaaaaaaaaad".
What we are trying to do is to find some evidence to the contrarary - so to at least justify the I/O bench-mark tests
Once we manage to get some attention - tools like Orion might get approved + bench-mark tests to show SAN current performance in the shared mode vs ASM deployed on several raw disks might get approved also.
Then - once proved - we may get an approval for a different infra.
What is needed now - is some tests that can demonstrate that I/O is actually well below acceptance level.
Any suggestions..?
Thank you
Kind regards
Andre
August 02, 2013 - 5:46 pm UTC
we've mentioned tools above, like orion, that can be used.
this is a chicken and egg problem, you:
a) want numbers to show that a benchmark is warranted.
b) are not understanding that a benchmark is necessary to get numbers.
infinite loop.
A reader, August 01, 2013 - 5:26 pm UTC
>> What is needed now - is some tests
You have been given suggestions, but you say you can't run them.
>> that can demonstrate that I/O is actually well below acceptance level.
What is your acceptance level? What are you basing it on?
You run Oracle - Oracle and the native OS have tools to measure IO metrics.
shared SAN => root cause? - breaking the impasse
Andre, August 04, 2013 - 1:01 pm UTC
Tom
NO.
I DO INDEED understand that a benchmark is needed to get the numbers - there is no "infinite loop" problem as far as I am concerned - the problem is POLITICAL - i.e.
How to:
=======
(1) convince the management to allow getting tools in place?
(2) convince them to request the SAN team to provision raw devices instead of shared volumes?
(3) convince the management to come up with the budget for such tests using a better server + ASM deployed on raw DASD?
That is WHY I had asked for help
================================
- so please assume Tom (or the NY reader) that you had been invited to come on site and now are facing a challenge to make a case to get proper bench-mark tests and infra change by proving this in 10-15 mins in which time you would be able to show the tip of the iceberg and projecting the entire iceberg size without necessarily having the deep-sea exploration team in place...?
Let me reiterate:
(a) old HP server (this may also have an impact on IO)
(b) zero info provided as to what the other 12 hosts are doing in terms of I/O to the same shared logical volume
(c) old Oracle 9i with no tools available like Orion or iozone
(d) all you have is access to this storage area via its logical volume + local drive (partitioned into C + D)
As it is 9i - there are no HISTOGRAM views like v$event_histogram - but you can see MAXIORTM or MAXIOWTM getting as high as 300-400 range at "normal times" (unless there are some more severe incidents - when it gets to over 22,000 cs)
So:
How would you go about CONVINCING the management to allow you an acid test on I/O (with ZERO cost to them) only to get initial feedback warranting deeper digging -
The objective is to FIRST achieve recognition of a MAGNITUDE od a problem -
And then - once you get them to listen to you:
i. properly set up tests with tools like Orion
ii. on a server spec that would have good I/O
iii. exposing the actual root cause being shared SAN
If you tell me that you would walk away from such a project and are NOT INCLINED to even spend 15 mins on it =>> I can understand - but if you can see the way to BREAK this IMPASSE - I would be grateful.
Thank you
Kindly
Andre
August 08, 2013 - 4:22 pm UTC
(1) tell them to read this page I guess.
(2) see #1
(3) see #2
not sure what else we can say here.
ask your SAN people to tell management what the industry standard for IOPS and IO response times are. (5-10ms at worst)
ask them to report what you are receiving currently. (1-2 seconds. that is 4 orders of magnitude away from acceptable. FOUR)
ask them to explain the delta.
benchmarking SAN I/O - vs ASM on raw devices
Andre, August 09, 2013 - 9:25 am UTC
Thank you very much Tom
This is succinct and good.
BTW - you said "4 orders of magnitude away from acceptable. FOUR"
with 5-10ms being "at worst" - conmparing our ACTUALS 3-4 sec => I infer that the acceptable I/O response times would need to be approx 400 us (0.4ms)
Whilst comparing to the 5ms "at worst" - this would be 3 orders of magnitude away - ALSO UNACCEPTABLE.
Thanks
Kindly
Andre
August 09, 2013 - 4:47 pm UTC
sorry - THREE orders of magnitude, 'just' three.