Skip to Main Content
  • Questions
  • Application design - logic in Database vs logic in Web (Application) server

Breadcrumb

Question and Answer

Tom Kyte

Thanks for the question, Marat.

Asked: November 24, 2009 - 11:37 pm UTC

Answered by: Tom Kyte - Last updated: July 11, 2012 - 4:52 pm UTC

Category: Database - Version: 10.2.0.4

Viewed 10K+ times! This question is

You Asked

Dear Tom,

Taking a web application - what is better: store programming logic in the database, or in web-server? Particullary, what (and why!) is the better application design:

1. [DB with application logic, WebToolkit] <--HTTP--> [Web server with UI-design logic]

or

2. [DB returning just data sets, WebToolkit] <--HTTP--> [Web server with application logic]


or something else?

Thanks a lot!

and we said...

Answer me this:

are you building an application that will last forever? Are the technologies and tools you are using today to craft your application going to be used 20 years from now? Will the language you used to code this application still be in vogue in 10 years?

When you ask that question of the data - the answer is easy, yes, this data is relevant and important, we'll be using it a long long time - much longer than the application we build this month (applications come, they go - they go really fast).


Think about this - the year is 1996, web applications are brand new - never existed before. Completely new way to approach things. People wanted web access to their data.

Too bad that the legacy applications were written as CICS transactions on a mainframe, front ended by an ISPF green screen panel.

OBSERVATION: it never fails to amuse me how dead similar CICS transactions are to middle tier applications and how similar the ISPF green screen panel is to a web browser. They are pretty much dead on *the same*

So, when they tried to move to the web, it was really really hard - why? Because the data logic, the data integrity rules, the security, everything - was tied up in the CICS transactions (written mostly in Cobol for example) and building an application that accessed the data directly was FORBIDDEN - for the simple reason that you could not do it safely.

So, many web stores were simply "lipstick on a pig" - they collected data and at the end of the day were batched up to run against the real system. Or, they interacted with the real system by pretending to be a 3278 terminal end user (screen scraping)


And here in the year 2009 - we are inventing 2020's CICS transactions with green screens (j2ee, web browsers).


http://asktom.oracle.com/pls/asktom/asktom.search?p_string=%22applications+come%2C+applications+go%22


Guess where I would encourage you to put your

o access control
o security in general
o data integrity rules


They belong right next to the data - allows you flexible access to the data - you can build a new application WITHOUT reinventing all of that stuff.


where does the logic to display a bunch of numbers in a pie chart belong? In the application.


where does the logic to ensure that if X=0, then Y must be >= 0? That belongs right next to the data.



So, programming logic - some of it in the application - user interface code, user interaction code, error handling, display of data, report generation - in the application.

Much of it in the database - security, access control, data integrity.

and you rated our response

  (95 ratings)

Is this answer out of date? If it is, please let us know via a Review

Reviews

Excellent!

November 26, 2009 - 5:13 pm UTC

Reviewer: Marat Tolgambayev from Melbourne, Australia

Tom,

thanks a lot! Your answer is just amazing!

I'm building my application in exactly the way you've described:
"programming logic - some of it in the application - user interface code, user interaction code, error handling, display of data, report generation - in the application.

Much of it in the database - security, access control, data integrity".

I just needed the right words to explain that to my customers - and you gave me the words!

Thank you again!

application

November 28, 2009 - 1:41 pm UTC

Reviewer: A reader

I do not think Tom like to put the UI code in the application.

UI changes constantly. Data and others stay the same.

I think Tom like UI be separate (outside DB), everything else (Business logic, integrity, security) belongs to the DB.


Tom Kyte

Followup  

November 28, 2009 - 4:09 pm UTC

no, the UI code is the application.

the stuff you are talking about - data integrity, security - is not application specific - it belongs near the data

UI stuff - application level

ui

November 28, 2009 - 7:58 pm UTC

Reviewer: A reader

Tom:

I thought UI = Data Entry screens = no data processing.
(i.e Blank HTML data entry form)

Just the visual interface.

I remember reading several posts where you said that is the only think you would support storing on the app/web server. Business logic like insert/update an emp, validating data entered by user should all be in database.
Tom Kyte

Followup  

November 29, 2009 - 8:34 am UTC

UI = user interface.

UI might be data entry screens, but they are still UI, they still process data - you might be making sure something is a number, date, etc. It is OK to *copy* the integrity constraints from the metadata in the database into the application when you generate a page (in fact, that makes a lot of sense), but the application is not responsible for validating the data - the database is. And if the rules in the database change - if you have done it right, the rules in the application change as well.


Find where I wrote that, you'll probably see it immediately followed by "does that mean never put the logic in the application" - the answer to that is "of course not - never say never, never say always, I always say"

parse that :)

logic

November 30, 2009 - 8:01 pm UTC

Reviewer: A reader

ttp://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:12083187196917

2nd followup and others later


application logic -- i agree.

BUSINESS LOGIC -- that, well, no, i agree to disagree. unless your business is J2EE and that
alone. business rules = data rules = close to data. a business rule could be "you can only be in
EMP if there exists a record in DEPT". would you argue that should be enforced outside?

application logic = something outside of the database.

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:883929178230
Tom Kyte

Followup  

December 01, 2009 - 3:37 am UTC

you do understand that a rule like "you can only be in
EMP if there exists a record in DEPT" would require a database hit right? So, you have to go to the database in order to validate that rule - therefore, just let the database do it.

Now, look at a rule like "column X is a number and must be between 1 and 10"

Should the application enforce that rule? Probably, definitely the number bit, maybe the 1-10 bit.

Does the database ALSO HAVE to enfore that rule? *Absolutely*, no question, yes. Because the application might not - or might not be able to.

Would it make sense to be in both places? Absolutely, better end user experience, better database utilization.

What is the drawback of that? You have to make sure the application is metadata driven and reads the rule at runtime from the database - so the rules are consistently maintained in a single source of truth - right next to the data.




And remember, I was responding to the statement

Having said the above, it does pay to maintain business logic outside the database and in Java
middle-tier components.


I say "no, it does not", you maintain it in the database. The middle tier logic is free to read that out and do something with it, but we are NOT maintaining the important rules in the application - we are doing it with the data.

You have to read everything in context, snippets don't count.

rules

December 01, 2009 - 5:18 pm UTC

Reviewer: A reader

Tom;

<<You have to make sure the application is metadata driven and reads the rule at runtime from the database - so the rules are consistently maintained in a single source of truth - right next to the data.

I have never seen that in any application. everything is hardocded in client. Is this easy thing to do?

I get confuse when you say application logic or business logic

Is application logic has to do with data validation and UI while business logic are DML transactions?
Tom Kyte

Followup  

December 01, 2009 - 10:19 pm UTC

... I have never seen that in any application. everything is hardocded in client. ...

Sam/SMK, I've asked you to use apex more than once (rather than hand code as you are wont to do) - if you would, you'd see that it'll use the metadata to build forms and whatnot.

everything is hardocded in client.


precisely and exactly where it should NOT be, else that client is the only stupid way to get to the data, to use the data. Therein lies the point.

I don't believe in the term business logic.

There is user interface code, get this data, transform it, interact with the end user with it.

There is data logic, get this data, transact against it, use it, verify it, make is sane.

Sometimes interface code looks like this so called business logic "when the inventory falls below 10, display in blinking red text". But if you have a rule "when inventory falls below 10, send alert to X" - that belongs next to the data - not in some application middle tier

Business Logic

January 24, 2010 - 4:05 am UTC

Reviewer: Valentin Chira from Vienna, Austria

Hi all. if you plan to have more than 1 client(and even if you don't) than i would separate the business logic from your presentation layer. if that business logic should be in the application layer or in database it's not easy to answer. if you implement it in a middle tier(java is the proper technology hier) you have the advantage of having the possibility to abstract you persistence details using OOP concepts that are much better suited than tables and triggers when it comes to describe the business data. i can elaborate on that if you wish. another huge advantage if you user a java middle tire is that you must not concern yourself with cross cutting concerns like security/transaction management/logging&tracing/etc... they are non intrusive implemented using AOP. There is another aspect hier and that is scalability . PLSQL is not scalable at all but a Java MW is easily scalable. there are million reasons why you shouldn't implement your business logic in DB with PLSQL and only one to do so... performance.
Tom Kyte

Followup  

January 26, 2010 - 1:28 am UTC

... f that
business logic should be in the application layer or in database it's not easy
to answer. ...

sure it is, it is trivial actually. If it involves manipulating data, it is data logic and belongs in a thing that starts with data.

... if you implement it in a middle tier(java is the proper technology
hier) you have the advantage of having the possibility to abstract you
persistence details using OOP concepts that are much better suited than tables
and triggers when it comes to describe the business data. ...

and totally and completely miss the point. the middle tier of 2010 is CICS of 1980 - nothing more, nothing less. It is "cool today, gone tomorrow" stuff. What will you use in the middle tier? RMI (not cool anymore), Corba (decidedly not cool anymore), HTTP (no, not really, but many have tried), SOAP (not quite good enough), Web services (not exactly), SOA (maybe today - but what about 2 years from now).

However, code I wrote in 1987 still functions and works and would look relatively fresh and new in 2010 - because all it did was connect to a database and do stuff. We can still all connect to databases and still all do stuff.


And if you are using OOP concepts to persist data and think "that is a good idea", then you have really missed the ideas put forth in this thread. An object view of data is a very very very extremely MYOPIC view of data. The reason SQL databases have actually worked and been successful (the paper describing relational databases and the paper describing smalltalk for example were released at the same time - one became a pragmatic way to manage data and one became a really good paper) is because they excel at presenting the data in many ways, shapes and forms - for all to use - NOT JUST YOUR APPLICATION.


.... another huge advantage if you user a java middle tire is
that you must not concern yourself with cross cutting concerns like
security/transaction management/logging&tracing/etc... ...

holy moley. I've no idea what you mean by that - but I hope it doesn't mean what I think you meant. I really do.


.. PLSQL is not scalable at all but a Java MW is easily scalable. ..

You just lost all credibility, all of it. 100% of it.


You know what sites I go to tune and re-architect? They are not the ones using things like APEX and plsql stored procedures.


so, if your goal is your last word -- performance -- then you might want to reconsider some of your thoughts...

Helsinki Declaration

January 26, 2010 - 2:07 am UTC

Reviewer: Toon from Netherlands

Mr. Valentin from Vienna might want to check out The Helsinki Declaration...
First post of which can be found here: http://thehelsinkideclaration.blogspot.com/2009/03/start-of-this-blog.html

:-)


Oops, another OOP freak

January 26, 2010 - 2:35 am UTC

Reviewer: Michael from Germany

Oh, how i love this java dogmatics!

... holy moley. I've no idea what you mean by that - but I hope it doesn't mean what I think you meant. I really do. ...

This hope is in vain, Tom: you know that he means exactly what you think he means. It is as worse as it is ...

But, wait a few seconds: then he'll speak the magic words "ORM" and/or "Hibernate" - get ready ...

... You just lost all credibility, all of it. 100% of it. ...

Very well said. But, of course, in the eyes of the java dogmatic this harsh (and btw correct) verdict from an "oldfashioned database guy" is the biggest compliment and the best confirmation for him.

Against dogmatics arguments are unfortunately not very effective.
For an amuzing evidence for that take a look at http://www.theflatearthsociety.org

What could "business logic" be?

January 26, 2010 - 3:26 am UTC

Reviewer: Stew Ashton from Paris, France


I developed for 25 years before truly discovering SQL, Oracle and this site at about the same time. More than once I've converted Java "business logic" into pure SQL that ran 100 to 500 times faster; it wasn't "business logic" after all, it was "data logic".

I say this to underline how much I agree with Tom and others about how important the data and "data logic" really are, and how surprised I am that others with my background don't wake up to the power of the database.

That being said, I think there is a place, in some complex applications, for a separate "business" or integration layer distinct from presentation code and the database. The most obvious example is when you have more than one data source or service provider, so you really need a transaction manager outside of the database to pull all that together.

I would try to avoid that complexity when possible, but when it is there I would not want to integrate in the database; I want Oracle on its host doing its thing without interference from extraneous processes.

So I do think we should aim for data-centric solutions which can do without the business/integration layer, but we should recognize the distributed exception when it arises and dedicate a layer to it.

Most applications should be designed to use one connection to one database, in which case two layers of code is fine and PHP/Oracle or APEX would be reasonable choices. Once you have multiple data sources or service providers or even multiple connections, that is real distributed processing and I would want a transaction manager handling that in a middle tier.

No flame war pls...

January 26, 2010 - 7:52 am UTC

Reviewer: Valentin from Vienna, Austria

I don't wanna start any flame war. I met a lot of guys like you over the years and i know you can't follow some other reason than performance. it was just my 2c. wanna listen gut..not.. again fine by me.
Tom Kyte

Followup  

January 29, 2010 - 2:36 pm UTC

but - did you listen to us...

I fear not.


Provide something to backup your statements and we'll be glad to discuss. But when you say stuff like:

.. PLSQL is not scalable at all but a Java MW is easily scalable. ..

sorry, but the BS detectors just go off big time. You have drunk someones drink and are blind to what is really true - if you in fact believe what you wrote and repeat it.

January 26, 2010 - 11:53 am UTC

Reviewer: Alexander

Here's what I don't understand about this debate that rages between application developers and dbas.....

Forget right or wrong for a second, it's obvious most of the time we won't come to an agreement. I don't understand why the developers insist they can do things better, despite the fact that they clearly have no interest in anything database related. I come to this conclusion because all of their "benefits" from doing things their way are just to AVOID things, things that are critical to successful applications (transactions, sql, etc).

Why do they not let us do what we do, what we enjoy doing, what we are trained to do? And they can focus on what they enjoy, building fancy windows or whatever. If the performance sucks, so what? That's our problem, and we'll deal with it. How many times do dbas come to developers and say, "Hey that log on screen really looks like crap, you should have done it this way...." We know they are the experts at this, and trust their abilities. We just don't get the same respect.

@Alexander

January 26, 2010 - 3:15 pm UTC

Reviewer: Stew Ashton from Paris, France


Alexander, I feel exactly the way you do, though in my case the desire preceded the training because I saw the underused potential for performance and centralization and reuse and security...

Basically I think there are different conceptions of what we do, or what I would call architectures. What do we produce: applications with "their" screens and "their" data, or databases plus applications that access them, or "services" plus applications that access them?

In my company we have big, consolidated DB/2 databases with many applications that access them through a service layer. On the other hand, each J2EE application has "its" Oracle database. Strangely enough, whoever accesses DB/2 has to go through the DBAs who validate everything, whereas the Oracle DBAs are just called in to install stuff and put out fires.

How is it that the DB/2 guys get to "do their job" as you put it, whereas Oracle is basically at the mercy of the developers? It's partly because in one case the database exists in and of itself, whereas each Oracle database is just an element of the application. It's also easier to administer 4 things than 400.

I think we have to break this one-to-one relationship between application and database. Some business sponsor needs to own the data as such, not just indirectly through ownership of some application that "uses" the data. Once the database exists independently from the application, the need for a database professional to be in charge of it should be obvious.

I would go a bit further and say we need to go outside the database and establish data access beachheads in the application. Our DB/2 folks insist on static SQL; we should insist on connection pools and prepared statement caches and bind variables and decent fetch sizes and enforce that in production. Ideally, we should be able to tell the developers what to do, why and to some extent how.

@Stew

January 27, 2010 - 6:19 am UTC

Reviewer: Michael from Germany

> How is it that the DB/2 guys get to "do their job" as you put it, whereas Oracle is basically at the mercy of the developers? It's partly because in one case the database exists in and of itself, whereas each Oracle database is just an element of the application.

This is certainly a very correct observation.
The simple fact is that in every typical solution the data is (or should be) primary and the application is (or should be) secondary.
The violation of this rule leads to these JEE nightmares (or even disasters).

Now, guess what these "developers" would do (if would you let them) to "modernize" or "improve" your DB/2 database/applications - where according to your description the data is (rightly) paramount: they would transform the database into an mere appendage of their applications.
It's quite likely that they would also tend to split the consolidated database into many - one database for every application.
That's often because their applications use different ORM (or better: let-me-neglect-the-database-completely) "tools" that come and go like the seasons and the data schema is also more often than not designed after the features or shortcomings of their ORM tools or applications.
Don't let this happen ...

Business (il)Logic -> the Mythical Business Layer

January 27, 2010 - 10:24 am UTC

Reviewer: Duke Ganote from Batavia Township, Clermont County, Ohio USA

I like Alex Papadimoulis' discussion of the Mythical Business Layer:
http://thedailywtf.com/Articles/The-Mythical-Business-Layer

Certainly, the swirling discussion is repeated else on Tom's site, e.g.
"Business Logic - PL/SQL Vs Java - Reg"
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:883929178230
"SQL In Presentation Layer"
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1924414300346524388

@Stew

January 29, 2010 - 11:55 am UTC

Reviewer: Duke Ganote from Batavia Township, Clermont County, Ohio USA

As a data warehouse architect, I at least get some visibility and say into how the application databases are set up because:
1) they're supposed to conform to the corporate data model
2) they're the source for reporting.

I can't think of many applications that lack some reporting/integration requirements.
<rant>
Everyone concerns themselves with the "User Experience Model" (J2EE term for the GUI), but somehow few ever think of a "Managerial Experience Model" (reporting/integration) to demonstrate the return-on-investment and effectiveness of the app for the company.
</rant>

Possibly the worst case is the mess that Tom pointed out on p.36 of his Effective oracle By Design... which he undoubtedly got from here:
http://asktom.oracle.com/pls/asktom/f?p=100:11:1501412268342338::::P11_QUESTION_ID:6692296628899

OHS or direct database calls from .NET?

February 07, 2011 - 6:48 am UTC

Reviewer: A reader

Dear Tom,

I'm creating a database driven application, based on HTTP/JSON web services (from my initial question "1. [DB with application logic, WebToolkit] <--HTTP--> [Web server with
UI-design logic]"). I'm going to use Oracle HTTP server with mod_plsql, database 11g.

However, when I discussed this architecture with an application developer, he was concerned about possible bottleneck of OHS when lots of requests come (say, millions per day). He suggested using .NET application based on server farm, which uses direct connection to the Oracle database (not via OHS).

Could you please clarify if OHS scalable enough (and, maybe suggest to what to pay attention) to take high workload.
And is it really much better to use Oracle data provider for .NET (or OCI) instead of HTTP calls to OHS?

Thank you!
Tom Kyte

Followup  

February 07, 2011 - 6:59 am UTC

insufficient data to answer.

If you use a connection pool and you have say 20 connections open - and they all have touched the remote data source (so the dblink is open) once - then you'll have 20 cached connections to the remote data source that everyone will share. You don't have to create a connection for each and every request, it'll be quite capable.

Guess the most appropriate thing to do when someone brings up a FUD (fear, uncertainty, doubt) factor is to.....


Benchmark it. See if you need to wasted time writing your own code or if the existing infrastructure can deal with it. In my experience that is the only way to handle FUD - to see if it is warranted or just FUD.

Connection pool will be used

February 07, 2011 - 7:56 am UTC

Reviewer: A reader

Thanks Tom.
That's what I exactly replied to the developer - we will use connection pool and test it...

But... From Your experience, Tom, is it a right idea at all - use this architecture (with OHS) - assuming that everything is setup correctly?
Do you have any examples of successfully working high-loaded web-sites with such an architecture (I know one - asktom.oracle.com :-) )?
I don't have working server yet, that's why I can't provide much details on that. But I'd like to predict problems in advance, before going to production...
Tom Kyte

Followup  

February 09, 2011 - 7:00 am UTC

In my experience - distributed database interaction should not be used by interactive applications - whether through database links or magic. If you need to access more than one data source for a given transaction - that transaction should be performed asynchronously.

the probability of at least one of the data sources not being available is greater if there is more than one data source, availability suffers.

Performance suffers

Complexity increases


I would not design something that required real time access to multiple data sources if at all possible. move the remote stuff into the background (message queues processed after the fact)

application

February 07, 2011 - 9:51 am UTC

Reviewer: sam

Tom:

<<sorry, but the BS detectors just go off big time>>

That was a very funny statement! It gave a good laugh.

I bet the detectors are ringing for you all the time.

constraints in the middle tier

February 08, 2011 - 7:02 am UTC

Reviewer: DMF

Very Good Article

There is a powerful movement once again to move everything outside the database into the Java Tier or Application Tier. As an example we enforce constraints within the database, however this has recently come under attack and particularly the NOT NULL constraint.

I received the following from the Java and Lead Architect
"Not null gives lack of flexibility, more costly evolution, inhibits the use of the database acting as a service to applications and makes it an inhibitor to evolution"
We all know why this is a twisted view of the universe for the reasons given here and in much of the very good literature produced within the Oracle community., Including life expectancty of the application, multiple software intercations with a single DB, use of constraints in the CBO, protection of the data etc. At the end of the day you don’t have to develop and unit test an Oracle Constraint so why not just use it.

What is intersting is how this is gaining momentum again, under the guise of a ‘Database as a Service’, which as far as I am concerned is being misinterprutted. I am waiting for a knock on the door to tell me to design the database and database ERD moving forward to be ‘database independent’.




Tom Kyte

Followup  

February 10, 2011 - 3:40 pm UTC

"Not null gives lack of flexibility, more costly evolution, inhibits the use of
the database acting as a service to applications and makes it an inhibitor to
evolution"


Never has someone used so many words to say so little.

How so on the lack of flexibility. Either:

a) the attribute is NULLABLE, in which case no one does anything, it is defined as "null"

b) the attribute it NOT NULLABLE, in which case the database should enforce it, not only for data integrity BUT FOR performance. The optimizer can use more access paths if it knows certain columns are NOT NULL.

c) the attribute is NULLABLE to day and NOT NULLABLE in the future. In that case, doing the constraint centrally ensures ALL CODE (no matter how poorly written) will have the constraint enforced - without having to track down all of the locations where it might be accessed.

d) the attribute is NOT NULLABLE today and NULLABLE in the future. In that case, we simply drop the constraint and all affected code is instantly relieved of the constraint without having to rewrite it all.

Where is the missing flexibility. An argument could easily be made that by placing it in the code (over and over again and again) is much less flexible than placing it in the database - but certainly NOT the other way around.

<What is intersting is how this is gaining momentum again, under the guise of
a ‘Database as a Service’, which as far as I am concerned is being
misinterprutted


It is not being misinterpreted - it is being purposely misconstrued by people that do not view the database as something of value. It is a lie in short.


February 08, 2011 - 3:36 pm UTC

Reviewer: JSunico from Canada

One defense I heard for putting the "business logic" in the application vs the database is to make the application database-independent, so that if the application changes the database (for example, from Oracle to MS SQL Server), then the logic still works. Basically, this allows the client the flexibility to choose whatever database they want, and the application will still work.


Tom Kyte

Followup  

February 10, 2011 - 4:01 pm UTC

I've seen people rewrite applications from the ground up a LOT more often than I've seen people actually change their database. So, once again, it actually works against them to not have a single concise point where all of the constraints are defined.


And when they do switch databases - they find that the concurrency models and way things work in general to be so different, they have to rewrite anyway.

Furthermore, the way you would enforce referential integrity in SQLServer in an application IS DIFFERENT from the way you would enforce it in Oracle due to the fact that reads don't block writes in Oracle and they typically do in SQLServer. Techniques that work in SQLServer do not work in Oracle and vice versa.

In order for the application to enforce a constraint in a database independent fashion and to get it correctly would mean that the developer of said constraint understands DB2, SQLServer, Oracle, Informix, etc etc etc - otherwise they cannot know how to code the logic safely for all (if in fact there is a single code solution that works for all). Show me a developer that knows all of those databases well enough to do that - and I'll show you a developer that would laugh at anyone that suggests doing it in the application. For you will have shown me a developer that understands and appreciates the database. That is, not your normal every day developer.

The above was written by a developer. Me

Isn't it the other way around.

February 09, 2011 - 7:58 am UTC

Reviewer: BillC from Ottawa

I think that it should be the other way around. The system should be application independent so that the database does not change no matter what is used to access the data. I think someone around here once said 'Applications come, applications go. The data lives on forever.'

The Myth Of Flexibility

February 09, 2011 - 8:46 am UTC

Reviewer: Vincent Palser from London UK

I can't help thinking this is merely a numbers game. There are many more Java bods out there than good DB people. The debate's less about what's the best long-term solution for the business, but more about the skill set of the employees in place. If they're Java trained, let's go for Java. That, and the natural desire of bored programmers to invent complexity to make their work more interesting. Why bother restricting data to being NOT NULL when you could write a "flexible" middle-tier rule for it?

And the old chestnut, the flexibility to be able to unplug a database and plug in a new one from another supplier - why doesn't anyone mention the same thing about the middle tier? Why can't I just replace that? Why's that so infallible?

oracle

February 09, 2011 - 10:00 am UTC

Reviewer: sam

To DMF

<<<"Not null gives lack of flexibility, more costly evolution, inhibits the use of the database acting as a service to applications and makes it an inhibitor to evolution">>.

Your lead should be writing Poetry instead of java archtiecture...

Anyone who says this statement does not qualify to be Java Architect and lead. This proves a high level of ignorance and I would recommend you find yourself a real knowledgeable Lead who has basic knowledge of SDLC and DBMS systems. I can see why 50% of IT projects end up with total failures.

As Tom always say the constraints should be enforced around the data. When you write ton of code outside database to enforce a data rule in *ONE* application, you still have many other application that are open. You do it once in the Database with one simple statement (NOT NULL) and it is enforced for everyone.

People keep talking about migrating databases but in real life it is mostly applications that are being getting rid of for many reasons. Databases stay for a long long time.

I found that people who keep complaining about oracle license costs mostly fall in this camp

"Penny wise, Pound foolish"

They spend millions of $$$ on writing unnecassry ton of code instead of maximizing investment in the database.



Wow...

February 09, 2011 - 4:39 pm UTC

Reviewer: djb from Seattle, WA

"Not null gives lack of flexibility, more costly evolution,
inhibits the use of the database acting as a service to
applications and makes it an inhibitor to evolution"

Wow, that just makes me want to cry...

applications

February 10, 2011 - 7:12 pm UTC

Reviewer: Sam

Tom:

<<<In order for the application to enforce a constraint in a database independent fashion and to get it correctly would mean that the developer of said constraint understands DB2, SQLServer, Oracle, Informix, etc etc etc - otherwise they cannot know how to code the logic safely for all (if in fact there is a single code solution that works for all).>>

I lost you on this one. If you are writing a NOT NULL (REQUIRED) constraint in a web application do not you just write it in JavaScript on the web for example. You block submission of the form(s) unless that field has data in it. Why do you need to understand the how DB works?

The constraint should be enforced both places but more importantly the DB first.
Tom Kyte

Followup  

February 14, 2011 - 7:06 am UTC

Sam - think about it, think about all constraints, think about all environments - in short - think about it.


Now, do a referential integrity check please.
And then a primary key constraint.
And then .... and so on.


I'll turn off javascript in my browser. Now what?

I'll build a URL that calls your process - and run it with wget, no browser at all - Now what?


This is an important question

February 12, 2011 - 1:38 am UTC

Reviewer: Amir Riaz

Hi,

I work with many vendors who works with DB2/oracle/exadata/sql server/netezza/teradata. in case of OLTP applications they use hiberante and spring. they use java hibernate for ORM and database independence. Almost 90% of products are currently made this way. similarly, oracle has a product called toplink. why dont you integrate toplink with Oracle database in such a way that oracle looks like a orm database. if you change the size of a string column in ORM its get automatically changed in database.
Tom Kyte

Followup  

February 14, 2011 - 7:31 am UTC

... if you change the size of a string
column in ORM its get automatically changed in database. ...

that is so so so very backwards. That makes the naive assumption, presumption that the "application owns the data" when that is not true. The data is used by everyone, applications come and go.


I love that database independence thingy. I've actually had a developer tell me (and half the room agreed with him) that if you "check transactional in hibernate- all of the generated code will be good - fully transactional". It took quite a while for me to explain to them why this is NOT TRUE - why your logic still counts.

Even though I had a clear cut sequence of events on a projected slide that would work entirely differently in DB2 than in Oracle. They believed "if we check transactional - hibernate will make sure it works the same". As if hibernate were magical or something (it isn't - unless you count making something that could be really fast and compact really slow and untunable 'magic' :) )

I'm not saying hibernate in itself is bad. I'm saying lots of developers that use hibernate....

February 14, 2011 - 7:43 am UTC

Reviewer: Oleksandr Alesinskyy

What you mean by ""if we check transactional - hibernate will make sure it works the same". What is "check transactional" in this sentence?
Tom Kyte

Followup  

February 14, 2011 - 7:52 am UTC

check the transactional box, ask hibernate to generate "transactional code". That was their term - they "checked transactional". Guess they are using "@Transactional" in the code. Whatever interface/IDE they were using must have had a checkbox for 'generate transactional code'

@JSunico

February 14, 2011 - 7:52 am UTC

Reviewer: Oleksandr Alesinskyy

"One defense I heard for putting the "business logic" in the application vs the database is to make
the application database-independent,"
Yes, it may be a valid defense - in relatively rare cases when your applications shall be simultaneously installed by several customers insisting of an usage of their DB of choice.
That means that either said application should be so small and simply that it is not much more than a child toy or you have an (almost) unlimited budget to hire experts in each of the target DBs, test thoroughly against each of them, performance tests against each of them, do the same for each of the DB versions to which your customers plan to migrate and so on. May be feasible if you developing something comparable to SAP/R3, otherwise it is a good paved way to the hell.

February 14, 2011 - 8:01 am UTC

Reviewer: Oleksandr Alesinskyy

check the transactional box, ask hibernate to generate "transactional code". That was their term - they "checked transactional".
Unfortunately in this thread it was present in your follow-up only, so it was unclear what is meant.

Guess they are using "@Transactional" in the code.
Not a bad idea though. Anyhow (in a vast majority of cases) a client should start and end transactions and this is one of the best ways to do it.
Tom Kyte

Followup  

February 14, 2011 - 9:05 am UTC

sure - but the point was they believed (honestly) that by using that - all of their code would be magically transactionally consistent - do the same thing in all databases.

Even with an example in front of them showing otherwise... "but hibernate says it will be *transactional* and transactional means it is correct"....

That is the scary thing - they actually believe that by checking a box off - their code will be written for them without them having to understand what a transaction is, how concurrency works, the differences between the databases.

lol

February 14, 2011 - 9:20 am UTC

Reviewer: Sokrates

"check transactional in hibernate- all of the generated code will be good - fully transactional"

rofl
that made my day !

database independence

February 14, 2011 - 2:29 pm UTC

Reviewer: sam

Tom:

Yes, I know you can edit any HTML form and disable any validation. That means for data independence to work the validatin code needs to written in the application code on server and it also means it has to be written in any other application that accesses the database. right? what a nightmare.

Some FUD&I people use PHP or PERL or JAVA for coding. They calim that these languages already have those database features built-in and instead of using a database feature or P/SQL they can use the other language.

My opinion is that those might be missing many if the database functionality but i am not sure i am correct.
Tom Kyte

Followup  

February 14, 2011 - 4:05 pm UTC

Sam - I can only comment on what you write, if you write "i'll do it in javascript", don't be offended when I talk about how easy it is to turn that off. Anticipate what others will say - people will be poking holes in it. Give a better example..


the problem with doing it in the application is that the applications come and go and there is typically MORE THAN ONE of them touching the data.

So now you have to do the security the same way everywhere (good luck enforcing that)

You have to attempt (stress the word attempt) to do data integrity the same everywhere.

and when you make a change to the rules - you better hit it everywhere.

and you better hope and pray that no one ever gets around your application.


I've never heard of php, perl or java having "database features built-in" myself.

@Transactional

February 20, 2011 - 8:00 am UTC

Reviewer: Amir Riaz

Hi Tom

Most important posting of you thread, I talk to one of our leads and he presented me with following code.

@Transactional
public interface OrderService {

void createOrder(Order order);
List queryByCriteria(Order criteria);
}

the problem i see is, createOrder(Order Order) procedure must have only statement or all the DML statement which either fail or succeed. The Order entity after modification must be passed to this procedure. I see it most like optimistic transcation managment. do you agree?


Tom Kyte

Followup  

February 20, 2011 - 12:42 pm UTC

I don't really know the code that would be generated by this - I don't know what 'transactional' means in hibernate.

All I know is the developers were sure that if they used "transactional", ALL OF THEIR CODE WOULD MAGICALLY BE TRANSACTIONALLY CONSISTENT AND CORRECT. Which is beyond naive - it is dangerous and "not smart"

February 21, 2011 - 4:53 am UTC

Reviewer: Oleksandr Alesinskyy

It is not "beyond naive" - it is almost correct.
If such code is executed in an appropriate environment (JEE container, Spring, ...) DB connections and transactions will be managed by the environment ensuring that specified transaction isolation levels and transaction propagation strategies are observed (@Transactional without any further qualifications denotes read committed and requires (latter means join existing transaction if it is in place otherwise start new one).

Definitely, it is developer's responsibility to ensure that these settings are suitable for the application running against the specific DB.

Tom Kyte

Followup  

February 22, 2011 - 9:51 pm UTC

HUH - I usually agree with you - but here - you are just wrong....

I wrote:


All I know is the developers were sure that if they used "transactional", ALL OF THEIR CODE WOULD MAGICALLY BE TRANSACTIONALLY CONSISTENT AND CORRECT. Which is beyond naive - it is dangerous and "not smart"


read that, re-read that. consistent AND CORRECT


The developers truly believed that no matter what they wrote, no matter their logic, no matter their approach -

IT WOULD BE TRANSACTIONALLY CONSISTENT and CORRECT.

eg: they did not need to think
eg: they did not need to understand
eg: they did not need to know how concurrency was dealt with in the database
eg: the same results would be found in all databases

that is beyond naive, that is reckless.

I don't think you actually think that way - that running the same code in DB2 would have the same effect in Oracle and vice versa. I may be wrong, but correct me if I'm wrong. If you believe that by checking "transactionally consistent" (which effectively means 'I will add a commit for you' and nothing more") that means 'my code is transactionally correct - it is perfect - it cannot be wrong' and we'll have to agree to disagree.

they truly believed it meant "same results everywhere, regardless of database, they are just bit buckets"


we are talking about people, that in general, do not understand the meaning of "transaction"

@Transactional

February 22, 2011 - 1:30 am UTC

Reviewer: Martin Vajsar from Prague, Czech Republic

Oleksandr,

the problem, as I see it, is that I could have several hundred of such services in a project, every flagged with an annotation. Is it really possible to flag every such piece of the application in such a way that any combination of them in any of existing code paths produces correct, desired outcome? Do the developers really consider all possible combinations (even as the application evolves and existing services are reused in new code paths), on all databases their database agnostic software is expected to run on?

It seems a tall order to me. Just curious.

I think Oleksandr Alesinskyy is correct.

February 22, 2011 - 1:34 am UTC

Reviewer: Amir Riaz

All the libraries which are used to connect to Oracle are those which we use in tom's described optimised example, further hibernate and spring take care of the following problems

cursor managment,
transcation managment,
connection managment,
ensure bind variable usage.

There are special Oracle transcation managment classes for oracle, connection manager does not uses its own jdbc drivers instead, it uses the Oracle jdbc drivers we have provided it. cursors are opened before each statement and closed after execution

This is the briefing i have so far and since i know java quite well, I must say, i am impressed.
Tom Kyte

Followup  

February 23, 2011 - 10:41 am UTC

He is correct in that there will be commits - there will be transactions, their work will be "transactionally consistent" inasmuch as they coded.

However, they (the developers) truly and honestly believed that by using 'transactional' their code would automagically be transactionally consistent and correct. No matter the database - it would be correct and do the "right thing". They were naive and ignorant of how the databases each implement concurrency and could not fathom the fact that their code would work differently on differing databases - resulting in different (incorrect) results on each.

@Transactional

February 22, 2011 - 4:51 am UTC

Reviewer: Martin Vajsar from Prague, Czech Republic

Amir, let's say I have ten services, all marked as requiring read committed isolation level, and one service requiring level serializable. I start executing first ten read-committed services and at the end use the service marked as serializable. Of course that I need the whole transaction to be serializable. How could the transaction manager know that it should have used serializable isolation level right from the start?

@Martin Vajsar

February 23, 2011 - 8:06 am UTC

Reviewer: Oleksandr Alesinskyy

What you mean by "whole transaction"?

Do you mean that service calls are nested?
In this case it would not matter if you use a declarative or programmatic approach to the transaction control. If the service that starts transaction does not know the isolation level required it cannot set it. And if it is known then just put the proper annotation.

Or you mean that some method starts transaction and the calls those services one after another? Then this "starter" method shall know required level and set it (either be annotated or do it programmatically).


February 23, 2011 - 8:20 am UTC

Reviewer: Oleksandr Alesinskyy


Followup February 22, 2011 - 9pm Central time zone:
HUH - I usually agree with you - but here - you are just wrong....

I wrote:

All I know is the developers were sure that if they used "transactional", ALL OF THEIR CODE WOULD MAGICALLY BE TRANSACTIONALLY CONSISTENT AND CORRECT.

And I wrote Definitely, it is developer's responsibility to ensure that these settings are suitable for the application running against the specific DB.

But anyhow the code would be magically transactionally consistent (according to an isolation level definition, expressed in terms of phenomena). Yes, it is clear that it would have quite different implications depending on DBMS (or even a mode, in which DBMS is operating). But it is a different matter - it does not repeal the fact that an operation would be transactionally consistent. As well as the fact that this consistency may be completely unsatisfactory or may have imtolerable side-effects rendering their application unusable.

Tom Kyte

Followup  

February 24, 2011 - 12:21 pm UTC

but my entire point was - the developers honestly and truly believed "hibernate would make my code CORRECT just by pressing the transactional button". Nothing else matters at that point, they are just wrong.

It is BEYOND naive to say that, it is beyond naive to say:

All I know is the developers were sure that if they used "transactional", ALL OF THEIR CODE WOULD MAGICALLY BE TRANSACTIONALLY CONSISTENT AND CORRECT.

February 24, 2011 - 2:17 pm UTC

Reviewer: Oleksandr Alesinskyy

Ok.

1. Hibernate does not have buttons. So it is just impossible to "press the transactional button".

2. @Transactional annotation put on the class or methods does make them transactional (assuming the code is executed in an appropriate environment).

3. It does not make the code automatically correct and well suited to an arbitrary DB.
Tom Kyte

Followup  

February 24, 2011 - 6:46 pm UTC

1) I understand that completely. I did not say that- whatever gui they were using to develop code must have had a button of some sort.

2&3) exactly - and exactly and precisely why I said what I said - that they were beyond naive.

Where to queue?

March 03, 2011 - 12:36 am UTC

Reviewer: Marat Tolgambayev

Tom,

back to my initial question and your followups: could you please suggest the better approach for handling high amount of requests to the database (via mod_plsql of OHS)?

What I've done is the following:

I have Oracle DB which exposes web services (with HTP package). Client requests come via OHS with mod_plsql. Most of stored procedures, which implement web services, are completed very quickly, but some of them might take longer time.

I wrap long-running procedures with jobs, so every appropriate HTTP request will activate a job. The jobs will do heavy SELECTs/INSERTs. The amount of such requests might be high - millions per day.

The system will be deployed to Amazon web services, so some additional queue services will be available (I don't think it worth to use them in my configuration though).

OHS will be running on the same (virtual) machine as the database. The database will be 11g SE initially.

Which bottlenecks would you see in this architecture? To what to draw attention?

Thank you!
Tom Kyte

Followup  

March 03, 2011 - 7:48 am UTC

I had to go lookup what the heck OHS stood for (I hate acronyms that are not spelled out at least once). We'll assume OHS = Oracle HTTP Server, in other words, the web server.


Limit the number of concurrent jobs PLUS the number of connections mod_plsql is allowed to make to the database so as to not overwhelm the resources you have available if they all become active concurrently.

Queue at the application server - not in the database. That is, if you have say 2 CPU's available to you - you probably don't want more than 4-8 "active" things in the database at any point in time. So configure maybe 2-4 job queues and 4-6 connections to the database from apache max (you'd be surprised how far that'll get you). In this fashion, the job queue jobs cannot overwhelm the machine and neither can the webserver.

mod_plsql

March 03, 2011 - 9:05 am UTC

Reviewer: sam

I wonder what Marat large number of requests equates to.

MOD_PLSQL scales very well. I am also using an older version of it that comes with the 10g app server.

I have an application that processes 20,000 - 40,000 requests per day without any scalability or performance issues. The database is set to handle 400 concurrent connections, All requests are coming through oracle HTTP server and mod-plsql and invoke a stored procedure.

Asktom site is running mod_plsql and that gets large number of requests per day.


Tom Kyte

Followup  

March 03, 2011 - 9:14 am UTC

... The database is set to handle 400
concurrent connections, ...

so, do you have 100 or so cpus?

Do the math, do a little division.

I hope for your sake you never actually get 400 concurrent requests because I have a feeling your database would fall over dead.

web

March 03, 2011 - 7:14 pm UTC

Reviewer: sam

What math? I do not have more than 100 sessions at same time.

The transactions are not not CPU intensive. It takes 2 second or less.

I think the machine has 16 CPUs I believe.

On average we have 100 sessions showing up in oracle. Many are idle waiting for mod_plsql to clean it up.

I am not sure why you sound surprised.

Is it too much to have 1,000 requests per hour or 15 requests per minute?
Tom Kyte

Followup  

March 04, 2011 - 8:37 am UTC

Sam

the math is rather simple. If you have a possibility of 400 connections (as you yourself wrote) - please do the math and tell me what would happen if all 400 connections happened? What would happen to your machine

do the math.

You have your 16 cpu database machine configured to accept 400 connections. Please do the math.

If you do the math, you will confirm that you have set up your machine to fully support a trivial denial of service attack (DOS). I can hit your machine using your web server and make you pull the plug it would be so busy. If I fired up some clients to repeatedly hit your webserver right now - and you went up to 400 connections, your machine would fall over. It would lock up. You would not be able to get on there to figure out what was going on.


So, I ask you - after you do the math, is 400 really the right number?

Or perhaps is 100 a better maximum number?


I don't care of they are idle - they have the ability to become "NOT IDLE" and they have the ability to far exceed 100.


Since you know, by doing the math, that you cannot really deal with 400 concurrent database sessions - why on earth would you even permit it.


I sound surprised Sam because we had this discussion a while ago - you asking how many connections to configure.

Having more than 10 times the number of cpu's configured is setting yourself up for a self inflicted DOS attack.

What happens if one of your procedures that used to take a second starts taking 10 seconds? Might the number of connection requests start to go up (because they are not coming back as fast as they used to). Now, these are taking 10 seconds too (a bad plan leaked into the system). So, the system starts running even slower - these 10 seconds procedures are now 30 second procedures. Which leads to people hitting stop and reload - causing more connections and the 30 second procedure becomes a 90 second procedure and so on

until you hit 400 connections and have to pull the plug.


That is what I meant by DO THE MATH. I meant for you to just do the math. Think about what would or could or more likely WILL happen to your database someday.


You can do a 1,000 seconds per second probably, the number of requests isn't at question here - the number of concurrent connections is.

Two seconds?

March 04, 2011 - 10:44 am UTC

Reviewer: Enrique Aviles from Florida

The transactions are not not CPU intensive. It takes 2 second or less.


That could still be slow.

If most transactions fall in the "or less" category, meaning less than a second then things look better than transactions taking close to 2 seconds. A considerably powerful server can do A LOT in two seconds which is something people tend to overlook. I've seen cases of "fast" queries that just take a couple of seconds perform thousands of consistent gets. Add the potentially high number of concurrent users you are allowing to hit the DB and I can clearly see how the server will flip over.

Tom Kyte

Followup  

March 04, 2011 - 11:37 am UTC

Agreed.

the goal, in system design, is to not allow the system to kill itself. A lot of people, for some reason, find it counterintuitive that FEWER CONNECTIONS can do a lot more work (better throughput) than A LOT OF CONNECTIONS can do.

Allowing a 16 cpu machine the potential to crank up to 400 concurrently active connections is a time bomb waiting to go off. All it takes is one "bad plan" or some transaction going awry in a way you didn't anticipate to start a connection storm that will end with a dead server - and with you looking bad, especially when you cannot tell them "why it happened" or what you are doing to make sure the thing you don't understand won't happen again.

connections

March 05, 2011 - 10:51 pm UTC

Reviewer: sam

Tom;


I cant do the math unless you tell me your secret formula for the corelation between CPUs and Number of concurrent connections.

Does not that *depends* on transactions begin processes and how much it use CPUs.

We have 16 CPUs on the machine but it also has another 100 oracle instances. I was not the architecture for this too.

Based on your earlier reponse, you say that with a 2 CPU machine you would set up 4 or 8 connections. That sounds awfully low to be able to support 8 users loggin in at same time.

FYI, I do not have 100 different users but i have application that keeps hitting the database which should be similar to people sending POST request anyway,




<<If you do the math, you will confirm that you have set up your machine to fully support a trivial denial of service attack (DOS).>>


Explain to me what difference does number of connections have to do with DOS attacks. What do you setup asktom at? Cant anyone use a client program to hit your server with thousands of requests per minute to shut down your site regardless of your setting for number of concurrent connections allowed. Does lowering the number of connections help you defend against DOS?


Tom Kyte

Followup  

March 07, 2011 - 11:03 am UTC

I cant do the math unless you tell me your secret formula for the corelation
between CPUs and Number of concurrent connections.


Ok, now you are scaring me a bit. As a person that works in Information Technology - working with computers - building things that run on computers - you do not know the basic fact of "how many things can a single cpu/core do at a single point in time" and "what happens when you run more things simultaneously on a single cpu/core"???

Really? If you start unzipping 50 files at once on your 1 or 2 core laptop or desktop computer - what would happen? How would it perform? Would it do very well? Or - would you be much much better off unzipping maybe 2 to 4 files at a times (depending on how many cpu/cores you have - the more you have the more you can possibly do at a single time) and when one of the unzippers finished - you would start another. You wouldn't want to overwhelm the machine - it would take much longer for 50 concurrent unzippers to finish then if you queued them up reasonably and did NOT overwhelm the machine.


Does not that *depends* on transactions begin processes and how much it use
CPUs.


to some degree yes, but 400 connections - unless you have at least 40 cpus and very very very IO bound transactions - you are way out of the ball park. close your eyes, imagine what would happen to your system if the 400 possible connetions you have all became (or in reality 'tried' to become) active at the same time. Re-read what I wrote above - I described a connection storm. I don't care if you haven't seen it happen YET - it will happen at some point.

What is the point of having 400 connections possible on a machine that cannot possibly do 400 concurrently active connections????


Do
The
Math


Based on your earlier reponse, you say that with a 2 CPU machine you would set
up 4 or 8 connections. That sounds awfully low to be able to support 8 users
loggin in at same time


why? How many active sessions to you have at a time (truly ACTIVE - not just trying to be active - but actually doing work - running on the cpu at the same time)???? Why do you need more? 8 connections can be used to support hundreds/thousands or more uses (using an application server that queues up requests and sends them as connections become inactive)



Explain to me what difference does number of connections have to do with DOS
attacks


re-read what I wrote above, read it slowly and critically... I wrote a very long reply on this already.

March 07, 2011 - 5:11 pm UTC

Reviewer: Alexander

"100 instances" on a machine should pretty much end the conversation. Forget any hope of achieving good performance. You have way bigger fish to fry than your number of connections.

#23: Database as a Fortress --[of] 97 Things Every Software Architect Should Know

March 20, 2011 - 2:58 pm UTC

Reviewer: Duke Ganote from 39.08,-84.33

Hip, hip, hurray!

"While business rules and user interfaces do evolve rapidly, the structures and relationships within the data you collect often do not. Therefore, it is critical to have your data model defined right from the start, both structurally and analytically. Migrating data from one schema to another in situ is difficult at best, time consuming always, and error prone often. While you can suffer bugs temporarily at the application layer, bugs in the database can be disastrous. Finding and fixing a data layer design problem does not restore your data once it has been corrupted.

"A solid data model is one that guarantees security of today’s data, but also extensible for tomorrow’s. Guaranteeing security means being impervious to bugs that will – despite your best efforts – be pervasive in an ever-changing application layer. It means enforcing referential integrity. It means building in domain constraints wherever they are known. It means choosing appropriate keys that help you ensure your data’s referential integrity and constraint satisfaction. Being extensible for tomorrow means properly normalizing your data so that you can easily add architectural layers upon your data model later. It means not taking shortcuts.

"The database is the final gatekeeper of your precious data. The application layer which is, by design ephemeral, cannot be its own watchdog...

http://softarch.97things.oreilly.com/wiki/index.php/Database_as_a_Fortress

#54 : It is all about the data

March 23, 2011 - 6:27 pm UTC

Reviewer: Duke Ganote from 39.08,-84.3

To DMF

November 18, 2011 - 11:25 am UTC

Reviewer: Mark from NY

"Not null gives lack of flexibility, more costly evolution, inhibits the use of the database acting as a service to applications and makes it an inhibitor to evolution"

Translation: "Preventing the application from doing whatever the heck it wants with the database is bad."

November 29, 2011 - 5:25 pm UTC

Reviewer: A reader

Hi Tom,

o access control
o security in general

Do you suggest every "business" user (i.e. online shopper, forum/social network user, game player etc etc) to be given "real" Oracle credentials? Given the most of them "come and go" - usually it's standard 80/20 rule for "registered users"/"regular users", I don't believe any of them deserve anything but a record in "SITE_USERS" table.
It might be a good idea, though, to have 2 separate connection pools, say, "SITE_USER" and "SITE_ADMIN" to prevent users doing dodgy stuff.

Regarding the "apps come and go" statement: databases do "come and go" as well. It often happens that the system is developed using MySQL/PostgreSQL/Firebird DB due to initial lack of resources, then goes to Oracle with the first round of investment ("Now we can afford the REAL database", some "bright" heads say) and finally dev team has to deal with MS SQL Server in production, because the company or its investor becomes "Microsoft Valued Partner" or being acquired by one. Nothing to do with wise tech decisions, everything to do with boardroom politics.
Tom Kyte

Followup  

November 30, 2011 - 7:10 am UTC

Do you suggest every "business" user (i.e. online shopper, forum/social network
user, game player etc etc) to be given "real" Oracle credentials?


No, that is probably impractical.

That said, that does not prevent, prohibit, inhibit you from using database security in any way shape or form. You have fine grained access control, you have secure application roles, you have many tools at your disposal.


the database didn't come and go in the above. The DATA STAYED AROUND FOREVER in your example. And if you were to have put the logic in the database - as you moved from system to system - you would still have one place to verify it all is in place.

Since the databases are different and a set of logic that WORKS on one database fails miserably on another - the more you have in the application, the more you have to review in the application.


And - apps come and go a lot faster than people decide to change databases.

November 30, 2011 - 2:50 am UTC

Reviewer: veggen

I have to strongly disagree with the idea of applications coming and going and databases being eternal. I've worked for many clients in different industries and have yet to see a confirmation for such a claim. I am working on replacing an Oracle DB with a different solution right now, for a fairly big media house. So it obviously does happen that the app stays and the DB gets replaced. Moral of the story: design your DB wisely, use it's features wisely, but do not make your code absolutely dependent on them (replacing Oracle's CONNECT BY PRIOR was particularly adventurous).
Tom Kyte

Followup  

November 30, 2011 - 7:14 am UTC

the apps you use today on a day to day basis.

did they exist 5 years ago? 10 years ago? In their current form (answer is no, 10 years ago there were more client server than web based)? Maybe the same fundemental "app" exists - the concept - but it has been written and re-written and SUPLEMENTED (mobile apps for example) 1,000 times over.



web

December 04, 2011 - 8:55 pm UTC

Reviewer: A reader


RE: replacing Oracle's CONNECT BY PRIOR was particularly adventurous

December 05, 2011 - 8:55 am UTC

Reviewer: Duke Ganote from 39.06,-84.24

@veggen. Should NOT be adventurous; recursive CTEs are available in any modern DBMS. Trying to replace built-in, close-to-the-data functionality with Java AND make it performant: that sounds adventurous!

December 11, 2011 - 6:32 pm UTC

Reviewer: A reader

>Since the databases are different and a set of logic that
>WORKS on one database fails miserably on another - the more
>you have in the application, the more you have to review in
>the application.

This is simply not true. The application has been successfully and almost seamlessly ported between 4 databases in its lifetime. Yes, we had EVERYTHING in the application, using database as a "data storage": no triggers, no stored procs were allowed. 99% of "porting" was simple Hibernate reconfiguration and everything just worked. We did have a problem with MySQL locking the entire table on "autoincrement" field during INSERT statement, though, but irony was that MySQL was the database we started with.

>The DATA STAYED AROUND FOREVER in your example.
Not true. There was no valuable data, because the app was brand-new.

>And - apps come and go a lot faster
>than people decide to change databases.

In corporate or government world this might be true. Smaller companies do merges and acquisitions, which often leads to "replace everything" command.


Tom Kyte

Followup  

December 12, 2011 - 5:59 am UTC

if the app is brand new, you are by definition at the start of the cycle. Your app will become "not relevant", the data - unless this is a game - will not.

if there was no valuable data, what is the point.

and I seriously wonder how your data integrity (oh wait, you'll tell us that is not important - because the data isn't valuable - which leads me back to "so why is your app interesting to anyone?") is... I've yet to see a "we do it all in the app" implementation that doesn't have data integrity issues.

December 11, 2011 - 6:46 pm UTC

Reviewer: A reader

>did they exist 5 years ago? 10 years ago? In their current
>form (answer is no, 10 years ago there were more client
>server than web based)? Maybe the same fundemental "app"
>exists - the concept - but it has been written and
>re-written and SUPLEMENTED (mobile apps for example) 1,000
>times over.

The same applies to the database, AND the "sacred" data as well - it simply goes to "archive" storage after certain period of time. Nobody really gives a damn about account details Joe Citizen used to buy few shares back in 1999 and sell them back in 2001 with 100K loss. Unless requested specifically, of course.
The other example would be my current project - 13 years old oracle forms/reports monster with business logic spread equally between packages, triggers, functions and FORMS code. The database changes accordingly - tables and packages modified or dropped, data archived and/or deleted, etc etc.
Tom Kyte

Followup  

December 12, 2011 - 6:00 am UTC

people do give a damn about their financial details, they frequent flyer history, their HR information, the amount of money they've paid into their 401k's, etc etc etc. people actually do care - you probably do too - you just don't even realize it.

December 11, 2011 - 7:17 pm UTC

Reviewer: A reader

>No, that is probably impractical.

>That said, that does not prevent, prohibit, inhibit you from
>using database security in any way shape or form. You have
>fine grained access control, you have secure application
>roles, you have many tools at your disposal.

Actually, it does. The application uses connection pool and generic authentication (say, SCOTT/TIGER). User authenticates against middleware container (being it web form or WS call, whatever). This is the most standard architecture one could possibly imagine.
Since secure application roles and, especially, VPDs (unfortunately you need Oracle Enterprise Edition to use them) are awesome tools, they bring code duplication issue: smart UIs disable/hide restricted elements and displays "NOT FOUND" (or "ACCESS DENIED") page if one's using request forgery attack to "enable" it back. Since it has already been covered in middle tier, why duplicate the same behavior in the database?
Additionally, f/g access control and secure application roles are exactly those kind of things which "works fine on DB X but fails miserable on DB Y (or even DB X version +Z)" because they are not standartised. Avoiding these problems are the very reason for popular "stick to SQL92 standard, vendor extensions prohibited" approach in modern application development.
Tom Kyte

Followup  

December 12, 2011 - 6:04 am UTC

they bring "multiple layers of defense".

doing it in the APP FORCES code duplication - as you'll have a mobile version of your web site, you'll have the android app, you'll have the iphone app, you'll have the full web site app and so on.


sql92 doesn't do anything for concurrency issues. Tell me - list out what the major difference between sql server and oracle (default installs, the most common settings).

Tell me - does the logic (transactional logic) for doing foreign key checking that works in sql server work in Oracle. Post your work, tell us step by step how you would implement a foreign key relationship. Are even just this: I need a business rules that makes "project_name" unique in this table when status = 'ACTIVE'. Inactive projects are allowed to have duplicate names:


create table proj( project_id int, project_name varchar2(30), status varchar2(10) );


so, what logic would you or your hibernate implement to enforce that simple business rule? Make it work on all databases consistently and correctly.

on "modern" application development

December 12, 2011 - 6:32 am UTC

Reviewer: Stew Ashton from Paris, France


One of the happiest times in my life was when I found out that "post-modern" art had arrived. It became official that "modern" was played out. This was before Java was born, by the way.

One thing J2EE and Hibernate have in common: they think only quick transactions exist. Hibernate is worse than useless for reporting and batch programs, and J2EE doesn't even know what batch is.

Your "modern" developments must not have any reporting or batch needs. Oh wait, you can just replace batch with messaging-based updates: good luck with scalability...

To belabor the point: over the last 4 years, I have consistently been able to speed up batch jobs from hours to minutes and occasionally from hours to seconds by replacing Java code with set-based SQL. Often, the application owner simply refuses to accept my solution on the grounds that "it's too complicated, my developers will never be able to maintain it."

The point is, Hibernate and other frameworks are imposed by management, not because they're "modern" or "better", but because the developer is expected to do his job without really knowing what's going on. Incompetence has become an IT strategy that drives "modern" application development.

December 12, 2011 - 9:31 am UTC

Reviewer: A reader


<quote>
Incompetence has become an IT strategy that drives "modern" application development.
</quote>

May sound rude ... it is 100% accurate .

I am glad , I am not only one with this idea.

Then it shouldn't matter

December 12, 2011 - 9:32 am UTC

Reviewer: Galen Boyer from Boston

> Since it has already been covered in middle tier, why duplicate
> the same behavior in the database?

So, then, it should not matter to you if the constraints are applied in the
database, correct? Your app takes care of all of it, right? But, a real issue
I found with more than one architect that had the same argument I see here was
that they were truly afraid that the constraints being applied in the database
might cause errors in the app. I would reply, "Well, you said your app handles
them all, why are you worried?". It would take only a small amount of time for
me to show you that you do have data integrity issues. And notice I'm saying
you have data integrity issues. And don't come back saying, we have no data
issues. You truly have no idea whether you have data issues because you haven't
asked the database to protect your data.

And, basically, that is the real issue with what your planned architecture ends
up with. It ends up with something that cannot be 100% trusted. If, when
showing some higher-up business sponsor the datamodel and he asks you, do we
have any data integrity issues, how do you answer them NO with 100% certainty?
You can't. There is no way for you to because your picture isn't guaranteed to
be what's in the database. Using a proper database design tool, you can say,
"See this line here with crows-feet and the customer# is NOT NULL? No issue
there, Oracle has guaranteed that every account has a primary customer". When
one looks at a picture of a datamodel that is used to create the schema deployed
in production, they trust it. When it does not translate to a schema, it isn't
fully 100% trusted that the data has no issues.

The only reason they might trust your saying "I am 100% certain we have no data
integrity issues" is that you are probably a smart guy that they trust. But,
when you say it that way, you are lying, whether you realize you are lying or
not.
Tom Kyte

Followup  

December 13, 2011 - 7:05 am UTC

change data integrity into security in the above and the paragraph would still be 100% true.

data integrity
security

they are somewhat relevant.

December 13, 2011 - 1:02 am UTC

Reviewer: A reader

>people do give a damn about their financial details, they
>frequent flyer history, their HR information, the amount of
>money they've paid into their 401k's, etc etc etc. people
>actually do care - you probably do too - you just don't even
>realize it.

Yes, that's why the data is saved in archives, not destroyed. If people want their "frequent flyer history" - they get a report, not 10 years worth of data from operative table.
Tom Kyte

Followup  

December 13, 2011 - 7:20 am UTC

my frequent flyer stuff better not be in any archives - nor my 401k, nor my financial stuff - I have stock holdings going back 10-20 or more years - I sort of want them *available*. My health records - same thing - I don't want a Dr making an uninformed decision about me because they forgot to look in some archive. All of that data I listed had better be online and available.

I think you are working on some very simple apps with relatively un-important data.

December 13, 2011 - 1:08 am UTC

Reviewer: A reader

>doing it in the APP FORCES code duplication - as you'll have
>a mobile version of your web site, you'll have the android
>app, you'll have the iphone app, you'll have the full web
>site app and so on.

This is where it gets wrong. What you've listed here are merely "facades", not the "apps". The "app" would be the middleware (written in java, c#, python, whatever), accessible locally or over the internet (WebServices, RESTful services, XML/RPC, Binary RPC - whatever) by android/iphone/web/whatever.
You seriously didn't mean storing your database connection details, including username and password, in iphone/android app, did you?
Tom Kyte

Followup  

December 13, 2011 - 7:22 am UTC

maybe it would, maybe it wouldn't. My android apps look a lot like client server apps.

And all of those 'technologies' you listed - hah, they come they go - they are very "fashionable" today. What about tomorrow. The database on the other hand - it'll still be there.

I could replace your webservices/restful blah blah blah with Corba/IDL, with "OO", with <fancy dancy whimsy of the moment>. Application development 'paradigms' come and go even faster than applications.


... You seriously didn't mean storing your database connection details, including
username and password, in iphone/android app, did you? ...

wtf? how did you even leap to that conclusion????? seriously??

December 13, 2011 - 1:14 am UTC

Reviewer: A reader

>sql92 doesn't do anything for concurrency issues. Tell me -
>list out what the major difference between sql server and
>oracle (default installs, the most common settings).

Optimistic vs pessimistic locking, yeah, right. 3% of performance sensitive code (bulk data processing etc) will be affected. Rule of thumb: if code is THAT performance sensitive, it goes to stored proc (if database supports that).
Apart from that 3% - no differences at "View/Create/Modify/Delete" screens (which are generally account for 90% of total app code) whatsoever.
Tom Kyte

Followup  

December 13, 2011 - 7:25 am UTC

oh boy, if you thought I meant optimistic/pessimistic - way off. I was talking about the inherit massive differences in the concurrency models as implemented by the various database vendors that makes a sequence of code executed simultaneously by many users result in different outcomes against the same exact set of data.


Rule of thumb: if code is
THAT performance sensitive, it goes to stored proc


are you the same "a reader" from above? did you just write that? That is sort of what we've been saying - but drop the word 'performance'.

rule of thumb: if the code is about data it goes to stored proc.



December 13, 2011 - 1:25 am UTC

Reviewer: A reader

>One thing J2EE and Hibernate have in common: they think only
>quick transactions exist. Hibernate is worse than useless
>for reporting and batch programs, and J2EE doesn't even know
>what batch is.

Agreed. JEE+Hibernate for data entry, database for batch processing and reporting - IF the "data" is not acquired from external sources (Google API, for instance).

>The point is, Hibernate and other frameworks are imposed
>by management, not because they're "modern" or "better",
>but because the developer is expected to do his job
>without really knowing what's going on. Incompetence has
>become an IT strategy that drives "modern" application
>development.

Almost valid point, but it's labor cost that causes all that mess. The mantra is - "whatever you well-paid consultant bastards do, must be understandable and maintainable by cheap junior developers when you go away". Too bad these cheap devs don't even understand how Hibernate works and usually panic when see something like @Transactional(REQUIRES_NEW).
Tom Kyte

Followup  

December 13, 2011 - 7:27 am UTC

Too bad these cheap
devs don't even understand how Hibernate works and usually panic when see
something like @Transactional(REQUIRES_NEW).


Even worse is when they believe if they see @transactional - it means all of the code that follows is perfectly "transactional and fail-proof and correct". Many more than once I've had a hibernate developer tell me "we don't need to think about concurrency issues, hibernate has this transactional thing that makes our code perfect". Hah.

December 13, 2011 - 1:34 am UTC

Reviewer: A reader

>So, then, it should not matter to you if the constraints are
>applied in the database, correct?

Hold your rant, please, it obviously is not correct. I never said CONSTRAINTS should not be applied in the database. They should.

We're discussing where the LOGIC should be, and I'm particularly interested in handling user security in multi-tiered application. When I had "Oracle-only, Enterprise edition OK" project, it was all great - we just used PVD to enforce customer security. Unfortunately, not many customers are that generous about databases.
Tom Kyte

Followup  

December 13, 2011 - 7:49 am UTC

I never said CONSTRAINTS
should not be applied in the database. They should.


where? you were asking above why this stuff should be duplicated in the database? You have a very confusing way of saying things if this most recent post is true. You wrote:

Yes, we had EVERYTHING in the application, using database as a "data
storage": no triggers, no stored procs were allowed. 99% of "porting" was simple Hibernate
reconfiguration and everything just worked. We did have a problem with MySQL locking the entire
table on "autoincrement" field during INSERT statement, though, but irony was that MySQL was the
database we started with.



EVERYTHING sounds like EVERYTHING to me - constraints, security, data processing - everything. You used the database as a bit bucket - nothing more.
.

...and I'm particularly interested in
handling user security in multi-tiered application....

that is just another type of constraint, it is no different from a constraint. it is there to protect the data.


December 13, 2011 - 1:52 am UTC

Reviewer: A reader

>Post your work, tell us step by step how you would implement
>a foreign key relationship

Why would I? Foreign keys are perfectly ok, it's security design I'm interested in.
Tom Kyte

Followup  

December 13, 2011 - 7:31 am UTC

since most every data breach out there is caused by holes in applications, the magical stuff in the middle tier - be it sql injection or whatever, and not database breaches, you might want to reconsider that.


Unless you are the smartest developer and everyone your work with is at least as smart and you develop perfect applications that have no holes.


Security is about defense in depth, multiple layers of defense, so that when one layer shows a weak spot - as it will - the other layers are there to help protect it still.

Implementing Security and Logic at the DB

December 13, 2011 - 11:47 am UTC

Reviewer: Forklift Six from USA

A Reader,

Your examples of why to put the security and data save/update code in the middle tier are exactly why we put it in the DB.

We have a near-terabyte-level DB that interacts w/ a dozen remote systems (custom apps, COTS products, etc). On top of this DB we have four web-based Apps (2x .Net 1.1, 1xCOTS, 1x .Net 3.5/4.0), several Python-based tool-sets, a Java utility, and 3 different reporting services. We are in the process of standing up two more .Net apps, an APEX app, and a COTS-based client-server editing tool. The only way we have been able to make this work is by putting all the security and logic in the DB. We also have some odd business and legal issues that (reluctantly) required us to provide direct SQL*Plus access to certain user groups. Attempting to manage security (and logging, and complex save behaviors, and ...) in each of these differing Apps and environments would be cost, time, and audit prohibitive.

The solution has been to implement security and business logic in Pl/SQL and internal security tables. We have a standard PL/SQL API that implements all business logic (to include security checks). We have wrapped the Security tables in a PL/SQL API, which we can further wrap in .Net, Java, APEX, Python, etc standard security APIs (we have a .Net Role provider that references these APIs, for example). Write it once, and all of our systems can use it natively - no special code required. Change it once, and the change cascades transparently across the all the stacks.

As for your earlier comment about exposing UI capabilities (edit buttons, etc), you are absolutely right - and this architecture lets us do so very, very easily. Our Apps query either their standard security providers (above) or a common "what can I do" PL/SQL call which gives them a list of allowed behaviors. From these they can configure and constrain the UI however they need to.

This also allows us to run a "legacy" app and its replacement side by side for an extended period w/o (in most cases) adversely affecting either one - invaluable for complex business transitions.

And while this is an Oracle-specific solution we could, if required, port it to some other DB by changing the one PL/SQL codebase. I'm not sure why we would, as Oracle gives us capabilities that would be problematic (but somewhat possible) to replicate in another environment. As far as DB upgrades go, however, it made our recent 10g-to-11gR2 RAC upgrade almost frightening smooth.

December 13, 2011 - 8:16 pm UTC

Reviewer: A reader

>I don't want a Dr making an uninformed decision about me
>because they forgot to look in some archive. All of that
>data I listed had better be online and available.

"Archive" does not necessarily mean "offline".

>I think you are working on some very simple apps with
>relatively un-important data.
WOW! Personal attack?

>And all of those 'technologies' you listed - hah, they
>come they go - they are very "fashionable" today. What
>about tomorrow. The database on the other hand - it'll
>still be there.
WHAT "database"? Mainframe used in early 90's? Oracle 6.0 that replaced it in 1998? SYBASE ASE 15 because the company has recently been acquired by big fat multinational and had to comply with its IT policies?
If your statement was true, we would see plenty of ancient Flat/Hierarchical/Network databases from the late 80's around, not to mention FOXPRO and PARADOX systems. RDBMS is just yet another "database technology".

>I could replace your webservices/restful blah blah blah
>with Corba/IDL, with "OO", with <fancy dancy whimsy of the
>moment>. Application development 'paradigms' come and go
>even faster than applications.

Transport is NOT an "application development paradigm". It really does not matter what protocol client uses to access application server - it's ol'good RPC (remote procedure call) and last time I checked, the concept didn't change since 1970's. Corba/IDL is just an RPC implementation, just like everything else, and is totally irrelevant.

>wtf? how did you even leap to that conclusion????? seriously??

hmmm, having all "data" and "logic" in the database (middle tier is evil), how would you connect your iPhone device to it?
Tom Kyte

Followup  

December 14, 2011 - 7:29 am UTC

"Archive" does not necessarily mean "offline".


why would I archive data I need then, what would be the purpose? other then to make things infinitely more complex and slower?

I either

a) need access to the data
b) don't need it

If I need is, 'archiving' it would make it that much harder to retrieve.

In a transactional system, the size of a table is not relevant. You are using an index to retrieve a manageable set of rows. I don't care if the table is 10,000 records or 10,000,000,000 - it takes about the same amount of time to retrieve a manageable set of rows from it. Archiving isn't something to do to that particular set of data.


WOW! Personal attack?


nope, statement of belief. Your data doesn't seem to be very important does it? It isn't financial data, it isn't related to health, it isn't employment records. It in short is data that has a very short shelf life and you don't seem to care too much about it after a while. You've said that. I therefore made a simple assumption about the nature of the data. The data isn't relevant to you - the application is.

WHAT "database"? Mainframe used in early 90's? Oracle 6.0 that replaced it in
1998?


sorry - you meant 1988, not 98'. And in 1987 I was programming on a mainframe using DB2 on MVS, SQL/DS on VM/CMS. And it was only 1987 because that is when I started programming - the systems had been in use for years.

And Oracle was on the mainframe then too - for many years. In fact, we wrote the system C compiler for the mainframes in order to get Oracle there. Oracle was on the mainframe in the 80's too.

If your statement was true, we would see plenty of ancient
Flat/Hierarchical/Network databases from the late 80's around, not to mention
FOXPRO and PARADOX systems. RDBMS is just yet another "database technology".


Umm, maybe you need to get out more? You know how many ancient flat/hierarchical/networked databases are out there? Quite a few. There is still more cobol code than anything else floating around. Just because you don't happen to see it where you work doesn't mean it doesn't exist.

and foxpro/paradox slowly evolved to use things like SQL and support ODBC access which allowed them to be more or less 'upgraded' over the years to more "enterprise-y" databases. Not a big porting issue since the databases were tiny to begin with.


Corba/IDL just an RPC implementation? Really. Interesting, I did not know that. oh wait, that is because that was never true:

http://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture

In addition to providing users with a language and a platform-neutral remote procedure call (RPC) specification, CORBA defines commonly needed services such as transactions and security, events, time, and other domain-specific interface models.

hmmm, sounds like a tad more than RPC doesn't it?


So is web services then
And restful
and whatever magical thing comes next to replace them (yes, replace them - rewrite it again for the next new fabulous thing)



December 13, 2011 - 8:52 pm UTC

Reviewer: A reader

>a sequence of code executed simultaneously by many users

Define "many".

>are you the same "a reader" from above? did you just write that? That is sort of what we've been saying - but drop the word 'performance'.

Yep, that's me.

>rule of thumb: if the code is about data it goes to stored proc.

WRONG. Up to 95% of code are doing just fine in the app server. Especially if "data" comes from multiple sources. It's just 5% that cause trouble and require special attention. Just like in early C days devs used assembler to eliminate bottlenecks.

>Even worse is when they believe if they see @transactional -
>it means all of the code that follows is perfectly
>"transactional and fail-proof and correct". Many more than
>once I've had a hibernate developer tell me "we don't need
>to think about concurrency issues, hibernate has this
>transactional thing that makes our code perfect". Hah.

First. @Transactional has NOTHING to do with Hibernate. It was Spring Framework annotation back in 2005, now it seems like every IoC container has it. Usually it means that container will join the existing transaction or start new transaction if there isn't one. Since most of these transactions are relatively short-lived, rarely span more than two tables and begin/end in or near "save button trigger" - your developer was pretty close to being right.

Second. Container-managed transactions work pretty well in EJB. Again, it is possible to fall back to Bean-managed transactions and begin/commit/rollback transactions manually. It was an XML Nightmare, though, to properly configure CMT on EJB1.1 and EJB2.0 and Spring's "lite" version was a huge leap forward.

>EVERYTHING sounds like EVERYTHING to me - constraints,
>security, data processing - everything. You used the
>database as a bit bucket - nothing more.

My bad, sorry. Database: data storage providing consistent data access and enforcing data integrity. That's how we used it. Disabling constraints is the most stupid thing one could probably imagine.

>that (security) is just another type of constraint, it is no
>different from a constraint. it is there to protect the data.

Application (middleware) is already securing the data. Why would you duplicate it?

>since most every data breach out there is caused by holes in
>applications, the magical stuff in the middle tier - be it
>sql injection or whatever, and not database breaches, you
>might want to reconsider that.

Most every data breach out there are privilege escalation (attacker possesses ungranted role), identity theft (attacker possesses stolen identity) and unwanted interaction (attacker makes user to unintentionally interact with the system, i.e. XSS attack). In every case, application will report "correct" credentials to the database.
Let's not discuss "dumbass attack" where user A can access every other user's data by simply modifying an ID on the browser URL - chances are idiots developing system like that would likely do the same on the database level (i.e. not checking requested data actually belongs to authenticated user).
Tom Kyte

Followup  

December 14, 2011 - 7:47 am UTC

Define "many".


anything more than one. I can usually demonstrate concurrency issues with just two users, three at the most.


>rule of thumb: if the code is about data it goes to stored proc.

WRONG.



it is not wrong, it is the way I've always done it. it is the way many do it. In your opinion, you do not agree, but your opinion does not make it wrong.


First. @Transactional has NOTHING to do with Hibernate.

whatever - read the thread. The hibernate guys said "we turned on transactional, our code therefore is perfectly safe in all databases"

Naive beyond belief.

your developer was pretty close to being right.

and you just joined that club. I don't care how big or small or long or short a transaction is - it is either

a) correct for a multiuser environment
b) or not

period, that is it. And a sequence of events run in Oracle can and will come to a different result than the same events in SQL Server. Nothing against oracle or sql server there - just a fact, their concurrency models are different - statements execute in a different mix - things that block in sql server and serialize will not in oracle and so on.


Second. Container-managed transactions work pretty well in EJB. Again, it is
possible to fall back to Bean-managed transactions and begin/commit/rollback
transactions manually.


forget about commit and rollback - that is the end. I'm worried about everything else that happened BEFORE the commit. That is what is screwed up.

Up to 95% of code are doing just fine in the app server.

and up to 100% of the code is doing just fine in the database. now what?


Disabling constraints is the
most stupid thing one could probably imagine.


well, we do agree on at least one thing.

Application (middleware) is already securing the data. Why would you duplicate
it?


as stated above, it is called defense in depth. Why would you have a single layer of defense in a layer written by programmers of multiple levels of capabilities? Why WOULDN'T you want multiple layers of defense.

Say your application is subject to a sql injection attack or - gasp - more generally - has a bug in it. Would it not be a good idea to have another couple of layers (like a database firewall, like VPD, like auditing, like ... ) to try and catch that. Yes, in fact it would be.

https://www.google.com/search?q=defense+in+depth


Let's not discuss "dumbass attack"

why not - since that is how Sony's network was shutdown, since that is the most prevalent form of attack. Why shouldn't we discuss the low hanging fruit.

Do you have evidence that "most every data breach out there are priv escalation"?

Identity theft is the result of a breach, not a cuase.

XSS is so very similar to sql injection - a variation on a theme. And it results in the applications security going away - be nice to have another line of defense.


one last thing on the dumbass attack. It is highly probable that the dumbass developer working in the middle tier would not be the same as the developer/DBA working on the database. Having multiple groups do the same thing is another good secure coding practice. In fact - in zero defect software development they often have many teams develop the same modules and run them all and compare outputs - to ensure that a single bug in a single module doesn't bring it all crashing down.

December 13, 2011 - 9:01 pm UTC

Reviewer: A reader

>We have a near-terabyte-level DB that interacts w/ a dozen
>remote systems

In our case we have a middleware server exposing "functions" (call it web service, or CORBA, or whatever - we support almost every possible transport protocol) to few hundreds remote systems developed by our customers (they usually integrate our stuff into their ERP/CMS/CRM/whatever systems). Nobody has direct access to our databases, nobody ever will.
Tom Kyte

Followup  

December 14, 2011 - 11:03 am UTC

I didn't see anyone proposing that - even though it is an option.

December 14, 2011 - 8:07 am UTC

Reviewer: Oleksandr Alesinskyy

And how "direct access to the database" is related to "logic in the database"? They are completely orthogonal.

December 14, 2011 - 5:31 pm UTC

Reviewer: A reader

>In a transactional system, the size of a table is not
>relevant.

That's the problem. This is only true for "unlimited resources" systems. Unfortunately, there are tons of "effective managers" out there forcing databases to virtual servers up to the point where you actually CAN see the difference.

>Your data doesn't seem to be very important does it? It
>isn't financial data, it isn't related to health, it isn't
>employment records.

So you just said that most of your statements are irrelevant to the systems that are "not financial data, not related to health, not employment records"? Well, fine then!

>sorry - you meant 1988, not 98'.
Yeah, my fault.

>Umm, maybe you need to get out more? You know how many
>ancient flat/hierarchical/networked databases are out
>there? Quite a few.
Not so much. And most of them are getting eventually replaced by RDBMS systems.

>hmmm, sounds like a tad more than RPC doesn't it?
Sophisticated RPC, yes. Just like WebServices or whatever comes next.

>and whatever magical thing comes next to replace them
>(yes, replace them - rewrite it again for the next new
>fabulous thing)

Transport layer. A set of facade classes to access interfaces exposed by the application (middleware system). We are now supporting REST, WS, XML-RPC, Hessian and few other things - usually requested by customers. No one asked us for CORBA yet, though.

Tom Kyte

Followup  

December 15, 2011 - 7:53 am UTC

That's the problem. This is only true for "unlimited resources" systems.
Unfortunately, there are tons of "effective managers" out there forcing
databases to virtual servers up to the point where you actually CAN see the
difference.


then please do describe how this archive data is ONLINE?? and available?? remember - that is what we are talking about here, that is the context.

and virtual server, smirtual server. A table in a VM with 100,000 rows and a table in a VM with 1,000,000,000 rows will perform about the same when returning 100 rows via an index. The 1,000,000,000 row table MIGHT do two or three extra logical IO's (106 logical IOs versus 103 logical IOs) but that is all.

Size of table - not relevant to OLTP performance if you are retrieving the same amount of rows. And if you say "but if the table is 1,000,000,000 rows we retrieve thousands of rows - but if it is 100,000 rows we retrieve 10's of rows" - I'll ask you how you can even consider archiving - you'd be LOOSING data. And if you say "but the archived data is not necessary", then I'd simply say "well then your QUERY is wrong, it should not be retrieving thousands of rows - you obviously forgot part of your where clause".

Size - not relevant.


Not so much. And most of them are getting eventually replaced by RDBMS systems.

You honestly need to get out more. Seriously. You would be surprised at how many places still run on mainframes - AS/400's and the like.

>hmmm, sounds like a tad more than RPC doesn't it?
Sophisticated RPC, yes. Just like WebServices or whatever comes next.


come on - you wrote that Corba was just an RPC. Corba was as sophisticated, neigh - I might say MORE sophisticated, then you restful web services, magic programming paradigm du jour.

No one asked us
for CORBA yet, though.


correct, because corba isn't cool anymore. Just like RMI isn't cool anymore. Just like X isn't cool anymore. They come and go as fast as the seasons change. That is the point.

December 14, 2011 - 6:23 pm UTC

Reviewer: A reader

>anything more than one. I can usually demonstrate
>concurrency issues with just two users, three at the most.

Yeah, right, I can think of a "very specific" case scenario too. How often does it happen in the real world?

>whatever - read the thread. The hibernate guys said "we
>turned on transactional, our code therefore is perfectly
>safe in all databases"
With most of "the code" being single INSERT or UPDATE statement per transaction - that's pretty much it.


>Nothing against oracle or sql server there - just a fact,
>their concurrency models are different - statements
>execute in a different mix - things that block in sql
>server and serialize will not in oracle and so on.

Can I please have at least ONE real example of code not portable, by nature, between ORACLE/MSSQL, where the problem cannot be solved by properly configuring Hibernate (providing it with locking optimizer hints etc) and/or further optimizing the code to some certain sanity level (i.e. replacing usual SELECT/DELETE/INSERT sequence junior devs provide with DELETE WHERE / INSERT etc) until it works fine in both Oracle and SQL Server? In other words, I would like to see an example where the real code change and recompilation is required to port both ways, with Oracle-based code failing in MSSQL AND MSSQL code failing in Oracle.

>and up to 100% of the code is doing just fine in the
>database. now what?
Maintenance and evolution. 100% of the code could be implemented in assembler language, but people are using high-level languages. Same thing here.

>Say your application is subject to a sql injection attack
>or - gasp - more generally - has a bug in it.
This would mean the application being successfully authenticated under admin instead of user. No problems from the database point of view - application reported valid "admin" credentials, access granted.

>Would it not be a good idea to have another couple of
>layers (like a database firewall, like VPD, like auditing,
>like ... ) to try and catch that. Yes, in fact it would be.

You're saying it like these extra layers are free and 100% error-prone. In reality, the same people would configure extra layers and they would do the same mistakes. This will also add overhead to further changes - now app devs will have to support both database and application security.

>why not - since that is how Sony's network was shutdown,
>since that is the most prevalent form of attack. Why
>shouldn't we discuss the low hanging fruit.

Sony's network had basically NO security. Another example of idiocy would be the recent Australian First State Super "security" breach: once you were logged in - you could simply change "?clientId=12345" part of the URL to gain access to other clients' data. In both cases, the level of stupidity is so high it makes no sense even discussing it. It's not even a "bug", it's almost criminal negligence and ignorance. If they were that dumb on app-level security, what makes you think they would be any smarter on db-level one?

>XSS is so very similar to sql injection - a variation on a
>theme. And it results in the applications security going
>away - be nice to have another line of defense.

What are you talking about? SQL injection is just request parameters forging in order to replace application's SQL string. XSS attack, on the other hand, requires logged-in user to click the forged link in order to issue an unwanted (but 100% legitimate from application and dataabase point of view) command. SQL injection allows attacker to issue his own SQL statement, whilst XSS attack does not.

>It is highly probable that the dumbass developer working
>in the middle tier would not be the same as the
>developer/DBA working on the database.

Not really, definitely not always. There are many cases when there's no such thing as "database developer / DBA" on the project.

>I didn't see anyone proposing that - even though it is an
>option.
Yes, it's something called SOA and this is what I'm talking about. If your opinion was based on idea of "multiple applications developed by separate project groups using the same database schema", then your approach would be the best (I still don't like the idea of few separate groups working on the same stored proc in order to implement slightly different business rules - this usually does not end well).
However, this kind of behavior is strictly prohibited in our company - the policy of inter-project communication is "want the data - use the service, hands off the database". Works pretty good as far as I can see.
Tom Kyte

Followup  

December 16, 2011 - 6:11 am UTC

Yeah, right, I can think of a "very specific" case scenario too. How often does
it happen in the real world?


more than zero is the factual answer to your question. MORE THAN ZERO. Which means your software design must accommodate it.

(my respect for you just dropped quite a few points there. What an attitude towards engineering and development. I guess the developers of the Tacoma Narrows Bridge thought much the same way - "it probably won't get windy like that, it probably will be ok...").


With most of "the code" being single INSERT or UPDATE statement per transaction
- that's pretty much it.


again, what an attitude. You know - the people that have a single insert/update in their transactions are also responsible for code that has, well, MORE THAN ONE. they consider themselves "software engineers" - they are not. They are faking it and they DON'T EVEN KNOW IT.

I'll tell you what, I'll just stop now. This isn't useful. I totally disagree with you - and these last two comments just put me over the top. Sorry - but there will be no conclusion to this since you believe that it is perfectly OK to ignore things that "might not happen" (they do, all of the time) and "most developers don't need to know because they work on trivial stuff" (they will work on non-trivial stuff too).

Securing the data in the application?

December 15, 2011 - 2:45 pm UTC

Reviewer: A reader

>Application (middleware) is already securing the data. Why would you duplicate it?

You have some middleware code that enforces the proper security when a user accesses the database through SQL*Plus?

Your Application (middleware) is not the only entity that can interact with the database. It CANNOT guarantee that the data is secure. Security HAS TO be implemented in the database, regardless of the security you build into your middleware. Why would YOU duplicate it?

I'm sorry, but if your security is not implemented in the database, then your data is not secure. Period. What assurances can you give me that your server admins are not connecting to the database and getting information they are not supposed to have access to? Security breaches come from inside organizations, too. Why do you think bank vaults have their own doors? Why are the locks on the outer doors not enough? Why do cars have keyed ignitions if they already have locks on their doors? If you are not putting your security in the database, you are leaving yourself open for a breach. You’ve posted a guard at one door and assumed that nobody can find the other entrances.

You get different answers

December 16, 2011 - 10:11 am UTC

Reviewer: Galen Boyer from BOSTON

> Can I please have at least ONE real example of code not portable, by nature,
> between ORACLE/MSSQL,

In SQLServer a writer will block a reader, in Oracle, a writer does not block a
reader. So, start a transaction updating a balance amount from say, $500 to
$1,000,000. Do not rollback or commit. In another session, run a report that
sums cash on hand. In that other SQLServer session, to have the report not be
blocked, you will set transaction isolation level "read_uncommitted". So, in
that reporting session, you get the answer of cash on hand is $100,000,000.
Now, in Oracle, no matter what isolation level you set, the second session would
report $99,000,500 as the value.

Two different answers and there is nothing you can do about it. No setting in
hibernate will solve it.

Now, in both those sessions behave like a user that just fatfingered the data,
rollback and reenter the correct transaction of $500 to $1000. Now, rerun the
report.

Oracle says, $99,001,000 and so does SQLServer. Oops. Now you are in a
quandary. At 10am, the business just bought into a money market fund with
another $999,500. At 2pm they find out that they didn't really have the extra
$999,500, but instead only an extra $1000. They have to sell out of the fund.
But, now, the fund they bought at 10am lost .5% of its value over the 4 hour
period. That database behaviour just cost the business 10 minutes of a
high-paid money manager's time, transaction fees and the loss of $4997.5.

Your business would decide to behave differently because of it. Now, switch to
Oracle, because you wrote database independent code. Well, guess what, the
behaviour of the business would now need to change again. That scenario would
be different, and hibernate ain't gonna help you solve it.
Tom Kyte

Followup  

December 16, 2011 - 5:56 pm UTC

I'll add one, cut and paste from my book, 'real world'

Note that - because of blocking reads - the developers logic would work dandy in MSSQL - in fact, they were an MSSQL programmer by trade and were very surprised to discover that "hey, these things are different"


<quote>


A developer was demonstrating to me a resource-scheduling program (for conference rooms, projectors, etc.) that he had just developed and was in the process of deploying. The application implemented a business rule to prevent the allocation of a resource to more than one person for any given period of time. That is, the application contained code that specifically checked that no other user had previously allocated the time slot (at least the developer thought it did). This code queried the SCHEDULES table and, if no rows existed that overlapped that time slot, inserted the new row. So, the developer was basically concerned with two tables:


ops$tkyte%ORA11GR2> create table resources
  2  ( resource_name varchar2(25) primary key,
  3    other_data    varchar2(25)
  4  );
Table created.

ops$tkyte%ORA11GR2> create table schedules
  2  ( resource_name varchar2(25) references resources,
  3    start_time    date,
  4    end_time      date
  5  );
Table created.



And, right after inserting a room reservation into SCHEDULES, and before committing, the application would query:

ops$tkyte%ORA11GR2> select count(*)
  2    from schedules
  3   where resource_name = :resource_name
  4     and (start_time < :new_end_time)
  5     AND (end_time > :new_start_time)
  6  /



It looked simple and bulletproof (to the developer anyway); if the count came back as onezero, the room was yours. If it came back greater than one, you could not reserve it for that period. Once I knew what his logic was, I set up a very simple test to show him the error that would occur when the application went live—an error that would be incredibly hard to track down and diagnose after the fact. You’d be convinced it must be a database bug.

extra note: in MSSQL, by default, the above query would BLOCK if someone had inserted that other row - these two sessions would deadlock and the transaction would fail. In Oracle - that would not happen



All I did was get someone else to use the terminal next to him. Both navigated to the same screen and, on the count of three, each hit the Go button and tried to reserve the same room for the exact same time. Both got the reservation. The logic, which worked perfectly in isolation, failed in a multiuser environment. The problem in this case was caused in part by Oracle’s non-blocking reads. Neither session ever blocked the other session. Both sessions simply ran the query and then performed the logic to schedule the room. They could both run the query to look for a reservation, even if the other session had already started to modify the SCHEDULES table (the change wouldn’t be visible to the other session until commit, by which time it was too late). Since they were never attempting to modify the same row in the SCHEDULES table, they would never block each other and, thus, the business rule could not enforce what it was intended to enforce.


The developer needed a method of enforcing the business rule in a multiuser environment—a way to ensure that exactly one person at a time made a reservation on a given resource. In this case, the solution was to impose a little serialization of his own. In addition to performing the preceding count(*),the developer first performed the following:

select * from resources where resource_name = :resource_name FOR UPDATE;;;



What he did here was to lock the resource (the room) to be scheduled immediately before scheduling it, in other words before querying the SCHEDULES table for that resource. By locking the resource he is trying to schedule, the developer ensures that no one else is modifying the schedule for this resource simultaneously. Everyone must wait until the transaction commits, at which point they would be able to see the schedule. The chance of overlapping schedules is removed.
Developers must understand that, in a multiuser environment, they must at times employ techniques similar to those used in multithreaded programming. The FOR UPDATE clause is working like a semaphore in this case. It serializes access to the RESOURCES tables for that particular row—ensuring no two people can schedule it simultaneously.

Using the FOR UPDATE approach is still highly concurrent as there are potentially thousands of resources to be reserved. What we have done is ensure that only one person modifies a resource at any time. This is a rare case where the manual locking of data we are not going to actually update is called for. You need to be able to recognize where you must manually lock and, perhaps as importantly, when not to (I’ll get to an example of this in a bit). Furthermore, the FOR UPDATE clause does not lock the resource from other people reading the data as it might in other databases. Hence the approach will scale very well.

Issues such as the ones I’ve described in this section have massive implications when you’re attempting to port an application from database to database (I return to this theme a little later in the chapter), and this trips people up time and time again. For example, if you are experienced in other databases where writers block readers and vice versa, you may have grown reliant on that fact to protect you from data integrity issues. The lack of concurrency is one way to protect yourself from this—that’s how it works in many non-Oracle databases. In Oracle, concurrency rules supreme and you must be aware that, as a result, things will happen differently (or suffer the consequences).

I have been in design sessions where the developers, even after being shown this sort of example, scoffed at the idea they would have to actually understand how it all works. Their response was “We just check the ‘transactional’ box in our Hibernate application and it takes care of all transactional things for us; we don’t have to know this stuff.” I said to them, “So Hibernate will generate different code for SQL Server and DB2 and Oracle, entirely different code, different amounts of SQL statements, different logic?” They said no, but it will be transactional. This misses the point. Transactional in this context simply means that you support commit and rollback, not that your code is transactionally consistent (read that as “not that your code is correct"). Regardless of the tool or framework you are using to access the database, knowledge of concurrency controls is vital if you want to not corrupt your data.


Ninety-nine percent of the time, locking is totally transparent and you need not concern yourself with it. It’s that other one percent you must be trained to recognize. There is no simple checklist of “if you do this, you need to do this” for this issue. Successful concurrency control is a matter of understanding how your application will behave in a multiuser environment and how it will behave in your database.

When we get to the chapters on locking and concurrency control, we’ll delve into this topic in much more depth. There you’ll learn that integrity constraint enforcement of the type presented in this section, where you must enforce a rule that crosses multiple rows in a single table or is between two or more tables (like a referential integrity constraint), are cases where you must always pay special attention and will most likely have to resort to manual locking or some other technique to ensure integrity in a multiuser environment.
</quote>

December 19, 2011 - 1:37 am UTC

Reviewer: A reader

In that other SQLServer session, to have the report not be
blocked, you will set transaction isolation level "read_uncommitted". So, in
that reporting session, you get the answer of cash on hand is $100,000,000.
Now, in Oracle, no matter what isolation level you set, the second session would
report $99,000,500 as the value.

Two different answers and there is nothing you can do about it.
hibernate will solve it.


WHOA! read_uncommitted? Seriously? What if your transaction is getting rolled back? What if there are 1000 transaction messing around the table? What's the use of your report?
Correct solution: Acquire exclusive lock, get needed data, release lock. Never use read_uncommitted
Tom Kyte

Followup  

December 19, 2011 - 7:41 am UTC

Reader - now maybe you are seeing why developers actually have to understand the tools they are using

December 19, 2011 - 1:55 am UTC

Reviewer: A reader

The application implemented a business rule to prevent the allocation of a resource to more than one person for any given period of time. That is, the application contained code that specifically checked that no other user had previously allocated the time slot (at least the developer thought it did).

Now this is getting ridiculous. I had EXACTLY the same task few years ago. This was addressed on the application level with extra check using DB trigger (exception raised if times overlap). Each "start a booking" would check the application cache for bookings in progress and create one if there wasn't one. Another check before actual database update ("check if there's a booking and it is actually MY booking, display error otherwise"). Since the application was the only owner of the data exposing "booking" services via remote EJBs, we didn't have to worry about rogue processes touching our table.
We had, though, a little problem with clustering. Replacing EHCache with JBoss Cache (now known as Infinispan) and configuring it for cluster use solved that little issue.
Tom Kyte

Followup  

December 19, 2011 - 7:49 am UTC

Reader - you mean you actually had to understand transactions, concurrency, how to program correctly against the database?

Bingo.


why is this ridiculous, this is COMMON PLACE. This is why I write so much (many times the same things over and over and over).


This was addressed on the application level with extra check using DB trigger (exception raised if times overlap).

tell me pray tell how you would do that in Oracle? Reads don't block writes, writes don't block reads (like they do by default in sqlserver - hence the developers logic would have worked perfectly in sqlserver - not portable to Oracle however since the databases do things DIFFERENTLY).

I'm not worried about rogue processes, I'm worried about a single application and its logic - period. If you do not use the select for update to lock the resource - please tell me what magic you would do in the trigger?


Each "start a booking" would check the application cache for bookings in progress and create one if there wasn't one

you mean the application developer had to follow a specific protocol to serialize, you mean that just tweaking a "transactional" setting isn't sufficient. You mean the developer has to understand concurrency controls, etc etc etc. You made our point, thanks very much.


We had, though, a little problem with clustering. Replacing EHCache with JBoss Cache (now known as Infinispan) and configuring it for cluster use solved that little issue.

and if you used a select for update of the single resource, followed by your database work on the child reservations table - you would have had no problems, no issues, correct data, from any environment.

December 19, 2011 - 2:00 am UTC

Reviewer: A reader

So Hibernate will generate different code for SQL Server and DB2 and Oracle, entirely different code, different amounts of SQL statements, different logic?” They said no, but it will be transactional.

In fact, they were wrong. Why don't you just read the actual Hibernate reference?
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/transactions.html
It actually DOES generate different code for Oracle and SQL server if you set the lock mode.
Tom Kyte

Followup  

December 19, 2011 - 7:53 am UTC

where does it say it will change the logic - generate an entirely different flow of sql statements.

Of course it generates the specific sql to start/end transactions. As we've already said many times, that is not relevant. We are concerned with the LOGIC in the transaction. That logic - the flow of the sql statements - what needs to be done - can change from database to database.

The logic, the flow, the code that works in database 1 does not necessarily work in database 2. Period. And hibernate will not, does not protect you from that.



It actually DOES generate different code for Oracle and SQL server if you set the lock mode.

maybe different sql syntax, but the same logic, the same flow. It still doesn't protect you from what we've been discussing.

December 19, 2011 - 2:08 am UTC

Reviewer: A reader

>You have some middleware code that enforces the proper
>security when a user accesses the database
>through SQL*Plus?

Given that there are only three "real", configured in the database, "users" are able to access the schema, two of them being SYSDBA and SYSOPER, and the third is "application" account - there's no need to have it.

>Your Application (middleware) is not the only entity that can
>interact with the database.

Actually, it IS. Nothing and no one apart from it interacts with the database directly. Third-party wants data - third-party uses the service provided by the middleware, not the database. Period.

>It CANNOT guarantee that the data is secure.
It can. No one, but our app, plays with the data. Not even in "read-only" mode.

>Security HAS TO be implemented in the database
And it is - to some extent. Like, "myapp/very_secure_password" is the only user, apart from SYSOPER/SYSDBA, which has any access to the schema.
Tom Kyte

Followup  

December 19, 2011 - 7:59 am UTC

Given that there are only three "real", configured in the database, "users" are able to access the schema, two of them being SYSDBA and SYSOPER, and the third is "application" account - there's no need to have it.

until someone gets around your application, sure.

that is why it is called "defense in depth".


There are lots more than three users in an Oracle database too. sysdba and sysoper are not users - they are roles. (sysdba is mapped to the SYS user, sysoper to the PUBLIC user).


Actually, it IS. Nothing and no one apart from it interacts with the database directly. Third-party wants data - third-party uses the service provided by the middleware, not the database. Period.


you have no dbas, you have no administrators, you have what must be the MOST SECURE system on the planet - beyond what the best of the security folks set up? Pretty nice, you should write up how you've done that. How you've made your application impervious, bug free. How you've locked your system down so tight that no one - not have your admins - can get to it.



what ever. dream on.

defense in depth.
never trust any single layer.
be paranoid to a degree.


if you did things in the database - really protect and secure and keep the data clean - at the database level - done by people that understand concurrency, transactions, data integrity, data security (rather than Joe/Josephine programmer whose head is filled with J2EE and considers the database a "black box") - you'd have something pretty secure. Then let them reinvent the wheel at their layer (multiple layers) - you'd be closer to what you think you have today.


To: A reader

December 19, 2011 - 8:05 am UTC

Reviewer: A curious reader

Hi,
I am curious to know two things:
a) You said that apart from your application, no one has access to data.

How do you diagnose and fix data related issues? Because of human error/application logic flaws/hardware failure, if wrong data gets inserted/updated/deleted, how do you fix the issue? Do you write another application to inspect each row of data in each table and fix the bad data?

b) You said that no one needs to know the password of application user. I am assuming that you have one super application user who owns all application related objects and data. If that application user password gets compromised, what then?

Does your CEO and security auditor know that the only thing protecting sensitive user data is a single password which is protected by "no one needs to know" technique? Data, which if compromised, can easily take the company deep into bankruptcy?

Thanks...

Tom Kyte

Followup  

December 19, 2011 - 3:54 pm UTC

a) and if they did that - it wouldn't be using their secure API, it would have to rebuild the security, integrity, etc - you have to use their application code to touch the data. There is no flexibility there.

b) exactly.


but I'm pretty sure it won't be a problem - because they'll say it is not a problem, they have the password secured.



This is why security in depth is rather crucial. Having a single layer of defense just doesn't cut it.

It just doesn't cut it.

December 19, 2011 - 9:44 am UTC

Reviewer: Galen Boyer from BOSTON

> WHOA! read_uncommitted? Seriously? What if your transaction is getting
> rolled back? What if there are 1000 transaction messing around the
> table? What's the use of your report? Correct solution: Acquire
> exclusive lock, get needed data, release lock. Never use
> read_uncommitted

Exactly! So, now, the users are impacted tremendously in one
environment, vs another. Different bigtime monetary and architectural
decisions with the business driving them will now be made based on the
database platform you are on.

And then, as Tom pointed out that you ask questions of the data, and
so does someone else, and the 2 answers, at the same exact time are
different. One is because you are in a session that is transactional
so your answer is based on your transactional state, while the other
session is getting a different answer based on different transactional
state, maybe none. But, the same exact sequence of transactional
events and querys submitted get you different behaviour on SQLServer.
No setting in Hibernate will solve it, sorry, just nothing will
actually get the two databases to behave the same exact way,
therefore, there is no way for an application to behave the same way,
unless you have 2 different code paths.

What's pretty darn frustrating with the java crowd is that the JVM did
a very spectacularly beautiful job at insulating the code from the OS.
You truly do have almost complete OS independence. But that is
because each JVM that is written has been coded for exactly the OS it
is running on. But, they did nothing in the database world. They
just threw up their hands and spec'd out JDBC. So, then, it is up to
the database vendors to somehow, run all queries the same, handle all
transactions the same, answer all questions the same way. What did
the J2EE spec designers think? Microsoft and Oracle and IBM were all
going to get together and say, "Hey, how can we make sure that the
JDBC support is the same across our databases? Lets all make sure
that no matter what query is run and no matter what transactional
setting of hibernate is set, we give the same answer and behaviour to
JDBC codebases."

December 19, 2011 - 2:52 pm UTC

Reviewer: A reader

>Given that there are only three "real", configured in the database, "users" are able to access the schema, two of them being SYSDBA and SYSOPER, and the third is "application" account - there's no need to have it.

>Actually, it IS. Nothing and no one apart from it interacts with the database directly. Third-party wants data - third-party uses the service provided by the middleware, not the database. Period.


What do you mean by, “there’s no need to have it”? Are you trying to say that you do not give SQL*Plus to your users because they will use your application to access the data? Is that why there is “no need to have it”; because your users will simply access the data through your application? If so, you seem to be missing the entire point of everything everyone is trying to explain to you, Oracle is STILL GOING TO ALLOW SQL*PLUS CONNECTIONS. Your application does not prevent a malicious user from installing SQL*Plus on their machine and accessing your database. Oracle will still accept those connections. Oracle will allow that user to do anything that your application can do. You are not secure at all.

“But how does that user get the username/password to connect to the database?” How does your application get the username/password to connect to the database? Do you store it in a config file on the application server like just about every application developer I have ever worked with? If so, it is in clear text on the server where anyone with access to that machine (like the server admins I talked about earlier) can easily get to with a simple search. Is it compiled into your build? If so, then it is stored in clear text in your source control, which those same server admins have access to. How many people have access to this password? If it is compiled in your build, then every developer has access to the PRODUCTION password that your application uses. And, since the database is just a black box, that basically means that every one of your developers are essentially a production DBA.

Besides that, I don’t believe that your application is the only tool accessing your database. Tom has already alluded to it. You do have DBA’s right? How do they administer the database? Through your application? If they use OEM, another tool is connecting to the database. If they use TOAD, another tool is connecting to the database. If they use ANYTHING OTHER THAN YOUR APPLICATION, another tool is connecting to the database. Now, explain to me what happens when an unauthorized user installs that same tool that we have already established is being allowed to connect to the database, uses the username/password that we have already established is stored in clear text somewhere on your network, and attempts to log into the database? I know what happens. They can do anything your application can do. They can select any data from any table. They can update any record in any table. Does your application delete records? You are putting a lot of faith in your network security. Certainly more than I would in mine. Does your application work with credit card information? If so, could you please remove any record you have of me in your system? Shouldn’t be too hard for you since your data has virtually no security on it.

Reports?

December 19, 2011 - 3:05 pm UTC

Reviewer: Jim from Oregon, USA

It sounds like no one can use a commercially available report writer. Which means that any reporting information has to be embedded in the application. Then only developers can develop report. Seems like that would be both a bottleneck and a much higher cost to write reports. No other system needs to access the data (eg a datawarehouse for data analysis etc.) Most commercial ETL tools would have a difficult time. I guess you can build your own report writers and etl tools but that seems like a very expensive proposition.(both in terms of time and expense) I would think from a business prospective that it would wipe out any competitive advantage you might have. (vis a vis other companies, they could get functionality done cheaper and faster)
Tom Kyte

Followup  

December 19, 2011 - 5:46 pm UTC

You are correct, you lose access to the data except through their API. That was one of my main points again.

We have come so far - only to find we've gone nowhere in the last 30 years.

logic

December 19, 2011 - 4:17 pm UTC

Reviewer: sam

<<if you did things in the database - really protect and secure and keep the data clean - at the database level - done by people that understand concurrency, transactions, data integrity, data security (rather than Joe/Josephine programmer whose head is filled with J2EE and considers the database a "black box") - you'd have something pretty secure. Then let them reinvent the wheel at their layer (multiple layers) - you'd be closer to what you think you have today. >>

Excellent Statement!


December 19, 2011 - 4:30 pm UTC

Reviewer: A reader

>Are you trying to say that you do not give
>SQL*Plus to your users because they will use your
>application to access the data?
Correct. I also do not give Oracle credentials to my users.

>If so, you seem to be missing the entire point of
>everything everyone is trying to explain to you
Not really.

>Oracle is STILL GOING TO ALLOW SQL*PLUS CONNECTIONS.
Correct.

>Your application does not prevent a malicious user from
>installing SQL*Plus on their machine
Correct.

>and accessing your database.
Incorrect. My "users" are just records in APPSCHEMA.USERS table. THEY DO NOT EXIST IN ORACLE, THEY CANNOT USE SQL*PLUS TO ACCESS THE DATABASE. That's what I'm trying to explain. Most of my "users" do not even belong to my organisation, they're just customers.

Of course, if one of those geniuses manage to steal SYSDBA, SYSOPER or APPSCHEMA password - then we're in trouble. But wait, there's no database-level security to prevent that, is there?
Tom Kyte

Followup  

December 19, 2011 - 5:59 pm UTC

Of course, if one of those geniuses manage to steal SYSDBA, SYSOPER or
APPSCHEMA password - then we're in trouble. But wait, there's no database-level
security to prevent that, is there?


actually, there is.


You have things like secure application roles - the appschema guy - he would have no privileges other than PUBLIC privs by default. Only by having the *application* enable the roles would they have anything. So, if and only if you used the application would you have access to things. But - that would require using the database security set.

sysoper - not really a security threat AND you'd have to be on the database machine to use it.

sysdba - lock it down man - you don't have to allow anyone to use it over the network if you don't want to. You can set up (getting repetitive here) multiple layers of defense. Access to the physical box being the last one. Almost no one should have the SYSDBA role granted to them (and remember, it is a role, not a user). You don't log with sysdba credentials - you need someone who has been granted sysdba (which, ahem, means you have more than three users by the way - since two of them ain't users).

And almost no one needs sysdba from day to day.


Also - if you have just "app schema" - and that owns "everything" - you've done it wrong. Not "you might have done it wrong", but "you've done it wrong"

You want a set of schemas to contain your tables. A set - because different bits of application functionality need different access rights. So, you'll have say two or three schemas with tables.

These schemas would grant select/insert/update/delete - whatever to a few code schemas. Here you would have views, you would have stored procedures. These application schemas would grant execute and grant as little as possible on the views they create to expose the data. This is continuing the use of "least privileges" - locking the data down from the very beginning.

Then finally you have your "appuser". It has access to a very specific set of views/tables (maybe)/procedures. It has minimal access to them (eg: it might have insert and select on an audit trail for your application but NOT delete or update).. It owns basically nothing. It can do ddl against basically nothing. This is the account that is the weakest link since your application server code will be using it.

In this fashion, when I find a sql injection one of your junior programmers (because we know you wouldn't have them) accidentally releases into the wild - I won't be able to exploit it to do too much damage - I won't be able to drop your tables, I won't be able to select from just anything, and so on.

defense in depth,
security in layers,
least privileges

starting at the database, on out.

December 19, 2011 - 5:14 pm UTC

Reviewer: A reader

>tell me pray tell how you would do that in Oracle? Reads
>don't block writes, writes don't block reads (like they do
>by default in sqlserver - hence the developers logic would
>have worked perfectly in sqlserver - not portable to Oracle
>however since the databases do things DIFFERENTLY).

Whatever the database is - user access is serialised in the application, where you can actually define the blocking logic you need. It's interesting, though, that you made think the "last resort" trigger written by our "Oracle Guru" (with the attitude like yours) does not really work and never did.
Tom Kyte

Followup  

December 19, 2011 - 6:03 pm UTC

It's interesting, though,
that you made think the "last resort" trigger written by our "Oracle Guru"
(with the attitude like yours) does not really work and never did.


You put up the word trigger - so not sure where you are coming from. You said you wrote such a trigger.

Knowing what I know about concurrency control, I was interested in the code of such a trigger - so that I could point out what you probably did wrong. Guess you don't want to do that.

You can have fun re-inventing what the databases invented over 30 years ago. Locking, security, consistency, concurrency. I hope you have the best team ever - because honestly, I don't see it happening out there in real life. The level of awareness of concurrency issues is way way way WAY down there.

A developer thinking to serialize access - hah, maybe you do it, maybe you are really really good. Maybe you are the 1% ;)



December 19, 2011 - 5:16 pm UTC

Reviewer: A reader

>Reader - now maybe you are seeing why developers actually
>have to understand the tools they are using

Never objected that

December 19, 2011 - 5:29 pm UTC

Reviewer: A reader

>Now, explain to me what happens when an unauthorized
>user installs that same tool that we have already
>established is being allowed to connect to the
>database, uses the username/password that we have already
>established is stored in clear text somewhere on your
>network, and attempts to log into the database?

Too many assumptions.
First, SSL or X.509 auth is used in production, no plaintext passwords are stored anywhere.
Second, you can only access production database port from production server.
Third, if rogue user manages to log in to production server (pfff), install second Oracle Connector there (pfffff) and then log in to the database (given that he managed to obtain username and password somewhere) - then yes, he could do whatever the application could.
Can you demonstrate any viable solution for this type of breach - i.e. user gets to PROD server then accesses PROD database using application credentials? What could be done on the DB level?
Tom Kyte

Followup  

December 19, 2011 - 6:07 pm UTC

you have bug free code? your security implementation would be certifiable at the highest levels?

security in depth
multiple layers

it keeps coming back to that.


I'm not saying we have bug free code - I can say much of our code has been NIST certified to provide a certain level of protection. but it is multiple layers, security in depth - that lets people sleep at night.


You've locked all of your data up in you API. If your api doesn't provide something - no go, you have to provide it. Very neat, very clean, very much closed.


I'll just reiterate from above:


And here in the year 2009 - we are inventing 2020's CICS transactions with green screens (j2ee, web browsers).


just have to update the date.

December 19, 2011 - 5:32 pm UTC

Reviewer: A reader

>You do have DBA’s right? How do they administer the database?

This question is irrelevant. DBA's can access everything, that's the nature of their work - has nothing to do with security.
Tom Kyte

Followup  

December 19, 2011 - 6:08 pm UTC

DBA's can access everything,

only if you let them - and you don't have to, in many cases - you shouldn't.

December 19, 2011 - 5:36 pm UTC

Reviewer: A reader

>It sounds like no one can use a commercially available
>report writer.

Why is that so? Crystal Reports are doing just fine with web services as a data source. Of course, reporting is developers' job - users are usually not SQL experts and most queries they build are far from being effective.
Tom Kyte

Followup  

December 19, 2011 - 6:09 pm UTC

reporting is developers' job

wow. ok. I did not know that. Please don't tell our user base - they may think you are correct and then where would they be....

To: A reader

December 20, 2011 - 11:49 am UTC

Reviewer: Curious reader

When it comes to security, all security professionals advise one thing: never put all your eggs in one basket. Security has to be implemented at multiple levels.

Things which are considered unbreakable today will not remain unbreakable forever.

Data has to be protected from unauthorized access, from both internal and external sources. Data can be compromised at various levels, in motion or at rest. Other than access via SQLPlus, network traffic sniffing, access to exports, access to backups, hard disk drive theft, data breach through social engineering... are some potential areas where such a compromise can happen.

SSL, X.509, firewalls, proxy servers, intrusion detection systems, physically secure location for servers, network traffic encryption, export encryption, backup encryption, tablespace encryption, employee vetting: they are all security mechanisms implemented OUTSIDE of application to keep data secure.

The use of these mechanisms does not, in any way, provide an excuse to exclude security implementation at database level even when the application level security is considered extremely robust.

December 20, 2011 - 10:52 pm UTC

Reviewer: A reader

>A developer thinking to serialize access - hah, maybe you do
>it, maybe you are really really good. Maybe you are the 1% ;)

You're saying it like The Sacred Oracle Database is a Godsend, not just yet another software product written by - surprise-surprise, developers.
Maybe we do not produce the best code from performance point of view, but it is pretty stable and works fine on a cluster. There are few million users as well. And no, we're not using Oracle, we're doing just fine with Firebird.
That's where I just stop - it really doesn't go anywhere.
Tom Kyte

Followup  

December 21, 2011 - 7:36 am UTC

Umm, no, I was saying it like "the sacred database is a godsend"

oracle has nothing to do with it. Firebird has its own concurrency control mechanisms and implementations - I hope for your few million users you understand how it works - that is all.


Just heard a great argument

January 04, 2012 - 2:50 pm UTC

Reviewer: Galen Boyer from Boston

Any operation you can perform without the need for a database connection is what is called, "business logic".
Tom Kyte

Followup  

January 04, 2012 - 3:46 pm UTC

unfortunately, that is also "all of them" to many people programming in the middle tier.

read the database into the middle tier - coherence or something like that - perform operations there.

exploit code,

May 01, 2012 - 11:15 am UTC

Reviewer: A reader

follow up,

May 01, 2012 - 4:05 pm UTC

Reviewer: A reader

What is wallets and certificates in reference to listeners? I have a work around solution for this

"The solution is to implement secure registration with the listener using wallets and certificates. This would require creating a wallet/certificate for each cluster node, setting up the listener.ora so the listener and scan listeners know to use the wallet and new TCPS port, adding the wallet config to the sqlnet.ora and modifying the remote_listener parameter for each database. ASM must also be configured. The listener and scan listener must be restarted as well as each database instance"

How did TNS Listener Poison attack happened which was not seen in so many years?

Thanks,


Learnt lesson hard way

May 02, 2012 - 4:19 pm UTC

Reviewer: Krishna from Orlando,FL USA

Hi:
I feel all the data centric logic or constraint must be close to the data i.e. the database.
Why? Following is the personal experience.
We developed a product with Oracle Database and Flex very recently. And wanted to catch all the constraints in the Front end. That includes unique constraint. Say Product code must be unique. We caught this logic in GUI only. We ended up with duplicate in the database and found that after release. Next is Concurrency and Consistency also will be an issue. What will the GUI developer do to ensure they are updating data that they queried and remains the same even when they update?
we used DB Package for all DB transactions thereby putting the business logic in DB. And GUI effects were taken care in flex. we did not do this 100% and that is causing maintenance issues.

APEX and code/data separation

July 09, 2012 - 1:26 pm UTC

Reviewer: Bill from Savannah, GA, USA

On September 15, 2009 you wrote:
"I use APEX.
My data logic is 100% contained in stored procedures.
APEX is the UI layer I put on top of that transactional API.

I use constraints for data integrity.
I use procedures for my transactions.
I use APEX for navigation and session state management."

Do you use any of the DML features of APEX? Are your stored procedures in the
same schema as the data? Are any of your stored procedures in APEX vice in a
separate schema? Do the features of APEX Interactive Reports and Tabular Form
violate the concept of maintaining data logic separate from application logic? In particular, the "automatic" insertion, updating and deletion these features use? How would suggest handling LOVs if we use a separate schema as a transaction schema.

We are consiering having an application package schema separate from the data
schema. Given that APEX performs many services for you, is the added layer and
separation a good value. I have posted a similair question in the APEX forums,
https://forums.oracle.com/forums/message.jspa?messageID=10440962&wwparam=1341844
480. I was hoping you could present your judgement whether the features that
APEX presents are a suffcient level of separation.

Thank you for your thoughts and for produccing such a valuable resource.

Best Regards,
Bill
Tom Kyte

Followup  

July 11, 2012 - 4:52 pm UTC

Do you use any of the DML features of APEX?

it depends on the nature of the application. For many things - no, I didn't use the CRUD screens. for some things - sure.

Are your stored procedures in the
same schema as the data?


optimally - no, they would not be. For security reasons. But if they are - make sure the schema your middle tier connects as is not the schema that owns the code and the tables.

Do the features of APEX Interactive Reports and Tabular Form
violate the concept of maintaining data logic separate from application logic?


not necessarily. They are sort of like views in a way (and views can be used to materialize these objects - not necessarily base tables)


I agreed with most of the comments on that thread with the definite exception of Andre.