This question took me back to a thread on the Oracle Technology Network or whatever it is called now, more than seven years back. https://forums.oracle.com/ords/apexds/post/how-can-i-trap-an-impossibly-large-date-6142
My posts in the latter part of that thread should be taken with a pinch of salt as I had just started, that same month, to learn Oracle and SQL; I am not a computer professional, and I had not worked with databases before then. However, it seems that most of what I reported there is unchanged in my current version of Oracle, which is 19.3.
The problem reported there was exactly that of invalid dates that make their way into the database. Some of those problems (bugs) survive in Oracle 19; there may have been other bugs that were fixed, but there are some that have not been fixed. I suspect this is still true for Oracle 23, or whatever the "current" version is as I write this.
To your main question - how do such invalid values get into the db in the first place. The most likely scenario is a direct load (simple copying of bytes with no validation whatsoever). Suppose you import data from some other system, not from Oracle db. You tell the Oracle db: "accept the incoming values as valid; I guarantee that they are" - but in fact they aren't valid. Specifically, Oracle uses one particular scheme for encoding dates - seven bytes where the first is the century, the second is the year within century, both in excess-over-100 encoding (so that negative years can also be encoded as positive integers in base 100). If the data source you import from uses a different scheme, data that is valid in the source, copied byte-for-byte to Oracle db without any checking, may be invalid in Oracle db.
This is why such raw copying of data should - in most cases - be used ONLY for Oracle db - to - Oracle db loads, not for importing data from other sources (in my opinion anyway).
You said you could fix the problematic rows with SQL. I wonder what you have in mind - other than, perhaps, trying to find all the invalid rows and setting the date values to NULL. It is not clear how you would be able to reconstitute the correct, intended date from the invalid values stored in the db. Most importantly, as I discovered in that old thread on OTN, in some cases an invalid date may actually be stored in the Oracle db in a seven-byte sequence that represents a valid
date - but one that has nothing to do with the intended value! Obviously such problems can't be fixed if you only look at invalid data to begin with.
Here is a quick demo of causing invalid data to be stored in the Oracle db, using purely Oracle SQL commands (which, again, is not the most likely reason for YOUR issue). It may be argued that "INSERT ... VALUES ..." uses a form of raw data manipulation, similar to batch loads; but the invalid dates should be caught earlier than that. In the examples I use date arithmetic (adding an insane number of days or of months to a valid "starting" date); that date arithmetic calculation should throw the error, regardless of what is intended to be done with the result of the calculation. Whether it is used in an INSERT statement or for any other purpose is irrelevant.
select banner_full from v$version;
Oracle Database 19c Enterprise Edition Release 184.108.40.206.0 - Production
drop table t purge;
create table t (id number, dt date);
insert into t (id, dt) values (1, date '2020-12-31' + 17500000);
insert into t (id, dt) values (2, add_months(date '2020-12-31', 574000));
insert into t (id, dt) values (3, date '2020-12-31' + 45000000);
insert into t (id, dt) values (4, add_months(date '2020-12-31', 65550*12));
alter session set nls_date_format='yyyy-mm-dd';
select id, dt, to_char(dt, 'yyyy-mm-dd') as dt_char, dump(dt) from t;
ID DT DT_CHAR DUMP(DT)
--- ---------- ---------- -----------------------------------
1 9998-05-17 9998-05-17 Typ=12 Len=7: 200,98,5,17,1,1,1
2 9918-04-30 9918-04-30 Typ=12 Len=7: 200,18,4,30,1,1,1
3 5846-10-25 0000-00-00 Typ=12 Len=7: 42,54,10,25,1,1,1
4 2034-12-31 2034-12-31 Typ=12 Len=7: 120,134,12,31,1,1,1
The example shows several interesting things. I was able to cause invalid data to be inserted in my table, using purely Oracle SQL code. The first two examples show how I can get dates in the 100-th century, either by adding days to a valid date or using ADD_MONTHS on a valid starting date. Note that in both cases the "year" byte is less than 100, which in Oracle encoding should mean the year is negative (B.C. or whatever the politically correct notation is for that these days). But the "century" byte is greater than 100; so this kind of DUMP can be immediately recognized as "invalid" for a date encoding.
The third example (ID = 3) results in both the "century" and the "year" bytes being less than 100. This should encode a B.C. year, but selecting the date with NLS_DATE_FORMAT as yyyy-mm-dd shows a date in year 5846 A.D.. Strangely, if we explicitly apply TO_CHAR to the date, we get the string 0000-00-00. Which means that something else is happening; converting the date to a string implicitly (in the DT column) vs. explicitly (in the DT_CHAR) column results in different output, even though both use the same format model.
The last example is the most interesting. We get a CORRECT dump for the date shown, 31 December 2034! However, that is NOT the correct result of the date expression in the INSERT statement; that expression should return 31 December in year 65550 + 2020 = 67570. If someone working in astronomy or particle physics, where dates in year 67570 may be valid, uses Oracle db for storing data and making computations, this would be a very definite bug. (Caused by their own incompetence - if they need years beyond 9999 they shouldn't be using Oracle db in the first place.)
One can play with this further. For example, I tried (not shown here) to create the same invalid results using date plus interval arithmetic: adding days or months using either interval literals or NUMTODSINTERVAL and NUMTOYMINTERVAL. Interestingly, that works (either way) for adding insanely many days, but it throws errors for adding insanely many months. Some checking seems to be happening, but not in a consistent manner.