You don't answer the question
Santiago Coria, March 15, 2022 - 6:03 pm UTC
The Hans-Werner asks "How can I guess the encoding of submitted data?"
But the reply does not answer how or even if it is or not posible.
I understand that once the file is in a BLOB in the database the charset information is lost. Is it correct?
March 18, 2022 - 4:49 am UTC
There is no "characterset" as such for the blob, because its characterset independent (assuming a multibyte charset for the database which is the default).
Mustafa KALAYCI, March 31, 2022 - 4:51 am UTC
Some developers store text data in a BLOB column and they don't do this by converting text data into a raw data. They read "text file" as a byte array and insert this byte array into blob column (I certainly against this method, always use clob for text data). in this case, text file itself (not just content) are written into blob and when you tried to read it as text (to_clob, dbms_lob.conterToClob etc) some characters might be converted to non alphabetical characters because they are lost in conversion.
to solve this, you must know base encoding of text file and while converting data from blob to clob, you must use CharacterSet ID for example to_clob(my_column, 46). 46 is the charset id of WE8ISO8859P15 charset. (at worst you can guess what it could be, use nls_charset_id function to get charset id of a charset and try to convert via this charset id, if there is no strange characters at the end of the conversion, you might found your encoding)
if you didn't check the encoding at the beginning and insert the file into blob, then you can "guess" the encoding of the file. there are some examples on google about how to guess file encoding by looking at it's first bytes but non of them is guaranteed.
I strongly suggest read the text file content and insert it into a CLOB and even if CLOB is not an option for some reason, again read content of the file then convert it to raw and then insert into blob. that way you won't be dealing with conversions.
My two cents.