Some people are worried about the efficiency of storing large objects in databases and the implications on performance. They are not necessarily entirely
wrong in fearing that storing large objects in databases might be problematic the best or might require a lot of tweaks to parameters in order to be able
to handle the large objects. It all depends on how a database implements storing large objects.
Oracle comes with two completely different mechanisms for that:
- LARGE objects
- LOB objects
When comparing the LARGE datatypes such as (*fixme*) to the LOB datatypes such as CLOB, BLOB, NCLOB (*fixme*) they don't read that different at first.
But there is a huge difference in how they are handled both internally inside the database as well when storing and retrieving from the database client.
LARGE fields are embedded directly into the table row. This has some consequences you should be aware of:
- If your record is made up of 5 VARCHAR fields with a maximum length of 40 bytes each and one LONGVARCHAR and you store 10 MB into the LONGVARCHAR column,
your database row will extent to 10.000.200 bytes or roughly 10 MB.
- The database always reads or writes the entire row. So if you do a SELECT on the VARCHAR fields in order to display their content in a user interface as
a basis for the user to decide if he or she will need the content of the LONGVARCHAR at all the database will already have fetched all the 10 MB. If you SELECT
and display 25 records each with a 10 MB object in it this will mean about 250 MB of I/O.
- When storing or fetching such a row you need to make sure your fetch buffer is sized appropriately.
In practice this cannot be efficient. It might work as long as you stay in the KB range, but you will most likely run into trouble as soon as it gets into the MBs
per record. Additionally, there are more limitations to the concept of LONG datatypes such as limiting the number of them you can have in one row and how you can
index them. This is probably why Oracle decided to deprecate LONG datatypes in favor of LOB columns.
A lot of non-Oracle-DBA people believe that LOB means "large OBject" because some other vendors have used the term BLOB for "Binary Large OBject" in their products.
This is not only wrong but - even worse - misleading, because people are asking: "What's the difference between large and long?" (Bear with all non native English
speakers here, please!)
Instead, LOB stands for Locator OBject which exactly describes what is is. It is a pointer to the place where the actual data itself is stored. This locator
will need only occupy some bytes in the row thus not harming row size at all. So all the issues discussed above vanish immediatelly. For the sake of simplicity
think of a LOB as a pointer to a file on the databases internal file system that stores the actual content of that field. (Oracle might use plain files or
different mechanisms in their implementation, we don't have to care.)
But as there is always a trade-off while LOBs are exstremely handy inside a row, they are more complex to store and retrieve. As opposed to all other column types
their actual content stays where it is even if you transfer the row from the database to the client. All that goes over the wire in that case will be a token
representing the actual LOB column content.
In order to read the content or to write LOB content it needs to open a separate stream connection over the network that can be read from or written to similar
to a file on a network file system. JDBC (starting at version *fixme*) comes with special objects such as java.sql.Blob and java.sql.Clob to access the content of
LOBs that do not represent character arrays or strings but streams!