Compromising between Java and Oracle to load LOB/CLOB/XML

Problem:

We have the challenge to load feed files, each with approximately 100k rows of trade data with XML which sizes vary between 30k to 300k, into an Oracle database, across the network, as quickly as possible.

Trade data is processed in a grid using a Java application, which in turn produces these XMLs to be inserted into the database. The Java application produces row by row insert statements against the database, and in effort to increase throughput, multiple streams of inserts will be run.  It was reported that performance was unsatisfactory, and that the assumption that the bottleneck is the database.

Constraints:

The use of external tables, sqlldr are not an option at this point, as it requires large changes to the java application which currently produces the required data.

Analysis:

Those who are familiar with Oracle will question the report that Oracle is struggling to load a mere 100k rows of data. On closer analysis, it is clear that the real bottleneck is network. 100k rows of data equates to about 7GB worth of data.


1) Tried using a new listener with 32k packet sizing, as opposed to the default 2k. I expected that this would reduce the amount of return trips that have to be made to transfer al the data. Good idea, but turned out that it hardly made any difference in throughput

2) We partitioned the target table to 20 partitions (hash), in the hope to load balance the concurrent loader to hit the different partitions. No improvement, and the main bottleneck is still network

3) OK, hardcore DBAs might frown on this one, but I think this is a very neat and interesting approach. We zipped the XML at source (at the java application side) to reduce the amount of data have to be shifted across the network. We later unzipped the XML with a simple trigger on the table, prior to writing it to the target column (yes, this works row by row still). This eliminates the network bottleneck, which is a very good thing, however performance is still not good enough

4) Altered the table and modify the lob column to use the CACHE option. This improved performance considerably, but now we were seeing a lot of “log buffer space” wait event. This was confusing me, because I had set the table to be NOLOGGING. After much digging, turned out that CACHE does not go hand in hand with NOLOGGING. If you cache, logging will be turned on automatically. The fact is, nocache nologging is much much slower than cache logging. So we will just have to deal with the “log buffer space” wait.

5) Increased the “log_buffer” parameter from 2MB to 8MB, as well as move the redo logs to the faster device. This reduced the log buffer space wait event, but introduced new wait “buffer busy waits”. The explanation for this is relatively simple, the concurrent inserts often try to write to the same data block. There are different options to deal with this. One is to use a tablespace with smaller block size to reduce the number of rows that can be fit into a single block. This will hopefully reduce the buffer busy waits, but not necessarily eliminate it. Or we can deal with this from the application side. We choose to do this instead.

6) We modified  the java code which control the job distribution across streams. A single stream is dynamically assigned to a unique partition. Therefore when the 40 streams are running, each one is guaranteed to be working against different blocks, therefore eliminating buffer busy waits completely.

And the results? 7GB worth of XML loaded within 5.5 minutes. Not bad.

Thursday, May 7th, 2009 at 15:40
No comments yet.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>