28 September 2006

StringBuilder vs StringBuffer

As Java 1.5 came out, many of us were eager to get our hands on StringBuilder. As a non-thread-safe version of StringBuffer, one would imagine that it would pack some heat on the performance end for cases where you don't have multi-threaded appends.

To test the relative performance differences, I used a BufferedReader over decently sized text file (622732 words gathered by repeatedly pasting the wikipedia document on USA). Armed with data, I wrote and measured the following loops that merely read in the file and appended to either a StringBuffer or a StringBuilder.


....
for (int x=0; x<100; x++) {
File f = new File("USA.txt");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
String line = "";
StringBuffer sb = new StringBuffer();
// StringBuilder sb = new StringBuilder();
long start = System.currentTimeMillis();
while ((line = in.readLine()) != null) {
sb.append(line);
}
long mid = System.currentTimeMillis();
String s = sb.toString();
long done = System.currentTimeMillis();
System.out.println((mid - start) + " " + (done - mid));
in.close();
}
...


As evident, I merely switched the constructor to a StringBuilder when I was done. The measurement dumps out the total file read and append times as well as to hit on the toString method.



Clearly, they both perform on almost the same lines. At Java One last year, there was lots of talk about how well the Sun VM was handling synchronization. This is proof.

But here's the dicey part. When we did a mass search and replace on our application, we found a massive boost. So, were we dreaming?

I was convinced that my findings were bogus, and that the overhead imposed by the readLine call on my measurement was hiding a performance difference. I was wrong. I changed the loop to measure like so:


...
long read = 0;
while ((line = in.readLine()) != null) {
long t1 = System.currentTimeMillis();
sb.append(line);
long t2 = System.currentTimeMillis();
read += (t2 - t1);
}
...


No difference. I'm happy about this because we need not start hunting down StringBuffer and switching it to StringBuilder like crazy. But, I'm perplexed about what we saw earlier. So, I decided to try and execute the same test on a JRockit VM (1.5 spec). Here's what I got:



JRockit seems to jump in performance steps- almost like the VM is adjusting to the code. Notice the step like decrease in append times for both the buffer and the builder, and the toString is clearly faster than Sun. But, there's something to be said for the sheer predictability of the Sun VM too.

So, my conclusion? Don't race to switch StringBuffer to StringBuilder- there doesn't seem to be a real tangible difference in performance.

1 comment:

  1. It is unquestanable thaht StringBuffers will perform worst in an uncontended scenerio synchronization should be avoided
    and there are ways of doing that for example let us sua that there is a variable with a thread ( a private variable)
    if that variable is accessed within a synchronized block it acts as a volatile veriable
    implying that the jvm is forced to sync up the values from within the main memory , rather than depend on the thread's local registers
    which is painfull
    yet in cases where we talking about very heavily loaded systems
    so whatever they do with synchronization it will remain slower
    it is a check and a set routine
    check for a semaphore , it it is free set it .
    any other thread sees the set semaphore has to wait till the first thread unsets it.
    so by having uncointended blocks also
    u are forced to go through the routine check and set even though u will always succeed in the first go itself
    and thus the overhead.

    ReplyDelete