Return-Path: Received: from pacific-carrier-annex.mit.edu by po10.mit.edu (8.9.2/4.7) id LAA08698; Tue, 5 Mar 2002 11:52:20 -0500 (EST) Received: from hermes.sun.com (hermes.sun.com [64.124.140.169]) by pacific-carrier-annex.mit.edu (8.9.2/8.9.2) with SMTP id LAA00985 for ; Tue, 5 Mar 2002 11:52:19 -0500 (EST) Date: Tue, 5 Mar 2002 16:52:19 GMT+00:00 From: "JDC Tech Tips" To: alexp@mit.edu Message-Id: <10992886717319895@hermes.sun.com> Subject: JDC Tech Tips, March 5, 2002 (Performance) Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Beyond Email J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, March 5, 2002. This issue covers: * String Concatenation and Performance * Improving Java I/O Performance Note that performance results vary depending on environment The results shown in these tips were obtained by the author -- you might see different results on your system. These tips were developed using Java(tm) 2 SDK, Standard Edition, v 1.3. You can view this issue of the Tech Tips on the Web at http://java.sun.com/jdc/JDCTechTips/2002/tt0305.html - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - STRING CONCATENATION AND PERFORMANCE The February 5, 2002 Tech Tip, "Writing toString Methods," (http://java.sun.com/jdc/JDCTechTips/2002/tt0205.html#tip1), included the following statement: Note that using "+" within toString to build up the return value is not necessarily the most efficient approach. You might want to use StringBuffer instead. A reader of the Tech Tip noted that the Java documentation says that StringBuffer is used to actually implement the + operator, and so the question is, what if anything do you gain in performance by explicitly using StringBuffer in your programs? This tip attempts to answer that question. To start, let's look at an example -- one where a string is built up by repeating the same character over and over: class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class AppDemo1 { static final int N = 47500; public static void main(String args[]) { // build up string using + MyTimer mt = new MyTimer(); String str1 = ""; for (int i = 1; i <= N; i++) { str1 = str1 + "*"; } System.out.println("elapsed time #1 = " + mt.getElapsed()); // build up string using StringBuffer mt = new MyTimer(); StringBuffer sb = new StringBuffer(); for (int i = 1; i <= N; i++) { sb.append("*"); } String str2 = sb.toString(); System.out.println("elapsed time #2 = " + mt.getElapsed()); // sanity check if (!str1.equals(str2)) { System.out.println("str1/str2 mismatch"); } } } When you run this program, the result should be something like this: elapsed time #1 = 61890 elapsed time #2 = 16 Approach #2 explicitly uses StringBuffer, while approach #1 uses it implicitly as part of the implementation of the + operator. You can examine the bytecodes used to implement the first approach by saying: javap -c -classpath . AppDemo1 Why such a huge time difference between the two approaches? The second approach appends characters to a StringBuffer, which is quite efficient. Doesn't the first approach do the same? Actually, it doesn't. The statement: str1 = str1 + "*"; does not append characters to the string referenced by str1. That's because Java strings are immutable, they don't change after they're created. What actually happens is that: o A StringBuffer is set up o str1 is copied to it o The "*" is appended to the buffer o The result is converted to a string o The str1 reference is made to point at that string. o The old string that str1 previously referenced is then made available for garbage collection. The loop in the demo goes through N iterations, and on each iteration, the contents of str1 (containing N-1 characters) must be copied to the buffer. This behavior implies that the first approach above has quadratic or worse performance. "Quadratic" means that the running time is proportional to the square of N. It is possible to effectively freeze an application by using this kind of loop. The AppDemo1 example illustrates the case where you're repeatedly concatenating one string to another, such that both strings must be copied to a temporary area (StringBuffer), a new string created, and then the original string reference made to point at the new string. But what if you're not doing this kind of operation, and instead, you simply have some code like this: public String toString() { return "X=" + x + " Y=" + y; } There's no loop or repetition here, and no string that keeps getting longer and longer. Is there any harm in using + instead of StringBuffer in this example? To answer that question, let's look at some more code: class MyPoint { private final int x, y; private final String cache; public MyPoint(int x, int y) { this.x = x; this.y = y; cache = "X=" + x + " Y=" + y; } public String toString1() { return "X=" + x + " Y=" + y; } public String toString2() { StringBuffer sb = new StringBuffer(); sb.append("X="); sb.append(x); sb.append(" Y="); sb.append(y); return sb.toString(); } public String toString3() { String s = ""; s = s + "X="; s = s + x; s = s + " Y="; s = s + y; return s; } public String toString4() { return cache; } } class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class AppDemo2 { static final int N = 1000000; public static void main(String args[]) { MyPoint mp = new MyPoint(37, 47); String s1 = null; String s2 = null; String s3 = null; String s4 = null; // toString1 MyTimer mt = new MyTimer(); for (int i = 1; i <= N; i++) { s1 = mp.toString1(); } System.out.println("toString1 " + mt.getElapsed()); // toString2 mt = new MyTimer(); for (int i = 1; i <= N; i++) { s2 = mp.toString2(); } System.out.println("toString2 " + mt.getElapsed()); // toString3 mt = new MyTimer(); for (int i = 1; i <= N; i++) { s3 = mp.toString3(); } System.out.println("toString3 " + mt.getElapsed()); // toString4 mt = new MyTimer(); for (int i = 1; i <= N; i++) { s4 = mp.toString4(); } System.out.println("toString4 " + mt.getElapsed()); // sanity check to make sure equivalent // results returned from each toString method if (!s1.equals(s2) || !s2.equals(s3) || !s3.equals(s4)) { System.out.println("check error"); } } } This demo sets up a class MyPoint, that is used to represent X,Y points. The demo implements a variety of toString methods for the class. The result of running the program should look something like this: toString1 2797 toString2 2703 toString3 5656 toString4 32 The first two approaches, using + and StringBuffer, have nearly identical performance. So you might conclude that these two approaches are in fact the same. The bytecodes generated for toString1 indicate that a StringBuffer is set up, and then the various strings are simply appended to it. The resulting code is very similar to toString2. But it's not quite that simple. The first problem is that you can't always formulate the value to be returned from toString as a single expression. toString3 and toString1 produce identical results. But toString3 takes twice as long to run, for the same reasons as described for the AppDemo1 example. The AppDemo2 example illustrates the situation where you need to build up the return value a piece at a time. In such a case, toString2, using an explicit StringBuffer, is a better choice. The other problem concerns a statement found in the Java Language Specification section 15.18.1.2, which says: An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression. This statement suggests that a Java compiler is not required to optimize an expression such as: str1 + str2 + str3 + ... as is currently done for the toString1 method above, but instead can create intermediate string objects. So it pays to be cautious in your use of the + operator, especially for long strings or within loops. Note that there's an even faster way to implement toString for this example. MyPoint is an immutable class, meaning that instances cannot be modified after creation. Given this, the value returned by toString will always be the same. Because the value is always the same, it can be computed once in the MyPoint constructor, and then simply returned by toString4. This kind of caching is often very useful, but not without drawbacks. If the class is mutable, then caching might not make sense. The same is true if computing the cache value is expensive, if the cache ties up a lot of memory, or if the toString method is called infrequently. For more information about this topic, see Section 15.18.1, String Concatenation Operator +, in "The Java Language Specification Second Edition" (http://java.sun.com/docs/books/jls/), and Chapter 5, Strings, in "Java Performance Tuning" by Jack Shirazi. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - IMPROVING JAVA I/O PERFORMANCE This tip looks at some of the issues and techniques around improving Java I/O performance. This is a complex subject, so the tip is only able to illustrate a few key points. The first thing to understand is that some factors that can significantly impact I/O performance exist primarily outside of Java applications. One of these items is operating system file caching. When you read a file from disk, the operating system often keeps a copy of the file in memory. The operating system does this so that no disk access is required if the file is read a second time. This technique can result in huge speedups, and you need to be careful to correct for this factor when doing performance analysis of your Java programs. This is one example where adding extra memory often improves application performance, even if the application does not directly need the memory for its internal data structures. Another performance factor is disk file fragmentation, where pieces of a file are scattered across a disk. Modern disk drives typically require 10-15 milliseconds to move from one part of a disk surface to another (seek time). This is a long time, especially compared to ultrafast processor speeds. The same issue occurs if you're using java.io.RandomAccessFile on a big database, and the database needs to move between items of information distributed across the disk. Yet another example is if you're copying one file to another, and the files are at widely-separated locations on the same disk so that the disk head must move back and forth repeatedly. This tip is not going to illustrate the caching or seek time issues with example code, but these issues are extremely important, and worth considering when you're designing your Java applications. Let's look at some code. The first issue to illustrate is the cost of enumerating and opening files. Suppose that you're writing a Java application that searches through all the files in a directory structure. Suppose too that as part of this project, you need to write some code that creates a list of path names based on a starting point in a directory. The code looks like this: import java.io.*; import java.util.*; class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class IODemo1 { // recursive method that walks directory // structure static void getFileList2(File f, List list) { if (f.isDirectory()) { String entries[] = f.list(); int maxlen = ( entries == null ? 0 : entries.length); for (int i = 0; i < maxlen; i++) { getFileList2( new File(f, entries[i]), list); } } else if (f.isFile()) { list.add(f.getPath()); } } // top-level method to get list of path names // starting at a specified point in directory // structure static List getFileList(String fn) { List list = new ArrayList(); getFileList2(new File(fn), list); return list; } public static void main(String args[]) { MyTimer mt = new MyTimer(); List list = getFileList(args[0]); // for (int i = 0; i < list.size(); i++) { // System.out.println(list.get(i)); // } System.out.println( "number of files = " + list.size()); System.out.println( "microseconds per file = " + mt.getElapsed() * 1000 / list.size()); } } If you want to see the resulting list of path names, you can uncomment the three lines in IODemo1, starting with for (int i = 0; ...). When you run the program, for example in Windows by saying: java IODemo1 C:\\ the result is something like: number of files = 11298 microseconds per file = 525 on the first run, and something like: number of files = 11298 microseconds per file = 157 on the second. The time discrepancy is due to caching effects, as mentioned above. The result of this demo indicates that it takes about 500 microseconds per pathname to enumerate a set of files, or about 2,000 files/second. So if you're writing an application, and it searches a directory structure of 100,000 files, it will take at least 50 seconds just to enumerate the files. This time is separate from actually opening and reading the files. Here's another example. It measures the time required to repeatedly open a file, read one byte, and close the file: import java.io.*; class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class IODemo2 { static final int N = 100000; static final String TESTFILE = "testfile"; public static void main(String args[]) throws IOException { // create file and write one byte into it FileOutputStream fos = new FileOutputStream(TESTFILE); fos.write('A'); fos.close(); // repeatedly open file and // read a byte from it MyTimer mt = new MyTimer(); for (int i = 1; i <= N; i++) { FileInputStream fis = new FileInputStream(TESTFILE); fis.read(); fis.close(); } System.out.println( "microseconds per open/read/close = " + mt.getElapsed() * 1000 / N); } } The result of running the program should look something like this: microseconds per open/read/close = 101 This result translates to about 10,000 file open/read/closes per second. It's hard to give simple advice on how to improve the performance of the IODemo1 and IODemo2 examples. For the IODemo1 example, you could do the enumeration once and keep the result in a cache. That approach could work if you know that the directory contents will not change. For the IODemo2 example, an obvious approach is to keep files open instead of repeatedly opening and closing them. These areas tend to be more amenable to changes in architecture than to applying performance techniques after the fact. For example, if you have 100,000 small files, and you want to search them, could you change the architecture to instead use 1,000 large files? Let's now look at performance issues related to reading bytes from a file. Suppose that you have a file containing a stream of ABABAB... bytes, and you'd like to count the number of "A" bytes. What's the fastest way of doing this? Here's some code that illustrates three approaches: import java.io.*; class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class IODemo3 { static final int N = 1000000; static final String TESTFILE = "testfile"; static int cnt1, cnt2, cnt3; static void write() throws IOException { // write a sequence of ABABAB... bytes to the file FileOutputStream fos = new FileOutputStream(TESTFILE); boolean flip = false; for (int i = 1; i <= N; i++) { fos.write(flip ? 'B' : 'A'); flip = !flip; } fos.close(); } static void read1() throws IOException { // read bytes from the file one at a time MyTimer mt = new MyTimer(); FileInputStream fis = new FileInputStream(TESTFILE); cnt1 = 0; int c; while ((c = fis.read()) != -1) { if (c == 'A') { cnt1++; } } fis.close(); System.out.println("read1 time = " + mt.getElapsed()); } static void read2() throws IOException { // read bytes from the file one at a time // with buffering MyTimer mt = new MyTimer(); FileInputStream fis = new FileInputStream(TESTFILE); BufferedInputStream bis = new BufferedInputStream(fis); cnt2 = 0; int c; while ((c = bis.read()) != -1) { if (c == 'A') { cnt2++; } } fis.close(); System.out.println("read2 time = " + mt.getElapsed()); } static void read3() throws IOException { // read from the file with buffering // and with direct access to the buffer MyTimer mt = new MyTimer(); FileInputStream fis = new FileInputStream(TESTFILE); cnt3 = 0; final int BUFSIZE = 1024; byte buf[] = new byte[BUFSIZE]; int len; while ((len = fis.read(buf)) != -1) { for (int i = 0; i < len; i++) { if (buf[i] == 'A') { cnt3++; } } } fis.close(); System.out.println("read3 time = " + mt.getElapsed()); } public static void main(String args[]) throws IOException { // write the test file write(); // read from the file the first time read1(); read2(); read3(); // sanity check if (cnt1 != cnt2 || cnt2 != cnt3) { System.out.println( "cnt1/cnt2/cnt3 mismatch"); } // read from the file the second time read1(); read2(); read3(); } } If you run the the program, you should see a result that looks something like this: read1 time = 4063 read2 time = 140 read3 time = 31 read1 time = 4079 read2 time = 140 read3 time = 16 The program does everything twice, just to make sure that there are no file caching effects. The read1 method opens a FileInputStream and then calls the read method repeatedly. The read2 method works similarly, but sets up a buffer on top of the input stream. The read2 method runs much faster than read1 -- in this case, about 25 times as fast. Why? The reason is that FileInputStream.read is implemented as a native method, and calls the underlying system for each byte that is read. This process may involve such things as invoking system calls, and is very expensive. The read2 method uses BufferedInputStream. This class overrides read to grab a byte out of a buffer. Only when the buffer is empty is a FileInputStream.read call made to refill it. Because the buffer is long (currently 2048 bytes), such a call occurs infrequently. The read3 method is even faster than read2. It uses an explicit buffer to grab chunks of bytes from the file. This approach is similar to the buffer used within BufferedInputStream. But instead of repeatedly calling the read method, it iterates over the buffer and directly accesses bytes from it. The difference between read1 and read2 is buffering. The difference between read2 and read3 is the elimination of method call overhead, and the processing that BufferedInputStream.read must do for each call. This processing includes checking whether the buffer is empty. Both of these techniques -- the use of buffering, and exercising caution in the use of per-byte or per-character I/O operations -- are central to improving I/O performance. It is possible to go too far with this kind of optimization. For example, the following line appears in the read3 code (it's used to iterate across the buffer): for (int i = 0; i < len; i++) { If the line was this: for (int i = 1; i < len; i++) { it would represent an off-by-one error. If you really need the last bit of performance, then read3 might be the appropriate approach. But getting into low-level details can be a big source of coding errors. Here's a final example, one that counts the number of lines in a text file: import java.io.*; class MyTimer { private final long start; public MyTimer() { start = System.currentTimeMillis(); } public long getElapsed() { return System.currentTimeMillis() - start; } } public class IODemo4 { static final int N = 1000000; static final String TESTFILE = "testfile"; static int cnt1, cnt2; static void write() throws IOException { // write a sequence of text lines // to the test file, // terminating with \r, \n, or \r\n FileWriter fw = new FileWriter(TESTFILE); BufferedWriter bw = new BufferedWriter(fw); String str = "test"; int slen = str.length(); int state = 0; for (int i = 1; i <= N; i++) { bw.write(str, 0, slen); if (state == 0) { bw.write('\r'); state = 1; } else if (state == 1) { bw.write('\n'); state = 2; } else { bw.write('\r'); bw.write('\n'); state = 0; } } bw.close(); } static void read1() throws IOException { // read from text file using readLine() MyTimer mt = new MyTimer(); FileReader fr = new FileReader(TESTFILE); BufferedReader br = new BufferedReader(fr); cnt1 = 0; while (br.readLine() != null) { cnt1++; } br.close(); System.out.println( "read1 " + mt.getElapsed()); } static void read2() throws IOException { // read from text file using a buffer // and reading individual characters MyTimer mt = new MyTimer(); FileReader fr = new FileReader(TESTFILE); BufferedReader br = new BufferedReader(fr); cnt2 = 0; int prev = -1; final int BUFSIZE = 1024; char cbuf[] = new char[BUFSIZE]; int currpos = 0; int maxpos = 0; while (true) { if (currpos == maxpos) { maxpos = br.read(cbuf, 0, BUFSIZE); if (maxpos == -1) { break; } currpos = 0; } int c = cbuf[currpos++]; if (c == '\r' || (c == '\n' && prev != '\r')) { cnt2++; } prev = c; } br.close(); System.out.println( "read2 " + mt.getElapsed()); } public static void main(String args[]) throws IOException { // write the test file write(); // read the test file the first time read1(); read2(); // sanity check if (cnt1 != cnt2) { System.out.println ("cnt1/cnt2 mismatch"); } // read the test file the second time read1(); read2(); } } The result of running this program should look something like this: read1 1125 read2 578 read1 1093 read2 563 The IODemo4 program creates some test data, consisting of a series of lines. A line of text is defined to end with carriage return, line feed, or carriage return line feed. All three of these cases are represented in the data. Two read methods are defined to count the number of lines. The first calls BufferedReader.readLine. The second uses a custom approach involving explicit buffering. Characters are read from the buffer and each line terminator is counted. The read2 method runs about twice as fast as read1. One reason for this difference is that the readLine method used by read1 returns a string for each line that is read. That string must be created and built up from individual characters. In this particular application, the string is not used for anything. It's inefficient to process the lines in this way -- unnecessary work is being done. But there's another side to this issue. In fact, you might want to use read1, even though it's slower. One problem with read2 is that the logic for detecting the end of line is tricky, and the three possible line terminators are explicitly coded into the read2 method. There's a good argument for leaving such details to the readLine method in the library. Also, what about an ambiguous case, where the last line of a text file does not end with a line terminator? Should it count as a line? In this case, what if read2 does something different than readLine? The extra speed of read2 might indeed be worth it, but it's a good idea to ask yourself whether the improved performance is worth the added complexity. The examples above touched on some of the core principles involved in speeding up I/O. There are other issues that have not been discussed. These include: o Performance issues for serialization o Byte/character and character/byte conversions o Converting numbers to strings and strings to numbers o Compression o Network I/O o Terminal I/O These areas are sometimes quite important. For more information about improving Java I/O performance, see Chapter 8, I/O, Logging, and Console Output, in "Java Performance Tuning" by Jack Shirazi. . . . . . . . . . . . . . . . . . . . . . . . IMPORTANT: Please read our Terms of Use, Privacy, and Licensing policies: http://www.sun.com/share/text/termsofuse.html http://www.sun.com/privacy/ http://developer.java.sun.com/berkeley_license.html * FEEDBACK Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster@sun.com * SUBSCRIBE/UNSUBSCRIBE - To subscribe, go to the subscriptions page, (http://developer.java.sun.com/subscription/), choose the newsletters you want to subscribe to and click "Update". - To unsubscribe, go to the subscriptions page, (http://developer.java.sun.com/subscription/), uncheck the appropriate checkbox, and click "Update". - To use our one-click unsubscribe facility, see the link at the end of this email: - ARCHIVES You'll find the JDC Tech Tips archives at: http://java.sun.com/jdc/TechTips/index.html - COPYRIGHT Copyright 2002 Sun Microsystems, Inc. All rights reserved. 901 San Antonio Road, Palo Alto, California 94303 USA. This document is protected by copyright. For more information, see: http://java.sun.com/jdc/copyright.html This issue of the JDC Tech Tips is written by Glen McCluskey. JDC Tech Tips March 5, 2002 Sun, Sun Microsystems, Java, and Java Developer Connection are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. To use our one-click unsubscribe facility, select the following URL: http://bulkmail.sun.com/unsubscribe?10992886717319895