Return-Path: Received: from pacific-carrier-annex.mit.edu by po10.mit.edu (8.9.2/4.7) id PAA04337; Tue, 10 Apr 2001 15:37:36 -0400 (EDT) Received: from hermes.java.sun.com (hermes.java.sun.com [204.160.241.85]) by pacific-carrier-annex.mit.edu (8.9.2/8.9.2) with SMTP id PAA20734 for ; Tue, 10 Apr 2001 15:38:16 -0400 (EDT) Date: Tue, 10 Apr 2001 15:38:16 -0400 (EDT) Message-Id: <200104101938.PAA20734@pacific-carrier-annex.mit.edu> From: "JDC Tech Tips" To: alexp@mit.edu Subject: JDC Tech Tips April 10, 2001 Reply-To: JDCTechTips@hermes.java.sun.com Errors-To: bounced_mail@hermes.java.sun.com Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Beyond Email 2.2 J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, April 10, 2001. This issue covers: * Making Deep Copies of Objects * Using strictfp * Optimizing String Performance These tips were developed using Java(tm) 2 SDK, Standard Edition, v 1.3. You can view this issue of the Tech Tips on the Web at http://java.sun.com/jdc/JDCTechTips/2001/tt0410.html - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - MAKING DEEP COPIES OF OBJECTS The March 6, 2001 JDC Tech Tip "Cloning Objects" (http://java.sun.com/jdc/JDCTechTips/2001/tt0306.html#cloning) described how to make copies of objects, using a combination of Object.clone and your own clone methods. Object.clone, and sometimes user-defined clone methods, make what are called "shallow" copies, that is, they copy object fields and references, but they do not recursively copy the objects referenced by these fields. Suppose you have an ArrayList object. If you call its clone method, it will create a new ArrayList object, and copy into it the various element references of the original list. If these elements are mutable, that is, they can be changed, then changing one of them in the original list will also change the element in the new list. In other words, two ArrayList objects have common elements through shared references. But this sharing might not be what you want. You might want to make a list copy, but not share updates to elements in the two lists. These type of totally distinct copies are called "deep copies." Let's look at a few deep copy techniques in the following program: import java.io.*; import java.util.*; class A implements java.io.Serializable { private int x; public A(int i) { setX(i); } public void setX(int i) { x = i; } public int getX() { return x; } } public class DeepDemo { // shallow copy static ArrayList copy1(ArrayList list) { return (ArrayList)list.clone(); } // hand-coded deep copy static ArrayList copy2(ArrayList list) { ArrayList newlist = new ArrayList(); A aobj = (A)list.get(0); newlist.add(new A(aobj.getX())); return newlist; } // deep copy via serialization static ArrayList copy3(ArrayList list) throws Exception { // serialize ArrayList into byte array ByteArrayOutputStream baos = new ByteArrayOutputStream(100); ObjectOutputStream oos = new ObjectOutputStream(baos); oos.writeObject(list); byte buf[] = baos.toByteArray(); oos.close(); // deserialize byte array into ArrayList ByteArrayInputStream bais = new ByteArrayInputStream(buf); ObjectInputStream ois = new ObjectInputStream(bais); ArrayList newlist = (ArrayList)ois.readObject(); ois.close(); return newlist; } static final int NUMITERS = 50000; public static void main(String args[]) throws Exception { ArrayList listold, listnew; long currtime, elapsedtime2, elapsedtime3; // shallow copy listold = new ArrayList(); listold.add(new A(37)); listnew = copy1(listold); ((A)listold.get(0)).setX(47); System.out.print("copy1 old/new = "); System.out.print(((A)listold.get(0)).getX() + " "); System.out.println(((A)listnew.get(0)).getX()); // deep copy via hand-coded method listold = new ArrayList(); listold.add(new A(37)); currtime = System.currentTimeMillis(); for (int i = 1; i <= NUMITERS; i++) { listnew = copy2(listold); } elapsedtime2 = System.currentTimeMillis() - currtime; ((A)listold.get(0)).setX(47); System.out.print("copy2 old/new = "); System.out.print(((A)listold.get(0)).getX() + " "); System.out.println(((A)listnew.get(0)).getX()); // deep copy via serialization listold = new ArrayList(); listold.add(new A(37)); currtime = System.currentTimeMillis(); for (int i = 1; i <= NUMITERS; i++) { listnew = copy3(listold); } elapsedtime3 = System.currentTimeMillis() - currtime; ((A)listold.get(0)).setX(47); System.out.print("copy3 old/new = "); System.out.print(((A)listold.get(0)).getX() + " "); System.out.println(((A)listnew.get(0)).getX()); System.out.println(); System.out.println("copy2 time = " + elapsedtime2); System.out.println("copy3 time = " + elapsedtime3); } } The DeepDemo program uses an ArrayList object containing a single element of type A. Objects of type A are mutable through the setX method. The program defines three copy methods. The first method, copy1, does a shallow copy using the ArrayList.clone method. This method creates a new list and copies object references from the old list. After the copy1 method is called, the program changes the value of the single element in the old list and checks whether the value of the element in the new list is also changed. Here, a change to the new list indicates a shallow copy. The second approach to copying uses a hand-coded method, copy2, to make a deep copy. The method sets up a new ArrayList, gets the value of the element from the old list, and adds a new A object with this value to the new list. The third approach, copy3, uses serialization. It uses writeObject to convert the ArrayList into a byte stream stored in a ByteArrayOutputStream object, and then reverses the process, converting the stream of bytes back into an ArrayList. This process forces the creation of new ArrayList and A objects. When you run the program, the output should look something like this: copy1 old/new = 47 47 copy2 old/new = 47 37 copy3 old/new = 47 37 copy2 time = 47 copy3 time = 8500 Both hand-coded deep copying (copy2) and serialization (copy3) keep the copies distinct. But there's an apparent issue here -- performance. The serialization approach is much slower than the hand-coded approach. However, this time discrepancy doesn't mean that you should avoid serialization when making deep copies. The DeepDemo results show that it took 8,500 milliseconds to make the copies. If you examine DeepDemo, you'll see that it does the copying 50,000 times. Doing 50,000 copies in 8,500 milliseconds means that it didn't take very long to do a single copy. And time might not be a critical factor in the part of your application that makes these copies. There's another big issue here besides performance. The manual or hand-coded approach doesn't scale very well to complex data structures. For example, if you need to make a deep copy of a graph or tree structure, doing it yourself can get pretty complicated. So comparing the serialization approach to the manual approach, you can say that the serialization approach costs a lot more, measured in time, than the manual approach, but it does a lot more as well. A final point about copying through serialization is that all the objects that the serialization process encounters must be serializable. In the DeepDemo example, A got this property by implementing the java.io.Serializable marker interface. ArrayList does likewise. For more information about deep copying, see Section 3.9.3, Shallow versus Deep Cloning, and Section 15.7, Object Serialization, in "The Java(tm) Programming Language Third Edition" by Arnold, Gosling, and Holmes (http://java.sun.com/docs/books/javaprog/thirdedition/). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - USING STRICTFP The keyword "strictfp" was a late addition to the Java programming language. It is used to control certain aspects of floating-point arithmetic. It can be a bit difficult to describe and illustrate strictfp, so this tip will do it in a stepwise way. The tip starts with a few examples of syntax, then it presents an example that shows where strictfp might be important. You can use strictfp as a modifier of class, interface, and method declarations, like this: // legal uses of strictfp strictfp interface A {} public strictfp class FpDemo1 { strictfp void f() {} } You cannot use strictfp on constructors or methods within interfaces: // illegal uses of strictfp interface A { strictfp void f(); } public class FpDemo2 { strictfp FpDemo2() {} } The strictfp keyword is used to designate an expression as "FP-strict." If a class, interface, or method is declared using strictfp, then the class, interface, or method is FP-strict. So too are all classes, interfaces, methods, constructors, instance initializers, variable initializers, and static initializers within the declaration. An expression is FP-strict if it occurs anywhere within one of these FP-strict declarations, or if it is a compile-time constant expression. In practical terms, this means that if a class or method is declared with strictfp, any expression that occurs within the class or method is an FP-strict expression. Because constructors cannot be declared using strictfp, they are FP-strict only if their defining class is FP-strict. Methods within interfaces cannot be declared using strictfp because this is an implementation rather than an interface property. So what does FP-strict actually mean? Consider the following example: public strictfp class FpDemo3 { public static void main(String[] args) { double d = 8e+307; System.out.println(4.0 * d * 0.5); System.out.println(2.0 * d); } } The maximum value of a double (Double.MAX_VALUE) is approximately 1.8e+308. In the FpDemo3 example, the first expression is evaluated as: (4.0 * d) * 0.5 because the Java programming language guarantees a left-to-right order of evaluation. When the first part of the expression is evaluated, the result is 32e+307 or 3.2e+308, which is larger than Double.MAX_VALUE. Because the expression is FP-strict, the implementation is required to evaluate the whole expression as resulting in positive infinity (which the program prints as "Infinity"). This is true even though the later multiplication by 0.5 produces a final value for the expression of 1.6e+308, which is less than Double.MAX_VALUE. By contrast, if the expression is not FP-strict, an implementation is allowed to use an extended exponent range to represent intermediate results. In the FpDemo3 example, this could keep the expression from overflowing, and produce a final result that is within range. An implementation is not required to do this; it can, in effect, use FP-strict rules everywhere. Note that multiplication in this example is not associative, in that: (4.0 * d) * 0.5 != 4.0 * (d * 0.5) strictfp is important because its use guarantees common behavior across different Java implementations. In other words, you can know that the floating-point arithmetic in your application behaves the same when you move your application to a different Java implementation or hardware platform. For more information about strictfp, see Section 6.6.3, Strict and non-Strict Floating-Point Arithmetic in "The Java(tm) Programming Language Third Edition" by Arnold, Gosling, and Holmes (http://java.sun.com/docs/books/javaprog/thirdedition/); and Section 15.4, FP-strict Expressions, in "The Java(tm) Language Specification Second Edition" by Gosling, Joy, Steele, and Bracha (http://java.sun.com/docs/books/jls/). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - OPTIMIZING STRING PERFORMANCE Suppose that you're working with a Java programming language application where you have lists of Integer wrapper objects. Suppose too that the application frequently converts these lists to strings. In other words, the application might have code like this: List list = new ArrayList(); list.add(new Integer(37)); list.add(new Integer(47)); ... String s = list.toString(); The toString method converts lists into a string, formatted like this: [37, 47] Is there a way you can optimize the conversions? To answer this question, it's useful to look at what happens internally when a list is converted to a string. A StringBuffer object is first set up. Then each list element is converted to a string and appended to the buffer. In the final step, the StringBuffer object itself is converted to a String object. List elements are converted to strings by calling a series of methods, the most important of which is Integer.toString(value, radix). This method takes an integer value, such as 59, and produces a new string from it. The conversion is expensive, in part because the value must be picked apart to obtain individual digits in the string representation. It's also expensive because a new string has to be created. Can you do better than the standard approach? The answer is yes, at least in some cases. Consider an example where the list elements are Integer objects with values in the range 0-99. In this case, it's possible to write a local version of toString that is faster than the standard method. The code looks like this: import java.util.*; public class PerfDemo { // length of ArrayList objects static final int MAXLISTLEN = 10000; // maximum value in Integer object static final int MAXNUM = 99; // cache of formatted strings for small integers static String cache[] = new String[MAXNUM + 1]; // fill cache at program startup static { for (int i = 0; i <= MAXNUM; i++) { cache[i] = Integer.toString(i); } } // local version of toString, that uses the cache static String localtoString(List list) { StringBuffer sb = new StringBuffer(); sb.append("["); int size = list.size(); for (int i = 0; i < size; i++) { Integer iobj = (Integer)list.get(i); sb.append(cache[iobj.intValue()]); if (i + 1 < size) { sb.append(", "); } } sb.append("]"); return sb.toString(); } public static void main(String args[]) { Random rn = new Random(0); List list = new ArrayList(); // fill list with random Integer values 0-99 for (int i = 1; i <= MAXLISTLEN; i++) { int r = rn.nextInt(MAXNUM + 1); list.add(new Integer(r)); } String s1 = null; String s2 = null; // do timing of standard approach long currtime = System.currentTimeMillis(); for (int i = 1; i <= 100; i++) { s1 = list.toString(); } long elapsed = System.currentTimeMillis() - currtime; System.out.println("standard toString time = " + elapsed); // do timing of custom approach currtime = System.currentTimeMillis(); for (int i = 1; i <= 100; i++) { s2 = localtoString(list); } elapsed = System.currentTimeMillis() - currtime; System.out.println("local toString time = " + elapsed); // check to make sure same strings are produced if (!s1.equals(s2)) { System.out.println("error"); } } } The PerfDemo program sets up a string cache for integer values 0-99. In other words, the program precomputes the string values that are the result of converting the integers 0-99 to strings. Using this cache, it's a simple matter within localtoString to call the intValue method on each Integer object, and then use the value as an index into the cache table. If you run PerfDemo, the result should look something like this: standard toString time = 1719 local toString time = 781 These results show that the local method runs more than twice as fast as the standard one. This is because the local method avoids integer-to-string conversions and string creation. The optimization is not general, in that you cannot expand the cache technique to handle any range of integers. You can term this performance technique "trading space for speed" or "precomputing results." Unfortunately, it becomes unreasonable if the cache table is too large. But many optimizations are of this type, that is, they achieve performance gains by exploiting knowledge about the particular data and operations in an application. Another angle on replacing standard library methods with your own, concerns reliability. The library methods have been exercised heavily, much more so that any method you are likely to write. Although it sometimes makes sense to use your own methods in preference to library methods, it's worth examining the tradeoffs. Looking at the PerfDemo example, you might ask whether you really care about the performance of the toString method, and whether speeding it up will actually improve the overall performance of your application. For more information about optimizing performance, see the book "Programming Pearls" by Jon Bentley. You can browse selections from the book at http://www.programmingpearls.com/. See especially Appendix 4 Rules for Code Tuning (http://www.programmingpearls.com/apprules.html). . . . . . . . . . . . . . . . . . . . . . . . - NOTE Sun respects your online time and privacy. The Java Developer Connection mailing lists are used for internal Sun Microsystems(tm) purposes only. You have received this email because you elected to subscribe. To unsubscribe, go to the Subscriptions page (http://developer.java.sun.com/subscription/), uncheck the appropriate checkbox, and click the Update button. - SUBSCRIBE To subscribe to a JDC newsletter mailing list, go to the Subscriptions page (http://developer.java.sun.com/subscription/), choose the newsletters you want to subscribe to, and click Update. - FEEDBACK Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster@sun.com - ARCHIVES You'll find the JDC Tech Tips archives at: http://java.sun.com/jdc/TechTips/index.html - COPYRIGHT Copyright 2001 Sun Microsystems, Inc. All rights reserved. 901 San Antonio Road, Palo Alto, California 94303 USA. This document is protected by copyright. For more information, see: http://java.sun.com/jdc/copyright.html - LINKS TO NON-SUN SITES The JDC Tech Tips may provide, or third parties may provide, links to other Internet sites or resources. Because Sun has no control over such sites and resources, You acknowledge and agree that Sun is not responsible for the availability of such external sites or resources, and does not endorse and is not responsible or liable for any Content, advertising, products, or other materials on or available from such sites or resources. Sun will not be responsible or liable, directly or indirectly, for any damage or loss caused or alleged to be caused by or in connection with use of or reliance on any such Content, goods or services available on or through any such site or resource. This issue of the JDC Tech Tips is written by Glen McCluskey. JDC Tech Tips April 10, 2001 Sun, Sun Microsystems, Java, and Java Developer Connection are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.