Received: from PACIFIC-CARRIER-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA08528; Tue, 29 Feb 00 17:55:45 EST Received: from hermes.javasoft.com by MIT.EDU with SMTP id AA05340; Tue, 29 Feb 00 17:56:40 EST Received: (from nobody@localhost) by hermes.java.sun.com (8.9.3+Sun/8.9.1) id WAA14176; Tue, 29 Feb 2000 22:55:13 GMT Date: Tue, 29 Feb 2000 22:55:13 GMT Message-Id: <200002292255.WAA14176@hermes.java.sun.com> X-Authentication-Warning: hermes.java.sun.com: Processed from queue /bulkmail/data/ed_70/mqueue3 X-Mailing: 195 From: JDCTechTips@sun.com Subject: JDC Tech Tips February 29, 2000 To: JDCMember@sun.com Reply-To: JDCTechTips@sun.com Errors-To: bounced_mail@hermes.java.sun.com Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Beyond Email 2.2 J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, February 29, 2000. This issue focuses on serialization. The tip has four parts: * Serialization in the Real World * Serialization and Class Versioning * Serialization and Secure Data * Serialization and the Complete Class Rewrite This tip was developed using Java(tm) 2 SDK, Standard Edition, v 1.2.2. You can view this issue of the Tech Tips on the Web at http://developer.java.sun.com/developer/TechTips/2000/tt0229.html. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SERIALIZATION IN THE REAL WORLD The Java(tm) serialization mechanism illustrates two of the best characteristics of the Java(tm) programming language: simplicity and flexibility. Serialization allows you to create persistent objects, that is, objects that can be stored and then reconstituted for later use. You might want to do this, for example, if you want to use an object with a program and then use the object again with a later invocation of the same program. The basic mechanism of serialization is simple. And it's flexible enough for you to customize default serialization as needed. This tip shows you how to serialize objects. It then shows you three situations where you can take advantage of the mechanism's flexibility: introducing a new version of a class, securing protected data, and completely rewriting a class. First, take a look at this basic example: import java.io.*; public class Person implements Serializable { public String firstName; public String lastName; private String password; transient Thread worker; public Person(String firstName, String lastName, String password) { this.firstName = firstName; this.lastName = lastName; this.password = password; } public String toString() { return new String(lastName + ", " + firstName); } } class WritePerson { public static void main(String [] args) { Person p = new Person("Fred", "Wesley", "cantguessthis"); ObjectOutputStream oos = null; try { oos = new ObjectOutputStream( new FileOutputStream("Person.ser")); oos.writeObject(p); } catch (Exception e) { e.printStackTrace(); } finally { if (oos != null) { try {oos.flush();} catch (IOException ioe) {} try {oos.close();} catch (IOException ioe) {} } } } } class ReadPerson { public static void main(String [] args) { ObjectInputStream ois = null; try { ois = new ObjectInputStream( new FileInputStream("Person.ser")); Object o = ois.readObject(); System.out.println("Read object " + o); } catch (Exception e) { e.printStackTrace(); } finally { if (ois != null) { try {ois.close();} catch (IOException ioe) {} } } } } Person is a class that represents data you'd like to make persistent. You might want to archive it to disk and reload in a later session. Java technology makes this easy. All you need to do is declare that the Person class implements the java.io.Serializable interface. The Serializable interface does not have any methods. It's simply a "signal" interface that indicates to the Java virtual machine that you want to use the default serialization mechanism. Compile Person and then test the code by first running WritePerson. WritePerson creates an ObjectOutputStream for the Person object and writes it to a FileOutputStream named Person.ser. This means it formats the object as a stream of bytes and saves it in the Person.ser file. Then, execute ReadPerson. This creates an ObjectInputStream from the FileInputStream, Person.ser. In other words, it reads the byte stream from Person.ser and reconstitutes the Person object from it. ReadPerson then prints the object. You should see: Read object Wesley, Fred The serialization mechanism you just used is capable of handling a wide variety of situations. When you serialize an object you save the complete state of the object, including all of its fields. This even includes fields marked private, such as the password field in the Person example. However there are times when you don't want a field to be persistent. In the Person example, the worker thread is tied to resources that are specific to this session of the virtual machine. It does not make any sense to serialize the thread for later use. Fortunately, the Java(tm) programming language includes the declaration transient. A field marked transient means that the the field is not saved when an object is serialized. Notice that the worker thread is declared transient so it is not saved when the Person object is serialized. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SERIALIZATION AND CLASS VERSIONING A place that default serialization usually runs into trouble is when you make a simple enhancement to a class. Imagine after shipping your Person class, you decide to track a Person's age. The modification to the person class is straightforward: public class Person implements Serializable { public String firstName; public String lastName; int age; private String password; transient Thread worker; public Person(String firstName, String lastName, String password, int age) { this.firstName = firstName; this.lastName = lastName; this.password = password; this.age = age; } public String toString() { return new String(lastName + ", " + firstName + " age " + age); } } class WritePerson { public static void main(String [] args) { Person p = new Person("Fred", "Wesley", "cantguessthis", 31); //everything past this point is the same as the original... What happens if somebody tries to use this new version of Person to stream in an old Person.ser file? Try it by executing ReadPerson again. (Don't run WritePerson first, or you will overwrite the old Person.ser file.) Notice that you can no longer read the file, instead you get a java.io.InvalidClassException. This is because the Java serialization mechanism is very cautious with modified classes. When a class is serialized, a 64-bit "fingerprint" for the class is calculated. This fingerprint, which is called the serialVersionUID, is based on several pieces of class data, including all the serializable fields. Because you added a new field (age) to the class, the serialVersionUID no longer matches, and you cannot read your old Person.ser file. The cautious approach is nice, because it prevents nasty bugs that might appear if two versions of a class were truly incompatible in some way. However, you might reasonably argue that the new Person class is compatible with the old one. Also your code is aware that the age value might not be set correctly when loading Person in its original format. In this situation, you need a way to tell Java that two classes are compatible. You can do this by explicitly setting the serialVersionUID for the Person class. If you add a line of the form: static final long serialVersionUID = /* some long integer */; to a class, Java serialization will use that ID, instead of calculating one for you. Of course, this piece of information is coming a little late, since you already saved the original Person using some Java-generated ID. Despair not. The serialver command-line tool in JDK(tm) 1.2 lets you extract a serialVersionUID from an existing class. Recompile the original Person class, and issue the command "serialver Person." In response, you should see: static final long serialVersionUID = 4070409649129120458L; Add this entire line to the new version of Person, and recompile. Now you can successfully load the original Person by running ReadPerson. The age is not correct (it's set to a default value, 0), because the original format didn't have an age field. At the very least, you have access to all of the data you serialized with the first version of the Person class. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - SERIALIZATION AND SECURE DATA Earlier, you saw that Java serialization works even with private data. This is necessary because private data is usually an essential part of an object's state. Without the private fields serialization would be meaningless. However this presents a problem. In the Person example above, Fred Wesley trusts that nobody can see his password, since the password field is private. With serialization, you can bypass this protection by dumping Fred's Person instance to a file. Open the Person.ser file in a hex editor, and Fred's password ("cantguessthis") is visible to all the world. To fix this security exposure, you need to control the way that data is written to the stream. The Java programming language allows you to do this with the following two methods: private void writeObject(ObjectOutputStream stream) throws IOException; private void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException; For serializable objects, the writeObject method allows a class to control the serialization of its own fields. The readObject method allows a class to control the deserialization of its own fields. What this means is that if you implement these methods in a Serializable class, they will replace the normal serialization behavior. Using writeObject and readObject allows you to do most anything with the stream. But in the Person case all you really need to do is encrypt the password. Once the password is encrypted, you can let the normal serialization mechanism take over. You can defer to the normal mechanism by calling the methods defaultReadObject and defaultWriteObject. Here's a Person class that puts this all together: import java.io.*; public class Person implements Serializable { public String firstName; public String lastName; int age; private String password; transient Thread worker; static final long serialVersionUID = 4070409649129120458L; //This is not a serious encryption algorithm! It works //but you should substitute something better. static String crypt(String input, int offset) { StringBuffer sb = new StringBuffer(); for (int n=0; n