Received: from SOUTH-STATION-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA16994; Tue, 11 Apr 00 17:07:29 EDT Received: from hermes.javasoft.com by MIT.EDU with SMTP id AA27697; Tue, 11 Apr 00 17:06:52 EDT Received: (from nobody@localhost) by hermes.java.sun.com (8.9.3+Sun/8.9.1) id VAA22622; Tue, 11 Apr 2000 21:08:01 GMT Date: Tue, 11 Apr 2000 21:08:01 GMT Message-Id: <200004112108.VAA22622@hermes.java.sun.com> X-Authentication-Warning: hermes.java.sun.com: Processed from queue /bulkmail/data/ed_11/mqueue7 X-Mailing: 198 From: JDCTechTips@sun.com Subject: JDC Tech Tips April, 11, 2000 To: JDCMember@sun.com Reply-To: JDCTechTips@sun.com Errors-To: bounced_mail@hermes.java.sun.com Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Beyond Email 2.2 J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, April 11, 2000. This issue covers: * Formatting Decimal Numbers * Using Checksums These tips were developed using Java(tm) 2 SDK, Standard Edition, v 1.2.2. You can view this issue of the Tech Tips on the Web at http://developer.java.sun.com/developer/TechTips/2000/tt0411.html - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - FORMATTING DECIMAL NUMBERS Suppose that you're developing a Java(tm) application that uses decimal numbers, and you'd like to control the formatting of these numbers for output purposes. How do you do this using the Java library? Or perhaps you don't care about formatting, but you do care about making your application work in an international context. For example, a simple statement like: System.out.println(1234.56); is locale-dependent; "." is used as a decimal point in the United States, but not necessarily everywhere else. How do you deal with this concern? A couple of classes in the java.text package deal with these kinds of issues. Here's a simple example that uses these classes to tackle the problem mentioned in the previous paragraph: import java.text.NumberFormat; import java.util.Locale; public class DecimalFormat1 { public static void main(String args[]) { // get format for default locale NumberFormat nf1 = NumberFormat.getInstance(); System.out.println(nf1.format(1234.56)); // get format for German locale NumberFormat nf2 = NumberFormat.getInstance(Locale.GERMAN); System.out.println(nf2.format(1234.56)); } } If you live in the United States and run this program, the output is: 1,234.56 1.234,56 In other words, different locales, in this case locales for the United States and for Germany, use different conventions for representing numbers. NumberFormat.getInstance returns an instance of NumberFormat (actually a concrete subclass of NumberFormat such as DecimalFormat), that is suited for formatting numbers according to the default locale. You can also specify a non-default locale, such as "Locale.GERMAN". Then the format method is called to format a number according to the rules of a specific locale. Note that the program could have done the formatting using a single expression: NumberFormat.getInstance().format(1234.56) but it's more efficient to save a format and then reuse it. Internationalization is a big issue when formatting numbers. Another is the ability to exercise fine control over formatting, for example, by specifying the number of decimal places. Here's another example that illustrates this idea: import java.text.DecimalFormat; import java.util.Locale; public class DecimalFormat2 { public static void main(String args[]) { // get format for default locale DecimalFormat df1 = new DecimalFormat("####.000"); System.out.println(df1.format(1234.56)); // get format for German locale Locale.setDefault(Locale.GERMAN); DecimalFormat df2 = new DecimalFormat("####.000"); System.out.println(df2.format(1234.56)); } } In this example, a specific number format is set, using a notation like "####.000". This pattern means "four places before the decimal point, which are empty if not filled, and three places after the decimal point, which are 0 if not filled". The output of this program is: 1234.560 1234,560 In a similar way, it's possible to control exponent formatting, for example: import java.text.DecimalFormat; public class DecimalFormat3 { public static void main(String args[]) { DecimalFormat df = new DecimalFormat("0.000E0000"); System.out.println(df.format(1234.56)); } } The output here is: 1.235E0003 You can also work with percentages: import java.text.NumberFormat; public class DecimalFormat4 { public static void main(String args[]) { NumberFormat nf = NumberFormat.getPercentInstance(); System.out.println(nf.format(0.47)); } } The output from this program is: 47% So far, you've seen various techniques for formatting numbers. What about going the other direction, that is, reading and parsing strings that contain formatted numbers? Parsing support is included in NumberFormat. For example, you can say: import java.util.Locale; import java.text.NumberFormat; import java.text.ParseException; public class DecimalFormat5 { public static void main(String args[]) { // get format for default locale NumberFormat nf1 = NumberFormat.getInstance(); Object obj1 = null; // parse number based on format try { obj1 = nf1.parse("1234,56"); } catch (ParseException e1) { System.err.println(e1); } System.out.println(obj1); // get format for German locale NumberFormat nf2 = NumberFormat.getInstance(Locale.GERMAN); Object obj2 = null; // parse number based on format try { obj2 = nf2.parse("1234,56"); } catch (ParseException e2) { System.err.println(e2); } System.out.println(obj2); } } This example has two parts, both of them concerned with parsing an identical string: "1234,56". The first part uses the default locale, the second the German locale. When this program is run in the United States, the result is: 123456 1234.56 In other words, the string "1234,56" is interpreted as a large integer "123456" in the United States, but as a decimal number "1234.56" in the German locale. There's one final point to be covered in this discussion of formatting. In the examples above, DecimalFormat and NumberFormat are both used. DecimalFormat is used to gain fine control over formatting, while NumberFormat is used to specify a locale other than the default. How do you combine these two classes? The answer centers around the fact that DecimalFormat is a subclass of NumberFormat, a subclass whose instances are specific to a particular locale. So you can use NumberFormat.getInstance to specify a locale, and then cast the resulting instance to a DecimalFormat object. The documentation says that this technique will work in the vast majority of cases, but that you need to surround the cast with a try/catch block just in case it does not (presumably in a very obscure case with an exotic locale). Such an approach looks like this: import java.text.DecimalFormat; import java.text.NumberFormat; import java.util.Locale; public class DecimalFormat6 { public static void main(String args[]) { DecimalFormat df = null; // get a NumberFormat object and cast it to // a DecimalFormat object try { df = (DecimalFormat) NumberFormat.getInstance(Locale.GERMAN); } catch (ClassCastException e) { System.err.println(e); } // set a format pattern df.applyPattern("####.00000"); // format a number System.out.println(df.format(1234.56)); } } The getInstance method obtains the format, then applyPattern is called to set a particular formatting pattern. The output of this program is: 1234,56000 If you don't care about internationalization, it makes sense to use DecimalFormat directly. USING CHECKSUMS In the computer software field, a "checksum" is a value computed from a stream of bytes. The checksum is a signature for the bytes, that is, a combining of the bytes using some algorithm. What's important is that changes or corruption in the byte stream can be detected with a high degree of probability. An example of checksum use is found in data transmission. An application might transmit 100 bytes of information to another application across a network. The application appends to the bytes a 32-bit checksum that is computed from the values of the bytes. On the receiving end of the transmission, the checksum is computed again based on the 100 bytes that were received. If the checksum at the receiving end is different than the one computed at the transmitting end, then the data has been corrupted in some way. A checksum is typically much smaller than the data it's calculated on. So it relies on a probabilistic model to catch most, but not all, errors in the data. Checksums closely resemble hash codes, in that an algorithm is applied in each case to compute a number from a sequence of bytes. The class java.util.zip.CRC32 implements one of the standard checksum algorithms: CRC-32. To see how you might use checksums, consider the following application: you're writing some strings to a text file, and you'd like to know whether the string list has been modified after writing. For example, you'd like to find out if someone used a text editor to edit the file. Here are two programs that comprise the application. The first program writes a set of strings to a file, and computes a running checksum from the bytes of the string characters: import java.io.*; import java.util.zip.CRC32; public class Checksum1 { // list of names to write to a file static final String namelist[] = { "Jane Jones", "Tom Garcia", "Sally Smith", "Richard Robinson", "Jennifer Williams" }; public static void main(String args[]) throws IOException { FileWriter fw = new FileWriter("out.txt"); BufferedWriter bw = new BufferedWriter(fw); CRC32 checksum = new CRC32(); // write the length of the list bw.write(Integer.toString(namelist.length)); bw.newLine(); // write each name and update the checksum for (int i= 0; i < namelist.length; i++) { String name = namelist[i]; bw.write(name); bw.newLine(); checksum.update(name.getBytes()); } // write the checksum bw.write(Long.toString(checksum.getValue())); bw.newLine(); bw.close(); } } The output of running this program is in a file "out.txt", with contents: 5 Jane Jones Tom Garcia Sally Smith Richard Robinson Jennifer Williams 4113203990 The number on the last line is a checksum computed by combining all the bytes found in the string characters. The second program reads the file: import java.io.*; import java.util.zip.CRC32; public class Checksum2 { public static void main(String args[]) throws IOException { FileReader fr = new FileReader("out.txt"); BufferedReader br = new BufferedReader(fr); CRC32 checksum = new CRC32(); // read the number of names from the file int len = Integer.parseInt(br.readLine()); // read each name from the file and update the checksum String namelist[] = new String[len]; for (int i = 0; i < len; i++) { namelist[i] = br.readLine(); checksum.update(namelist[i].getBytes()); } // read the checksum long cs = Long.parseLong(br.readLine()); br.close(); // if checksum doesn't match, give error, // else display the list of names if (cs != checksum.getValue()) { System.err.println("*** bad checksum ***"); } else { for (int i = 0; i < len; i++) { System.out.println(namelist[i]); } } } } This program reads the list of names from the file and displays the names. If you edit "out.txt" with a text editor, and change one of the names, for example changing "Tom" to "Thomas", the program will compute a different checksum, and display a checksum error message. Now, you might think that a person could maliciously change the text file, compute a new checksum, and change that as well. This is possible, but not easy to do. That's because the CRC-32 checksum algorithm is not obvious to a casual user, and so it's difficult to calculate what the new checksum value should be. Another way of using checksums is through the CheckedInputStream and CheckedOutputStream classes in java.util.zip. These classes support computation of a running checksum on an I/O stream. . . . . . . . . . . . . . . . . . . . . . . . - NOTE The names on the JDC mailing list are used for internal Sun Microsystems(tm) purposes only. To remove your name from the list, see Subscribe/Unsubscribe below. - FEEDBACK Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster@sun.com - SUBSCRIBE/UNSUBSCRIBE The JDC Tech Tips are sent to you because you elected to subscribe when you registered as a JDC member. To unsubscribe from JDC email, go to the following address and enter the email address you wish to remove from the mailing list: http://developer.java.sun.com/unsubscribe.html To become a JDC member and subscribe to this newsletter go to: http://java.sun.com/jdc/ - ARCHIVES You'll find the JDC Tech Tips archives at: http://developer.java.sun.com/developer/TechTips/index.html - COPYRIGHT Copyright 2000 Sun Microsystems, Inc. All rights reserved. 901 San Antonio Road, Palo Alto, California 94303 USA. This document is protected by copyright. For more information, see: http://developer.java.sun.com/developer/copyright.html This issue of the JDC Tech Tips is written by Glen McCluskey. JDC Tech Tips April 11, 2000