Received: from PACIFIC-CARRIER-ANNEX.MIT.EDU by po10.MIT.EDU (5.61/4.7) id AA10169; Wed, 17 Mar 99 04:55:44 EST Received: from hermes.javasoft.com by MIT.EDU with SMTP id AA01523; Wed, 17 Mar 99 04:56:03 EST Received: by hermes.java.sun.com (SMI-8.6/SMI-SVR4) id KAA08020; Wed, 17 Mar 1999 10:14:14 GMT Date: Wed, 17 Mar 1999 10:14:14 GMT Message-Id: <199903171014.KAA08020@hermes.java.sun.com> X-Mailing: 99 From: JDCTechTips@sun.com Subject: JDC Tech Tips Vol. 2 No. 8 To: JDCMember@sun.com Reply-To: JDCTechTips@sun.com Errors-To: JDCMailErrors@sun.com Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Beyond Email 2.1 J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) Tech Tips, Vol. 2 No. 8. This issue covers: * Collators * Big Decimal - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - T I P S , T E C H N I Q U E S , A N D S A M P L E C O D E COLLATORS If you've used Java much at all, you've probably had occasion to compare strings using the String.compareTo method. This method does lexicographical comparison, which is a fancy way of saying that the numerical values of corresponding Unicode characters in the strings are compared. For example, the letter "a" has a numeric value of 0x61, "b" 0x62, and so on. Such comparisons are obviously useful, but not necessarily completely adequate, for example in an internationalization context. Suppose, for example, that you'd like for lower and upper case characters to compare identical, or you want accents on letters to be ignored. The collator classes in java.text can be used for this purpose, that is, to build locale-sensitive string comparison methods. To see how collators work, consider an example: import java.text.*; import java.util.*; public class collate { public static void main(String args[]) { Collator coll = Collator.getInstance(Locale.US); coll.setStrength(Collator.TERTIARY); System.out.println(coll.compare("a","A"));//false coll.setStrength(Collator.SECONDARY); System.out.println(coll.compare("a","A"));//true coll.setStrength(Collator.SECONDARY); System.out.println(coll.compare("a","\u00e0"));//false coll.setStrength(Collator.PRIMARY); System.out.println(coll.compare("a","\u00e0"));//true coll.setStrength(Collator.IDENTICAL); System.out.println(coll.compare("a","b"));//false CollationKey key1 = coll.getCollationKey("abc"); CollationKey key2 = coll.getCollationKey("def"); System.out.println(key1.compareTo(key2));//false } } The first line: Collator defcoll = Collator.getInstance(Locale.US); retrieves a new collator, according to the locale settings applicable to the United States. Then a series of string comparisons is done, in each case setting a strength before performing the comparison. A strength specifies what level of difference is considered important in the comparison. Four different strengths can be defined: IDENTICAL, PRIMARY, SECONDARY, and TERTIARY. The meaning of each strength depends on the specific locale. For example, in the US locale, upper versus lower case is considered a TERTIARY difference, less important than a SECONDARY difference. If the strength is set to TERTIARY, then case is significant. An example of a SECONDARY difference is accents on letters. The Unicode letter "\u00e0", defined to be: 00E0;LATIN SMALL LETTER A WITH GRAVE is considered different than "a" when comparing using a SECONDARY strength setting, but identical when using a PRIMARY one. These rules may of course be different for some other locale. A final point about this example concerns efficiency. If you are performing repeated string comparisons using collators, it may be more efficient to use CollationKey objects instead of Collation.compare. CollationKey objects are precompiled, which aids performance. If you are developing applications that operate in an international context, then this whole area is one that needs to be considered. BIGDECIMAL The java.math package contains two classes, BigInteger and BigDecimal. BigInteger represents arbitrary-precision integers, with arithmetic operations such as addition and division supported, along with comparison and hashing methods. A BigDecimal consists of an arbitrary-precision integer along with a scale, where the scale is the number of digits to the right of the decimal point. These classes can be used in applications requiring high-precision numbers. Financial applications sometimes require such precision, as do some kinds of numerical programming problems. An example of one of these is computing numerical constants to a high degree of precision. The mathematical constant "e" can be defined as the sum of the infinite series: 1/0! + 1/1! + 1/2! + 1/3! + ... A program that uses BigDecimal to compute this constant to 40 places is: import java.math.*; public class bige { public static void main(String args[]) { BigDecimal one = new BigDecimal("1"); BigDecimal curfact = new BigDecimal("1"); BigDecimal factmul = new BigDecimal("1"); BigDecimal curval = new BigDecimal("0"); String curout = ""; // number of desired decimal places final int NP = 40; for (;;) { // divide 1 by the current factorial BigDecimal x = one.divide(curfact, NP + 1, BigDecimal.ROUND_HALF_EVEN); // add the result to the accumulated value curval = curval.add(x); // move to the next factorial value curfact = curfact.multiply(factmul); factmul = factmul.add(one); // check convergence of the current value String s = curval.toString().substring(0, NP + 2); if (s.equals(curout)) { System.out.println(s); break; } curout = s; } } } During the calculation, an extra digit is carried to help the rounding behavior. The rounding behavior itself is ROUND_HALF_EVEN, which means "round up/down toward nearest digit", or if the digits are equidistant, round toward the even digit. For example, division using this rounding mode, assuming one decimal place, comes out like this: 755 / 100 = 7.6 (7.5 and 7.6 are equidistant, round toward 6) 745 / 100 = 7.4 (7.4 and 7.5 are equidistant, round toward 4) This particular rounding method minimizes cumulative error. The output of the program is: 2.7182818284590452353602874713526624977572 which is a correct value for "e" to 40 places. . . . . . . . . . . . . . . . . . . . . . . . . - NOTE The names on the JDC mailing list are used for internal Sun Microsystems(tm) purposes only. To remove your name from the list, see Subscribe/Unsubscribe below. - FEEDBACK Comments? Send your feedback on the JDC Tech Tips to: JDCTechTips@Sun.com - SUBSCRIBE/UNSUBSCRIBE The JDC Tech Tips are sent to you because you elected to subscribe when you registered as a JDC member. To unsubscribe from JDC Email, go to the following address and enter the email address you wish to remove from the mailing list: http://developer.java.sun.com/unsubscribe.html To become a JDC member and subscribe to this newsletter go to: http://java.sun.com/jdc/ - ARCHIVES You'll find the JDC Tech Tips archives at: http://developer.java.sun.com/developer/javaInDepth/TechTips/index.html - COPYRIGHT Copyright 1999 Sun Microsystems, Inc. All rights reserved. 901 San Antonio Road, Palo Alto, California 94303 USA. This document is protected by copyright. For more information, see: http://developer.java.sun.com/developer/copyright.html The JDC Tech Tips are written by Glen McCluskey. JDC Tech Tips Vol. 2 No. 8 March 16, 1999