Return-Path: Received: from MIT.EDU by po10.mit.edu (8.9.2/4.7) id QAA25942; Tue, 9 Jan 2001 16:35:26 -0500 (EST) Received: from hermes.java.sun.com by MIT.EDU with SMTP id AA21762; Tue, 9 Jan 01 16:36:52 EST Received: (from nobody@localhost) by hermes.java.sun.com (8.9.3+Sun/8.9.1) id VAA10034; Tue, 9 Jan 2001 21:37:48 GMT Date: Tue, 9 Jan 2001 21:37:48 GMT Message-Id: <200101092137.VAA10034@hermes.java.sun.com> X-Mailing: 330 From: JDCTechTips@sun.com Subject: JDC Tech Tips January 9, 2001 To: JDCMember@sun.com Reply-To: JDCTechTips@sun.com Errors-To: bounced_mail@hermes.java.sun.com Precedence: junk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Beyond Email 2.2 J D C T E C H T I P S TIPS, TECHNIQUES, AND SAMPLE CODE WELCOME to the Java Developer Connection(sm) (JDC) Tech Tips, January 9, 2001. This issue covers: * Using the java.lang.Character Class * Handling Uncaught Exceptions These tips were developed using Java(tm) 2 SDK, Standard Edition, v 1.3. You can view this issue of the Tech Tips on the Web at http://java.sun.com/jdc/JDCTechTips/2001/tt0109.html - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - USING THE JAVA.LANG.CHARACTER CLASS java.lang.Character is a wrapper class for the primitive type char. Like other wrappers such as Integer, it is used to represent primitive values in object form, so that collection classes that only know about Object references can manipulate char values. The Character class is also used to group together methods and constants used in handling Unicode characters. This tip looks at some of the ways you can use Character. The first example shows how the class is used as a wrapper: import java.util.*; public class CharDemo1 { public static void main(String args[]) { List list = new ArrayList(); list.add(new Character('a')); list.add(new Character('b')); list.add(new Character('c')); for (int i = 0; i < list.size(); i++) { System.out.println(list.get(i)); } } } In this example, three Character objects representing the letters a, b, and c are added to an ArrayList. Then the contents of the list are displayed. The Character class contains a lot of "isX" methods, such as "isDigit". You might think that these methods aren't really necessary, because it's simpler to say something like: if (c >= '0' && c <= '9') ... if you want to test whether a character is a digit. This code actually works in some contexts, but it has a big problem. It doesn't account for the fact that Java uses the Unicode character set rather than the ASCII character set. For example, if you run this program: public class CharDemo2 { public static void main(String args[]) { int dig_count = 0; int def_count = 0; for (int i = 0; i <= 0xffff; i++) { if (Character.isDigit((char)i)) { dig_count++; } if (Character.isDefined((char)i)) { def_count++; } } System.out.println("number of digits = " + dig_count); System.out.println("number of defined = " + def_count); } } it reports that the Unicode character set contains 159 characters that are classified as digits. This example also illustrates another interesting point: not all possible Unicode character values have meaning. The program reports that Character.isDefined returns true for 47400 of 65536 characters. Another place where the Character class is useful is in converting from upper case characters to lower case characters. Here's an example: public class CharDemo3 { public static void main(String args[]) { char cupper = 'A'; char clower; // convert to lower case using the ASCII convention clower = (char)(cupper + 0x20); System.out.println("cupper #1 = " + cupper); System.out.println("clower #1 = " + clower); System.out.println(); // convert to lower case using Character.toLowerCase() clower = Character.toLowerCase(cupper); System.out.println("cupper #2 = " + cupper); System.out.println("clower #2 = " + clower); } } If you've used the ASCII character set, it's common to convert to lower case by adding 0x20 (decimal 32) to an upper case letter. This approach works in the demo program, but again fails to take into account the Unicode character set. The key obstacle is that in Unicode, upper and lower case equivalents aren't guaranteed to be exactly 0x20 values apart. So in this situation, it's preferable to use the toLowerCase method of the Character class. The Character class also contains several methods for converting between character and integer values. These are used, for example, in Integer.parseInt, to convert number strings in a specified base into integers, as is the case in the following statement: Integer.parseInt("-ff", 16) == -255 Here's a program that illustrates these methods: public class CharDemo4 { public static void main(String args[]) { // return the numeric value of 'z' considered as // a digit in base 36 int dig = Character.digit('z', 36); System.out.println(dig); // return the character value for the // specified digit in base 36 char cdig = Character.forDigit(dig, 36); System.out.println(cdig); // return the numeric value of \u217c int rn50 = Character.getNumericValue('\u217c'); System.out.println(rn50); } } Character.digit returns the numeric value of a character considered as a digit in a given radix. So, for example, in base 36, digits have the values 0-9 and a-z, and thus 'z' has the value 35. Character.forDigit reverses the process; the appropriate digit as a character for the value 35 in base 36 is 'z'. Character.getNumericValue returns the numeric value of a character digit, using the value specified in an internal table called the Unicode Attribute Table. For example, the Unicode value \U217C is the Roman Numeral "L", which has a value of 50. The Unicode Attribute Table is also used to specify the type of a Unicode character. Types are categories such as punctuation, currency symbols, letters, and so on. Here's a simple program that displays the hexadecimal values of all the characters classified as currency symbols: public class CharDemo5 { public static void main(String args[]) { for (int i = 0; i <= 0xffff; i++) { if (Character.getType((char)i) == Character.CURRENCY_SYMBOL) { System.out.println(Integer.toHexString(i)); } } } } There are 27 such symbols. The first one listed, 0x24, corresponds to the familiar '$' character. A final example of how you can use the Character class has to do with Unicode character blocks. These blocks are used to group related characters. Some examples are BASIC_LATIN, ARABIC, GEORGIAN, ARROWS, and KANBUN. Here's a demo program that prints all character values in the GREEK character block: public class CharDemo6 { public static void main(String args[]) { for (int i = 0; i <= 0xffff; i++) { if (Character.UnicodeBlock.of((char)i) == Character.UnicodeBlock.GREEK) { System.out.println(Integer.toHexString(i)); } } } } To learn more about java.lang.Character, see section 11.1.3 Character, and Table 7 Unicode Character Blocks in Appendix B Useful Tables in "The Java Programming Language Third Edition" by Arnold, Gosling, and Holmes" (http://java.sun.com/docs/books/javaprog/thirdedition/). HANDLING UNCAUGHT EXCEPTIONS If you've done much programming in the Java(tm) programming language, you've probably encountered applications that terminate abnormally with an uncaught exception. Here's a program that does just that: public class ExcDemo1 { public static void main(String args[]) { int vec[] = new int[10]; vec[10] = 37; } } In this example, the program terminates abnormally due to an uncaught exception. The program throws an exception because of an illegal array access to vec[10] (vec has valid array indexes of 0-to-9). Before examining some techniques for handling uncaught exceptions, let's look at the rules for how the Java(tm) Virtual Machine* terminates a program. The first rule is that an uncaught exception terminates the thread in which it occurs. The second rule is that a program terminates when there are no more user threads available. Here's an example: class MyThread extends Thread { public void run() { try { Thread.sleep(5 * 1000); } catch (InterruptedException e) { System.err.println(e); } System.out.println("MyThread thread still alive"); } } public class ExcDemo2 { public static void main(String args[]) { new MyThread().start(); int vec[] = new int[10]; vec[10] = 37; } } The main thread shuts down almost immediately, due to an unhandled exception. But there's an instance of MyThread that remains active for approximately five seconds, and completes normally. So how do you handle uncaught exceptions? The first approach is very simple -- you put a try...catch block around the top-level method that invokes the application: public class ExcDemo3 { static void app() { int vec[] = new int[10]; vec[10] = 37; } public static void main(String args[]) { try { app(); } catch (Exception e) { System.err.println("uncaught exception: " + e); } } } All the exceptions that an application typically tries to catch are subclasses of java.lang.Exception, and so are caught by this technique. This excludes exceptions like OutOfMemoryException, which are descendants of java.lang.Error. If you really want to catch everything (not necessarily a good idea), you need to use a "catch (Throwable e)" clause. What if you want to extend this technique to multiple threads? An obvious approach is to say: class MyThread extends Thread { public void run() { int vec[] = new int[10]; vec[10] = 37; } } public class ExcDemo4 { public static void main(String args[]) { try { new MyThread().start(); } catch (Exception e) { System.err.println("uncaught exception: " + e); } } } Unfortunately, this approach doesn't actually work -- it simply catches exceptions caused by the start method itself, namely IllegalThreadStateException which is thrown when the thread has previously been started. So it's necessary to get a little more sophisticated, and override the uncaughtException method in the ThreadGroup class. A ThreadGroup object represents a group of threads. There is a method in ThreadGroup that is called when a thread within the group is about to die because of an uncaught exception. Here's what the code looks like: class MyThreadGroup extends ThreadGroup { public MyThreadGroup(String s) { super(s); } public void uncaughtException(Thread t, Throwable e) { System.err.println("uncaught exception: " + e); //super.uncaughtException(t, e); } } class MyThread extends Thread { public MyThread(ThreadGroup tg, String n) { super(tg, n); } public void run() { int vec[] = new int[10]; vec[10] = 37; } } public class ExcDemo5 { public static void main(String args[]) { ThreadGroup tg = new MyThreadGroup("mygroup"); Thread t = new MyThread(tg, "mythread"); t.start(); } } The code example creates a subclass of ThreadGroup, and overrides the uncaughtException method. This overridden method is called for a dying thread; the thread object and exception are passed as parameters to the method. By default, uncaughtException invokes the uncaughtException method on the thread group's parent group object. If there is no such group, the exception's printStackTrace method is called to display a stack trace. You can see what the default behavior looks like by commenting the "System.err.println" line and uncomment the "super.uncaughtException(t, e)" line. Further reading: sections 10.12 Thread and Exceptions, and 18.3 Shutdown in "The Java Programming Language Third Edition" by Arnold, Gosling, and Holmes" (http://java.sun.com/docs/books/javaprog/thirdedition/). . . . . . . . . . . . . . . . . . . . . . . . - NOTE Sun respects your online time and privacy. The Java Developer Connection mailing lists are used for internal Sun Microsystems(tm) purposes only. You have received this email because you elected to subscribe. To unsubscribe, go to the Subscriptions page (http://developer.java.sun.com/subscription/), uncheck the appropriate checkbox, and click the Update button. - SUBSCRIBE To subscribe to a JDC newsletter mailing list, go to the Subscriptions page (http://developer.java.sun.com/subscription/), choose the newsletters you want to subscribe to, and click Update. - FEEDBACK Comments? Send your feedback on the JDC Tech Tips to: jdc-webmaster@sun.com - ARCHIVES You'll find the JDC Tech Tips archives at: http://java.sun.com/jdc/TechTips/index.html - COPYRIGHT Copyright 2001 Sun Microsystems, Inc. All rights reserved. 901 San Antonio Road, Palo Alto, California 94303 USA. This document is protected by copyright. For more information, see: http://java.sun.com/jdc/copyright.html This issue of the JDC Tech Tips is written by Glen McCluskey. JDC Tech Tips January 9, 2001 * As used in this document, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.