com.dalsemi.io
Class ByteToCharConverter

java.lang.Object
  |
  +--com.dalsemi.io.ByteToCharConverter
Direct Known Subclasses:
ByteToCharISO8859_1, ByteToCharUTF8

public abstract class ByteToCharConverter
extends Object

This class defines an interface to allow conversion of bytes to characters for a particular encoding scheme. Encoding converters should reside in the com.dalsemi.io package.

Many of the encoding schemes need to take state into account in the conversion process. That is, the conversion to a char might depend on the byte sequence converted before it. To accommodate this, the ByteToCharConverter has the ability to remember state between conversions (between calls to convert(). Therefore, the caller should call the flush() method to finalize the conversion and reset the converter's internal state.

Subclasses of this abstract class need to implement getMaxCharCount(), convert(), flush(), and getName().

Programs should not call into a converter directly. A better method of executing byte array conversions is to use the java.lang.String(byte[],String) constructor.


      ...
      //byte[] preConvertedBytes is previously declared and
      //has a sequence of UTF8 encoded bytes
      String str = new String(preConvertedBytes,"UTF8");
      ...
 

This will convert the bytes stored in preConvertedBytes into a String according to the UTF8 encoding scheme.

See Also:
CharToByteConverter

Constructor Summary
ByteToCharConverter()
           
 
Method Summary
abstract  int convert(byte[] src, int srcStart, int srcEnd, char[] dst, int dstStart, int dstEnd)
          Converts the specified byte array into a char array based on this ByteToCharConverter's encoding scheme.
abstract  int flush(char[] buff, int start, int end)
          Tells the ByteToCharConverter to convert any unconverted data it has internally stored.
static ByteToCharConverter getConverter(String name)
          Dynamically loads a ByteToCharConverter for the specified encoding scheme.
static ByteToCharConverter getDefaultConverter()
          Returns the default ByteToCharConverter for the system.
abstract  int getMaxCharCount(byte[] forThis, int start, int end)
          Returns the number of characters that the specified byte sequence will require for encoding.
abstract  String getName()
          Returns the name of this encoding scheme.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ByteToCharConverter

public ByteToCharConverter()
Method Detail

getConverter

public static ByteToCharConverter getConverter(String name)
Dynamically loads a ByteToCharConverter for the specified encoding scheme. All converters should be placed in the com.dalsemi.io package, and have class name ByteToCharNAME, where NAME is the encoding scheme. For example, the UTF8 ByteToCharConverter is called com.dalsemi.io.ByteToCharUTF8.
Parameters:
name - the name of the encoding scheme
Returns:
converter for the specified encoding scheme, or null if none could be found

getDefaultConverter

public static ByteToCharConverter getDefaultConverter()
Returns the default ByteToCharConverter for the system. The name of the default encoding scheme is stored in the system property "file.encoding". This method finds the name of the default encoding scheme, and calls getConverter() with that name as its argument.
Returns:
converter for the system's default file encoding property, or null if the converter could not be found
See Also:
getConverter(java.lang.String)

getMaxCharCount

public abstract int getMaxCharCount(byte[] forThis,
                                    int start,
                                    int end)
Returns the number of characters that the specified byte sequence will require for encoding. For instance, in UTF8 encoding, a one, two, or three byte sequence may encode to one char. This method should always be called before the convert() method. The value returned may not be the actual number of converted characters that will be produced due to conversion errors, but it will be the maximum that will be produced.
Parameters:
forThis - the byte sequence to determine the required encoding size
start - offset into the byte array to begin processing
end - the ending offset in the byte array to stop processing. The number of processed bytes will then be (end-start).
Returns:
The number of bytes required to encode the specified byte sequence
See Also:
convert(byte[],int,int,char[],int,int)

convert

public abstract int convert(byte[] src,
                            int srcStart,
                            int srcEnd,
                            char[] dst,
                            int dstStart,
                            int dstEnd)
                     throws CharConversionException
Converts the specified byte array into a char array based on this ByteToCharConverter's encoding scheme. getMaxCharCount() should always be called first to find out how much room is required in the destination char array.
Parameters:
src - the same byte array passed to getMaxCharCount()
srcStart - the same starting offset as passed to getMaxCharCount()
srcEnd - the same ending offset as passed to getMaxCharCount()
dst - the destination character array.
dstStart - the offset to begin storing converted bytes in the destination array
dstEnd - the ending location for storing converted bytes into the destination array. This argument may usually be ignored, as the algorithm may choose to continue converting bytes until finished.
Returns:
number of characters created and stored from this character sequence
Throws:
CharConversionException - if an illegal byte sequence is encountered that cannot be converted
See Also:
getMaxCharCount(byte[],int,int), flush(char[],int,int)

flush

public abstract int flush(char[] buff,
                          int start,
                          int end)
                   throws CharConversionException
Tells the ByteToCharConverter to convert any unconverted data it has internally stored. Some ByteToCharConverter's will store state between calls to convert(). Since the converter may be left in an unknown state, the converter should be flushed to notify it that no more input will be received. The converter can handle any unfinished conversions before its output is used.
Parameters:
buff - the destination character array
start - the next available offset into the destination array
end - offset in the destination array to stop placing data (may be ignored by some algorithms)
Returns:
number of bytes that were stored in the destination array from this call to flush()
Throws:
CharConversionException - if an illegal character is encountered that cannot be converted
See Also:
convert(byte[],int,int,char[],int,int)

getName

public abstract String getName()
Returns the name of this encoding scheme. For example, "UTF8".
Returns:
String representing the name of this encoding scheme.