A CharsetDecoder is a
"decoding engine" that converts a
sequence of bytes into a sequence of characters based on the encoding
of some charset. Obtain a CharsetDecoder from the
Charset that represents the charset to be decoded.
If you have a complete sequence of bytes to be decoded in a
ByteBuffer you can pass that buffer to the
one-argument version of decode(
).
This convenience method decodes the bytes and stores the resulting
characters into a newly allocated CharBuffer,
resetting and flushing the decoder as necessary. It throws an
exception if there are problems with the bytes to be decoded.
Typically, however, the three-argument version of decode(
) is used in a multistep decoding process:
Call the reset( ) method, unless this is the first
time the CharsetDecoder has been used.
Call the three-argument version of decode( ) one
or more times. The third argument should be true
on, and only on, the last invocation of the method. The first
argument to decode( ) is a
ByteBuffer that contains bytes to be decoded. The
second argument is a CharBuffer into which the
resulting characters are stored. The return value of the method is a
CoderResult object that specifies the state of the
ongoing the decoding operation. The possible
CoderResult return values are detailed below. In a
typical case, however, decode( ) returns after it
has decoded all of the bytes in the input buffer. In this case, you
would then typically fill the input buffer with more bytes to be
decoded, and read characters from the output buffer, calling its
compact( ) method to make room for more. If an
unexpected problem arises in the CharsetDecoder
implementation, decode( ) throws a
CoderMalfunctionError.
Pass the output CharBuffer to the flush(
) method to allow any remaining characters to be output.
The decode( ) method returns a
CoderResult that indicates the state of the
decoding operation. If the return value is
CoderResult.UNDERFLOW, then it means that
decode( ) returned because all bytes from the
input buffer have been read, and more input is required. If the
return value is CoderResult.OVERFLOW, then it
means that decode( ) returned because the output
CharBuffer is full, and no more characters can be
decoded into it. Otherwise, the reurn value is a
CoderResult whose isError( )
method returns true. There are two basic types of
decoding errors. If isMalformed( ) returns
true then the input included bytes that are not
legal for the charset. These bytes start at the position of the input
buffer, and continue for length( ) bytes.
Otherwise, if isUnmappable( ) returns
true, then the input bytes include a character for
which there is no representation in Unicode. The relevant bytes start
at the position of the input buffer and continue for length(
) bytes.
By default a CharsetDecoder reports all malformed
input and unmappable character errors by returning a
CoderResult object as described above. This
behavior can be altered, however, by passing a
CodingErrorAction to onMalformedInput(
) and onUnmappableCharacter( ). (Query
the current action for these types of errors with
malformedInputAction( ) and
unmappableCharacterAction( ).)
CodingErrorAction defines three constants that
represent the three possible actions. The default action is
REPORT. The action IGNORE tells
the CharsetDecoder to ignore (i.e. skip) malformed
input and unmappable charaters. The REPLACE action
tells the CharsetDecoder to replace malformed
input and unmappable characters with the replacement string. This
replacement string can be set with replaceWith( ),
and can be queried with replacement( ).
averageCharsPerByte( ) and
maxCharsPerByte( ) return the average and maximum
number of characters that are produced by this decoder per decoded
byte. These values can be used to help you choose the size of the
CharBuffer to allocate for decoding.
CharsetDecoder is not a thread-safe class. Only
one thread should use an instance at a time.
CharsetDecoder is an abstract class. Implementors
defining new charsets will need to subclass
CharsetDecoder and define the abstract
decodeLoop( ) method, which is invoked by
decode( ).
public abstract class CharsetDecoder {
// Protected Constructors
protected CharsetDecoder(Charset cs,
float averageCharsPerByte, float maxCharsPerByte);
// Public Instance Methods
public final float averageCharsPerByte( );
public final Charset charset( );
public final java.nio.CharBuffer decode(java.nio.ByteBuffer in)
throws CharacterCodingException;
public final CoderResult decode(java.nio.ByteBuffer in, java.nio.
CharBuffer out, boolean endOfInput);
public Charset detectedCharset( );
public final CoderResult flush(java.nio.CharBuffer out);
public boolean isAutoDetecting( ); constant
public boolean isCharsetDetected( );
public CodingErrorAction malformedInputAction( );
public final float maxCharsPerByte( );
public final CharsetDecoder onMalformedInput(CodingErrorAction newAction);
public final CharsetDecoder onUnmappableCharacter(CodingErrorAction
newAction);
public final String replacement( );
public final CharsetDecoder replaceWith(String newReplacement);
public final CharsetDecoder reset( );
public CodingErrorAction unmappableCharacterAction( );
// Protected Instance Methods
protected abstract CoderResult decodeLoop(java.
nio.ByteBuffer in, java.nio.CharBuffer out);
protected CoderResult implFlush(java.nio.CharBuffer out);
protected void implOnMalformedInput(CodingErrorAction
newAction); empty
protected void implOnUnmappableCharacter(CodingErrorAction
newAction); empty
protected void implReplaceWith(String
newReplacement); empty
protected void implReset( ); empty
}