java.lang.Object
- org.apache.lucene.analysis.util.CharacterUtils

```
public abstract class CharacterUtils
extends java.lang.Object
```
CharacterUtils provides a unified interface to Character-related operations to implement backwards compatible character operations based on a Version instance.

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class CharacterUtils.CharacterBuffer
A simple IO buffer to use with fill(CharacterBuffer, Reader).

Constructor Summary

Constructors
Constructor Description

CharacterUtils()

Method Summary

All Methods Static Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method	Description
`abstract int`	`codePointAt(char[] chars, int offset, int limit)`	Returns the code point at the given index of the char array where only elements with index less than the limit are used.
`abstract int`	`codePointAt(java.lang.CharSequence seq, int offset)`	Returns the code point at the given index of the `CharSequence`.
`abstract int`	`codePointCount(java.lang.CharSequence seq)`	Return the number of characters in `seq`.
`boolean`	`fill(CharacterUtils.CharacterBuffer buffer, java.io.Reader reader)`	Convenience method which calls `fill(buffer, reader, buffer.buffer.length)`.
`abstract boolean`	`fill(CharacterUtils.CharacterBuffer buffer, java.io.Reader reader, int numChars)`	Fills the `CharacterUtils.CharacterBuffer` with characters read from the given reader `Reader`.
`static CharacterUtils`	`getInstance(Version matchVersion)`	Returns a `CharacterUtils` implementation according to the given `Version` instance.
`static CharacterUtils`	`getJava4Instance()`	Return a `CharacterUtils` instance compatible with Java 1.4.
`static CharacterUtils.CharacterBuffer`	`newCharacterBuffer(int bufferSize)`	Creates a new `CharacterUtils.CharacterBuffer` and allocates a `char[]` of the given bufferSize.
`abstract int`	`offsetByCodePoints(char[] buf, int start, int count, int index, int offset)`	Return the index within `buf[start:start+count]` which is by `offset` code points from `index`.
`int`	`toChars(int[] src, int srcOff, int srcLen, char[] dest, int destOff)`	Converts a sequence of unicode code points to a sequence of Java characters.
`int`	`toCodePoints(char[] src, int srcOff, int srcLen, int[] dest, int destOff)`	Converts a sequence of Java characters to a sequence of unicode code points.
`void`	`toLowerCase(char[] buffer, int offset, int limit)`	Converts each unicode codepoint to lowerCase via `Character.toLowerCase(int)` starting at the given offset.
`void`	`toUpperCase(char[] buffer, int offset, int limit)`	Converts each unicode codepoint to UpperCase via `Character.toUpperCase(int)` starting at the given offset.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - CharacterUtils
```
public CharacterUtils()
```
- Method Detail
  - getInstance
```
public static CharacterUtils getInstance(Version matchVersion)
```
    Returns a CharacterUtils implementation according to the given Version instance.
    
    Parameters:
    
    matchVersion - a version instance
    
    Returns:
    
    a CharacterUtils implementation according to the given Version instance.
  - getJava4Instance
```
public static CharacterUtils getJava4Instance()
```
    Return a CharacterUtils instance compatible with Java 1.4.
  - codePointAt
```
public abstract int codePointAt(java.lang.CharSequence seq,
                                int offset)
```
    Returns the code point at the given index of the CharSequence. Depending on the Version passed to getInstance(Version) this method mimics the behavior of Character.codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
    
    Parameters:
    
    seq - a character sequence
    
    offset - the offset to the char values in the chars array to be converted
    
    Returns:
    
    the Unicode code point at the given index
    
    Throws:
    
    java.lang.NullPointerException - - if the sequence is null.
    
    java.lang.IndexOutOfBoundsException - - if the value offset is negative or not less than the length of the character sequence.
  - codePointAt
```
public abstract int codePointAt(char[] chars,
                                int offset,
                                int limit)
```
    Returns the code point at the given index of the char array where only elements with index less than the limit are used. Depending on the Version passed to getInstance(Version) this method mimics the behavior of Character.codePointAt(char[], int) as it would have been available on a Java 1.4 JVM or on a later virtual machine version.
    
    Parameters:
    
    chars - a character array
    
    offset - the offset to the char values in the chars array to be converted
    
    limit - the index afer the last element that should be used to calculate codepoint.
    
    Returns:
    
    the Unicode code point at the given index
    
    Throws:
    
    java.lang.NullPointerException - - if the array is null.
    
    java.lang.IndexOutOfBoundsException - - if the value offset is negative or not less than the length of the char array.
  - codePointCount
```
public abstract int codePointCount(java.lang.CharSequence seq)
```
    Return the number of characters in seq.
  - newCharacterBuffer
```
public static CharacterUtils.CharacterBuffer newCharacterBuffer(int bufferSize)
```
    Creates a new CharacterUtils.CharacterBuffer and allocates a char[] of the given bufferSize.
    
    Parameters:
    
    bufferSize - the internal char buffer size, must be >= 2
    
    Returns:
    
    a new CharacterUtils.CharacterBuffer instance.
  - toLowerCase
```
public final void toLowerCase(char[] buffer,
                              int offset,
                              int limit)
```
    Converts each unicode codepoint to lowerCase via Character.toLowerCase(int) starting at the given offset.
    
    Parameters:
    
    buffer - the char buffer to lowercase
    
    offset - the offset to start at
    
    limit - the max char in the buffer to lower case
  - toUpperCase
```
public final void toUpperCase(char[] buffer,
                              int offset,
                              int limit)
```
    Converts each unicode codepoint to UpperCase via Character.toUpperCase(int) starting at the given offset.
    
    Parameters:
    
    buffer - the char buffer to UPPERCASE
    
    offset - the offset to start at
    
    limit - the max char in the buffer to lower case
  - toCodePoints
```
public final int toCodePoints(char[] src,
                              int srcOff,
                              int srcLen,
                              int[] dest,
                              int destOff)
```
    Converts a sequence of Java characters to a sequence of unicode code points.
    
    Returns:
    
    the number of code points written to the destination buffer
  - toChars
```
public final int toChars(int[] src,
                         int srcOff,
                         int srcLen,
                         char[] dest,
                         int destOff)
```
    Converts a sequence of unicode code points to a sequence of Java characters.
    
    Returns:
    
    the number of chars written to the destination buffer
  - fill
```
public abstract boolean fill(CharacterUtils.CharacterBuffer buffer,
                             java.io.Reader reader,
                             int numChars)
                      throws java.io.IOException
```
    Fills the CharacterUtils.CharacterBuffer with characters read from the given reader Reader. This method tries to read numChars characters into the CharacterUtils.CharacterBuffer, each call to fill will start filling the buffer from offset 0 up to numChars. In case code points can span across 2 java characters, this method may only fill numChars - 1 characters in order not to split in the middle of a surrogate pair, even if there are remaining characters in the Reader.
    Depending on the Version passed to getInstance(Version) this method implements supplementary character awareness when filling the given buffer. For all Version > 3.0 fill(CharacterBuffer, Reader, int) guarantees that the given CharacterUtils.CharacterBuffer will never contain a high surrogate character as the last element in the buffer unless it is the last available character in the reader. In other words, high and low surrogate pairs will always be preserved across buffer boarders.
    
    A return value of false means that this method call exhausted the reader, but there may be some bytes which have been read, which can be verified by checking whether buffer.getLength() > 0.
    
    Parameters:
    
    buffer - the buffer to fill.
    
    reader - the reader to read characters from.
    
    numChars - the number of chars to read
    
    Returns:
    
    false if and only if reader.read returned -1 while trying to fill the buffer
    
    Throws:
    
    java.io.IOException - if the reader throws an IOException.
  - fill
```
public final boolean fill(CharacterUtils.CharacterBuffer buffer,
                          java.io.Reader reader)
                   throws java.io.IOException
```
    Convenience method which calls fill(buffer, reader, buffer.buffer.length).
    
    Throws:
    
    java.io.IOException
  - offsetByCodePoints
```
public abstract int offsetByCodePoints(char[] buf,
                                       int start,
                                       int count,
                                       int index,
                                       int offset)
```
    Return the index within buf[start:start+count] which is by offset code points from index.

Class CharacterUtils

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

CharacterUtils

Method Detail

getInstance

getJava4Instance

codePointAt

codePointAt

codePointCount

newCharacterBuffer

toLowerCase

toUpperCase

toCodePoints

toChars

fill

fill

offsetByCodePoints