com.ibm.icu.lang.UCharacterCategory Java Examples

The following examples show how to use com.ibm.icu.lang.UCharacterCategory. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: SpoofChecker.java    From fitnotifications with Apache License 2.0 6 votes vote down vote up
/**
 * Computes the set of numerics for a string, according to UTS 39 section 5.3.
 */
private void getNumerics(String input, UnicodeSet result) {
    result.clear();

    for (int utf16Offset = 0; utf16Offset < input.length();) {
        int codePoint = Character.codePointAt(input, utf16Offset);
        utf16Offset += Character.charCount(codePoint);

        // Store a representative character for each kind of decimal digit
        if (UCharacter.getType(codePoint) == UCharacterCategory.DECIMAL_DIGIT_NUMBER) {
            // Store the zero character as a representative for comparison.
            // Unicode guarantees it is codePoint - value
            result.add(codePoint - UCharacter.getNumericValue(codePoint));
        }
    }
}
 
Example #2
Source File: UCharacterName.java    From fitnotifications with Apache License 2.0 6 votes vote down vote up
/**
* Gets the character extended type
* @param ch character to be tested
* @return extended type it is associated with
*/
private static int getType(int ch)
{
    if (UCharacterUtility.isNonCharacter(ch)) {
        // not a character we return a invalid category count
        return NON_CHARACTER_;
    }
    int result = UCharacter.getType(ch);
    if (result == UCharacterCategory.SURROGATE) {
        if (ch <= UTF16.LEAD_SURROGATE_MAX_VALUE) {
            result = LEAD_SURROGATE_;
        }
        else {
            result = TRAIL_SURROGATE_;
        }
    }
    return result;
}
 
Example #3
Source File: UCharacterName.java    From trekarta with GNU General Public License v3.0 6 votes vote down vote up
/**
* Gets the character extended type
* @param ch character to be tested
* @return extended type it is associated with
*/
private static int getType(int ch)
{
    if (UCharacterUtility.isNonCharacter(ch)) {
        // not a character we return a invalid category count
        return NON_CHARACTER_;
    }
    int result = UCharacter.getType(ch);
    if (result == UCharacterCategory.SURROGATE) {
        if (ch <= UTF16.LEAD_SURROGATE_MAX_VALUE) {
            result = LEAD_SURROGATE_;
        }
        else {
            result = TRAIL_SURROGATE_;
        }
    }
    return result;
}
 
Example #4
Source File: CaseMapImpl.java    From trekarta with GNU General Public License v3.0 5 votes vote down vote up
private static boolean isLNS(int c) {
    // Letter, number, symbol,
    // or a private use code point because those are typically used as letters or numbers.
    // Consider modifier letters only if they are cased.
    int gc = UCharacterProperty.INSTANCE.getType(c);
    return ((1 << gc) & LNS) != 0 ||
            (gc == UCharacterCategory.MODIFIER_LETTER &&
                UCaseProps.INSTANCE.getType(c) != UCaseProps.NONE);
}
 
Example #5
Source File: UCharacterProperty.java    From fitnotifications with Apache License 2.0 4 votes vote down vote up
@Override
int getMaxValue(int which) {
    return UCharacterCategory.CHAR_CATEGORY_COUNT-1;
}
 
Example #6
Source File: UCharacterProperty.java    From trekarta with GNU General Public License v3.0 4 votes vote down vote up
@Override
int getMaxValue(int which) {
    return UCharacterCategory.CHAR_CATEGORY_COUNT-1;
}
 
Example #7
Source File: Characters.java    From es6draft with MIT License 2 votes vote down vote up
/**
 * Unicode category "Zs" (space separator)
 * 
 * @param c
 *            the character
 * @return {@code true} if the character is space separator
 */
public static boolean isSpaceSeparator(int c) {
    return UCharacter.getType(c) == UCharacterCategory.SPACE_SEPARATOR;
}