Open Chinese Convert  1.0.3
A project for conversion between Traditional and Simplified Chinese
 All Classes Functions Typedefs Modules
Static Public Member Functions | List of all members
opencc::UTF8Util Class Reference

UTF8 string utilities. More...

#include <UTF8Util.hpp>

Static Public Member Functions

static void SkipUtf8Bom (FILE *fp)
 Detect UTF8 BOM and skip it.
 
static size_t NextCharLengthNoException (const char *str)
 Returns the length in byte for the next UTF8 character. More...
 
static size_t NextCharLength (const char *str)
 Returns the length in byte for the next UTF8 character.
 
static size_t PrevCharLength (const char *str)
 Returns the length in byte for the previous UTF8 character.
 
static const char * NextChar (const char *str)
 Returns the char* pointer over the next UTF8 character.
 
static const char * PrevChar (const char *str)
 Move the char* pointer before the previous UTF8 character.
 
static size_t Length (const char *str)
 Returns the UTF8 length of a valid UTF8 string.
 
static const char * FindNextInline (const char *str, const char ch)
 Finds a character in the same line. More...
 
static bool IsLineEndingOrFileEnding (const char ch)
 Returns ture if the character is a line ending or end of file.
 
static string FromSubstr (const char *str, size_t length)
 Copies a substring with given length to a new std::string.
 
static bool NotShorterThan (const char *str, size_t byteLength)
 Returns true if the given string is longer or as long as the given length.
 
static string TruncateUTF8 (const char *str, size_t maxByteLength)
 Truncates a string with a maximal length in byte. More...
 
static void ReplaceAll (string &str, const char *from, const char *to)
 Replaces all patterns in a string in place.
 
static string Join (const vector< string > &strings, const string &separator)
 Joins a string vector in to a string with a separator.
 
static string Join (const vector< string > &strings)
 Joins a string vector in to a string.
 
static void GetByteMap (const char *str, const size_t utf8Length, vector< size_t > *byteMap)
 

Detailed Description

UTF8 string utilities.

Member Function Documentation

static const char* opencc::UTF8Util::FindNextInline ( const char *  str,
const char  ch 
)
inlinestatic

Finds a character in the same line.

Parameters
strThe text to be searched in.
chThe character to find.
Returns
The pointer that points to the found chacter in str or EOL/EOF.
static size_t opencc::UTF8Util::NextCharLengthNoException ( const char *  str)
inlinestatic

Returns the length in byte for the next UTF8 character.

On error returns 0.

static string opencc::UTF8Util::TruncateUTF8 ( const char *  str,
size_t  maxByteLength 
)
inlinestatic

Truncates a string with a maximal length in byte.

No UTF8 character will be broken.


The documentation for this class was generated from the following files: