Libcroco
Functions
cr-utils.c File Reference

Some misc utility functions used in the libcroco. More...

#include "cr-utils.h"
#include "cr-string.h"

Go to the source code of this file.

Functions

enum CRStatus cr_utils_utf8_str_len_as_ucs4 (const guchar *a_in_start, const guchar *a_in_end, gulong *a_len)
 Given an utf8 string buffer, calculates the length of this string if it was encoded in ucs4. More...
 
enum CRStatus cr_utils_ucs4_str_len_as_utf8 (const guint32 *a_in_start, const guint32 *a_in_end, gulong *a_len)
 Given an ucs4 string, this function returns the size (in bytes) this string would have occupied if it was encoded in utf-8. More...
 
enum CRStatus cr_utils_ucs1_str_len_as_utf8 (const guchar *a_in_start, const guchar *a_in_end, gulong *a_len)
 Given an ucsA string, this function returns the size (in bytes) this string would have occupied if it was encoded in utf-8. More...
 
enum CRStatus cr_utils_utf8_to_ucs4 (const guchar *a_in, gulong *a_in_len, guint32 *a_out, gulong *a_out_len)
 Converts an utf8 buffer into an ucs4 buffer. More...
 
enum CRStatus cr_utils_read_char_from_utf8_buf (const guchar *a_in, gulong a_in_len, guint32 *a_out, gulong *a_consumed)
 Reads a character from an utf8 buffer. More...
 
enum CRStatus cr_utils_utf8_str_len_as_ucs1 (const guchar *a_in_start, const guchar *a_in_end, gulong *a_len)
 
enum CRStatus cr_utils_utf8_str_to_ucs4 (const guchar *a_in, gulong *a_in_len, guint32 **a_out, gulong *a_out_len)
 Converts an utf8 string into an ucs4 string. More...
 
enum CRStatus cr_utils_ucs4_to_utf8 (const guint32 *a_in, gulong *a_in_len, guchar *a_out, gulong *a_out_len)
 Converts an ucs4 buffer into an utf8 buffer. More...
 
enum CRStatus cr_utils_ucs4_str_to_utf8 (const guint32 *a_in, gulong *a_in_len, guchar **a_out, gulong *a_out_len)
 Converts an ucs4 string into an utf8 string. More...
 
enum CRStatus cr_utils_ucs1_to_utf8 (const guchar *a_in, gulong *a_in_len, guchar *a_out, gulong *a_out_len)
 Converts an ucs1 buffer into an utf8 buffer. More...
 
enum CRStatus cr_utils_ucs1_str_to_utf8 (const guchar *a_in, gulong *a_in_len, guchar **a_out, gulong *a_out_len)
 Converts an ucs1 string into an utf8 string. More...
 
enum CRStatus cr_utils_utf8_to_ucs1 (const guchar *a_in, gulong *a_in_len, guchar *a_out, gulong *a_out_len)
 Converts an utf8 buffer into an ucs1 buffer. More...
 
enum CRStatus cr_utils_utf8_str_to_ucs1 (const guchar *a_in, gulong *a_in_len, guchar **a_out, gulong *a_out_len)
 Converts an utf8 buffer into an ucs1 buffer. More...
 
gboolean cr_utils_is_white_space (guint32 a_char)
 Returns TRUE if a_char is a white space as defined in the css spec in chap 4.1.1. More...
 
gboolean cr_utils_is_newline (guint32 a_char)
 Returns true if the character is a newline as defined in the css spec in the chap 4.1.1. More...
 
gboolean cr_utils_is_hexa_char (guint32 a_char)
 returns TRUE if the char is part of an hexa num char: i.e hexa_char ::= [0-9A-F] More...
 
gboolean cr_utils_is_nonascii (guint32 a_char)
 Returns true if the character is a nonascii character (as defined in the css spec chap 4.1.1): More...
 
void cr_utils_dump_n_chars (guchar a_char, FILE *a_fp, glong a_nb)
 Dumps a character a_nb times on a file. More...
 
void cr_utils_dump_n_chars2 (guchar a_char, GString *a_string, glong a_nb)
 
GList * cr_utils_dup_glist_of_string (GList const *a_list_of_strings)
 Duplicates a list of GString instances. More...
 
GList * cr_utils_dup_glist_of_cr_string (GList const *a_list_of_strings)
 Duplicate a GList where the GList::data is a CRString. More...
 

Detailed Description

Some misc utility functions used in the libcroco.

Note that troughout this file I will refer to the CSS SPECIFICATIONS DOCUMENTATION written by the w3c guys. You can find that document at http://www.w3.org/TR/REC-CSS2/ .

Definition in file cr-utils.c.

Function Documentation

◆ cr_utils_dump_n_chars()

void cr_utils_dump_n_chars ( guchar  a_char,
FILE *  a_fp,
glong  a_nb 
)

Dumps a character a_nb times on a file.

Parameters
a_charthe char to dump
a_fpthe destination file pointer
a_nbthe number of times a_char is to be dumped.

Definition at line 1260 of file cr-utils.c.

◆ cr_utils_dump_n_chars2()

void cr_utils_dump_n_chars2 ( guchar  a_char,
GString *  a_string,
glong  a_nb 
)

◆ cr_utils_dup_glist_of_cr_string()

GList* cr_utils_dup_glist_of_cr_string ( GList const *  a_list_of_strings)

Duplicate a GList where the GList::data is a CRString.

Parameters
a_list_of_stringsthe list to duplicate
Returns
the duplicated list, or NULL if something bad happened.

Definition at line 1314 of file cr-utils.c.

References cr_string_dup().

◆ cr_utils_dup_glist_of_string()

GList* cr_utils_dup_glist_of_string ( GList const *  a_list_of_strings)

Duplicates a list of GString instances.

Returns
the duplicated list of GString instances or NULL if something bad happened.
Parameters
a_list_of_stringsthe list of strings to be duplicated.

Definition at line 1288 of file cr-utils.c.

◆ cr_utils_is_hexa_char()

gboolean cr_utils_is_hexa_char ( guint32  a_char)

returns TRUE if the char is part of an hexa num char: i.e hexa_char ::= [0-9A-F]

Definition at line 1224 of file cr-utils.c.

◆ cr_utils_is_newline()

gboolean cr_utils_is_newline ( guint32  a_char)

Returns true if the character is a newline as defined in the css spec in the chap 4.1.1.

nl ::=
|\r
|\r|\f

Parameters
a_charthe character to test.
Returns
TRUE if the character is a newline, FALSE otherwise.

Definition at line 1206 of file cr-utils.c.

◆ cr_utils_is_nonascii()

gboolean cr_utils_is_nonascii ( guint32  a_char)

Returns true if the character is a nonascii character (as defined in the css spec chap 4.1.1):

nonascii ::= [^\0-\177]

Parameters
a_charthe character to test.
Returns
TRUE if the character is a nonascii char, FALSE otherwise.

Definition at line 1244 of file cr-utils.c.

◆ cr_utils_is_white_space()

gboolean cr_utils_is_white_space ( guint32  a_char)

Returns TRUE if a_char is a white space as defined in the css spec in chap 4.1.1.

white-space ::= ' '| \t|\r|
|\f

Parameters
a_charthe character to test. return TRUE if is a white space, false otherwise.

Definition at line 1181 of file cr-utils.c.

Referenced by cr_input_consume_white_spaces().

◆ cr_utils_read_char_from_utf8_buf()

enum CRStatus cr_utils_read_char_from_utf8_buf ( const guchar *  a_in,
gulong  a_in_len,
guint32 *  a_out,
gulong *  a_consumed 
)

Reads a character from an utf8 buffer.

Actually decode the next character code (unicode character code) and returns it.

Parameters
a_inthe starting address of the utf8 buffer.
a_in_lenthe length of the utf8 buffer.
a_outoutput parameter. The resulting read char.
a_consumedthe number of the bytes consumed to decode the returned character code.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 428 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_END_OF_INPUT_ERROR, and CR_OK.

Referenced by cr_input_peek_char(), and cr_input_read_char().

◆ cr_utils_ucs1_str_len_as_utf8()

enum CRStatus cr_utils_ucs1_str_len_as_utf8 ( const guchar *  a_in_start,
const guchar *  a_in_end,
gulong *  a_len 
)

Given an ucsA string, this function returns the size (in bytes) this string would have occupied if it was encoded in utf-8.

Parameters
a_in_starta pointer to the beginning of the input buffer.
a_in_enda pointer to the end of the input buffer.
a_lenout parameter. The computed length.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 230 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, and CR_OK.

Referenced by cr_utils_ucs1_str_to_utf8().

◆ cr_utils_ucs1_str_to_utf8()

enum CRStatus cr_utils_ucs1_str_to_utf8 ( const guchar *  a_in,
gulong *  a_in_len,
guchar **  a_out,
gulong *  a_out_len 
)

Converts an ucs1 string into an utf8 string.

Parameters
a_in_startthe beginning of the input string to convert.
a_in_endthe end of the input string to convert.
a_outout parameter. The converted string.
a_outout parameter. The length of the converted string.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 941 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_OK, cr_utils_ucs1_str_len_as_utf8(), and cr_utils_ucs1_to_utf8().

◆ cr_utils_ucs1_to_utf8()

enum CRStatus cr_utils_ucs1_to_utf8 ( const guchar *  a_in,
gulong *  a_in_len,
guchar *  a_out,
gulong *  a_out_len 
)

Converts an ucs1 buffer into an utf8 buffer.

The caller must know the size of the resulting buffer and allocate it prior to calling this function.

Parameters
a_inthe input ucs1 buffer.
a_in_lenin/out parameter. The length of the input buffer. After return, points to the number of bytes actually consumed even in case of encoding error.
a_outout parameter. The output utf8 converted buffer.
a_out_lenin/out parameter. The size of the output buffer. If the output buffer size is shorter than the actual needed size, this function just convert what it can.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 886 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, and CR_OK.

Referenced by cr_utils_ucs1_str_to_utf8().

◆ cr_utils_ucs4_str_len_as_utf8()

enum CRStatus cr_utils_ucs4_str_len_as_utf8 ( const guint32 *  a_in_start,
const guint32 *  a_in_end,
gulong *  a_len 
)

Given an ucs4 string, this function returns the size (in bytes) this string would have occupied if it was encoded in utf-8.

Parameters
a_in_starta pointer to the beginning of the input buffer.
a_in_enda pointer to the end of the input buffer.
a_lenout parameter. The computed length.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 187 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, and CR_OK.

Referenced by cr_utils_ucs4_str_to_utf8().

◆ cr_utils_ucs4_str_to_utf8()

enum CRStatus cr_utils_ucs4_str_to_utf8 ( const guint32 *  a_in,
gulong *  a_in_len,
guchar **  a_out,
gulong *  a_out_len 
)

Converts an ucs4 string into an utf8 string.

Parameters
a_inthe input string to convert.
a_in_lenin/out parameter. The length of the input string. After return, points to the actual number of characters consumed. This can be usefull to debug the input string in case of encoding error.
a_outout parameter. Points to the output string. It is allocated by this function and must be freed by the caller.
a_out_lenout parameter. The length (in bytes) of the output string.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 845 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_OK, cr_utils_ucs4_str_len_as_utf8(), and cr_utils_ucs4_to_utf8().

◆ cr_utils_ucs4_to_utf8()

enum CRStatus cr_utils_ucs4_to_utf8 ( const guint32 *  a_in,
gulong *  a_in_len,
guchar *  a_out,
gulong *  a_out_len 
)

Converts an ucs4 buffer into an utf8 buffer.

Parameters
a_inthe input ucs4 buffer to convert.
a_in_lenin/out parameter. The size of the input buffer to convert. After return, this parameter contains the actual number of characters consumed.
a_outthe output converted utf8 buffer. Must be allocated by the caller.
a_out_lenin/out parameter. The size of the output buffer. If this size is actually smaller than the real needed size, the function just converts what it can and returns a success status. After return, this param points to the actual number of bytes in the buffer.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 748 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_ENCODING_ERROR, and CR_OK.

Referenced by cr_utils_ucs4_str_to_utf8().

◆ cr_utils_utf8_str_len_as_ucs1()

enum CRStatus cr_utils_utf8_str_len_as_ucs1 ( const guchar *  a_in_start,
const guchar *  a_in_end,
gulong *  a_len 
)

Definition at line 569 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_ENCODING_ERROR, and CR_OK.

◆ cr_utils_utf8_str_len_as_ucs4()

enum CRStatus cr_utils_utf8_str_len_as_ucs4 ( const guchar *  a_in_start,
const guchar *  a_in_end,
gulong *  a_len 
)

Given an utf8 string buffer, calculates the length of this string if it was encoded in ucs4.

Parameters
a_in_starta pointer to the begining of the input utf8 string.
a_in_enda pointre to the end of the input utf8 string (points to the last byte of the buffer)
a_lenout parameter the calculated length.
Returns
CR_OK upon succesfull completion, an error code otherwise.

Definition at line 69 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_ENCODING_ERROR, and CR_OK.

Referenced by cr_utils_utf8_str_to_ucs1(), and cr_utils_utf8_str_to_ucs4().

◆ cr_utils_utf8_str_to_ucs1()

enum CRStatus cr_utils_utf8_str_to_ucs1 ( const guchar *  a_in,
gulong *  a_in_len,
guchar **  a_out,
gulong *  a_out_len 
)

Converts an utf8 buffer into an ucs1 buffer.

Parameters
a_in_startthe start of the input buffer.
a_in_endthe end of the input buffer.
a_outout parameter. The resulting converted ucs4 buffer. Must be freed by the caller.
a_out_lenout parameter. The length of the converted buffer.
Returns
CR_OK upon successfull completion, an error code otherwise. Note that out parameters are valid if and only if this function returns CR_OK.

Definition at line 1141 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_OK, cr_utils_utf8_str_len_as_ucs4(), and cr_utils_utf8_to_ucs1().

◆ cr_utils_utf8_str_to_ucs4()

enum CRStatus cr_utils_utf8_str_to_ucs4 ( const guchar *  a_in,
gulong *  a_in_len,
guint32 **  a_out,
gulong *  a_out_len 
)

Converts an utf8 string into an ucs4 string.

Parameters
a_inthe input string to convert.
a_in_lenin/out parameter. The length of the input string. After return, points to the actual number of bytes consumed. This can be usefull to debug the input stream in case of encoding error.
a_outout parameter. Points to the output string. It is allocated by this function and must be freed by the caller.
a_out_lenout parameter. The length of the output string.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 710 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_OK, cr_utils_utf8_str_len_as_ucs4(), and cr_utils_utf8_to_ucs4().

◆ cr_utils_utf8_to_ucs1()

enum CRStatus cr_utils_utf8_to_ucs1 ( const guchar *  a_in,
gulong *  a_in_len,
guchar *  a_out,
gulong *  a_out_len 
)

Converts an utf8 buffer into an ucs1 buffer.

The caller must know the size of the resulting converted buffer, and allocated it prior to calling this function.

Parameters
a_inthe input utf8 buffer to convert.
a_in_lenin/out parameter. The size of the input utf8 buffer. After return, points to the number of bytes consumed by the function even in case of encoding error.
a_outout parameter. Points to the resulting buffer. Must be allocated by the caller. If the size of a_out is shorter than its required size, this function converts what it can and return a successfull status.
a_out_lenin/out parameter. The size of the output buffer. After return, points to the number of bytes consumed even in case of encoding error.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 995 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, CR_ENCODING_ERROR, and CR_OK.

Referenced by cr_utils_utf8_str_to_ucs1().

◆ cr_utils_utf8_to_ucs4()

enum CRStatus cr_utils_utf8_to_ucs4 ( const guchar *  a_in,
gulong *  a_in_len,
guint32 *  a_out,
gulong *  a_out_len 
)

Converts an utf8 buffer into an ucs4 buffer.

Parameters
a_inthe input utf8 buffer to convert.
a_in_lenin/out parameter. The size of the input buffer to convert. After return, this parameter contains the actual number of bytes consumed.
a_outthe output converted ucs4 buffer. Must be allocated by the caller.
a_out_lenin/out parameter. The size of the output buffer. If this size is actually smaller than the real needed size, the function just converts what it can and returns a success status. After return, this param points to the actual number of characters decoded.
Returns
CR_OK upon successfull completion, an error code otherwise.

Definition at line 270 of file cr-utils.c.

References CR_BAD_PARAM_ERROR, and CR_OK.

Referenced by cr_utils_utf8_str_to_ucs4().