Allegro can manipulate and display text using any character values from 0 right up to 2^32-1 (although the current implementation of the grabber can only create fonts using characters up to 2^16-1). You can choose between a number of different text encoding formats, which controls how strings are stored and how Allegro interprets strings that you pass to it. This setting affects all aspects of the system: whenever you see a function that returns a char * type, or that takes a char * as an argument, that text will be in whatever format you have told Allegro to use.
By default, Allegro uses UTF-8 encoded text (U_UTF8). This is a variable-width format, where characters can occupy anywhere from one to four bytes. The nice thing about it is that characters ranging from 0-127 are encoded directly as themselves, so UTF-8 is upwardly compatible with 7-bit ASCII ("Hello, World!" means the same thing regardless of whether you interpret it as ASCII or UTF-8 data). Any character values above 128, such as accented vowels, the UK currency symbol, and Arabic or Chinese characters, will be encoded as a sequence of two or more bytes, each in the range 128-255. This means you will never get what looks like a 7-bit ASCII character as part of the encoding of a different character value, which makes it very easy to manipulate UTF-8 strings.
There are a few editing programs that understand UTF-8 format text files. Alternatively, you can write your strings in plain ASCII or 16-bit Unicode formats, and then use the Allegro textconv program to convert them into UTF-8.
If you prefer to use some other text format, you can set Allegro to work with normal 8-bit ASCII (U_ASCII), or 16-bit Unicode (U_UNICODE) instead, or you can provide some handler functions to make it support whatever other text encoding you like (for example it would be easy to add support for 32 bit UCS-4 characters, or the Chinese GB-code format).
There is some limited support for alternative 8-bit codepages, via the U_ASCII_CP mode. This is very slow, so you shouldn't use it for serious work, but it can be handy as an easy way to convert text between different codepages. By default the U_ASCII_CP mode is set up to reduce text to a clean 7-bit ASCII format, trying to replace any accented vowels with their simpler equivalents (this is used by the allegro_message() function when it needs to print an error report onto a text mode DOS screen). If you want to work with other codepages, you can do this by passing a character mapping table to the set_ucodepage() function.
Note that you can use the Unicode routines before you call install_allegro() or allegro_init(). If you want to work in a text mode other than UTF-8, it is best to set it with set_uformat() just before you call these.
Although you can change the text format on the fly, this is not a good idea. Many strings, for example the names of your hardware drivers and any language translations, are loaded when you call allegro_init(), so if you change the encoding format after this, they will be in the wrong format, and things will not work properly. Generally you should only call set_uformat() once, before allegro_init(), and then leave it on the same setting for the duration of your program.U_ASCII - fixed size, 8-bit ASCII characters U_ASCII_CP - alternative 8-bit codepage (see set_ucodepage()) U_UNICODE - fixed size, 16-bit Unicode characters U_UTF8 - variable size, UTF-8 format Unicode characters
See also: get_uformat, register_uformat, set_ucodepage, set_uformat, uconvert, ustrsize, ugetc, ugetx, usetc, uwidth, ucwidth, uisok, uoffset, ugetat, usetat, uinsert, uremove, allegro_init.
Examples using this: exunicod.
switch(get_uformat()) { case U_ASCII: do_something(); break; case U_UTF8: do_something_else(); break; ... }
Return value: Returns the currently selected text encoding format. See the documentation of set_uformat() for a list of encoding formats.
See also: set_uformat.
See also: set_uformat, uconvert, ugetc, ugetx, usetc, uwidth, ucwidth, uisok.
The `table' parameter points to an array of 256 shorts, which contain the Unicode value for each character in your codepage. The `extras' parameter, if not NULL, points to a list of mapping pairs, which will be used when reducing Unicode data to your codepage. Each pair consists of a Unicode value, followed by the way it should be represented in your codepage. The list is terminated by a zero Unicode value. This allows you to create a many->one mapping, where many different Unicode characters can be represented by a single codepage value (eg. for reducing accented vowels to 7-bit ASCII).
Allegro will use the `table' parameter when it needs to convert an ASCII string to an Unicode string. But when Allegro converts an Unicode string to ASCII, it will use both parameters. First, it will loop through the `table' parameter looking for an index position pointing at the Unicode value it is trying to convert (ie. the `table' parameter is also used for reverse matching). If that fails, the `extras' list is used. If that fails too, Allegro will put the character `^', giving up the conversion.
Note that Allegro comes with a default `table' and `extras' parameters set internally. The default `table' will convert 8-bit characters to `^'. The default `extras' list reduces Latin-1 and Extended-A characters to 7 bits in a sensible way (eg. an accented vowel will be reduced to the same vowel without the accent).
See also: set_uformat.
if (need_uconvert(text, U_UTF8, U_CURRENT)) { /* conversion is required */ }
Return value: Returns non-zero if any conversion is required or zero otherwise.
See also: set_uformat, get_uformat, do_uconvert, uconvert.
length = uconvert_size(old_string, U_CURRENT, U_UNICODE); new_string = malloc(length); ustrcpy(new_string, old_string);
Return value: Returns the number of bytes required to store the string after conversion.
See also: need_uconvert, do_uconvert.
Note that, even for empty strings, your destination string must have at least enough bytes to store the terminating null character of the string, and your parameter `size' must reflect this. Otherwise, the debug version of Allegro will abort at an assertion, and the release version of Allegro will overrun the destination buffer.char temp_string[256]; do_uconvert(input_string, U_CURRENT, temp_string, U_ASCII, 256);
See also: uconvert.
As a convenience, if `buf' is NULL it will convert the string into an internal static buffer and the `size' parameter will be ignored. You should be wary of using this feature, though, because that buffer will be overwritten the next time this routine is called, so don't expect the data to persist across any other library calls. The static buffer may hold less than 1024 characters, so you won't be able to convert large chunks of text. Example:
char *p = uconvert(input_string, U_CURRENT, buffer, U_ASCII, 256);
Return value: Returns a pointer to `buf' (or the static buffer if you used NULL) if a conversion was performed. Otherwise returns a copy of `s'. In any cases, you should use the return value rather than assuming that the string will always be moved to `buf'.
See also: set_uformat, need_uconvert, uconvert, uconvert_ascii, uconvert_toascii, do_uconvert.
See also: uconvert.
Examples using this: exunicod.
See also: uconvert.
int first_unicode_letter = ugetc(text_string);
Return value: Returns the character pointed to by `s' in the current encoding format.
See also: ugetx, usetc, uwidth, ucwidth, uisok.
char *p = string; int first_letter, second_letter, third_letter; first_letter = ugetx(&p); second_letter = ugetx(&p); third_letter = ugetx(&p);
Return value: Returns the character pointed to by `s' in the current encoding format, and advances the pointer to the next character after the one just returned.
See also: ugetc, usetc, uwidth, ucwidth, uisok.
Return value: Returns the number of bytes written, which is equal to the width of the character in the current encoding format.
See also: ugetc, ugetx, uwidth, ucwidth, uisok.
Return value: Returns the number of bytes occupied by the first character of the specified string, in the current encoding format.
See also: uwidth_max, ugetc, ugetx, usetc, ucwidth, uisok.
Return value: Returns the number of bytes that would be occupied by the specified character value, when encoded in the current format.
See also: uwidth_max, ugetc, ugetx, usetc, uwidth, uisok.
Return value: Returns non-zero if the value can be correctly encoded, zero otherwise.
See also: ugetc, ugetx, usetc, uwidth, ucwidth.
int from_third_letter = uoffset(text_string, 2);
Return value: Returns the offset in bytes to the specified character.
See also: ugetat, usetat, uinsert, uremove.
int third_letter = ugetat(text_string, 2);
Return value: Returns the character value at the specified index in the string.
See also: uoffset, usetat, uinsert, uremove.
usetat(text_string, 2, letter_a);
Return value: Returns the number of bytes by which the trailing part of the string was moved. This is of interest only with text encoding formats where characters have a variable length, like UTF-8.
See also: uoffset, ugetat, uinsert, uremove.
uinsert(text_string, 0, prefix_letter);
Return value: Returns the number of bytes by which the trailing part of the string was moved.
See also: uoffset, ugetat, usetat, uremove.
int length_in_bytes = ustrsizez(text_string); ... length_in_bytes -= uremove(text_string, -1);
Return value: Returns the number of bytes by which the trailing part of the string was moved.
See also: uoffset, ugetat, usetat, uinsert.
See also: ustrsizez, empty_string.
Examples using this: exunicod.
See also: ustrsize, empty_string.
Examples using this: exunicod.
char *temp_buffer = malloc(256 * uwidth_max(U_UTF8));
See also: uwidth, ucwidth.
See also: utoupper, ugetc, ugetx, usetc, uwidth, ucwidth, uisok.
See also: utolower, ugetc, ugetx, usetc, uwidth, ucwidth, uisok.
for (counter = 0; counter < ustrlen(text_string); counter++) { if (uisspace(ugetat(text_string, counter))) usetat(text_string, counter, '_'); }
See also: uisdigit, ugetc, usetc, uwidth, ucwidth, uisok.
for (counter = 0; counter < ustrlen(text_string); counter++) { if (uisdigit(ugetat(text_string, counter))) usetat(text_string, counter, '*'); }
See also: uisspace, ugetc, usetc, uwidth, ucwidth, uisok.
void manipulate_string(const char *input_string) { char *temp_buffer = ustrdup(input_string); /* Now we can modify temp_buffer */ ...
Return value: Returns the newly allocated string. This memory must be freed by the caller. Returns NULL if it cannot allocate space for the duplicated string.
See also: _ustrdup, uconvert, ustrsize, ustrsizez.
Examples using this: exconfig.
See also: ustrdup, uconvert, ustrsize, ustrsizez.
Return value: Returns the value of dest.
See also: uconvert, ustrzcpy, ustrncpy.
Examples using this: exunicod.
Note that, even for empty strings, your destination string must have at least enough bytes to store the terminating null character of the string, and your parameter `size' must reflect this. Otherwise, the debug version of Allegro will abort at an assertion, and the release version of Allegro will overrun the destination buffer.
Return value: Returns the value of `dest'.
See also: uconvert, ustrcpy, ustrzncpy.
Examples using this: ex3buf, exgui.
Return value: Returns the value of `dest'.
See also: uconvert, ustrzcat, ustrncat.
Examples using this: exunicod.
Note that, even for empty strings, your destination string must have at least enough bytes to store the terminating null character of the string, and your parameter `size' must reflect this. Otherwise, the debug version of Allegro will abort at an assertion, and the release version of Allegro will overrun the destination buffer.
Return value: Returns the value of `dest'.
See also: uconvert, ustrcat, ustrzncat.
Examples using this: exgui.
See also: uconvert, ustrsize, ustrsizez.
Return value: Returns zero if the strings are equal, a positive number if `s1' comes after `s2' in the ASCII collating sequence, else a negative number.
See also: uconvert, ustrsize, ustrsizez, ustrncmp, ustricmp, ustrnicmp.
Note that if `src' is longer than `n' characters, `dest' will not be null-terminated.
Return value: The return value is the value of `dest'.
See also: uconvert, ustrcpy, ustrzncpy.
Note that, even for empty strings, your destination string must have at least enough bytes to store the terminating null character of the string, and your parameter `size' must reflect this. Otherwise, the debug version of Allegro will abort at an assertion, and the release version of Allegro will overrun the destination buffer.
Return value: The return value is the value of `dest'.
See also: uconvert, ustrzcpy, ustrncpy.
Examples using this: exkeys.
Return value: The return value is the value of `dest'.
See also: uconvert, ustrcat, ustrzncat.
Return value: The return value is the value of `dest'.
See also: uconvert, ustrzcat, ustrncat.
if (ustrncmp(prefix, long_string, ustrlen(prefix)) == 0) { /* long_string starts with prefix */ }
Return value: Returns zero if the substrings are equal, a positive number if `s1' comes after `s2' in the ASCII collating sequence, else a negative number.
See also: uconvert, ustrsize, ustrsizez, ustrcmp, ustricmp, ustrnicmp.
if (ustricmp(string, user_input) == 0) { /* string and user_input are equal (ignoring case) */ }
Return value: Returns zero if the strings are equal, a positive number if `s1' comes after `s2' in the ASCII collating sequence, else a negative number.
See also: uconvert, ustrsize, ustrsizez, ustrnicmp, ustrcmp, ustrncmp.
Examples using this: exconfig.
if (ustrnicmp(prefix, long_string, ustrlen(prefix)) == 0) { /* long_string starts with prefix (ignoring case) */ }
Return value: Returns zero if the strings are equal, a positive number if `s1' comes after `s2' in the ASCII collating sequence, else a negative number.
See also: uconvert, ustrsize, ustrsizez, ustricmp, ustrcmp, ustrncmp.
char buffer[] = "UPPER CASE STRING"; allegro_message(ustrlwr(buffer));
Return value: The return value is the value of `s'.
See also: uconvert, utolower, ustrupr.
char buffer[] = "lower case string"; allegro_message(ustrupr(buffer));
Return value: The return value is the value of `s'.
See also: uconvert, utolower, ustrlwr.
char *p = ustrchr("one,two,three,four", ',');
Return value: Returns a pointer to the first occurrence of `c' in `s', or NULL if no match was found. Note that if `c' is NULL, this will return a pointer to the end of the string.
See also: uconvert, ustrrchr, ustrstr, ustrpbrk, ustrtok.
char *p = ustrrchr("one,two,three,four", ',');
Return value: Returns a pointer for the last occurrence of `c' in `s', or NULL if no match was found.
See also: uconvert, ustrchr, ustrstr, ustrpbrk, ustrtok.
char *p = ustrstr("hello world", "world");
Return value: Returns a pointer within `s1', or NULL if `s2' wasn't found.
See also: uconvert, ustrchr, ustrrchr, ustrpbrk, ustrtok.
char *p = ustrpbrk("one,two-three.four", "-. ");
Return value: Returns a pointer to the first match, or NULL if none are found.
See also: uconvert, ustrchr, ustrrchr, ustrstr, ustrtok.
char *word; char string[]="some-words with dashes"; char *temp = ustrdup(string); word = ustrtok(temp, " -"); while (word) { allegro_message("Found `%s'\n", word); word = ustrtok(NULL, " -"); } free(temp);
Return value: Returns a pointer to the token, or NULL if no more are found.
See also: uconvert, ustrchr, ustrrchr, ustrstr, ustrpbrk, ustrtok_r, allegro_message, ustrncpy.
Examples using this: exgui.
char *word, *last; char string[]="some-words with dashes"; char *temp = ustrdup(string); word = ustrtok_r(string, " -", &last); while (word) { allegro_message("Found `%s'\n", word); word = ustrtok_r(NULL, " -", &last); } free(temp);
Return value: Returns a pointer to the token, or NULL if no more are found. You can free the memory pointed to by `last' once NULL is returned.
See also: ustrtok.
Return value: Returns the equivalent value, or zero if the string does not represent a number.
See also: uconvert, ustrtol, ustrtod.
char *endp, *string = "456.203 askdfg"; int number = ustrtol(string, &endp, 10);
Return value: Returns the string converted as a value of type `long int'. If nothing was converted, returns zero with `*endp' pointing to the beginning of `s'.
See also: uconvert, ustrtod, uatof.
char *endp, *string = "456.203 askdfg"; double number = ustrtod(string, &endp);
Return value: Returns the string converted as a value of type `double'. If nothing was converted, returns zero with *endp pointing to the beginning of s.
See also: uconvert, ustrtol, uatof.
PACKFILE *input_file = pack_fopen("badname", "r"); if (input_file == NULL) allegro_message("%s\nSorry!\n", ustrerror(errno));
Return value: Returns a pointer to a static string that should not be modified or freed. If you make subsequent calls to ustrerror(), the string will be overwritten.
See also: uconvert, allegro_error.
Return value: Returns the number of characters written, not including the terminating null character.
See also: uconvert, uszprintf, uvsprintf.
Examples using this: exkeys.
char buffer[10]; int player_score; ... uszprintf(buffer, sizeof(buffer), "Your score is: %d", player_score);
Return value: Returns the number of characters that would have been written without eventual truncation (like with usprintf), not including the terminating null character.
See also: uconvert, usprintf, uvszprintf.
Examples using this: exgui.
Return value: Returns the number of characters written, not including the terminating null character.
See also: uconvert, usprintf, uvszprintf.
#include <stdarg.h> void log_message(const char *format, ...) { char buffer[100]; va_list parameters; va_start(parameters, format); uvszprintf(buffer, sizeof(buffer), format, parameters); va_end(parameters); append_buffer_to_logfile(log_name, buffer); send_buffer_to_other_networked_players(multicast_ip, buffer); and_also_print_it_on_the_screen(cool_font, buffer); } void some_other_function(void) { log_message("Hello %s, are you %d years old?\n", "Dave", 25); }
Return value: Returns the number of characters that would have been written without eventual truncation (like with uvsprintf), not including the terminating null character.
See also: uconvert, uszprintf, uvsprintf.