Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ (2) — Plan9 4th Edition

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

utf(6)

tcs(1)

RUNE(2)

NAME

runetochar, chartorune, runelen, runenlen, fullrune, utfecpy, utflen, utfnlen, utfrune, utfrrune, utfutf − rune/UTF conversion

SYNOPSIS

­#include <u.h>
­#include <libc.h>

intrunetochar(char ∗s, Rune ∗r)

intchartorune(Rune ∗r, char ∗s)

intrunelen(long r)

intrunenlen(Rune ∗r, int n)

intfullrune(char ∗s, int n)

char∗utfecpy(char ∗s1, char ∗es1, char ∗s2)

intutflen(char ∗s)

intutfnlen(char ∗s, long n)

char∗utfrune(char ∗s, long c)

char∗utfrrune(char ∗s, long c)

char∗utfutf(char ∗s1, char ∗s2)

DESCRIPTION

These routines convert to and from a UTF byte stream and runes. 

­Runetochar copies one rune at ­r to at most ­UTFmax bytes starting at ­s and returns the number of bytes copied.  UTFmax, defined as ­3 in <libc.h>, is the maximum number of bytes required to represent a rune. 

­Chartorune copies at most ­UTFmax bytes starting at ­s to one rune at ­r and returns the number of bytes copied.  If the input is not exactly in UTF format, ­chartorune will convert to 0x80 and return 1. 

­Runelen returns the number of bytes required to convert ­r into UTF. 

­Runenlen returns the number of bytes required to convert the ­n runes pointed to by ­r into UTF. 

­Fullrune returns 1 if the string ­s of length ­n is long enough to be decoded by ­chartorune and 0 otherwise.  This does not guarantee that the string contains a legal UTF encoding.  This routine is used by programs that obtain input a byte at a time and need to know when a full rune has arrived. 

The following routines are analogous to the corresponding string routines with ­utf substituted for ­str and ­rune substituted for chr. 

­Utfecpy copies UTF sequences until a null sequence has been copied, but writes no sequences beyond es1. If any sequences are copied, ­s1 is terminated by a null sequence, and a pointer to that sequence is returned.  Otherwise, the original ­s1 is returned. 

­Utflen returns the number of runes that are represented by the UTF string s.

­Utfnlen returns the number of complete runes that are represented by the first ­n bytes of UTF string s. If the last few bytes of the string contain an incompletely coded rune, ­utfnlen will not count them; in this way, it differs from utflen, which includes every byte of the string.

­Utfrune (utfrrune) returns a pointer to the first (last) occurrence of rune ­c in the UTF string s, or 0 if ­c does not occur in the string.  The NUL byte terminating a string is considered to be part of the string s.

­Utfutf returns a pointer to the first occurrence of the UTF string ­s2 as a UTF substring of s1, or 0 if there is none. If ­s2 is the null string, ­utfutf returns s1.

SOURCE

­/sys/src/libc/port/rune.c
­/sys/src/libc/port/utfrune.c

SEE ALSO

utf(6), tcs(1)

Plan 9  —  March 02, 2002

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026