unix/fiss

lib/libutf/rune.3 in master
Repositories | Summary | Log | Files | LICENSE

rune.3 (3064B) download


  1.deEX
  2.ift .ft5
  3.nf
  4..
  5.deEE
  6.ft1
  7.fi
  8..
  9.TH RUNE 3
 10.SH NAME
 11runetochar, chartorune, runelen, runenlen, fullrune, utfecpy, utflen, utfnlen, utfrune, utfrrune, utfutf \- rune/UTF conversion
 12.SH SYNOPSIS
 13.ta \w'\fLchar*xx'u
 14.B #include <utf.h>
 15.PP
 16.B
 17int	runetochar(char *s, Rune *r)
 18.PP
 19.B
 20int	chartorune(Rune *r, char *s)
 21.PP
 22.B
 23int	runelen(long r)
 24.PP
 25.B
 26int	runenlen(Rune *r, int n)
 27.PP
 28.B
 29int	fullrune(char *s, int n)
 30.PP
 31.B
 32char*	utfecpy(char *s1, char *es1, char *s2)
 33.PP
 34.B
 35int	utflen(char *s)
 36.PP
 37.B
 38int	utfnlen(char *s, long n)
 39.PP
 40.B
 41char*	utfrune(char *s, long c)
 42.PP
 43.B
 44char*	utfrrune(char *s, long c)
 45.PP
 46.B
 47char*	utfutf(char *s1, char *s2)
 48.SH DESCRIPTION
 49These routines convert to and from a
 50.SM UTF
 51byte stream and runes.
 52.PP
 53.I Runetochar
 54copies one rune at
 55.I r
 56to at most
 57.B UTFmax
 58bytes starting at
 59.I s
 60and returns the number of bytes copied.
 61.BR UTFmax ,
 62defined as
 63.B 3
 64in
 65.BR <libc.h> ,
 66is the maximum number of bytes required to represent a rune.
 67.PP
 68.I Chartorune
 69copies at most
 70.B UTFmax
 71bytes starting at
 72.I s
 73to one rune at
 74.I r
 75and returns the number of bytes copied.
 76If the input is not exactly in
 77.SM UTF
 78format,
 79.I chartorune
 80will convert to 0x80 and return 1.
 81.PP
 82.I Runelen
 83returns the number of bytes
 84required to convert
 85.I r
 86into
 87.SM UTF.
 88.PP
 89.I Runenlen
 90returns the number of bytes
 91required to convert the
 92.I n
 93runes pointed to by
 94.I r
 95into
 96.SM UTF.
 97.PP
 98.I Fullrune
 99returns 1 if the string
100.I s
101of length
102.I n
103is long enough to be decoded by
104.I chartorune
105and 0 otherwise.
106This does not guarantee that the string
107contains a legal
108.SM UTF
109encoding.
110This routine is used by programs that
111obtain input a byte at
112a time and need to know when a full rune
113has arrived.
114.PP
115The following routines are analogous to the
116corresponding string routines with
117.B utf
118substituted for
119.B str
120and
121.B rune
122substituted for
123.BR chr .
124.PP
125.I Utfecpy
126copies UTF sequences until a null sequence has been copied, but writes no 
127sequences beyond
128.IR es1 .
129If any sequences are copied,
130.I s1
131is terminated by a null sequence, and a pointer to that sequence is returned.
132Otherwise, the original
133.I s1
134is returned.
135.PP
136.I Utflen
137returns the number of runes that
138are represented by the
139.SM UTF
140string
141.IR s .
142.PP
143.I Utfnlen
144returns the number of complete runes that
145are represented by the first
146.I n
147bytes of
148.SM UTF
149string
150.IR s .
151If the last few bytes of the string contain an incompletely coded rune,
152.I utfnlen
153will not count them; in this way, it differs from
154.IR utflen ,
155which includes every byte of the string.
156.PP
157.I Utfrune
158.RI ( utfrrune )
159returns a pointer to the first (last)
160occurrence of rune
161.I c
162in the
163.SM UTF
164string
165.IR s ,
166or 0 if
167.I c
168does not occur in the string.
169The NUL byte terminating a string is considered to
170be part of the string
171.IR s .
172.PP
173.I Utfutf
174returns a pointer to the first occurrence of
175the
176.SM UTF
177string
178.I s2
179as a
180.SM UTF
181substring of
182.IR s1 ,
183or 0 if there is none.
184If
185.I s2
186is the null string,
187.I utfutf
188returns
189.IR s1 .
190.SH SOURCE
191.B https://9fans.github.io/plan9port/unix
192.SH SEE ALSO
193.IR utf (7),
194.IR tcs (1)