Download Transcoder source tarball v0.0
(transcoder transcoder) is a Guile library that converts between
character sets. It follows the R6RS idea of transcoders quite
closely. The following functions work as described in R6Rs.
make-transcoder
native-transcoder
transcoder-codec
transcoder-eol-style
transcoder-error-handling-mode
latin-1-codec
utf-8-codec
utf-16-codec
eol-style
native-eol-style
error-handling-mode
There are four additional functions.
u32vector->locale-string vector transcoder
locale-string->u32vector string transcoder
Given TRANSCODER, a transcoder object as returned by the procedure
make-transcoder, these procedures will convert an encoded string
to a vector of codepoints, or a vector of codepoints into an
encoded string.
read-codepoint port transcoder
write-codepoint codepoint port transcoder
Given an open PORT and a TRANSCODER object as returned by
make-transcoder, these procedures will either read a codepoint
from the port or write a codepoint to the port. It is returned as
an integer.
Here is an example converting between encoded string and codepoints
(use-modules (transcoder transcoder)) (setlocale LC_ALL "") (define tc (make-transcoder (utf-8-codec))) (write (locale-string->u32vector "a b c" tc)) (newline) ;; Returns #u32(97 32 98 32 99) (write (locale-string->u32vector "á ñ ö" tc)) (newline) ;; Returns #u32(225 32 241 32 246) ;; Here are variations on the letter A (display (u32vector->locale-string #u32(#xC1 #xE1 #x103 #x1ce #xc2) tc)) (newline) ;; Returns ÁáăǎÂ
Here is an example to scan a file for non-ASCII character
(define tc (make-transcoder (utf-8-codec)))
(define iport (open-file "demo.scm" "r"))
(define oport (open-output-string))
(let loop ((cp (read-codepoint iport tc)))
(if (not (eof-object? cp))
(begin
(if (> cp 128)
(write-codepoint cp oport tc))
(loop (read-codepoint iport tc)))
(display (get-output-string oport))))
(newline)