On Wed, 16 Jun 2004 sam+dillo@chaosring.org wrote:
Jorge, in a message with the subject "UTF-8", sent on June 3rd, wondered why someone would encode an English-language webpage with UTF-8. [ ... reasons snipped ... ] Now, obviously, considering Dillo's target audience, it doesn't need full Unicode support. But it would be nice if it could handle basic UTF-8 encoded Latin-1 characters in web pages [1]. [...snip...] [1] Some hacky code I once wrote, that reads UTF-8 from standard input, and outputs iso 8859-1:
if(c<128) { printf("%c",c); } else { if(c < 0xe0) { /* two-byte sequence */ v = c & 0x1f; v <<= 6; c = getc(stdin); v = v + (c & 0x3f); printf("%c",v); } else { /* multi-byte sequence */ while(c & 0xc0 == 0x80 && !feof(stdin)) { c = getc(stdin); } } }
(This code, FWIW, is public domain)
It is probably better to simply use iconv.
Just as a note, I find no iconv(1) or iconv(3) on my (old!) system, so ./configure would have to be able to make some decisions if things went that way... -- -- David McKee -- dmckee@jlab.org -- (757) 269-7492 (Office)