Hi! Before I start digging to the specs, someone may answer me this question faster: Should stripping spaces from attributes done before or after the conversions of entities? E.g., if I have something like <td align="char" char=" "> is the value of the "char" attribute " " (one space) or "" (empty)? Dillo does the second, so it is not possible to align characters at spaces. BTW, does someone know a good and simple way to retrieve informations about other the user has specified via $LC_...? I need this for the standard value of the "char" attribute, which depends on the language of the document (or a content part). Sebastian
* Sebastian Geerken <s.geerken@ping.de> [2003-01-02 17:53] : [...]
BTW, does someone know a good and simple way to retrieve informations about other the user has specified via $LC_...? I need this for the standard value of the "char" attribute, which depends on the language of the document (or a content part).
Probably use the nl_langinfo C function with the item CODESET. From the manpage: nl_langinfo - query language and locale information CODESET (LC_CTYPE) Return a string with the name of the character encoding used in the selected locale, such as "UTF-8", "ISO-8859-1", or "ANSI_X3.4-1968" (better known as US-ASCII). This is the same string that you get with "locale charmap". For a list of character encoding names, try "locale -m", cf. locale(1). Fred
Hi, On Thu, Jan 02, Frederic Bothamy wrote:
* Sebastian Geerken <s.geerken@ping.de> [2003-01-02 17:53] : [...]
BTW, does someone know a good and simple way to retrieve informations about other the user has specified via $LC_...? I need this for the standard value of the "char" attribute, which depends on the language of the document (or a content part).
Probably use the nl_langinfo C function with the item CODESET. From the manpage: [...]
What I need is rather a function char *nl_langinfo(nl_item *item, char *locale); since the locale is set in the HTML document, not by the user (e.g. if a german user reads an english web page). Perhaps a temporary setlocale() may work. (Currently , neither localeconv, nor nl_langinfo works at all, perhaps I miss some files.) Anyway thanks for the answer. Sebastian
On Thu, 2003-01-02 at 19:31, Sebastian Geerken wrote:
What I need is rather a function
char *nl_langinfo(nl_item *item, char *locale);
since the locale is set in the HTML document, not by the user (e.g. if a german user reads an english web page). Perhaps a temporary setlocale() may work.
Surely the HTML document gives you the encoding directly, not a locale string? Can you give a slightly more concrete description of what it is you're trying to do? p.
On Thu, Jan 02, Philip Blundell wrote:
On Thu, 2003-01-02 at 19:31, Sebastian Geerken wrote:
What I need is rather a function
char *nl_langinfo(nl_item *item, char *locale);
since the locale is set in the HTML document, not by the user (e.g. if a german user reads an english web page). Perhaps a temporary setlocale() may work.
Surely the HTML document gives you the encoding directly, not a locale string? Can you give a slightly more concrete description of what it is you're trying to do?
I'm working on character alignment, and the attribute CHAR specifies the character at which the columns aligned. If not specified, the language-dependant character for the decimal point must be used. This is nothing covered by the encoding, since two languages using the same encoding may differ in this. Sebastian
On Thu, Jan 02, Sebastian Geerken wrote:
Before I start digging to the specs, someone may answer me this question faster: Should stripping spaces from attributes done before or after the conversions of entities? E.g., if I have something like
<td align="char" char=" ">
is the value of the "char" attribute " " (one space) or "" (empty)? Dillo does the second, so it is not possible to align characters at spaces.
Currently (last commits), the HTML parser assumes that an empty string was one space, so that CHAR="" will be equivalent to CHAR=" " or CHAR="&32;". I'll search in the specs. Sebastian
Sebastian, On Thu, 2 Jan 2003, Sebastian Geerken wrote:
Hi!
Before I start digging to the specs, someone may answer me this question faster: Should stripping spaces from attributes done before or after the conversions of entities?
Most probably the first.
E.g., if I have something like
<td align="char" char=" ">
is the value of the "char" attribute " " (one space) or "" (empty)? Dillo does the second, so it is not possible to align characters at spaces.
I think " " should become " " and " " should become "" when taken as simple CDATA (for instance as in %URI), but when the attribute value happens to be a %Character (again CDATA), then " " should be " ". AFAIS, doing the same for " " and "" in CHAR element, is a good workaround. You may like to read sgmltut.html from the HTML-4.01 SPEC (html4/intro/sgmltut.html). At least that's the one I use. Cheers Jorge.-
participants (4)
-
Frédéric Bothamy
-
Jorge Arellano Cid
-
Philip Blundell
-
Sebastian Geerken