Re: [Dillo-dev]Re: White spaces handling (was: Weird glitch with rendering)

May 22, 2004


      On Fri, May 21, Jorge Arellano Cid wrote:
...
[...]
...
...
If  we  "collapse" as the SPEC says should be done, we have two
possibilities:
What part of the spec do you refer to?
HTML-4.01, sec 9.1:
<q>
 [...]
 Note that a sequence of white spaces between words in the source
 document may result in an entirely different rendered inter-word
 spacing (except in the case of the PRE element). In particular,
 user agents should collapse input white space sequences when
                               ^^^^^
Again, I understand this, that this refers to the lowest processing
level, this is what the HTML parser has already done before.
...
producing output inter-word space. This can and should be done
 even in the absence of language information (from the lang
 attribute, the HTTP"Content-Language" header field (see
 [RFC2616], section 14.12), user agent settings, etc.).
 [...]
</q>
[...]
...
Generally, I'd like to stick to this tree view, and especially regard,
in this example, the words "Some " and "text" as lying in different
levels in the tree, not within a flat list. In the history, this was
not always very clear for HTML, but it is much clearer for XHTML. The
current parser does not actually build a tree, but should a bit like
as it does.
Sorry, my english, what does "a bit like as it does" mean?
Sorry, this should say "[should behave] a bit like as *if* it does
[build a tree]".
...
[...]
...
...
Now, this solution would also account for the special SGML line
break rules:
<q source='HTML-4.01 SPEC B.3.1'>
SGML  (see  [ISO8879], section 7.6.1) specifies that a line break
immediately following a start tag must be ignored, as must a line
break  immediately  before  an  end tag. This applies to all HTML
elements without exception.
The following two HTML examples must be rendered identically:
<P>Thomas is watching TV.</P>
<P>
Thomas is watching TV.
</P>
So must the following two examples:
<A>My favorite Website</A>
<A>
My favorite Website
</A>
</q>
Rhis is actually something different: It is only about line breaks,
and it applies to *all* elements, including <pre>.
If   you  consider  that  line  breaks  are  also  white  space
characters  (HTML-4.01  sec  9.1), this becomes a special case of
general white space handling.
Since this rule applies always, it should be handles at a level below
(if I understand this correctly). I.e., first remove these linke
breaks (also for <pre>, and then (except for <pre>), remove white
spaces. BTW, I do not know whether this is also valid for XML, I did
not find something equivalent in the XML spec.
...
PS:  It  seems  like we'll have to modify the parser to produce a
parsing tree for CSS/XML to be supported properly.
This is already halfway done in the CSS prototype, since CSS does
indeed depend on a document tree (mostly because CSS is processed
asynchronously). It has still to be considered, on which levels what
white space handling is done.

Sebastian

Re: [Dillo-dev]Re: White spaces handling (was: Weird glitch with rendering)

Sebastian Geerken