Re: [Dillo-dev]Re: White spaces handling (was: Weird glitch with rendering)

May 21, 2004

      On Fri, May 14, Jorge Arellano Cid wrote:
...
[...]
  For instance:
...
A different case is "<u>Some </u> text". Your patch will make
"<u>Some </u>text" of it, but it should be really be
"<u>Some</u> text."
Yes, I agree, "collapsing" here should be:
'<u>Some </u> text'  =>
   '<u>Some</u> text'
as you note.
but what do we do with this:
'<u>Some </u>text'
If we ignore white space after the start tag and before the end
tag, it becomes
'<u>Some</u>text'       (with no space at all!)
What I find indeed reasonable. (See below for my reasons.)
...
If  we  "collapse" as the SPEC says should be done, we have two
possibilities:
What part of the spec do you refer to? From what I have understood,
spaces should be collapsed at the raw data leve. That is, if you have
the part "<p><u>Some </u>text</p>", it will be parsed into the following
tree:

   ...
   `- <p>
      +- <u>
      |  `- "Some "
      `- "text"

Then, the question is, what should be done with the space at the end
of "Some ".

Generally, I'd like to stick to this tree view, and especially regard,
in this example, the words "Some " and "text" as lying in different
levels in the tree, not within a flat list. In the history, this was
not always very clear for HTML, but it is much clearer for XHTML. The
current parser does not actually build a tree, but should a bit like
as it does. This approach makes also the new HTML parser in the CSS
prototype simpler.
...
'<u>Some </u>text'      (as it was: underline the whitespace)
and
'<u>Some</u> text'      (move the space out of the tag)
AFAICT,  the  SPEC  leaves  the  choice  open, and advices HTML
authors against whitespace inside the tags.
IMO,  always  collapsing  white  space  after the start tag and
                  ^^^^^^^^^^
I'd say, we should (for most elements) simply ignore whitespaces after
the opening tag, and before the closing tag. This solves generally the
problem with "Some <u> underlined </u> text.", and we should not
relate spaces in different elements, e.g. by collapsing them.
...
before  the  end  tag is the simplest to implement. Even more, as
the  SPEC  doesn't define what to do in this case, it's an option
left to the User Agent:
<q source='HTML4.01 SPEC, 9.1'>
  In  order  to  avoid  problems  with  SGML  line  break rules and
  inconsistencies  among extant implementations, authors should not
  rely  on  user  agents  to render white space immediately after a
  start tag or immediately before an end tag. Thus, authors, and in
  particular authoring tools, should write:
<P>We offer free <A>technical support</A> for subscribers.</P>
and not:
<P>We offer free<A> technical support </A>for subscribers.</P>
</q>
So, at least, the authors are warned ;-)
...
Now, this solution would also account for the special SGML line
break rules:
<q source='HTML-4.01 SPEC B.3.1'>
SGML  (see  [ISO8879], section 7.6.1) specifies that a line break
immediately following a start tag must be ignored, as must a line
break  immediately  before  an  end tag. This applies to all HTML
elements without exception.
The following two HTML examples must be rendered identically:
<P>Thomas is watching TV.</P>
<P>
Thomas is watching TV.
</P>
So must the following two examples:
<A>My favorite Website</A>
<A>
My favorite Website
</A>
</q>
Rhis is actually something different: It is only about line breaks,
and it applies to *all* elements, including <pre>.

Sebastian

Re: [Dillo-dev]Re: White spaces handling (was: Weird glitch with rendering)

Sebastian Geerken