On Wed, Oct 06, 2004 at 03:47:32PM +0200, Bjoern Brill wrote:
Hello,
Hi Björn.
I've tried out dillo-0.8.3-rc1 for a few days on a Debian GNU/Linux 3.0 "woody"/i386 system.
It compiled and installed cleanly, and I did not encounter any regressions against 0.8.2 yet. The parser changes are a real improvement -- several formerly problematic sites I've tried work now, and the "Detected HTML errors" view produces much more meaningful results.
Thanks for the good report!
Two small problems I've encountered:
- It seems that, if a pair of <a href="..."> </a> tags contains something illegal, like <div>, then the <a> tags are ignored, rather than the illegal stuff inside. That's suboptimal.
Can you elaborate on "suboptimal"? The reason why <a> is closed is that INLINE elements can't contain BLOCK elements, so any inline elements left open are closed. This cleanup has proven very healthy. From the SPEC: <q> The DIV and SPAN elements, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. These elements define content to be inline (SPAN) or block-level (DIV) but impose no other presentational idioms on the content. Thus, authors may use these elements in conjunction with style sheets, the lang attribute, etc., to tailor HTML to their own needs and tastes. </q>
As an example, look at http://www.kmelektronik.de/ (admittedly a very broken site, standards-wise). The text fragments "Light-Version Versand" and "Light-Version Shop" near the page bottom should really be links. If this is hard to fix, then don't bother.
Yes. The only other site I've found is www.lynucs.org which I expect to correct the problem when told. I tried a small hack, but it has the side effect of not cleaning-up any INLINE element upon <div> openings. This solves the problem with the above mentioned pages, but may create bigger problems than what it solves. For instance. INLINES include: TT, I, B, U, S, STRIKE, BIG, SMALL, FONT, EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR, ACRONYM, SUB, SUP, Q, SPAN, BDO A OBJECT APPLET PARAM IMG BASEFONT BR SCRIPT MAP AREA INPUT SELECT OPTGROUP TEXTAREA LABEL BUTTON Not having them closed (cleaned) upon <div> openning makes me shudder... Maybe a good solution is to only allow an exception when <a> precedes the <div>. This would be much safer. Of course it would'n work with <a ...><b><div> </div></b></a>. Now, considering the small amount of sites doing this, it may be an overkill. Please share your thoughts.
- This one is not new, but I fixed it a while ago for myself and then forgot about it: the "This page uses the NON-STANDARD meta refresh tag..." warning usually occurs inside <head>, but is immediately sent to the parser, which in turn switches its HTML processing state from IN_HEAD to IN_BODY prematurely. After that, head-only tags like <title> and <base> are ignored. A minimal fix is at the end of the mail.
Thanks. Most probably it will make its way into rc2. -- Cheers Jorge.-