Hi Richard, thanks for your comments! On Fri, Oct 15, 2004 at 04:35:29PM -0400, Richard Page-Wood wrote:
I do know a bit about HTML and XHTML, though, and I was wondering about your doctype-sniffing patch. As I understand it, you're trying to distinguish between HTML 4.01 and XHTML 1.x by sniffing the doctype and giving appropriate warnings for invalid markup. Are you also going to alter the rendering for XHTML?
Certainly not as part of my patch. It's origin was simply the observation that Dillo refused anchor names like "Dürst" which are allowed in HTML if defined with the "name" attribute (see Section 12.2.3 of the HTML 4.01 spec).
There's a good article here:
http://www.hixie.ch/advocacy/xhtml
which talks amongst other things about the impossibility of correctly identifying an XHTML document which might be of interest to you.
Having looked at this article and the references given therein, I don't feel anymore that it would be a good idea to try and figure out whether the document type is HTML or XHTML. I still like the idea of supporting XHTML in some way, mostly because XML lacks many of the strange features of SGML that make parsing difficult. For example, "<" and "&" are not allowed as ordinary characters in XML. But this has nothing to do with anchor names, so I will remove the XHTML parts of the patch (unless someone complains). Jorge: Are you still interested in evaluating <!DOCTYPE> to figure out the HTML version? Maybe it would be ok for a small browser like Dillo to stick to HTML 4.01. All the best, -- Matthias Franz Section de Mathématiques, Université de Genève, Suisse