Tim wrote:
There is a bug in the HTML parser: Tags within quotes are interpreted:
<input type="text" name="test" value="<p>asdf</p>" />
I think it is line 3754f (src/html.cc) which evokes the unwanted behaviour. I'd fix this bug by introducing two variables: The first one states whether we're currently inside of a quote and the second one stores its type (single or double quote). As long as the current character does not equal the type, all characters in between the starting and ending quote will be ignored. Of course we should be also able to deal with escaped quotes allowing constructs similar to the following one:
<input type="text" name="test" value="<p>\"asdf\"</p>" />
Strictly speaking, for attributes that are CDATA, at least, my impression is that the element is closed if the parser encounters an (SGML jargon!) end-tag open (ETAGO, i.e., "</") "delimiter-in-context" (meaning, I think, that it is followed by certain characters such as ASCII letters). Of course, sgml is a horrible, loony monstrosity, and we don't follow it to the letter to begin with, so I'm not sure that I'd necessarily be opposed to making the parser behave as you say. (Remembers http://lists.auriga.wearlab.de/pipermail/dillo-dev/2008-January/003668.html)