On Mon, May 02, 2016 at 09:59:09PM -0300, Jorge Arellano Cid wrote:
Hi,
On Sun, May 01, 2016 at 04:06:51PM +0100, Nick Warne wrote:
I have been looking at this for about 5 hours now, and it has fried my brain.
Reading the code, and then reading the email thread about depreciated meta tag 'refresh', I agree with Dillo's actions.
But what I found was that on some sites (mine included), if there was a tag:
<meta http-equiv="refresh" content="15" />
Dillo reports and the warning message reports:
The author wanted you to go _here_ after 10 seconds.
What happens now is the link (_here_) gets appended with the refresh value, so for example:
www.example.com/refesh.html
becomes:
www.example.com/10 (or 15 or whatever the refresh value is).
I have found that if the meta tag is right:
<meta http-equiv="refresh" content="5;URL='./index.html'" />
then the code creates the right URL (but it's not seen as no warning is given.
html.cc from line 3218
if the meta tag refresh is correct HTML on the web page, mr_url returns the correct path to the refresh. If it isn't, then mr_url returns the value of the refresh - which fubars the warning messages link.
I tested this running dillo from the commandline with this insertion and testing the correct and wrong meta tags:
} /* Skip to anything after "URL=" or ";" if "URL=" is not found */ if ((p = dStriAsciiStr(content, "url="))) content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content))) mr_url = dStrndup(content + 1, p - content - 1); else mr_url = dStrdup(content + 1); } else { mr_url = dStrdup(content); } //nick printf ("%s\n", mr_url); new_url = a_Html_url_new(html, mr_url, NULL, 0); // new_url = a_Html_url_new(html, ".", NULL, 0);
Not being very good with C++, I can't work out what goes wrong here - but it looks like that if 'URL=' or';' isn't found, then the code still looks for quotes - which does something to the string?
I have two pages you can test on - one with the proper meta tag, and one with out:
http://irpi.linicks.net:8080/static_simple.html <- good
http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the bug
If possible view both with that printf I put in - you can see what mr_url returns.
Please try to be clear & concise.
There's a bug, yes. You may try this:
diff -r c20e74568504 src/html.cc --- a/src/html.cc Sun May 01 10:49:17 2016 -0300 +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300 @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); + else + content = ""; /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content)))
@Johannes, does it look OK to you?
Looks ok to me, but the whole code there is a bit hairy. Cheers, Johannes