I have been looking at this for about 5 hours now, and it has fried my brain. Reading the code, and then reading the email thread about depreciated meta tag 'refresh', I agree with Dillo's actions. But what I found was that on some sites (mine included), if there was a tag: <meta http-equiv="refresh" content="15" /> Dillo reports and the warning message reports: The author wanted you to go _here_ after 10 seconds. What happens now is the link (_here_) gets appended with the refresh value, so for example: www.example.com/refesh.html becomes: www.example.com/10 (or 15 or whatever the refresh value is). I have found that if the meta tag is right: <meta http-equiv="refresh" content="5;URL='./index.html'" /> then the code creates the right URL (but it's not seen as no warning is given. html.cc from line 3218 if the meta tag refresh is correct HTML on the web page, mr_url returns the correct path to the refresh. If it isn't, then mr_url returns the value of the refresh - which fubars the warning messages link. I tested this running dillo from the commandline with this insertion and testing the correct and wrong meta tags: } /* Skip to anything after "URL=" or ";" if "URL=" is not found */ if ((p = dStriAsciiStr(content, "url="))) content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content))) mr_url = dStrndup(content + 1, p - content - 1); else mr_url = dStrdup(content + 1); } else { mr_url = dStrdup(content); } //nick printf ("%s\n", mr_url); new_url = a_Html_url_new(html, mr_url, NULL, 0); // new_url = a_Html_url_new(html, ".", NULL, 0); Not being very good with C++, I can't work out what goes wrong here - but it looks like that if 'URL=' or';' isn't found, then the code still looks for quotes - which does something to the string? I have two pages you can test on - one with the proper meta tag, and one with out: http://irpi.linicks.net:8080/static_simple.html <- good http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the bug If possible view both with that printf I put in - you can see what mr_url returns. Phew* Nick -- Gosh that takes me back... or is it forward? That's the trouble with time travel, you never can tell." -- Doctor Who "Androids of Tara"
Hi, On Sun, May 01, 2016 at 04:06:51PM +0100, Nick Warne wrote:
I have been looking at this for about 5 hours now, and it has fried my brain.
Reading the code, and then reading the email thread about depreciated meta tag 'refresh', I agree with Dillo's actions.
But what I found was that on some sites (mine included), if there was a tag:
<meta http-equiv="refresh" content="15" />
Dillo reports and the warning message reports:
The author wanted you to go _here_ after 10 seconds.
What happens now is the link (_here_) gets appended with the refresh value, so for example:
www.example.com/refesh.html
becomes:
www.example.com/10 (or 15 or whatever the refresh value is).
I have found that if the meta tag is right:
<meta http-equiv="refresh" content="5;URL='./index.html'" />
then the code creates the right URL (but it's not seen as no warning is given.
html.cc from line 3218
if the meta tag refresh is correct HTML on the web page, mr_url returns the correct path to the refresh. If it isn't, then mr_url returns the value of the refresh - which fubars the warning messages link.
I tested this running dillo from the commandline with this insertion and testing the correct and wrong meta tags:
} /* Skip to anything after "URL=" or ";" if "URL=" is not found */ if ((p = dStriAsciiStr(content, "url="))) content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content))) mr_url = dStrndup(content + 1, p - content - 1); else mr_url = dStrdup(content + 1); } else { mr_url = dStrdup(content); } //nick printf ("%s\n", mr_url); new_url = a_Html_url_new(html, mr_url, NULL, 0); // new_url = a_Html_url_new(html, ".", NULL, 0);
Not being very good with C++, I can't work out what goes wrong here - but it looks like that if 'URL=' or';' isn't found, then the code still looks for quotes - which does something to the string?
I have two pages you can test on - one with the proper meta tag, and one with out:
http://irpi.linicks.net:8080/static_simple.html <- good
http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the bug
If possible view both with that printf I put in - you can see what mr_url returns.
Please try to be clear & concise. There's a bug, yes. You may try this: diff -r c20e74568504 src/html.cc --- a/src/html.cc Sun May 01 10:49:17 2016 -0300 +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300 @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); + else + content = ""; /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content))) @Johannes, does it look OK to you? -- Cheers Jorge.-
On 03/05/16 01:59, Jorge Arellano Cid wrote:
Please try to be clear & concise.
I thought I was :)
There's a bug, yes. You may try this:
diff -r c20e74568504 src/html.cc --- a/src/html.cc Sun May 01 10:49:17 2016 -0300 +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300 @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); + else + content = ""; /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content)))
@Johannes, does it look OK to you?
OK, that doesn't quite work correctly - it removes the warning messages altogether. Using: content = "./"; produces the warning message and with the top level URL - but, a case like this: www.example.com/refresh.html should produce a refresh to that URL, but it drops the page and points to www.example.com/ Nearly there... Nick -- Gosh that takes me back... or is it forward? That's the trouble with time travel, you never can tell." -- Doctor Who "Androids of Tara"
On Mon, May 02, 2016 at 09:59:09PM -0300, Jorge Arellano Cid wrote:
Hi,
On Sun, May 01, 2016 at 04:06:51PM +0100, Nick Warne wrote:
I have been looking at this for about 5 hours now, and it has fried my brain.
Reading the code, and then reading the email thread about depreciated meta tag 'refresh', I agree with Dillo's actions.
But what I found was that on some sites (mine included), if there was a tag:
<meta http-equiv="refresh" content="15" />
Dillo reports and the warning message reports:
The author wanted you to go _here_ after 10 seconds.
What happens now is the link (_here_) gets appended with the refresh value, so for example:
www.example.com/refesh.html
becomes:
www.example.com/10 (or 15 or whatever the refresh value is).
I have found that if the meta tag is right:
<meta http-equiv="refresh" content="5;URL='./index.html'" />
then the code creates the right URL (but it's not seen as no warning is given.
html.cc from line 3218
if the meta tag refresh is correct HTML on the web page, mr_url returns the correct path to the refresh. If it isn't, then mr_url returns the value of the refresh - which fubars the warning messages link.
I tested this running dillo from the commandline with this insertion and testing the correct and wrong meta tags:
} /* Skip to anything after "URL=" or ";" if "URL=" is not found */ if ((p = dStriAsciiStr(content, "url="))) content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content))) mr_url = dStrndup(content + 1, p - content - 1); else mr_url = dStrdup(content + 1); } else { mr_url = dStrdup(content); } //nick printf ("%s\n", mr_url); new_url = a_Html_url_new(html, mr_url, NULL, 0); // new_url = a_Html_url_new(html, ".", NULL, 0);
Not being very good with C++, I can't work out what goes wrong here - but it looks like that if 'URL=' or';' isn't found, then the code still looks for quotes - which does something to the string?
I have two pages you can test on - one with the proper meta tag, and one with out:
http://irpi.linicks.net:8080/static_simple.html <- good
http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the bug
If possible view both with that printf I put in - you can see what mr_url returns.
Please try to be clear & concise.
There's a bug, yes. You may try this:
diff -r c20e74568504 src/html.cc --- a/src/html.cc Sun May 01 10:49:17 2016 -0300 +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300 @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); + else + content = ""; /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content)))
@Johannes, does it look OK to you?
Looks ok to me, but the whole code there is a bit hairy. Cheers, Johannes
On Tue, May 10, 2016 at 08:59:49AM +0200, Johannes Hofmann wrote:
On Mon, May 02, 2016 at 09:59:09PM -0300, Jorge Arellano Cid wrote:
[...]
I have two pages you can test on - one with the proper meta tag, and one with out:
http://irpi.linicks.net:8080/static_simple.html <- good
http://fishpi.linicks.net:8081/static_simple.html <- bad - that shows the bug
If possible view both with that printf I put in - you can see what mr_url returns.
Please try to be clear & concise.
There's a bug, yes. You may try this:
diff -r c20e74568504 src/html.cc --- a/src/html.cc Sun May 01 10:49:17 2016 -0300 +++ b/src/html.cc Sun May 01 23:26:48 2016 -0300 @@ -3220,6 +3220,8 @@ static void Html_tag_open_meta(DilloHtml content = p + strlen("url="); else if ((p = strstr(content, ";"))) content = p + strlen(";"); + else + content = ""; /* Handle the case of a quoted URL */ if (*content == '"' || *content == '\'') { if ((p = strchr(content + 1, *content)))
@Johannes, does it look OK to you?
Looks ok to me, but the whole code there is a bit hairy.
Yes. This stems from two reasons: the meta-refresh element has no defined standard, and our code has evolved accomodating to work with what we've found along the way. Do you have a testcase battery for meta refresh? Currently, I'm working on the BODY and HTML element handling by the parser (special cases since their implicit optional-magic open and close). Fortunately I have the test cases for BODY and HTML. I also found my old set for META refresh, but I don't have the case you made the last patch for. It would be great to consolidate them. -- Cheers Jorge.-
On Tue, May 10, 2016 at 10:38:57AM -0300, Jorge Arellano Cid wrote:
Yes. This stems from two reasons: the meta-refresh element has no defined standard, and our code has evolved accomodating to work with what we've found along the way.
They do have refresh in html5: https://www.w3.org/TR/html51/document-metadata.html#statedef-http-equiv-refr...
On Tue, 10 May 2016 19:20:21 +0000 eocene <eocene at gmx.com> wrote:
On Tue, May 10, 2016 at 10:38:57AM -0300, Jorge Arellano Cid wrote:
Yes. This stems from two reasons: the meta-refresh element has no defined standard, and our code has evolved accomodating to work with what we've found along the way.
They do have refresh in html5: https://www.w3.org/TR/html51/document-metadata.html#statedef-http-equiv-refr...
Quick reading, that is the way to go. Nick -- Gosh that takes me back... or is it forward? That's the trouble with time travel, you never can tell." -- Doctor Who "Androids of Tara"
participants (4)
-
eocene@gmx.com
-
jcid@dillo.org
-
johannes.hofmann@gmx.de
-
nick@linicks.net