Hi, attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues: * Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T) without breaking: * zero width space handling. * ideographic character handling. It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you. Cheers, Johannes
Hi Johannes, On Tue, Sep 27, 2011 at 09:30:24PM +0200, Johannes Hofmann wrote:
Hi,
attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues:
* Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T)
without breaking:
* zero width space handling. * ideographic character handling.
It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you.
Good! I'm reviewing the patch and it seems to work well so far. Please explain the non-deterministical behaviour at google. -- Cheers Jorge.-
Jorge wrote:
On Tue, Sep 27, 2011 at 09:30:24PM +0200, Johannes Hofmann wrote:
attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues:
* Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T)
without breaking:
* zero width space handling. * ideographic character handling.
It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you.
Good!
I'm reviewing the patch and it seems to work well so far.
I just noticed that page source isn't wrapping.
On Thu, Sep 29, 2011 at 03:02:24PM +0000, corvid wrote:
Jorge wrote:
On Tue, Sep 27, 2011 at 09:30:24PM +0200, Johannes Hofmann wrote:
attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues:
* Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T)
without breaking:
* zero width space handling. * ideographic character handling.
It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you.
Good!
I'm reviewing the patch and it seems to work well so far.
I just noticed that page source isn't wrapping.
Ah good point. Seems I broke white-space:pre-wrap. I will try to fix that.
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
Hi Johannes,
On Tue, Sep 27, 2011 at 09:30:24PM +0200, Johannes Hofmann wrote:
Hi,
attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues:
* Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T)
without breaking:
* zero width space handling. * ideographic character handling.
It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you.
Good!
I'm reviewing the patch and it seems to work well so far.
Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not. Cheers, Johannes
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
Hi Johannes,
On Tue, Sep 27, 2011 at 09:30:24PM +0200, Johannes Hofmann wrote:
Hi,
attached is a preliminary patch for white-space: nowrap handling. It' supposed to fix the following issues:
* Weird page width in combination with white-space:nowrap sequences (e.g. on http://en.wikipedia.org/wiki/Web_browser. * Line breaks at tags (e.g. <b>D</b>illo should not break after the "D". (similar issue on http://fltk.org/newsgroups.php?gfltk.development+T)
without breaking:
* zero width space handling. * ideographic character handling.
It's not ready for commit yet, as google results don't render deterministically the same way. Nevertheless please give it a try and report how it works for you.
Good!
I'm reviewing the patch and it seems to work well so far.
Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation. After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now. -- Cheers Jorge.-
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
[...] Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P -- Cheers Jorge.-
Jorge wrote:
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
[...] Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P
I haven't looked into this, but do you suppose it could have to do with the part in html.cc that goes if (isspace(buf[buf_index])) { /* whitespace: group all available whitespace */ while (++buf_index < bufsize && isspace(buf[buf_index])) ; Html_process_space(html, buf + token_start, buf_index - token_start); token_start = buf_index; } ... that is, sends off whatever it's accumulated at the end of a packet?
On Sat, Oct 01, 2011 at 04:45:52PM +0000, corvid wrote:
Jorge wrote:
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
[...] Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P
I haven't looked into this, but do you suppose it could have to do with the part in html.cc that goes
if (isspace(buf[buf_index])) { /* whitespace: group all available whitespace */ while (++buf_index < bufsize && isspace(buf[buf_index])) ; Html_process_space(html, buf + token_start, buf_index - token_start); token_start = buf_index; } ...
that is, sends off whatever it's accumulated at the end of a packet?
Not sure, but I don't think so. Without the whitespace patch, dillo also doesn't seem to get right the available horizontal space for the link [1]. To me it looks related to the way ParMin propagates upwards in a textblock, but I'm just starting to look into it. BTW, adding these lines to style.css helps to visualize it: table {border: 1px solid red !important} td {border: 1px solid green !important} div {border: 1px solid black !important} [1] http://www.google.com/search?ie=UTF-8&oe=UTF-8&q=dillo -- Cheers Jorge.-
On Sat, Oct 01, 2011 at 03:17:09PM -0300, Jorge Arellano Cid wrote:
On Sat, Oct 01, 2011 at 04:45:52PM +0000, corvid wrote:
Jorge wrote:
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
[...] Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P
I haven't looked into this, but do you suppose it could have to do with the part in html.cc that goes
if (isspace(buf[buf_index])) { /* whitespace: group all available whitespace */ while (++buf_index < bufsize && isspace(buf[buf_index])) ; Html_process_space(html, buf + token_start, buf_index - token_start); token_start = buf_index; } ...
that is, sends off whatever it's accumulated at the end of a packet?
Not sure, but I don't think so.
Without the whitespace patch, dillo also doesn't seem to get right the available horizontal space for the link [1].
To me it looks related to the way ParMin propagates upwards in a textblock, but I'm just starting to look into it.
Yes, I also think it's related to that. Textblock::getExtremesImpl() starts from wrapRef and uses the precomputed line->maxWordMin, line->maxParMax, line->parMin, and line->parMax. From there it does compute the stuff on it's own. So if the computation of line->maxWordMin, line->maxParMax, line->parMin, and line->parMax in wordWrap() doesn't agree with what Textblock::getExtremesImpl() does afterwards, the result becomes unreliable. Cheers, Johannes
On Sun, Oct 02, 2011 at 11:09:25AM +0200, Johannes Hofmann wrote:
On Sat, Oct 01, 2011 at 03:17:09PM -0300, Jorge Arellano Cid wrote:
On Sat, Oct 01, 2011 at 04:45:52PM +0000, corvid wrote:
Jorge wrote:
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
> [...] > Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P
I haven't looked into this, but do you suppose it could have to do with the part in html.cc that goes
if (isspace(buf[buf_index])) { /* whitespace: group all available whitespace */ while (++buf_index < bufsize && isspace(buf[buf_index])) ; Html_process_space(html, buf + token_start, buf_index - token_start); token_start = buf_index; } ...
that is, sends off whatever it's accumulated at the end of a packet?
Not sure, but I don't think so.
Without the whitespace patch, dillo also doesn't seem to get right the available horizontal space for the link [1].
To me it looks related to the way ParMin propagates upwards in a textblock, but I'm just starting to look into it.
Yes, I also think it's related to that. Textblock::getExtremesImpl() starts from wrapRef and uses the precomputed line->maxWordMin, line->maxParMax, line->parMin, and line->parMax. From there it does compute the stuff on it's own.
So if the computation of line->maxWordMin, line->maxParMax, line->parMin, and line->parMax in wordWrap() doesn't agree with what Textblock::getExtremesImpl() does afterwards, the result becomes unreliable.
FWIW, this old comment in the code (not present today) may be a good hint of what happens now: textblock.cc, Textblock::wordWrap() /* NOTE: Most code relies on that all values of nowrap are equal for all * words within one line. */ -- Cheers Jorge.-
On Sat, Oct 01, 2011 at 01:11:05PM -0300, Jorge Arellano Cid wrote:
On Sat, Oct 01, 2011 at 11:03:18AM -0300, Jorge Arellano Cid wrote:
On Fri, Sep 30, 2011 at 08:54:39AM +0200, Johannes Hofmann wrote:
On Wed, Sep 28, 2011 at 05:49:10PM -0300, Jorge Arellano Cid wrote:
[...] Please explain the non-deterministical behaviour at google.
If I search for e.g. dillo on google, the first line (the link) of a result entry is sometimes broken into two lines, sometimes not.
Thanks for the explanation.
After some time reviewing it and guessing, I succeeded to isolate a small testcase which I'm reviewing now.
Sorry, it was not a reliable testcase... :-P
Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P). Compare with Firefox/Iceweasel rendering. -- Cheers Jorge.-
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support: #center_col{min-width:562px} Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width. -- Cheers Jorge.-
On Tue, Oct 04, 2011 at 04:13:57PM -0300, Jorge Arellano Cid wrote:
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support:
#center_col{min-width:562px}
Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width.
More precisely: 1.- Without min-width, Firefox shrinks the contents down to a thin column following the window width (similar to dillo) but 2.- Firefox keeps the width of the two innermost DIVs touching the right margin of the parent DIV (unlike dillo). The second point looks like a bug in dillo, and it happens with and without the patch. -- Cheers Jorge.-
On Tue, Oct 04, 2011 at 04:13:57PM -0300, Jorge Arellano Cid wrote:
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support:
#center_col{min-width:562px}
Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width.
A colleague just found a simple test case, and the fix for it also seems to fix the google case - but it needs some more testing. Testcase: <b>Hello World</b><div>Hello world</div> Notice how the first line breaks without need when narrowing the window. Cheers, Johannes
On Wed, Oct 05, 2011 at 01:32:29PM +0200, Johannes Hofmann wrote:
On Tue, Oct 04, 2011 at 04:13:57PM -0300, Jorge Arellano Cid wrote:
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support:
#center_col{min-width:562px}
Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width.
A colleague just found a simple test case, and the fix for it also seems to fix the google case - but it needs some more testing.
Testcase:
<b>Hello World</b><div>Hello world</div>
Notice how the first line breaks without need when narrowing the window.
Good testcase! Here, the first sentence is broken regardless the window size, and FWIW, this patch solves it: @@ -1825,6 +1825,7 @@ void Textblock::addParbreak (int space, word = addWord (0, 0, 0, style); word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = space; wordWrap (words->size () - 1); } @@ -1846,6 +1847,7 @@ void Textblock::addLinebreak (core::styl word = addWord (0, 0, 0, style); word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = 0; wordWrap (words->size () - 1); } (this patch doesn't solve the google testcase, so I thought it was better to let you know of it). -- Cheers Jorge.-
On Wed, Oct 05, 2011 at 12:48:56PM -0300, Jorge Arellano Cid wrote:
On Wed, Oct 05, 2011 at 01:32:29PM +0200, Johannes Hofmann wrote:
On Tue, Oct 04, 2011 at 04:13:57PM -0300, Jorge Arellano Cid wrote:
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support:
#center_col{min-width:562px}
Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width.
A colleague just found a simple test case, and the fix for it also seems to fix the google case - but it needs some more testing.
Testcase:
<b>Hello World</b><div>Hello world</div>
Notice how the first line breaks without need when narrowing the window.
Good testcase!
Here, the first sentence is broken regardless the window size, and FWIW, this patch solves it:
@@ -1825,6 +1825,7 @@ void Textblock::addParbreak (int space,
word = addWord (0, 0, 0, style); word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = space; wordWrap (words->size () - 1); } @@ -1846,6 +1847,7 @@ void Textblock::addLinebreak (core::styl word = addWord (0, 0, 0, style);
word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = 0; wordWrap (words->size () - 1); }
(this patch doesn't solve the google testcase, so I thought it was better to let you know of it).
Thanks! I already have a fix and I think the whole thing can be committed tonight or tomorrow. Cheers, Johannes
On Wed, Oct 05, 2011 at 05:54:22PM +0200, Johannes Hofmann wrote:
On Wed, Oct 05, 2011 at 12:48:56PM -0300, Jorge Arellano Cid wrote:
On Wed, Oct 05, 2011 at 01:32:29PM +0200, Johannes Hofmann wrote:
On Tue, Oct 04, 2011 at 04:13:57PM -0300, Jorge Arellano Cid wrote:
On Tue, Oct 04, 2011 at 01:57:05PM -0300, Jorge Arellano Cid wrote:
[...] Attached goes a small and very interesting testcase. (I had a hard time reducing google's page :-P).
Compare with Firefox/Iceweasel rendering.
There's a "min-width" CSS element that dillo doesn't support:
#center_col{min-width:562px}
Taking it out of the equation makes Firefox render the page quite similar to dillo; so maybe it is not a rewrap bug, but a lack of implementation of CSS' min-width.
A colleague just found a simple test case, and the fix for it also seems to fix the google case - but it needs some more testing.
Testcase:
<b>Hello World</b><div>Hello world</div>
Notice how the first line breaks without need when narrowing the window.
Good testcase!
Here, the first sentence is broken regardless the window size, and FWIW, this patch solves it:
@@ -1825,6 +1825,7 @@ void Textblock::addParbreak (int space,
word = addWord (0, 0, 0, style); word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = space; wordWrap (words->size () - 1); } @@ -1846,6 +1847,7 @@ void Textblock::addLinebreak (core::styl word = addWord (0, 0, 0, style);
word->content.type = core::Content::BREAK; + word->content.breakType = core::Content::BREAK_OK; word->content.breakSpace = 0; wordWrap (words->size () - 1); }
(this patch doesn't solve the google testcase, so I thought it was better to let you know of it).
Thanks! I already have a fix and I think the whole thing can be committed tonight or tomorrow.
Ok, another round of testing. Please check how attached patch works for you. Cheers, Johannes PS: Kudos to Florian who by accident found the nice test case!
participants (4)
-
corvid@lavabit.com
-
jcid@dillo.org
-
Johannes.Hofmann@gmx.de
-
johannes.hofmann@gmx.de