On Fri, Dec 14, 2012 at 04:48:23PM +0100, Sebastian Geerken wrote:
Somehow this post got lost ...
Date: Fri, 14 Dec 2012 15:43:41 +0100 From: Sebastian Geerken <sgeerken at dillo.org> Subject: Re: [Dillo-dev] Dillo early exit To: Dillo mailing list <dillo-dev at dillo.org> Mail-Followup-To: Sebastian Geerken <sgeerken at dillo.org>, Dillo mailing list <dillo-dev at dillo.org>
On Fri, Dec 14, Jorge Arellano Cid wrote:
Just noticed this:
Nav_open_url: new url='http://news.bress.net/search.php?feed=149' Dns_server [0]: news.bress.net is 67.205.59.213 Connecting to 67.205.59.213 NumPendingStyleSheets=1 *** [dillo/3.0.2] This should not happen! *** Aborted
This is new, as dillo from Nov 14 doesn't exit. Any clues?
Debugging shows the same issue as Alexander's problem:
On Fri, Dec 14, Alexander Voigt wrote:
with the current Dillo development version 2672:4d0bdcf10ee7 (Fri Dec 14 12:24:54 2012 +0100) I get a segfault when I try to access the Dillo bug database.
[...] #3 _nextUtf8Char (s=<value optimized out>) at unicode.cc:92 #4 0x000000000047163c in lout::unicode::nextUtf8Char (s=0x98d10c "\267", len=1) at unicode.cc:114 #5 0x000000000044e8d0 in dw::Textblock::addText (this=<value optimized out>, text=0x98d10c "\267", len=<value optimized out>, style=<value optimized out>) at textblock.cc:1430
The HTML parser passes invalid UTF-8 to dw::Textblock. I will make nextUtf8Char more robust (of course, dillo should not crash), but Jorge's page is HTML, encoded in ISO-8859-1, not UTF-8, as seen here:
000009b0 34 38 22 3e 2d 20 4b 65 79 73 74 72 6f 6b 65 20 |48">- Keystroke | 000009c0 4c 6f 67 67 69 6e 67 20 77 69 74 68 20 42 65 61 |Logging with Bea| 000009d0 63 6f 6e 20 ab 20 53 74 72 61 74 65 67 69 63 20 |con . Strategic | ^^
It seems that the Fltk functions do some checks, and sometimes decode as ISO-8859-1.
AFAIR from comments in fltk, some utf8 functions dealt with mixed latin1, utf8 and some windows codec. They got into it because the mix was inevitable for them. -- Cheers Jorge.-