Hi, [gzip decoder] On Mon, Nov 12, 2007 at 02:52:43PM +0000, place wrote:
Hmm, how unfortunate. It had been working just fine for me, though I took it out yesterday so that I could have a clean tree... I wonder whether we're constantly going to see timing problems or something, where the people with fast machines write things that work on fast machines and the person with a slow machine will write things that work on a slow machine.
Good news! Committed. I found a workaround for the segfault and decided to commit to allow further investigation and polishing from in-CVS code. It doesn't look like a race condition, but a problem of handling redirections and the null_decoder in cache.c (redirections is a somewhat ad-hoc code, not well designed yet). Attached goes a page to reproduce the segfault (without the patch in CVS). For instance: 1- save the attached page in /tmp 2- dillo-fltk /tmp 3- click on the page 4- go back 5- go forward (segfault) // you may need to repeat this 4 and 5 BTW, the workaround is mainly: - dStr_append_l(entry->Data, buf, (int)buf_size); + /* Assert we have a Decoder. + * BUG: this is a workaround, more study and a proper design + * for handling redirects is required */ + if (entry->Decoder != NULL) { + decodedBuf = a_Decode_process(entry->Decoder, buf, buf_size); + dStr_append_l(entry->Data, decodedBuf->str, decodedBuf->len); + dStr_free(decodedBuf, 1); + } else { + dStr_append_l(entry->Data, buf, buf_size); + } With regard to: // Doesn't work. I could make TotalSize into something like BytesRemaining, // seeing whether it goes precisely to 0. I'd prefer TransferSize (this is the whole http transfer size minus the Header length --i.e. Content-Length). With regard to the iconv decoder: Please note that we may still need the original data, to be able to save verbatim. This is, if the original page is encoded in latin2 (with a <meta http-equiv charset line in the source), and it is saved translated to UTF-8 we have two problems (the misleading "meta" and that the user will get a page that's not a verbatim copy of the original). One way to solve this is to re-encode into the original charset at save time. This is not 8bit clean but could work most of the time. Another way is to keep a copy of the verbatim data. In this case we're 8bit clean and it would only take more memory when the original is not UTF-8. I *feel* 8bit clean is the correct path, and here there're lots of ways to optimize. For instance, with UTF-8 pages we don't need an extra buffer. For pages that need one, we can deallocate the UTF-8 encoded one when leaving the page (and re-create it if the page is visited again). This is just some food for thought. -- Cheers Jorge.-