Hi,
[gzip decoder]
On Mon, Nov 12, 2007 at 02:52:43PM +0000, place wrote:
> Hmm, how unfortunate. It had been working just fine for me, though I took it
> out yesterday so that I could have a clean tree... I wonder whether
> we're constantly going to see timing problems or something, where the
> people with fast machines write things that work on fast machines and
> the person with a slow machine will write things that work on a slow machine.
Good news!
Committed.
I found a workaround for the segfault and decided to
commit to allow further investigation and polishing from in-CVS code.
It doesn't look like a race condition, but a problem of handling redirections
and the null_decoder in cache.c (redirections is a somewhat ad-hoc code, not
well designed yet).
Attached goes a page to reproduce the segfault (without the patch in CVS).
For instance:
1- save the attached page in /tmp
2- dillo-fltk /tmp
3- click on the page
4- go back
5- go forward (segfault) // you may need to repeat this 4 and 5
BTW, the workaround is mainly:
- dStr_append_l(entry->Data, buf, (int)buf_size);
+ /* Assert we have a Decoder.
+ * BUG: this is a workaround, more study and a proper design
+ * for handling redirects is required */
+ if (entry->Decoder != NULL) {
+ decodedBuf = a_Decode_process(entry->Decoder, buf, buf_size);
+ dStr_append_l(entry->Data, decodedBuf->str, decodedBuf->len);
+ dStr_free(decodedBuf, 1);
+ } else {
+ dStr_append_l(entry->Data, buf, buf_size);
+ }
With regard to:
// Doesn't work. I could make TotalSize into something like BytesRemaining,
// seeing whether it goes precisely to 0.
I'd prefer TransferSize (this is the whole http transfer size minus the
Header length --i.e. Content-Length).
With regard to the iconv decoder:
Please note that we may still need the original data, to be
able to save verbatim. This is, if the original page is encoded
in latin2 (with a <meta http-equiv charset line in the source),
and it is saved translated to UTF-8 we have two problems (the
misleading "meta" and that the user will get a page that's not a
verbatim copy of the original).
One way to solve this is to re-encode into the original charset
at save time. This is not 8bit clean but could work most of the
time.
Another way is to keep a copy of the verbatim data. In this
case we're 8bit clean and it would only take more memory when the
original is not UTF-8.
I *feel* 8bit clean is the correct path, and here there're lots
of ways to optimize. For instance, with UTF-8 pages we don't need
an extra buffer. For pages that need one, we can deallocate the
UTF-8 encoded one when leaving the page (and re-create it if the
page is visited again).
This is just some food for thought.
--
Cheers
Jorge.-