Hi, [gzip decoder] On Mon, Nov 12, 2007 at 02:52:43PM +0000, place wrote:
Hmm, how unfortunate. It had been working just fine for me, though I took it out yesterday so that I could have a clean tree... I wonder whether we're constantly going to see timing problems or something, where the people with fast machines write things that work on fast machines and the person with a slow machine will write things that work on a slow machine.
Good news! Committed. I found a workaround for the segfault and decided to commit to allow further investigation and polishing from in-CVS code. It doesn't look like a race condition, but a problem of handling redirections and the null_decoder in cache.c (redirections is a somewhat ad-hoc code, not well designed yet). Attached goes a page to reproduce the segfault (without the patch in CVS). For instance: 1- save the attached page in /tmp 2- dillo-fltk /tmp 3- click on the page 4- go back 5- go forward (segfault) // you may need to repeat this 4 and 5 BTW, the workaround is mainly: - dStr_append_l(entry->Data, buf, (int)buf_size); + /* Assert we have a Decoder. + * BUG: this is a workaround, more study and a proper design + * for handling redirects is required */ + if (entry->Decoder != NULL) { + decodedBuf = a_Decode_process(entry->Decoder, buf, buf_size); + dStr_append_l(entry->Data, decodedBuf->str, decodedBuf->len); + dStr_free(decodedBuf, 1); + } else { + dStr_append_l(entry->Data, buf, buf_size); + } With regard to: // Doesn't work. I could make TotalSize into something like BytesRemaining, // seeing whether it goes precisely to 0. I'd prefer TransferSize (this is the whole http transfer size minus the Header length --i.e. Content-Length). With regard to the iconv decoder: Please note that we may still need the original data, to be able to save verbatim. This is, if the original page is encoded in latin2 (with a <meta http-equiv charset line in the source), and it is saved translated to UTF-8 we have two problems (the misleading "meta" and that the user will get a page that's not a verbatim copy of the original). One way to solve this is to re-encode into the original charset at save time. This is not 8bit clean but could work most of the time. Another way is to keep a copy of the verbatim data. In this case we're 8bit clean and it would only take more memory when the original is not UTF-8. I *feel* 8bit clean is the correct path, and here there're lots of ways to optimize. For instance, with UTF-8 pages we don't need an extra buffer. For pages that need one, we can deallocate the UTF-8 encoded one when leaving the page (and re-create it if the page is visited again). This is just some food for thought. -- Cheers Jorge.-
Jorge wrote:
It doesn't look like a race condition, but a problem of handling redirections and the null_decoder in cache.c (redirections is a somewhat ad-hoc code, not well designed yet).
Attached goes a page to reproduce the segfault (without the patch in CVS). For instance:
1- save the attached page in /tmp 2- dillo-fltk /tmp 3- click on the page 4- go back 5- go forward (segfault) // you may need to repeat this 4 and 5
Perhaps along these same lines, I did manage to break my image-loading code once when I clicked a few times on an image that wanted to redirect. I couldn't get it to crash again, though.
With regard to the iconv decoder:
Please note that we may still need the original data, to be able to save verbatim. This is, if the original page is encoded in latin2 (with a <meta http-equiv charset line in the source), and it is saved translated to UTF-8 we have two problems (the misleading "meta" and that the user will get a page that's not a verbatim copy of the original).
One way to solve this is to re-encode into the original charset at save time. This is not 8bit clean but could work most of the time.
Another way is to keep a copy of the verbatim data. In this case we're 8bit clean and it would only take more memory when the original is not UTF-8.
I *feel* 8bit clean is the correct path, and here there're lots of ways to optimize. For instance, with UTF-8 pages we don't need an extra buffer. For pages that need one, we can deallocate the UTF-8 encoded one when leaving the page (and re-create it if the page is visited again).
This is just some food for thought.
You had mentioned once before that you wanted to keep the original in cache, and I agree completely.
On Tue, Nov 13, 2007 at 11:11:29AM -0300, Jorge Arellano Cid wrote:
Hi,
[gzip decoder]
On Mon, Nov 12, 2007 at 02:52:43PM +0000, place wrote:
Hmm, how unfortunate. It had been working just fine for me, though I took it out yesterday so that I could have a clean tree... I wonder whether we're constantly going to see timing problems or something, where the people with fast machines write things that work on fast machines and the person with a slow machine will write things that work on a slow machine.
Good news! Committed.
Is decode.h perhaps missing in cvs? cheers, Johannes
On Tue, Nov 13, 2007 at 09:10:08PM +0100, Johannes Hofmann wrote:
On Tue, Nov 13, 2007 at 11:11:29AM -0300, Jorge Arellano Cid wrote:
Hi,
[gzip decoder]
On Mon, Nov 12, 2007 at 02:52:43PM +0000, place wrote:
Hmm, how unfortunate. It had been working just fine for me, though I took it out yesterday so that I could have a clean tree... I wonder whether we're constantly going to see timing problems or something, where the people with fast machines write things that work on fast machines and the person with a slow machine will write things that work on a slow machine.
Good news! Committed.
Is decode.h perhaps missing in cvs?
Ooops! Yes, sorry... Committed both: decode.[ch] -- Cheers Jorge.-
Now that I'm on gcc4, let's get rid of the embarrassing signedness warnings: diff -pur dillo2/src/decode.c dillo2-cur/src/decode.c --- dillo2/src/decode.c 2007-11-14 11:40:05.000000000 +0000 +++ dillo2-cur/src/decode.c 2007-11-14 19:40:29.000000000 +0000 @@ -32,10 +32,10 @@ static Dstr *Decode_gzip(Decode *dc, con Dstr *output = dStr_new(""); while ((rc == Z_OK) && (inputConsumed < inLen)) { - zs->next_in = (char *)inData + inputConsumed; + zs->next_in = (Bytef *)inData + inputConsumed; zs->avail_in = inLen - inputConsumed; - zs->next_out = dc->buffer; + zs->next_out = (Bytef *)dc->buffer; zs->avail_out = bufsize; rc = inflate(zs, Z_SYNC_FLUSH); @@ -69,7 +69,7 @@ static Dstr *Decode_charset(Decode *dc, Dstr *input, *output; char *inPtr, *outPtr; - int inLeft, outRoom; + size_t inLeft, outRoom; output = dStr_new("");
participants (3)
-
jcid@dillo.org
-
Johannes.Hofmann@gmx.de
-
place@gobigwest.com