Latest on invalid image callback references
The valgrind logs are still showing invalid memory accesses that happen when image callbacks refer to DilloImages that have been freed by a_Web_free(). This is after the imgref patch was applied, since when I've been running an essentially unpatched Dillo. My image tracing still shows that the image pipeline sometimes continues to run after a_Web_free() frees the image it is attached to. As far as I can tell there has been no recent change in the behaviour of this problem. I have just applied a patch I was considering earlier that makes a_Cache_stop_client() close the client. I've attached it for comments, but I won't suggest applying it until we can see whether it's made any difference. That will take a few days, as I still haven't found a reliable way to reproduce the bug so I will just have to wait a few days and see if it stops appearing in the logs. Please comment on whether this patch is the right way to fix this. Regards, Jeremy Henty PS. You'll probably notice fewer valgrind reports in my log pages in future because I just returned to full-time work so I won't be surfing from home nearly as much.
On Tue, May 19, 2009 at 07:34:30AM +0100, Jeremy Henty wrote:
The valgrind logs are still showing invalid memory accesses that happen when image callbacks refer to DilloImages that have been freed by a_Web_free(). This is after the imgref patch was applied, since when I've been running an essentially unpatched Dillo. My image tracing still shows that the image pipeline sometimes continues to run after a_Web_free() frees the image it is attached to. As far as I can tell there has been no recent change in the behaviour of this problem.
I see.
I have just applied a patch I was considering earlier that makes a_Cache_stop_client() close the client. I've attached it for comments, but I won't suggest applying it until we can see whether it's made any difference. That will take a few days, as I still haven't found a reliable way to reproduce the bug so I will just have to wait a few days and see if it stops appearing in the logs. Please comment on whether this patch is the right way to fix this.
A long time ago when coding the bindings for the new imgbuf, I seem to remember seeing a corner case that would leak. Back then the priority was to get the port into a working state so it was left for future fixing (as many other things). Since then, the image code has undergone major changes and cleanups and it may take some effort to get to this bug. One chance I checked yesterday was a mismatched image type (e.g. PNG served as GIF) but it wasn't the cause. As we're almost done with rc1 for the dillo-2.1 release. I'd prefer to work on it after the release, or to dedicate more time to it only if there's a suitable window in the release process. A huge step forward would be to find a way to repoduce the bug. WRT the patch, it may be the fix, but there's a need to check for the current semantic of stop_clients in all cases. This function may only remove the client but not shut the decoding process (think of a repush for instance), and BTW some parts of the code use a flag (as "force"). My guess is that this semantic is still fuzzy so it may be necessary to study the whole, define an accurate behaviour, code accordingly, check the corner cases and to test in the "wild". -- Cheers Jorge.-
On Fri, May 22, 2009 at 02:31:58PM -0400, Jorge Arellano Cid wrote:
On Tue, May 19, 2009 at 07:34:30AM +0100, Jeremy Henty wrote:
I have just applied a patch I was considering earlier that makes a_Cache_stop_client() close the client.
And now I have just unapplied it because it crashed Dillo all the time. <sigh> If this patch isn't wrong, then something else is.
A huge step forward would be to find a way to repoduce the bug.
Believe me, I know. Unfortunately my current patches do little more than confirm at the end of the session that sometime previously, something, somewhere went wrong. Which is great for confirming the existence of the bug, but for fixing it, not so much. What's really needed is complete logging of the CCC, the cache and the dicache. Back to the drawing board...
My guess is that this semantic is still fuzzy ...
It's nice to know that it isn't just me who doesn't understand the code! :-) Just what is a repush anyway? (BTW, I'm not complaining, just pining for the good old days when Dillo was so buggy that it was easy for me to find something I could actually fix. :-) :-) ) Regards, Jeremy Henty PS. I agree that this should not block Dillo-2.1 . Just think of it as the Moby Dick to my Captain Ahab. From Hell's heart I stab at thee, freed struct DilloImage! Dillo-2.1 *will* rock, whether or not this is fixed.
participants (2)
-
jcid@dillo.org
-
onepoint@starurchin.org