Nav_repush() is hurting valgrind
The valgrind logs[1] show a recent burst of errors from calls to Nav_repush(). Also, I'm currently trialling a patch to fix the image access errors from a_Cache_stop_client() [2]. I don't think it can be the cause of the Nav_repush() problems but I'd like a second opinion (I don't yet fully understand how this code should work). Regards, Jeremy Henty [1] http://starurchin.singularity/dillo/valgrind.html [2] http://starurchin.singularity/dillo/patch/make-a_Cache_stop_client-close-cli...
On Sat, May 09, 2009 at 10:27:34AM +0100, Jeremy Henty wrote:
The valgrind logs[1] show a recent burst of errors from calls to Nav_repush().
Oh, here I can't reproduce that. I've tried with some sites and the valgrind log here is clean (a bit better than it used to be). Do you have an example URL or test case? BTW, I'm talking about the vanilla repository. Are you testing with a patched repo?
Also, I'm currently trialling a patch to fix the image access errors from a_Cache_stop_client() [2]. I don't think it can be the cause of the Nav_repush() problems but I'd like a second opinion (I don't yet fully understand how this code should work).
[1] http://starurchin.singularity/dillo/valgrind.html [2] http://starurchin.singularity/dillo/patch/make-a_Cache_stop_client-close-cli...
The DNS here can't resolve the hostname. -- Cheers Jorge.-
On Sat, May 09, 2009 at 10:00:55AM -0400, Jorge Arellano Cid wrote:
On Sat, May 09, 2009 at 10:27:34AM +0100, Jeremy Henty wrote:
The valgrind logs[1] show a recent burst of errors from calls to Nav_repush().
Oh, here I can't reproduce that. I've tried with some sites and the valgrind log here is clean (a bit better than it used to be). Do you have an example URL or test case?
Not yet. I'll see what I can dig out of the logs.
BTW, I'm talking about the vanilla repository. Are you testing with a patched repo?
Yes. If it continues I'll back out any relevant patches and see if it changes anything. I've added a list of my local patches to the valgrind log page so that developers can see what I've done. http://starurchin.org/dillo/valgrind.html The only two that could (I think) possibly be relevant are: http://starurchin.org/dillo/patch/img_ref.diff.txt http://starurchin.org/dillo/patch/make-a_Cache_stop_client-close-client.txt The first patch was suggested to me by the list when I first reported the image access problems, the second is my attempt to fix it. I'd really appreciate any comments on whether they are doing things the right way.
The DNS here can't resolve the hostname.
D'oh! My bad, that's the copy on my local machine. The correct links are above. Regards, Jeremy Henty
On Sat, May 09, 2009 at 04:13:08PM +0100, Jeremy Henty wrote:
On Sat, May 09, 2009 at 10:00:55AM -0400, Jorge Arellano Cid wrote:
On Sat, May 09, 2009 at 10:27:34AM +0100, Jeremy Henty wrote:
The valgrind logs[1] show a recent burst of errors from calls to Nav_repush().
Oh, here I can't reproduce that. I've tried with some sites and the valgrind log here is clean (a bit better than it used to be). Do you have an example URL or test case?
Not yet. I'll see what I can dig out of the logs.
I've found this: ==9830== 1,420 (352 direct, 1,068 indirect) bytes in 11 blocks are definitely lost in loss record 153 of 192 ==9830== at 0x4025D2E: malloc (vg_replace_malloc.c:207) ==9830== by 0x8081270: dMalloc (dlib.c:36) ==9830== by 0x807E46A: a_Image_new (image.cc:40) ==9830== by 0x80703C5: a_Html_image_new(DilloHtml*, char const*, int, _DilloUrl*) (html.cc:2067) ==9830== by 0x8074F49: _ZL16Html_input_imageP9DilloHtmlPKci (form.cc:1905) ==9830== by 0x8078C5A: Html_tag_open_input(DilloHtml*, char const*, int) (form.cc:484) ==9830== by 0x8071F71: _ZL16Html_process_tagP9DilloHtmlPci (html.cc:3455) ==9830== by 0x80725E8: _ZL14Html_write_rawP9DilloHtmlPcii (html.cc:3756) ==9830== by 0x807276F: DilloHtml::write(char*, int, int) (html.cc:573) ==9830== by 0x8072802: _ZL13Html_callbackiP12_CacheClient (html.cc:3647) ==9830== by 0x8060BC7: Cache_process_queue (cache.c:1154) ==9830== by 0x80603A0: a_Cache_process_dbuf (cache.c:876) in www.lacuarta.cl
BTW, I'm talking about the vanilla repository. Are you testing with a patched repo?
Yes. If it continues I'll back out any relevant patches and see if it changes anything. I've added a list of my local patches to the valgrind log page so that developers can see what I've done.
http://starurchin.org/dillo/valgrind.html
The only two that could (I think) possibly be relevant are:
http://starurchin.org/dillo/patch/img_ref.diff.txt http://starurchin.org/dillo/patch/make-a_Cache_stop_client-close-client.txt
The first patch was suggested to me by the list when I first reported the image access problems, the second is my attempt to fix it. I'd really appreciate any comments on whether they are doing things the right way.
Now I'm working on this, comments should follow along the way. Please check whether the reported errors also happen in the vanilla repo. -- Cheers Jorge.-
On Sat, May 09, 2009 at 12:19:17PM -0400, Jorge Arellano Cid wrote:
I've found this:
==9830== 1,420 (352 direct, 1,068 indirect) bytes in 11 blocks are definitely lost in loss record 153 of 192 [...] in www.lacuarta.cl
I can reproduce this with a vanilla Dillo, but the leak disappears if I apply the imgref patch that was suggested to me a while back. (Patch attached.)
Please check whether the reported errors also happen in the vanilla repo.
OK, I'll back out the image patches and see what the logs show then. (It may take a few days before I can be sure if it's made a difference.) I'd still suggest applying the imgref patch to the mainline as it fixes your leak above. It looks as though DilloImages aren't getting unrefed enough, and the patch fixes that. Regards, Jeremy Henty
On Sat, May 09, 2009 at 08:16:05PM +0100, Jeremy Henty wrote:
On Sat, May 09, 2009 at 12:19:17PM -0400, Jorge Arellano Cid wrote:
I've found this:
==9830== 1,420 (352 direct, 1,068 indirect) bytes in 11 blocks are definitely lost in loss record 153 of 192 [...] in www.lacuarta.cl
I can reproduce this with a vanilla Dillo, but the leak disappears if I apply the imgref patch that was suggested to me a while back. (Patch attached.)
Good! Image reference counting looks fishy inside the code. I'll review the patch.
Please check whether the reported errors also happen in the vanilla repo.
OK, I'll back out the image patches and see what the logs show then. (It may take a few days before I can be sure if it's made a difference.) I'd still suggest applying the imgref patch to the mainline as it fixes your leak above. It looks as though DilloImages aren't getting unrefed enough, and the patch fixes that.
I still don't know what kind of valgrind problems are happening in vanilla dillo. This is useful information, but getting logs of a patched dillo is certainly confusing. -- Cheers Jorge.-
On Sat, May 09, 2009 at 12:19:17PM -0400, Jorge Arellano Cid wrote:
I've found this:
==9830== 1,420 (352 direct, 1,068 indirect) bytes in 11 blocks are definitely lost in loss record 153 of 192 ==9830== at 0x4025D2E: malloc (vg_replace_malloc.c:207) ==9830== by 0x8081270: dMalloc (dlib.c:36) ==9830== by 0x807E46A: a_Image_new (image.cc:40) ==9830== by 0x80703C5: a_Html_image_new(DilloHtml*, char const*, int, _DilloUrl*) (html.cc:2067) ==9830== by 0x8074F49: _ZL16Html_input_imageP9DilloHtmlPKci (form.cc:1905) ==9830== by 0x8078C5A: Html_tag_open_input(DilloHtml*, char const*, int) (form.cc:484) ==9830== by 0x8071F71: _ZL16Html_process_tagP9DilloHtmlPci (html.cc:3455) ==9830== by 0x80725E8: _ZL14Html_write_rawP9DilloHtmlPcii (html.cc:3756) ==9830== by 0x807276F: DilloHtml::write(char*, int, int) (html.cc:573) ==9830== by 0x8072802: _ZL13Html_callbackiP12_CacheClient (html.cc:3647) ==9830== by 0x8060BC7: Cache_process_queue (cache.c:1154) ==9830== by 0x80603A0: a_Cache_process_dbuf (cache.c:876)
in www.lacuarta.cl
Here are two more that show the same thing. With a vanilla dillo they leak DilloImages. The imgref patch that I attached earlier fixes the leaks. http://stephenlaw.blogspot.com/2009/05/careful-how-you-use-bogus.html http://www.nybooks.com/articles/22694 I've attached the valgrind logs. The "__0.txt" versions are from the vanilla repo. The others are after applying the imgref patch. No other patches were applied. I think it's clear that image reference counting is currently broken and the imgref patch, or something like it, is needed to fix it. Regards, Jeremy Henty
On Sun, May 10, 2009 at 08:20:09AM +0100, Jeremy Henty wrote:
Here are two more that show the same thing. With a vanilla dillo they leak DilloImages. The imgref patch that I attached earlier fixes the leaks.
http://stephenlaw.blogspot.com/2009/05/careful-how-you-use-bogus.html http://www.nybooks.com/articles/22694
I've attached the valgrind logs. The "__0.txt" versions are from the vanilla repo. The others are after applying the imgref patch. No other patches were applied.
Oops, I screwed up and overwrote one of the logs before attaching it. Ignore the one that reports a segfault. The others are OK: it's definitely the case that the vanilla Dillo leaks DilloImages on these URLs and that the imgref patch fixes the leak. Regards, Jeremy Henty
On Sun, May 10, 2009 at 08:20:09AM +0100, Jeremy Henty wrote:
On Sat, May 09, 2009 at 12:19:17PM -0400, Jorge Arellano Cid wrote:
I've found this:
==9830== 1,420 (352 direct, 1,068 indirect) bytes in 11 blocks are definitely lost in loss record 153 of 192 ==9830== at 0x4025D2E: malloc (vg_replace_malloc.c:207) ==9830== by 0x8081270: dMalloc (dlib.c:36) ==9830== by 0x807E46A: a_Image_new (image.cc:40) ==9830== by 0x80703C5: a_Html_image_new(DilloHtml*, char const*, int, _DilloUrl*) (html.cc:2067) ==9830== by 0x8074F49: _ZL16Html_input_imageP9DilloHtmlPKci (form.cc:1905) ==9830== by 0x8078C5A: Html_tag_open_input(DilloHtml*, char const*, int) (form.cc:484) ==9830== by 0x8071F71: _ZL16Html_process_tagP9DilloHtmlPci (html.cc:3455) ==9830== by 0x80725E8: _ZL14Html_write_rawP9DilloHtmlPcii (html.cc:3756) ==9830== by 0x807276F: DilloHtml::write(char*, int, int) (html.cc:573) ==9830== by 0x8072802: _ZL13Html_callbackiP12_CacheClient (html.cc:3647) ==9830== by 0x8060BC7: Cache_process_queue (cache.c:1154) ==9830== by 0x80603A0: a_Cache_process_dbuf (cache.c:876)
in www.lacuarta.cl
Here are two more that show the same thing. With a vanilla dillo they leak DilloImages. The imgref patch that I attached earlier fixes the leaks.
http://stephenlaw.blogspot.com/2009/05/careful-how-you-use-bogus.html http://www.nybooks.com/articles/22694
I've attached the valgrind logs. The "__0.txt" versions are from the vanilla repo. The others are after applying the imgref patch. No other patches were applied.
I think it's clear that image reference counting is currently broken and the imgref patch, or something like it, is needed to fix it.
Patch reviewed and committed! Now, with this patch, is there any other Image-related problem detected by valgrind? AFAIS, it was made by Johannes, corvid and you? Just to give due credit in the ChangeLog. -- Cheers Jorge.-
On Sun, May 10, 2009 at 01:47:56PM -0400, Jorge Arellano Cid wrote:
AFAIS, it was made by Johannes, corvid and you? Just to give due credit in the ChangeLog.
It was sent to me a while back by either Johannes or corvid, I'm afraid I forget which. No credit is due to me as I used it exactly as it was given to me. Regards, Jeremy Henty
On Sun, May 10, 2009 at 08:20:09AM +0100, Jeremy Henty wrote:
Here are two more that show the same thing. With a vanilla dillo they leak DilloImages. The imgref patch that I attached earlier fixes the leaks.
http://stephenlaw.blogspot.com/2009/05/careful-how-you-use-bogus.html http://www.nybooks.com/articles/22694
I've attached the valgrind logs. The "__0.txt" versions are from the vanilla repo. The others are after applying the imgref patch. No other patches were applied.
I see "bogus_log.txt" has a SIGSEGV. Does it mean you get a segfault exit with the patch applied on that page? (it doesn't happen here, and I've got no segfault in a long time). -- Cheers Jorge.-
On Sun, May 10, 2009 at 01:53:34PM -0400, Jorge Arellano Cid wrote:
On Sun, May 10, 2009 at 08:20:09AM +0100, Jeremy Henty wrote:
I've attached the valgrind logs. The "__0.txt" versions are from the vanilla repo. The others are after applying the imgref patch. No other patches were applied.
I see "bogus_log.txt" has a SIGSEGV.
Ignore that. I accidentally overwrote the original log while I was fixing problems due to my image tracing code interacting badly with recent changes. (That was also almost certainly the cause of the burst of valgrind errors on the 9th, so ignore those too.) Regards, Jeremy Henty
Replying to myself. On Sun, May 10, 2009 at 08:20:09AM +0100, Jeremy Henty wrote:
Here are two more that show the same thing. With a vanilla dillo they leak DilloImages. The imgref patch that I attached earlier fixes the leaks.
http://stephenlaw.blogspot.com/2009/05/careful-how-you-use-bogus.html http://www.nybooks.com/articles/22694
And they are now leak-free after pulling the latest tip. Awesome! Jeremy Henty
participants (2)
-
jcid@dillo.org
-
onepoint@starurchin.org