Proposed Patch for Zombies Left by Dpi_start_dpid() in dpi.c
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what. I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux). ######## BEGIN PATCH ######## --- a/src/IO/dpi.c 2013-01-27 12:26:38.000000000 -0500 +++ b/src/IO/dpi.c 2013-08-27 14:58:07.995235867 -0400 @@ -33,6 +33,8 @@ #include <netinet/tcp.h> #include <arpa/inet.h> #include <netdb.h> +#include <sys/wait.h> +#include <signal.h> #include "../msg.h" #include "../klist.h" @@ -331,16 +333,33 @@ } } +static void Sigchld_handler (int signum) { + int status; + waitpid (-1, &status, WNOHANG); + } + /* * Start dpid. * Return: 0 starting now, 1 Error. */ static int Dpi_start_dpid(void) { + static int install_sigchld_handler_flag = TRUE; pid_t pid; int st_pipe[2], ret = 1; char *answer; + if ( install_sigchld_handler_flag ) { + /* Install SIGCHLD handler. */ + struct sigaction sa; + sa.sa_handler = Sigchld_handler; + sigemptyset(&sa.sa_mask); + sa.sa_flags = SA_RESTART | SA_NOCLDSTOP; + if (sigaction(SIGCHLD, &sa, 0)) + MSG ("Unable to install SIGCHLD handler.\n"); + install_sigchld_handler_flag = FALSE; + } + /* create a pipe to track our child's status */ if (pipe(st_pipe)) return 1; ######## END PATCH ########
Hi JG, On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it: diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@ // Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN); /* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1); Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place. Cheers, Johannes
Hi, On Wed, Aug 28, 2013 at 07:51:34PM +0200, Johannes Hofmann wrote:
Hi JG,
On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it:
diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@
// Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN);
/* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1);
Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place.
Oh, I'm looking at it now too! For me this is an old TODO: with subtle issues lurking. Somehow, setting SIG_IGN to a child rings me wrong... :-) After some digging (Manual page sigaction(2)): ... "A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged." ... " POSIX.1-1990 disallowed setting the action for SIGCHLD to SIG_IGN. POSIX.1-2001 allows this possibility, so that ignoring SIGCHLD can be used to prevent the creation of zombies (see wait(2)). Nevertheless, the historical BSD and System V behaviors for ignoring SIGCHLD differ, so that the only completely portable method of ensuring that terminated children do not become zombies is to catch the SIGCHLD signal and perform a wait(2) or similar." So, I'm more inclined to using the same method as in dpid and downloads.dpi ... -- Cheers Jorge.-
Hi Jorge, On Wed, Aug 28, 2013 at 08:48:35PM -0400, Jorge Arellano Cid wrote:
Hi,
On Wed, Aug 28, 2013 at 07:51:34PM +0200, Johannes Hofmann wrote:
Hi JG,
On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it:
diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@
// Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN);
/* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1);
Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place.
Oh, I'm looking at it now too! For me this is an old TODO: with subtle issues lurking.
Somehow, setting SIG_IGN to a child rings me wrong... :-)
After some digging (Manual page sigaction(2)):
...
"A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged."
I guess this is no problem, as dpid explicitely adds a signal handler for SIGCHLD again, but I'm happy if you take care of the whole issue :-) Cheers, Johannes
Hi Johannes, JG: On Thu, Aug 29, 2013 at 09:13:49AM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Wed, Aug 28, 2013 at 08:48:35PM -0400, Jorge Arellano Cid wrote:
Hi,
On Wed, Aug 28, 2013 at 07:51:34PM +0200, Johannes Hofmann wrote:
Hi JG,
On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it:
diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@
// Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN);
/* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1);
Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place.
Oh, I'm looking at it now too! For me this is an old TODO: with subtle issues lurking.
Somehow, setting SIG_IGN to a child rings me wrong... :-)
After some digging (Manual page sigaction(2)):
...
"A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged."
I guess this is no problem, as dpid explicitely adds a signal handler for SIGCHLD again, but I'm happy if you take care of the whole issue :-)
The attached program is a stress test for SIGCHLD. The idea is to find a formula that works well even in such cases. IIRC both of you have access to BSD machines so please test it there. The program spawns 255 children which sleep 5 seconds and terminate generating a SIGCHLD avalanche, which should be handled cleanly. It works OK in Debian. $gcc -W -Wall sigcld.c -o sigcld $./sigcld in another terminal (during the 5 secs. sleep window): $ps aux|grep sigcld (shows the long list) then while main is Sleeping 10 seconds: $ps aux|grep sigcld (shows only one sigcld process) after it exits: $ps aux|grep sigcld (no process left) TIA. -- Cheers Jorge.-
Hi Jorge, On Fri, Aug 30, 2013 at 12:41:41PM -0400, Jorge Arellano Cid wrote:
Hi Johannes, JG:
On Thu, Aug 29, 2013 at 09:13:49AM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Wed, Aug 28, 2013 at 08:48:35PM -0400, Jorge Arellano Cid wrote:
Hi,
On Wed, Aug 28, 2013 at 07:51:34PM +0200, Johannes Hofmann wrote:
Hi JG,
On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it:
diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@
// Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN);
/* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1);
Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place.
Oh, I'm looking at it now too! For me this is an old TODO: with subtle issues lurking.
Somehow, setting SIG_IGN to a child rings me wrong... :-)
After some digging (Manual page sigaction(2)):
...
"A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged."
I guess this is no problem, as dpid explicitely adds a signal handler for SIGCHLD again, but I'm happy if you take care of the whole issue :-)
The attached program is a stress test for SIGCHLD. The idea is to find a formula that works well even in such cases. IIRC both of you have access to BSD machines so please test it there.
The program spawns 255 children which sleep 5 seconds and terminate generating a SIGCHLD avalanche, which should be handled cleanly. It works OK in Debian.
$gcc -W -Wall sigcld.c -o sigcld $./sigcld
in another terminal (during the 5 secs. sleep window):
$ps aux|grep sigcld
(shows the long list)
then while main is Sleeping 10 seconds:
$ps aux|grep sigcld
(shows only one sigcld process)
after it exits:
$ps aux|grep sigcld
(no process left)
Works fine on DragonFlyBSD with the #include <wait.h> line removed. Cheers, Johannes
On Sun, Sep 01, 2013 at 08:49:06PM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Fri, Aug 30, 2013 at 12:41:41PM -0400, Jorge Arellano Cid wrote:
Hi Johannes, JG:
On Thu, Aug 29, 2013 at 09:13:49AM +0200, Johannes Hofmann wrote:
Hi Jorge,
On Wed, Aug 28, 2013 at 08:48:35PM -0400, Jorge Arellano Cid wrote:
Hi,
On Wed, Aug 28, 2013 at 07:51:34PM +0200, Johannes Hofmann wrote:
Hi JG,
On Tue, Aug 27, 2013 at 03:15:14PM -0400, JG wrote:
It seems that the dpid children forked by Dpi_start_dpid() in dpi.c are not reaped when they terminate themselves after a period of inactivity while the parent dillo process is still running. The patch below is simply to reap these zombies. The SIGCHLD handler should probably do more, but I am not yet familiar enough with Dillo to know what.
I believe this patch adheres to the Dillo rules of coding style, but am new to the list and a novice at Dillo patching. And hence also apologies to Jorge for this duplication of a previous communication by private email (although with one small change to get the patch to compile under OpenBSD as well as Linux).
I can reproduce the issue here by killing dpid manually. For me the following also fixes it:
diff -r 659dc205c377 src/dillo.cc --- a/src/dillo.cc Wed Aug 21 11:42:38 2013 +0200 +++ b/src/dillo.cc Wed Aug 28 19:48:10 2013 +0200 @@ -329,6 +329,7 @@
// Some OSes exit dillo without this (not GNU/Linux). signal(SIGPIPE, SIG_IGN); + signal(SIGCHLD, SIG_IGN);
/* Handle command line options */ opt_argv = dNew0(char*, numOptions(Options) + 1);
Can you please check whether it works for you too? It would have the advantage that we have all signal related stuff in one place.
Oh, I'm looking at it now too! For me this is an old TODO: with subtle issues lurking.
Somehow, setting SIG_IGN to a child rings me wrong... :-)
After some digging (Manual page sigaction(2)):
...
"A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged."
I guess this is no problem, as dpid explicitely adds a signal handler for SIGCHLD again, but I'm happy if you take care of the whole issue :-)
The attached program is a stress test for SIGCHLD. The idea is to find a formula that works well even in such cases. IIRC both of you have access to BSD machines so please test it there.
The program spawns 255 children which sleep 5 seconds and terminate generating a SIGCHLD avalanche, which should be handled cleanly. It works OK in Debian.
$gcc -W -Wall sigcld.c -o sigcld $./sigcld
in another terminal (during the 5 secs. sleep window):
$ps aux|grep sigcld
(shows the long list)
then while main is Sleeping 10 seconds:
$ps aux|grep sigcld
(shows only one sigcld process)
after it exits:
$ps aux|grep sigcld
(no process left)
Works fine on DragonFlyBSD with the #include <wait.h> line removed.
Good! I'll patch and commit soon. -- Cheers Jorge.-
On Sun, Sep 01, 2013 at 02:58:02PM -0400, Jorge Arellano Cid wrote:
On Sun, Sep 01, 2013 at 08:49:06PM +0200, Johannes Hofmann wrote:
Hi Jorge, [...] Works fine on DragonFlyBSD with the #include <wait.h> line removed.
Good!
I'll patch and commit soon.
Committed. -- Cheers Jorge.-
participants (4)
-
jcid@dillo.org
-
jgaffney@lawrenceville.org
-
Johannes.Hofmann@gmx.de
-
johannes.hofmann@gmx.de