Another race in signal handling

Chris Shoemaker c.shoemaker at cox.net
Mon Jan 21 17:25:49 CET 2008


On Mon, Jan 21, 2008 at 05:57:36AM +0100, Marc Lehmann wrote:
> On Wed, Jan 16, 2008 at 09:15:50AM -0500, Chris Shoemaker <c.shoemaker at cox.net> wrote:
> > The condition is true, and gotsig becomes 1.  The signal handler returns.
> > 
> > However, as soon as the sighandler returns, the full signal mask is
> > removed, so a new signal may be received at any time.  If a signal is
> > received before sigcb() clears gotsig, the sighandler will not record
> > it, because (!gotsig) will still be false.
> 
> Thats wrong, just look at the code you quote:
> 
> >    signals [signum - 1].gotsig = 1;
> 
> Here the signal gets recorded, which is independent of the global gotsig
> variable, so what you write cannot be true, as the signal will be recorded
> regardless of the value of gotsig.
> 
> indepdnent means the signal occurance will be recorded regardless of the
> value of "gotsig".
> 
> gotsig simply controls wether the event loop should be woken up, signals
> cannot go lost even when gotsig is forced to be 1 all the time (but it can
> delay signal handling indefinitely).

Ok, I like your terms better than mine.  Since I know libev has
recorded the receipt of the signal, and I'm waiting for the ev_signal
watcher's callback to be called, this "indefinite delay" (infinite
delay, really, since the callback will never be called) is rather
troublesome.

> > But, I don't understand the motive for the flag in the first place, so
> 
> Are you sure you didn't tinker with the code in other places and this is a
> pristine copy of libev you are looking at?

It's pristine except that I have disabled the childev watcher in exactly
the same way it's disabled on win32, like so:

@@ -1227,11 +1227,14 @@ ev_default_loop (unsigned int flags)
           siginit (EV_A);
 
 #ifndef _WIN32
+#if 0
           ev_signal_init (&childev, childcb, SIGCHLD);
           ev_set_priority (&childev, EV_MAXPRI);
           ev_signal_start (EV_A_ &childev);
           ev_unref (EV_A); /* child watcher should not keep loop alive */
 #endif
+#endif
         }
       else
         ev_default_loop_ptr = 0;
@@ -1248,9 +1251,12 @@ ev_default_destroy (void)
 #endif
 
 #ifndef _WIN32
+#if 0
   ev_ref (EV_A); /* child watcher */
   ev_signal_stop (EV_A_ &childev);
 #endif
+#endif
 
   ev_ref (EV_A); /* signal watcher */
   ev_io_stop (EV_A_ &sigev);


> In any case, it would be better not to make (wrong) analysis of the
> perceived problems you see, but instead describe your problem, maybe
> deliver a testcase and maybe make a (possibly wrong) analysis of the
> problem (which is often helpful).
> 
> Without the actual problem report, such analyses are, however, useless.

I figured that describing the exact time when interuption by a signal
causes signal watchers to be "indefinitely delayed" would be helpful,
but if you find testcases more helpful, I'm very glad to oblige.  Here
it is:

----- cut -----
#include <ev.h>
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <sys/time.h>

struct ev_signal signal_watcher;
static void signal_cb (EV_P_ struct ev_signal *w, int revents) {
  exit(99);
}
static void _preempt (int signum) {
  printf ("preempt\n");
}

int main (int args, char *argv[]) {
  struct ev_loop *loop = ev_default_loop(EV_FORK_ENABLE);
  pid_t pid;

  int pdes[2], status, res;
  char buf[4];

  ev_loop(loop, EVLOOP_ONESHOT | EVLOOP_NONBLOCK);

  while (1) {
  pipe(pdes);
  if ((pid = fork()) == 0) {
    struct itimerval new, old;
    close(pdes[0]);

    new.it_interval.tv_usec = 50;
    new.it_interval.tv_sec = 0;
    new.it_value.tv_usec = new.it_interval.tv_usec;
    new.it_value.tv_sec = 0;
    signal(SIGALRM, _preempt);
    setitimer(ITIMER_REAL, &new, &old);

    ev_default_fork();
    ev_signal_init (&signal_watcher, signal_cb, SIGHUP);
    ev_signal_start (loop, &signal_watcher);

    ev_loop(loop, EVLOOP_NONBLOCK);

    write(pdes[1], "baz", 4);
    close(pdes[1]);

    while (1) {
      ev_loop(loop, 0);
    }
    exit(0);

  } else {
    close(pdes[1]);
    read(pdes[0], buf, 4);  /* After this point, the ev_signal has started */
    close(pdes[0]);

    kill(pid, SIGHUP);
    res = waitpid(pid, &status, 0);
    printf("Result = %d, ExitStatus = %d\n", res, WEXITSTATUS(status));
    if (WEXITSTATUS(status) != 99) {
      break;
    }
  }
  }
  printf("done\n");
  return 0;
}
----- end cut ----


With the global gotsig variable, this program usually hangs within 20
seconds or so.(*)  Without it, the loop runs indefinitely.  The general
test pattern is: Cause the signal you're watching for to be delivered
while gotsig is 1.

So far, I'm operating under the assumption that this global gotsig
variable serves no useful purpose.  If I'm wrong, please correct me.

-chris

(*) The frequency of the deadlock can be affected by changing the value of
"new.it_interval.tv_usec = 50".

> 
> -- 
>                 The choice of a       Deliantra, the free code+content MORPG
>       -----==-     _GNU_              http://www.deliantra.net
>       ----==-- _       generation
>       ---==---(_)__  __ ____  __      Marc Lehmann
>       --==---/ / _ \/ // /\ \/ /      pcg at goof.com
>       -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list