Disabling the SIGCHLD handler

Marc Lehmann schmorp at schmorp.de
Wed Jan 16 04:17:10 CET 2008


On Tue, Jan 15, 2008 at 10:47:33AM -0500, Chris Shoemaker <c.shoemaker at cox.net> wrote:
> > handler before creating the process, or before the process exits, only
> > before you poll for more events.
> 
> I'm glad we finally agree, in practice at least.

No, we don't. Your claim was wrong and is wrong. We do not agree the
least. I don't understand why you cannot accept that.

> In my opinion, this is a rather special and uninteresting exception to
> the general rule that running the default event loop will eagerly reap
> the exit status of children, even before an ev_child has started.

Despite your opinion, it is by far the most common and important
case. Your case, on the other hand, is very uncommon, and pessimising that
a bit in favor of providing good and simple support for the common case
makes sense (simple things should be simple, complex ones possible).

> > Well, you run the loop before doing that, obviously, you have to start your
> > child handler before doing that.
> 
> In my case, I may not know that I even care to start an ev_child until a
> long, lomg time after the child has exited.  And, when I do eventually care,
> I won't know whether the child has exited or not.  Obviously, I can't
> avoid running the default loop indefinitely.

As this is a rather marginal and untypical case, I would suggest the
simplest solution is to provide your own sigchld handler.

> > There really is no other sensible way to do it, and you can always structure
> > your program to make it work.
> 
> I suppose you mean by using my own sigchld handler.

No, but that is the simplest solution.

> > > If I understand what you're claiming, then that program should print the
> > > exit status 99, and then terminate.  In fact, it does neither.
> > 
> > I am not claiming anything. You were claiming that you cnanot catch the exit
> > status of a process that exited before startign an ev_child watcher, and that
> > was and is simply untrue.
> > 
> > example quote: "it does force an application to choose between either:
> > a) not having access to the exit status of children that exited before a
> > ev_child was started, OR b)" [...]
> > 
> > And this claim is untrue.
> 
> You deleted an important part: "b) not being able to use ev_signal"

No, it is not important. You can use ev_signal handlers and have access to
the exit status of children etc. etc.

> That would require running the default event loop.

Yes.

> We agree that a) is only true if the default event loop is run.

No, a) is not true at all, regardless of which loop you run, as I have
explained a number of times. Please understand that I will ignore your
repetitions of this wrong claim, it doesn't get any truer.

> Thus, the two options are exclusive.  I never claimed A, I claimed A XOR
> B.

And your claim was and is wrong, as I explained (and you chose to ignore)
a number of times.

> Fair enough.  You could say I'm expecting too much, to be able to get
> the exit status of a child that died long before I started an
> ev_child.

Yes.

> But, as POSIX allows the waitpid caller to get the exit status long
> after the child has died, and as I must offer the POSIX semantics, I
> must also allow this.

Yes, libev does not do posix emulation, it is a higher-level interface,
anything else would be pointless.

The important point is thta libev doesn't preclude you from doing it
differently.

> It's O(N) in the number of child watchers, and runs only upon SIGCHLD.
> Is that really so bad?

The same could be said about select: It's O(n) in the number of fds and runs
only when the process has nothing better to do, is that really so bad?

The answer is, yes, sometimes it is.

> I'd love to know of a more efficient way, but efficiency is secondard to
> correctness, so even if this handler is less efficicient than libev's, I
> need it because it doesn't reap the children eagerly.

And youc na provide it without any problems.

> > libev handles ECHILD just fine, what makes you think it doesn't
> > handle this case?
> 
> I meant acutally generating an event for ECHLD, which libev doesn't
> do, but which I prefer.

And which makes no sense in an event-based Program, as it would be a bug
in the application.

Your application isn't event-based, and this isn't well-supported by libev
indeed, but libev doesn't try to support non-event-based programming
styles, for obvious reasons.

> > libev doesn't make it as if a child died more than once, even if there are
> > multiple ev_child watchers. this isn't possible with the unix semantics.
> 
> Eh?!  Sometimes I wonder if we're reading the same code.  Of course it does!
> Two watchers for the same pid.  How many get triggered?

Both.

> TWO!  It had better be so, because that's exactly what the code does:

Indeed.

> It loops over ev_childs, feeding all the events.  Remember?:

I know my code well, thank you very much. Please stop insulting me.

> why I warned you that my patch changed that.)  Understand, I'm not
> criticizing the design, different apps have different goals.  Again, I
> really have to stick to POSIX semantics, so a child only dies once.

The child dies only once, and this is reported properly by libev to all
interested watchers. In no way does libev make it as if a child died more
than once.

Back up that point, or take it back please. But since, in the past,
you didn't take back your other wrong claims, I have little hopes for
that. You really should, though, I don't like it when people make wrong
claims and don't take them back, that certainly doesn't improve my
relationship with them.

> > in fact, your algorithm contains a race condition where an ev_child
> > watcher gets the exist status of the wrong process (one that is started
> > between the two waitpid calls), something libev avoids.
> 
> I assume you're talking about:
> 
> pid = waitpid (w->pid, &status, WNOHANG | WUNTRACED | WCONTINUED);
> /* What if a new process is started right now? */
> if (WCONTINUED && pid < 0 && errno == EINVAL) {
>    pid = waitpid (w->pid, &status, WNOHANG | WUNTRACED);
> }
> 
> I don't understand what you mean about the sending the status of the
> wrong process.  If the watcher gets an event, I think it's always for
> a "matching" process. (It might match the process group, btw.)

a) if you have a watcher that waits for pid 5, and one that waits for -1,
   only one of them gets the event, when both should.
b) if you have a watcher that waits for pid -1 and later start one for pid 5
   but the process was altready reaped, you lost the event.
c) if you have two watchers that wait for pid 5, then the second one might
   not get the exit status at all, or the exit status of the wrong one.

there are other problems and races.

> > And the solution is still the same one, and I still haven't heard why you
> > don't just use your own sigchld watcher.
> 
> Actually, I am.  Somehow got the impression that you were interested
> to know why the existing sigchld watcher was unsuitable for our
> purposes.

No, I am quite aware of that: your purpose is not event-based, and forcing a
non-event-based model onto a generic event library does not work well, wether
its signals or file descriptors.

> > You couldn't wait race-free for any child, and which watcher gets invoked
> > is then a matter of registration order. Thats not acceptable to me. The
> > point of libev is to provide a generic interface that doesn't suffer from
> > races or non-deterministic child reaping. It also shouldn't have O(n)
> > complexity with a high constant factor due to the syscall.-per-pid.
> 
> I still don't understand the allusion to race, but you're absolutely
> right about registration order (but that's not non-deterministic, is
> it?)

libev makes no guarentees about ordering or reordering of child watchers,
it just works correctly. by introducing ordering you would rely on
undocumented (and possibly changing) internals of libev that could well
depend ont eh actions of other code, which in turn could depend on
external events, making it non-deterministic at worst and version-specific
at best.

> I concede that my needs are signficantly different from libev's
> sigchld handler, and that my sigchld handler will make O(n_watchers *
> n_sigchlds) waitpid calls while libev's will only make O(1 *
> n_sigchlds).

Your needs are dictated by a non-event based model. It makes total sense
that it doesn't map well onto an event-library.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      pcg at goof.com
      -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list