Disabling the SIGCHLD handler

Marc Lehmann schmorp at schmorp.de
Tue Jan 15 04:38:54 CET 2008

On Mon, Jan 14, 2008 at 10:38:51AM -0500, Chris Shoemaker <c.shoemaker at cox.net> wrote:
> I think you might be taking this personally.  Please don't.  I'd

I don't.

> really like a productive dialog here.  Perhaps we're talking past each
> other.

Me too, maybe. I just don't understand why you ignore the simple solution to
your problem: providing your own sigchld handler.

> Perhaps we're not looking at the same version of libev.  (I'm

We are.

> looking at 2.01.)  Perhaps I'm an idiot.  However, even if I'm an

you didn't make that impression to me.

> idiot, I'm not attacking you.  A little politeness doesn't cost you
> much.

I am polite. Claiming I am not is impolite, no?

> On Mon, Jan 14, 2008 at 03:06:41AM +0100, Marc Lehmann wrote:
> Notice that nothing prevents the waitpid from reaping any child at all.

Right, thats how it was designed.

> All that is then required is for someone to start a ev_child for the
> child we just reaped.  That event will never trigger.

If you start the ev_child handler _after_ handling child events, thats
true.  You need to start it before. But you do not need to start the child
handler before creating the process, or before the process exits, only
before you poll for more events.

>   ev_loop(loop, EVLOOP_NONBLOCK);

Well, you run the loop before doing that, obviously, you have to start your
child handler before doing that.

There really is no other sensible way to do it, and you can always structure
your program to make it work.

> If I understand what you're claiming, then that program should print the
> exit status 99, and then terminate.  In fact, it does neither.

I am not claiming anything. You were claiming that you cnanot catch the exit
status of a process that exited before startign an ev_child watcher, and that
was and is simply untrue.

example quote: "it does force an application to choose between either:
a) not having access to the exit status of children that exited before a
ev_child was started, OR b)" [...]

And this claim is untrue.

This works perfectly well:

  pid = fork ()
  // child exits here
  cw.pid = pid;
  ev_child_start (&cw);

> Notice that the child (21046) has been reaped, with exit status 99,
> _before_ the ev_child has been started. 

In *some* cases this is true, but not in general. Please note that
registering a watcher "too late" is a problem for all other methods, too:
every method will fail if you register interest in it too late.

> And as a matter of fact, this is the same reason why it would prevent my
> own waitpid from finding the already-reaped child.

I cannot comment on your waitpid, but libev certainly allows you to
register child watchers after the child exited, but before the exit status
was fetched. This is true for *all* methods, even the one you outlined

> I don't follow, really, but here's basically how I would do it:

thats very slow and causes high overhead. the basic promise of libev is
that it is efficient, not that it makes dozens/hundreds/thousands of
syscalls to reap a signal child.

> Perhaps you wanted to avoid handling the ECHILD, but I would need it,
> and using rpid == -1 seems rather like waitpid returning -1.

libev handles ECHILD just fine, what makes you think it doesn't
handle this case?

> Now, I realize that this might not offer the behavior you desire in terms
> of multiple ev_childs registering for the same pid.  But this is ok for
> me, since I'm fine with the POSIX waitpid semantics of each child only
> unblocking one waitpid, and other waitpids getting ECHILD.  I don't want
> it to appear like the child died more than once, even if there are multiple
> ev_child watcher.

libev doesn't make it as if a child died more than once, even if there are
multiple ev_child watchers. this isn't possible with the unix semantics.

in fact, your algorithm contains a race condition where an ev_child
watcher gets the exist status of the wrong process (one that is started
between the two waitpid calls), something libev avoids.

In any case, the best way to proceed would be to go by my original
recommendation and just use your own sigchld handler.

> > And even if, you cna always provid your own child reaper. I just do not
> > see your problem.
> I hope I've been clearer?

Not sure, your claim was wrong and is wrong, wether it became clearer or
not is not really relevant.

And the solution is still the same one, and I still haven't heard why you
don't just use your own sigchld watcher.

> > No, signals are an unsharable resource just like sigchld. It just cannot
> > be done with posix, sorry.
> I do realize it couldn't be shared, I meant to offer another loop type
> to be used _instead_ of the default loop.

And on what grounds? Couldn't you just tell me why providing your own sigchld
handler wouldn't work?

> In any case, it's easy enough
> to disable in the code.

And completely superfluous, yes, as libev supports this out of the box

> > > Instead, I loop over only list of outstanding calls to waitpid,
> > 
> > I assume you do this on every call to waitpid, too...
> No, on every SIGCHLD.

Then you have a bug, as you could have received the SIGCHLD earlier.

libev handles this by not calling waitpid unless told to, i.e., outside the
sigchld handler.

> > > realized that it would be quite easy to modify libev to provide the
> > > behavior I want.  I would just remove the waitpid(-1) call, and put a
> > > waitpid(pid) call inside the loop over childs[].  As an added benefit,
> > 
> > That would break it, however.
> I guess that depends on your definition of "break".

My definition of break is that it breaks the documented libev API, so
no need to put "break" into quotes: this is the libev mailinglist, and
breaking obviously means breaking the designed behaviour of libev (wether
documented or not).

> It would, however, function exactly the same as my "legacy" sigchld
> handler, which is good for me at least. :)

You couldn't wait race-free for any child, and which watcher gets invoked
is then a matter of registration order. Thats not acceptable to me. The
point of libev is to provide a generic interface that doesn't suffer from
races or non-deterministic child reaping. It also shouldn't have O(n)
complexity with a high constant factor due to the syscall.-per-pid.

> Good question.  I guess it feels a bit cleaner to use the ev_child
> interface.

Thats a valid reason, feel free to patch it then.

> > For some reason, you keep ignoring the obvious solution that ha sbeen
> > mentioned a few times already.
> Well, since I've completely ignored the obvious solution that has been
> mentioned a few times already, it's increasingly likely that... what?.
> Why do you think that might be?

I have no clue, I asked you, and I really don't like to joke around, the
only eprson to know that is you, don't make me guess.

> > No, I am not interested in introducing bugs.
> :( That's really quite impolite.  What have I done to earn such a
> response?  It's just code, remember.

Your patch introduces bugs. If you think pointing this out is impolite, and
deliberately breaking a documented and race-free API even though I knew it
is polite, then I will choose impolite but correct any day.

If you can't take criticism or people pointing out facts and claim that
that is impolite, thats entirely your problem. I can't see how that could
be impolite.

                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      pcg at goof.com
      -=====/_/_//_/\_,_/ /_/\_\

More information about the libev mailing list