Bug Report: SIGPIPE during call to ev_async_send

Marc Lehmann schmorp at schmorp.de
Mon Oct 12 15:00:34 CEST 2015


On Sun, Oct 11, 2015 at 04:42:03PM -0700, Benjamin Mahler <benjamin.mahler at gmail.com> wrote:
> Interesting, from what I can tell we don't trip the postfork=1 code in the
> link below within our hot paths,

That's good from a performance standpoint. since you say it's a rare event in
production code, it makes sense that it isn't triggered in the hot path.

> We had tried to avoid ignoring SIGPIPE because we use libev in an
> asynchronous / actor library that we maintain independently from Mesos

Well, unless you fork and try to use the loop in the child, that is fine. The
rationale behind making this a requirement on fork, rather than some other
alternative, is that fork has a lot of other process-wide requirements as
well.

> So in theory shell filters could be built atop libprocess, but it seems
> reasonable to ask users of the library to handle EPIPE in 2015.

shell filters can still be built - as long as you don't fork and do event
processing with the same loop in the child, that should be fine. It's not
really difefrent than using threads, threads put a lot more limitations on
fork, and trying to do multiprocessing in forked processes is generally
asking for trouble, so this is not seen as a particularly bad extra burden-

Again, you don't have to worry about it if other parts of the program fork,
as long as they don't use your library in the child.

Alternatively, somebody might come up with a solution, but the problem is
that we can't use lcoks (it has to work from signal handlers), and the other
thread or signal handler can be delayed wihtout any upper bound.

So while we can replace the write fd atomically (from a userspace
perspective), the linux kernel will still suffer from a race, where one cpu
replaces the fd with another file description, but the other cpu has already
fetched it and locked it, and will then trigger the SIGPIPE. (basically, when
you call write in one thread and close on the same fd in another, the kernel
might suffer form bad effects, for example, the write might block forever).

The only solution we found is to not let this happen in general, but avoid
this case entirely in important situations, such as when epoll recreates the
poll set - we don't have to rebuild the event pipe in this case, so we don't.

We do have to rebuild the event pipe, though, in a child process, before
using the associated loop (and only iff using the loop), and then I don't see
a way to avoid a race between a signal handler and the pipe rebuild code (or
a newly started thread).

> I'm curious, do you know when this would make its way into a release? No
> worries if not.

Hopefully "soon" there will be a release, and it will have that code. I am
mainly waiting for you to test whether the change in libev seems to fix your
problem or not.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp at schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list