epoll status

Nulik Nol nuliknol at gmail.com
Sat Sep 10 15:57:04 CEST 2011

Hi ,
i have been reading the manual page for ev and it says the following
about epoll:

          "EVBACKEND_EPOLL"   (value 4, Linux)
               Use the linux-specific epoll(7) interface (for both
pre- and post-2.6.9 kernels).

               For few fds, this backend is a bit little slower than
poll and select, but it scales phenomenally
               better. While poll and select usually scale like
O(total_fds) where n is the total number of fds (or
               the highest fd), epoll scales either O(1) or O(active_fds).

               The epoll mechanism deserves honorable mention as the
most misdesigned of the more advanced event
               mechanisms: mere annoyances include silently dropping
file descriptors, requiring a system call per
               change per file descriptor (and unnecessary guessing of
parameters), problems with dup, returning
               before the timeout value, resulting in additional
iterations (and only giving 5ms accuracy while
               select on the same platform gives 0.1ms) and so on. The
biggest issue is fork races, however - if a
               program forks then both parent and child process have
to recreate the epoll set, which can take
               considerable time (one syscall per file descriptor) and
is of course hard to detect.

               Epoll is also notoriously buggy - embedding epoll fds
should work, but of course doesn’t, and epoll
               just loves to report events for totally different file
descriptors (even already closed ones, so one
               cannot even remove them from the set) than registered
in the set (especially on SMP systems). Libev
               tries to counter these spurious notifications by
employing an additional generation counter and
               comparing that against the events to filter out
spurious ones, recreating the set when required. Last
               not least, it also refuses to work with some file
descriptors which work perfectly fine with "select"
               (files, many character devices...).

               Epoll is truly the train wreck analog among event poll
mechanisms, a frankenpoll, cobbled together in
               a hurry, no thought to design or interaction with others.

               While stopping, setting and starting an I/O watcher in
the same iteration will result in some caching,
               there is still a system call per such incident (because
the same file descriptor could point to a
               different file description now), so its best to avoid
that. Also, "dup ()"’ed file descriptors might
               not work very well if you register events for both file

               Best performance from this backend is achieved by not
unregistering all watchers for a file descriptor
               until it has been closed, if possible, i.e. keep at
least one watcher active per fd at all times.
               Stopping and starting a watcher (without re-setting it)
also usually doesn’t cause extra overhead. A
               fork can both result in spurious notifications as well
as in libev having to destroy and recreate the
               epoll object, which can take considerable time and thus
should be avoided.

               All this means that, in practice, "EVBACKEND_SELECT"
can be as fast or faster than epoll for maybe up
               to a hundred file descriptors, depending on the usage. So sad.

               While nominally embeddable in other event loops, this
feature is broken in all kernel versions tested
               so far.

               This backend maps "EV_READ" and "EV_WRITE" in the same

My question is, is this still valid? Has the speed of epoll got better
and the bugs solved ? I assume this was long time ago... or wasn't it?

Thanks in advance

The power of zero is infinite

More information about the libev mailing list