epoll status
Nulik Nol
nuliknol at gmail.com
Sat Sep 10 15:57:04 CEST 2011
Hi ,
i have been reading the manual page for ev and it says the following
about epoll:
----------------------
"EVBACKEND_EPOLL" (value 4, Linux)
Use the linux-specific epoll(7) interface (for both
pre- and post-2.6.9 kernels).
For few fds, this backend is a bit little slower than
poll and select, but it scales phenomenally
better. While poll and select usually scale like
O(total_fds) where n is the total number of fds (or
the highest fd), epoll scales either O(1) or O(active_fds).
The epoll mechanism deserves honorable mention as the
most misdesigned of the more advanced event
mechanisms: mere annoyances include silently dropping
file descriptors, requiring a system call per
change per file descriptor (and unnecessary guessing of
parameters), problems with dup, returning
before the timeout value, resulting in additional
iterations (and only giving 5ms accuracy while
select on the same platform gives 0.1ms) and so on. The
biggest issue is fork races, however - if a
program forks then both parent and child process have
to recreate the epoll set, which can take
considerable time (one syscall per file descriptor) and
is of course hard to detect.
Epoll is also notoriously buggy - embedding epoll fds
should work, but of course doesn’t, and epoll
just loves to report events for totally different file
descriptors (even already closed ones, so one
cannot even remove them from the set) than registered
in the set (especially on SMP systems). Libev
tries to counter these spurious notifications by
employing an additional generation counter and
comparing that against the events to filter out
spurious ones, recreating the set when required. Last
not least, it also refuses to work with some file
descriptors which work perfectly fine with "select"
(files, many character devices...).
Epoll is truly the train wreck analog among event poll
mechanisms, a frankenpoll, cobbled together in
a hurry, no thought to design or interaction with others.
While stopping, setting and starting an I/O watcher in
the same iteration will result in some caching,
there is still a system call per such incident (because
the same file descriptor could point to a
different file description now), so its best to avoid
that. Also, "dup ()"’ed file descriptors might
not work very well if you register events for both file
descriptors.
Best performance from this backend is achieved by not
unregistering all watchers for a file descriptor
until it has been closed, if possible, i.e. keep at
least one watcher active per fd at all times.
Stopping and starting a watcher (without re-setting it)
also usually doesn’t cause extra overhead. A
fork can both result in spurious notifications as well
as in libev having to destroy and recreate the
epoll object, which can take considerable time and thus
should be avoided.
All this means that, in practice, "EVBACKEND_SELECT"
can be as fast or faster than epoll for maybe up
to a hundred file descriptors, depending on the usage. So sad.
While nominally embeddable in other event loops, this
feature is broken in all kernel versions tested
so far.
This backend maps "EV_READ" and "EV_WRITE" in the same
way as "EVBACKEND_POLL".
------------------------------------
My question is, is this still valid? Has the speed of epoll got better
and the bugs solved ? I assume this was long time ago... or wasn't it?
Thanks in advance
--
==================================
The power of zero is infinite
More information about the libev
mailing list