libeio, Windows

Marc Lehmann schmorp at schmorp.de
Tue Jun 3 10:05:13 CEST 2008


On Mon, Jun 02, 2008 at 11:47:42PM -0600, Tony Arcieri <tony at medioh.com> wrote:
> I'm really looking for something that acts as a scheduler.  Rev, for

What you should look for first is a mailer thta quotes correctly. About
half of the passages that are unquoted are actualyl written by me, without
any "> " prefix. This is very confusing :) It seems that your mailer forgets
to quite the first line of some paragraphs.

> What would be ideal is libev as the core scheduler, providing readiness
> notifications where they're explicitly needed, and libeio handling
> asynchronous I/O. (Let's forget Windows here for a moment)

Libev isn't a good scheduler, though - I tries to get events through to
you as fast as possible, but it doesn't guarantee fny notion of airness or
low latency - i tries to be as fair and quick as possible, but in the end,
all it cares about is overall efficiency.

Not saying that it's bad, just that it might not be the solution for
everybody.

> task to sleep until it receives a message (letting other tasks run).  When
> the I/O operation completes, the Scheduler wakes the task back up by sending
> it an Event object with the requested data.
> 
> Seems like a pretty good use case for libeio...

Absolutely. Perfect fit so to say.

Now, getting adequate scheduling is not that hard (by basically ignoring
the problem), but getting good I/O scheduling is, so far, a completely
unsolved problem for me.

libeio provides a minimum/maximum parallelity option, but selecting values
for those is not easy - some problems might want low values, other high
values. The IO::AIO perl module has some more features here (for example,
aio_scandir, which doesn't have an eio equivalent, lets you specify the
parallelity), but it is still primitive.

Of course, perfect values depend on stuff like disks, filesytem, OS etc.,
too.

The best option I found so far was to mostly ignore the issue. But one has
to keept hat in mind when designing a possibly long-term API.

> The best way to approach this is to use a library thta emulates all this for
> > you already, e.g. cygwin (there are others).
> 
> What about MinGW?

AFAICR, mingw is not such a library. I could be wrong, but last I looked,
mingw's goal was to use native apis as much as possible.

But it does have some unix emulation, the question is wether it is useful
enough to make libeio useful.

> That's what the Ruby people on Windows seem to be using nowdays, as Ruby
> incorporates a lot of the POSIX semantics into the core language.

A good idea in general, a bad idea if one wants high performance on windows.

But then, that's microsoft's fault entirely - if windows performs worse,
but correctly, thats enough for me as a goal.

> But for libeio, I see no hope of ever making a sensible windows port without
> > something like cygwin - too much is simply missing or totally different.

^ this is such a case of mis-quoting.

> > ...it doesn't do I/O readyness notifications, so is not
> > an option for libev for example
> 
> In most of the use cases I'm intended, I'd really need both.  Certain
> libraries (namely OpenSSL) wrap up their functionality in such a way that
> you *must* let the library do the I/O due to the nature of its API

That is news to me, and certainly wrong:

Openssl doesn't actually enforce this, its just the simplest mode of
operation with blocking sockets. if you want to do you own I/O, you are
free to do (and in fact, if you are non-blocking, then doing that is
much less of a hassle then trying to use openssl on non-blocking sockets
directly. the retry semantics will drive you mad).

In fact, openssl completely decouples I/O from TLS handling.

> those cases I'd still need something like libev to give me readiness
> notifications.

Not if you do the I/O on your own. AnyEvent::Handle for example does this, it
uses EV for readyness notifications, but it could just as well use an
asynchronous read or something else.

The code is very new and might still be buggy, but it outlines the
principles: use a memory stream, which will avoid all issues with blocking
in openssl (this might be interesting to somebody else besides us, who
knows :)

   set up code:
   http://ue.tst.eu/f7383715351b4427b6b9db660b6289f8.txt

You tell your ssl context the server/client mode, create two BIO_s_mem
streams and connect to it, then you poll those regularly (this could be
optimised a bit in C, but perl lacks the API definitions and I wasn't keen
on adding them):

   poll code, $self->{_tls_wbuf} is unencrypted data from the app, _wbuf the socket
   write buffer, _rbuf the unencrypted app read buffer.
   http://ue.tst.eu/2f4cfee5166db411bd85a40e06de6dbf.txt

Note that, for sockets, using async-i/o is the wrong approach (mostly
because it is not efficient and ties the write buffer with interesting
semantics), it doesn't scale, so readyness notifications are the way to
go. Surprisingly, they are supported by windows more efficiently than by
using select, but you _have_ to use multiple threads, which slows you down
again, so it's not a clear trade-off to go from select to something else.

> In all other cases, I think it's best for the underlying library to
> perform the I/O for me.

I strongly disagree - for everything that supports non-blocking operations
async I/O is almost always wrong, because you don't know wether data is
there.

letting the library do I/O ties a lot of resources that an event-based
program would avoid.

It might be easiest/laziest etc., which *could* be "best" inc ertain
situations, though :)

> > And for libev, the solution is to realise that libev's model (the posix
> > model) simply doesn't map on windows (its not just that the functions are
> > missing or named differently - they have different and often incompatible
> > semantics as well).
> 
> I think there's a pretty limited problem domain that actually requires
> readiness notifications vs. async I/O.  That problem domain is pretty much
> limited to OpenSSL.

Well, its not limited to openssl, its simply the wrong approach for
sockets. non-blocking I/O *is* the right approach for sockets from an
efficiency standpoint.

For files, non-blocking I/O would be wrong (well, no os supports it
anyways), there, async I/O is the right approach, because you can decide
within a finite timeframe wether data is available or not.

> > Or in other words: you cannot share code that is efficient on windows and
> > the rest of the world. Microsoft deliberately and deeply was designed to
> > be incompatible with anything else.
> 
> Yeah, that's pretty Machiavellian of them.

The right word is "monopolism": they simply try to tie their customers
to them, as far fewer people would stay with windows if swicthing were
easier.  This is not their only strategy, of course, breaking the law and
making illegal secret contracts with resellers is the other important
method.

> It sounds like while it'd be possible to expose the same API as libeio
> and libev (with libev being comparatively broken), the internal

The point is that libev simply is limited to sockets on windows - not even
pipe's work, not to mention standard input or other sources (devices!).

Also, libev's api is not the only issue - making a non-blocking connect on
windows is quite different than elsewhere, send() has different semantics,
which only ever bites you under very high load etc. etc.

"Getting it right" is very hard work, and I see little code attempting it.

libev's api is too low, and libeio's api is too high, for efficient
implementations on windows.

libevent has this I/O stream abstraction, which only works for sockets,
which could, in theory, be implmented using iocp, but I think they are
mostly the wrong model, and benchmarks are not clearly pointing at them
for being performant. At the end of the day, windows as a platform might
simply be slow and too ridden by bugs to be performant.

As a typical example, take tcp socket writes: windows can return ENOBUFS, and
the application has no (defined) way to recover from that - maybe adding more
RAM will help, maybe waiting a second and retrying will help, maybe trying a
smaller write size might help.

Here, iocp might work better, but they suffer from similar quirks.

Working around all these quirks will cause bad performance - even tiny
things such as libev having to provide and check an exceptfds argument to
work around windows bugs causes some minor performance issues.

> implementation would be considerably different versus other platforms,
> and hence the semantics.

Yes.

If your goal is to just ignore windows performance, and just making it
work, then I think that's the best thing you can do.

Having it work on windows, but being able to point at better operating
systems when people want performance, is inherently good. It isn't as if
people couldn't get performance when they want it.

> Perhaps you could expose just the libeio API on Windows, and concede that
> anyone who needs readiness notifications is screwed?

I have no clue what that means - libev is for non-blocking I/O, something
libeio by design cannot do, as it was meant for asynchronous I/O.

Those things are incomaptible with each other, have different use cases
etc.

> If you limit yourself to sockets, this shouldn't be such a big problem. I
> > don't know how ruby deals with sockets on windows - perl squeezes them
> > into file descriptors just like everything else, so using libev on windows
> > + perl is close to trivial.
> >
> 
> Yes, that's what Ruby does as well...

Then chances are, it might work out of the box.

Now, select on windows doesn't use file descriptors, and oftentimes, systems
like ruby have their own wrapper.

You can make use of that wrapper by defining EV_SELECT_IS_WINSOCKET to
0, or you can let libev use the winsockets select function by not doing
anything.

Then everything will work out of the box, AFTER you cleaned up your code to
work with the horrible windows include files and fixed any incompatibilites
:)

For that, you can easily wait for a victim^Wvolunteer to do it. In my
experience, they won't come, as people who know enough about windows
usually do not touch anything that comes from unix.

> Rubinius's scheduler gives you the option of either receiving an event
> notification for I/O readiness or having the scheduler perform the I/O for
> you.  In order to replace the latter with libeio I'd still need hooks into
> libev for the readiness notifications.

I don't see why - you never have a file descriptor that sensibly can be used
with both libev and libeio - it's apples and chairs, so to say.

If you force libeio to do sockets I/O for you, you will deadlock
eventually, as async I/O doesn't scale in the same way as sockets scale,
and libeio doesn't (officially) support sockets - it will work, somewhat,
if you pass in a negative offset value, but thats not meant for sockets.

Mixing the two simply makes no sense.

> So really I'd need to use both alongside each other...

You would, because they are made for different things - libev doesn't work
on files, libeio doesn't on sockets.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      pcg at goof.com
      -=====/_/_//_/\_,_/ /_/\_\



More information about the libev mailing list