alternate approach to timer inaccuracy due to cached times

Marc Lehmann schmorp at
Fri Oct 14 10:59:49 CEST 2011

On Thu, Oct 13, 2011 at 07:08:08PM -0700, Shaun Lindsay <srlindsay at> wrote:
> I think my description of the issue was lacking.  This is not specific to

I do think I understand you, and I do maintain this is a design shortcoming
in gevent that is not shared by other code based on libev (or other event

You do some heavy processing, so ev_now() is no longer accurate enough for
your case, and in theory, ev_now can always be "not enough".

(1) This can be fixed the good way - by a better design for example: your timeout
relies on some other event _not_ occuring. if this is an I/O event, give it
higher priority. You can always add your own events by using an ev_check
watcher too.

(2) It can be fixed the lazy but correct way - if you don't whne *when* an
event occured, you have to ask the OS. If gettimefday (usually a fast
userspace function) is good enough for you, you can just call it when the
other event occurs and base your timeout on that.

(3) And lastly, if you can't fix the design, and the lazy fix is too much
overhead, you can create a simple list and just start yxour timers in the
ev_prepare watcher.

Note that the (2) fix is really the only correct fix if you want
"accurate" timeouts even when your processing overhead is high - if you
don't know when an event occured, you have to timestamp them - anything
else is more or less a hack that might work splendid for you. Of course, (2)
cannot be guaranteed to work on unix either, because unix does not have a
timestamping facility, so everything is just hope.

The (1) solution is best in almost all cases, because the only case where it
fails is when you receive more data while processing your event, so by the
time you read you already get newer data. The error here is, however, in the
right direction (timeouts are only being delayed), so it seems the (1)
solution does what you want, without any hacks.

I don't see a problem that would prevent this from working with coroutines
or threads either (Coro for perl implements it for example).

Now, the thing is, currently libev gives you all those options. The main
problem with your patch is that it removes some of these options, and it
is trivially implementable by you.

But I really would recommend trying to fix your design, because then you
have no need for grossly inexact hacks.

> > > so I'd need to call ev_update_now() before every timer start.
> >
> > In most cases, you would have events that invalidate the error case -
> > having these at higher priority would solve that problem in a better way.
> The only way to invalidate the error case is to actually check the time in
> the callback for any timer event, which, again, puts us back in the system
> call per timer case.

Well, you can normally use ev_now for that, because after all, you got
actual I/O events before the ev_now timestamps, so the times compare well.

But yes, to be really exact, you need ev_time() or eq. calls, which are 1)
not so costly and 2) unlikely to be very common (timeouts really aren't

> > > timeouts where immediate, nondeterministic expiration is an acceptable
> > > condition,
> >
> > You are not talking about libev, are you? Even without calling
> > ev_update_now there is nothing nondeterministic, and timers don't timeout
> > immediately in any case.
> Yes, I am talking about libev.

Then you are wrong - the timer handling is absolutely deterministic.

> As illustrated in the example, the amount of work done before starting a
> timer negatively affects the real time on that timer.

Yeah, but there is no way to know what the "real time" is under POSIX, so you
always need some approximation - in this case, it's ev_now, and there are
many ways to make it more accurate.

> scenario, the amount of work done during event dispatch will be dependent of
> whatever network traffic you're serving, so the actual values of your
> timeouts are then subject to that workload and thus nondeterministic

If your workload is nondeterminsitic and your desing flawed, then yes, but
both of these are outside the scope of libev.

> repro case is an example of the timer expiring immediately.

I see what you mean with immediate, but this is not true - immediate relative
to what? to an I/O event that caused a lot of processing? no, it expires
after at least the processing time.

And if your timeouts are lower than your processing time, then your
timeouts are simply too low.

If you only want to protetc yourself against the occasional delay causin
such problems, then you can implement the right fix by checking the reset
event (I/O...) before the timeout.

> Not really.  The problem is that the amount of time taken to handle the
> event callbacks is unknown.

Well, it's trivial to measure without any extra syscalls (ignoring time
jumps), though, so if your problem is not knowing this time, then the fix
is trivial, just chekc ev_now between ev_check and ev_prepare.

> The time taken to add timers to the heap,
> however, is relatively bounded and, even for large numbers of timers, still
> on the order of microseconds.

Yeah, but making libev inexact for everybody without workaround is hardly a
good solution.

To make this viable, it doesn't help if you repeat this again and again - you
would have to explain why you method is the only method that works for you,
even though we think better methods exist (and are used in practise).

> then add the timers, the disparity is then limited to the time taken to
> start the timers alone, decoupling it from the time taken to handle all the
> event callbacks.

No, you just move any timing slips into the other direction. Why not use a
random time instead and rely on random sampling :)

> > Libev guarantees that timeouts never trigger before the requested time
> > already.
> Already addressed.


> > The latter can only happen if you use two different clocks to compare
> > times.  This is rather meaningless - libev will never timeout a timer
> > requesting a 100ms timeout before or even at 100ms after the requested
> > time.
> Not meaningless at all.  If I call ev_timer_init with a requested time of
> 0.1, I expect it to fire 100ms after that function call or later.  As shown
> in the repro code, that is not the actual behavior.

The repo code shows no such thing - please compare the actual event
timestamp, which you get with ev_now ().

> > I think if you are hitting such a problem in gevent, this is a design issue
> > in gevent - any event system should be able to deal with delays in event
> > processing in a sane way.
> I wasn't actually seeing that issue;

Sure, thats what this mailinglist is for.

> I was just positing that my patch to libev could actually cause that
> behavior and was entirely unrelated to gevent.

Since the specific design of gevent causes these issues, it is very
related to gevent.

> If you're at all interested in the patch after reading this, I made some
> changes regarding where the deferred timers were actually started in ev_run.
>  I can send out the updated version, assuming you agree that this is
> actually a problem and actually worth fixing.

I think it is actually a problem and actually worth fixing. I do think your
patch doesn't fix anything at all, though, and that the only place where the
fix can be done is in the code that causes it - in gevent.

All your patch does is delay timers w.r.t. the actual event time and makes it
hard to implement correct behaviour - I can see how that works for you, but
there are other, and imho better, ways to handle this issue.

So, whats wrong with the fixes that we suggest are better (namely (2)
above)? Why would that not fix your problem in a better way?

                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_    
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp at
      -=====/_/_//_/\_,_/ /_/\_\

More information about the libev mailing list