failures with "libev: ev_io_start called with corrupted watcher", ((WL)w)->next != (WL)w)

Kirill Timofeev kirill.timofeev at hulu.com
Mon Dec 15 02:35:22 CET 2014


Hi Marc,

thanks for quick response and sorry for non readable code. Please find 
complete source attached.

I'm using preallocated memory for watchers in failing code. And I 
checked once again code and don't see where I could miss using 
ev_io_stop(). I would really appreciate if you would have a look at this 
code and let me know if you would see any issues. This code is currently 
opensourced: https://github.com/hulu/statsd-router, but attached version 
have some changes that are not on github yet. Please let me know if you 
would have any questions on implemented logic (I added a lot of 
comments, but I can imagine that something is still unclear). Also 
please let me know if adding some logging can help.

Thanks,
Kirill.

On 12/14/2014 04:32 PM, Marc Lehmann wrote:
> On Sun, Dec 14, 2014 at 04:11:08PM -0800, Kirill Timofeev <kirill.timofeev at hulu.com> wrote:
>> Dec 14 03:55:48 els-abacus-prod-01 statsd-router: statsd-router:
>> ev.c:3552: ev_io_start: Assertion `("libev: ev_io_start called with
>> corrupted watcher", ((WL)w)->next != (WL)w)' failed.
> This assert was added to catch some common usage bugs - most likely, you
> overwrote or freed an I/O watcher without stopping it, and later tried to
> start a new watcher at the same address, which got detected by libev.
>
>> Should I do call connect() in else branch and check it for EISCONN?
> I haven't checked your code (it was badly garbled by your mailer), but the
> general rule is to stop watchers before you reuse the memory for something
> else (a new watcher, in this case). Maybe you simply forgot to stop the
> I/O watcher when you freed your connection data in an error case?
>
> You could try to run with a libev compiled with -DEV_FREQUENT_VERIFY=3,
> which will make very frequent (and slow) checks, in the hope of catching
> this bug earlier, but this kind of corruption is usually hard to detect
> timely for libev, as it usually happens far away from any libev call.
>
> I would suggest going through your code and checking for cases where
> you free memory, or reuse structures with I/O watchers inside, without
> sotpping active ones.
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: statsd-router.c
Type: text/x-csrc
Size: 30660 bytes
Desc: not available
URL: <http://lists.schmorp.de/pipermail/libev/attachments/20141214/a4b9ac65/attachment-0001.c>


More information about the libev mailing list