failures with "libev: ev_io_start called with corrupted watcher", ((WL)w)->next != (WL)w)

Kirill Timofeev kirill.timofeev at hulu.com
Mon Dec 15 01:11:08 CET 2014


Hi folks,

I have application using libev, which runs for months without issues. 
But sometime it fails with following message in the log:

Dec 14 03:55:48 els-abacus-prod-01 statsd-router: 2014-12-14 03:55:47 
ERROR ds_health_check_timer_cb: before ev_io_init else branch
Dec 14 03:55:48 els-abacus-prod-01 statsd-router: 2014-12-14 03:55:47 
ERROR ds_health_check_timer_cb: before ev_io_init else branch
Dec 14 03:55:48 els-abacus-prod-01 statsd-router: 2014-12-14 03:55:47 
ERROR ds_health_check_timer_cb: before ev_io_start
Dec 14 03:55:48 els-abacus-prod-01 statsd-router: 2014-12-14 03:55:47 
ERROR ds_health_check_timer_cb: before ev_io_start
Dec 14 03:55:48 els-abacus-prod-01 statsd-router: 2014-12-14 03:55:47 
ERROR ds_health_check_timer_cb: before ev_io_init else branch
Dec 14 03:55:48 els-abacus-prod-01 statsd-router: statsd-router: 
ev.c:3552: ev_io_start: Assertion `("libev: ev_io_start called with 
corrupted watcher", ((WL)w)->next != (WL)w)' failed.

Here is function, where failure occurs:

void ds_health_check_timer_cb(struct ev_loop *loop, struct ev_periodic 
*p, int revents) {
     int i;
     int health_fd;
     struct ev_io *watcher;

     for (i = 0; i < global.downstream_num; i++) {
         watcher = (struct ev_io *)(&global.downstream[i].health_watcher);
         health_fd = watcher->fd;
         if (health_fd < 0) {
             health_fd = socket(AF_INET, SOCK_STREAM, 0);
             if (health_fd == -1) {
                 log_msg(ERROR, "%s: socket() failed %s", __func__, 
strerror(errno));
                 continue;
             }
             if (setnonblock(health_fd) == -1) {
                 close(health_fd);
                 log_msg(ERROR, "%s: setnonblock() failed %s", __func__, 
strerror(errno));
                 continue;
             }
             if (connect(health_fd, (struct sockaddr 
*)(&global.downstream[i].sa_in_health), 
sizeof(global.downstream[i].sa_in_health)) == -1 && errno == EINPROGRESS) {
                 log_msg(ERROR, "%s: before ev_io_init if branch", 
__func__);
                 ev_io_init(watcher, ds_health_connect_cb, health_fd, 
EV_WRITE);
             } else {
                 log_msg(ERROR, "%s: connect() failed %s", __func__, 
strerror(errno));
                 close(health_fd);
                 continue;
             }
         } else {
             log_msg(ERROR, "%s: before ev_io_init else branch", __func__);
             ev_io_init(watcher, ds_health_send_cb, health_fd, EV_WRITE);
         }
         log_msg(ERROR, "%s: before ev_io_start", __func__);
         ev_io_start(loop, watcher);
     }
}

This failure happens during network outages like switch failure. Should 
I do call connect() in else branch and check it for EISCONN? Please let 
me know if there is any good practice for situation like this.

Thanks,
Kirill



More information about the libev mailing list