ocserv-main segfault in libev.so.4.0.0
Marc Lehmann
schmorp at schmorp.de
Fri Jun 21 02:21:11 CEST 2019
On Thu, Jun 20, 2019 at 10:37:25PM +0800, §汤圆§/ty <20207593 at qq.com> wrote:
> I am a small development team here, about 20 users connect to the ocserv server through cisco anyconnect. I don't know how to manually reproduce this problem, but in my scenario, the ocserv-main process will exit with segment-fault. All users will drop from the vpn at the same time. Every day, it will fail 1-2 times. It is not a fault at startup. Failure after running for a while.
Hi!
> #0 0x0000000000000000 in ?? ()
> (gdb) where
> #0 0x0000000000000000 in ?? ()
> #1 0x00007f305fc553d5 in ev_invoke_pending (loop=0x7f305fe5ea40 <default_loop_struct>) at ev.c:3322
> #2 0x00007f305fc585b5 in ev_run (loop=0x7f305fe5ea40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
> #3 0x000055d4d269d7da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
Good that you provided a backtrace.
This doesn't really look like a crash/bug in libev, but more like an
outside bug (e.g. specifying an invalid event callback or something like
that).
If you want to debug this further, you should compile the whole program
with debugging info and (if possible) without optimisation, e.g. -ggdb -O0
or at least -g.
That way, you could see in what file and what line it crashes, and this
might shed a lot of info on what is causing the problem. If the last frame
is still ev_invoke_pending, you can enter it and "print *p" and "print
*p->w", which should tell you what watcher is involved, and what callback
is specified.
> Program terminated with signal 11, Segmentation fault.
> #0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
> 2658 if ((w->pid == pid || !w->pid)
> (gdb) where
> #0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
> #1 childcb (loop=0x7f49ca995a40 <default_loop_struct>, sw=<optimized out>, revents=<optimized out>) at ev.c:2690
> #2 0x00007f49ca78c3d5 in ev_invoke_pending (loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:3322
> #3 0x00007f49ca78f5b5 in ev_run (loop=0x7f49ca995a40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
> #4 0x0000559f444867da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
> p w->pid is a wild pointer.
Thats a very different crash. Maybe you have some random memory corruption
going on - it's hard to explain why a registered watcher pointer would
suiddenly become invalid - most likely, this is again a bug in ocserv-main
somewhere.
The most common problem is that people reuse/overwrite/zero/corrupt/free a
watcher structure while it is still in use by libev.
You can recompile libev with -DEV_VERIFY=3 - this will check all available
data structures at almost every call. This can slow down event handling a
lot, but it can also crash much earlier - more likely at the place where the
problem actually happens.
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp at schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
More information about the libev
mailing list