回复: ocserv-main segfault in libev.so.4.0.0
§汤圆§/ty
20207593 at qq.com
Wed Jul 3 05:01:38 CEST 2019
Hi, Marc Lehmann,
Thank you very much for your reply.
We submitted more information to the author of Ocserv in accordance with your suggestion, and he has fixed the bug.
Thank you, you have helped us a lot.
------------------ 原始邮件 ------------------
发件人: "Marc Lehmann"<schmorp at schmorp.de>;
发送时间: 2019年6月21日(星期五) 上午8:21
收件人: "§汤圆§/ty"<20207593 at qq.com>;
抄送: "libev"<libev at lists.schmorp.de>;
主题: Re: ocserv-main segfault in libev.so.4.0.0
On Thu, Jun 20, 2019 at 10:37:25PM +0800, §汤圆§/ty <20207593 at qq.com> wrote:
> I am a small development team here, about 20 users connect to the ocserv server through cisco anyconnect. I don't know how to manually reproduce this problem, but in my scenario, the ocserv-main process will exit with segment-fault. All users will drop from the vpn at the same time. Every day, it will fail 1-2 times. It is not a fault at startup. Failure after running for a while.
Hi!
> #0 0x0000000000000000 in ?? ()
> (gdb) where
> #0 0x0000000000000000 in ?? ()
> #1 0x00007f305fc553d5 in ev_invoke_pending (loop=0x7f305fe5ea40 <default_loop_struct>) at ev.c:3322
> #2 0x00007f305fc585b5 in ev_run (loop=0x7f305fe5ea40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
> #3 0x000055d4d269d7da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
Good that you provided a backtrace.
This doesn't really look like a crash/bug in libev, but more like an
outside bug (e.g. specifying an invalid event callback or something like
that).
If you want to debug this further, you should compile the whole program
with debugging info and (if possible) without optimisation, e.g. -ggdb -O0
or at least -g.
That way, you could see in what file and what line it crashes, and this
might shed a lot of info on what is causing the problem. If the last frame
is still ev_invoke_pending, you can enter it and "print *p" and "print
*p->w", which should tell you what watcher is involved, and what callback
is specified.
> Program terminated with signal 11, Segmentation fault.
> #0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
> 2658 if ((w->pid == pid || !w->pid)
> (gdb) where
> #0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
> #1 childcb (loop=0x7f49ca995a40 <default_loop_struct>, sw=<optimized out>, revents=<optimized out>) at ev.c:2690
> #2 0x00007f49ca78c3d5 in ev_invoke_pending (loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:3322
> #3 0x00007f49ca78f5b5 in ev_run (loop=0x7f49ca995a40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
> #4 0x0000559f444867da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
> p w->pid is a wild pointer.
Thats a very different crash. Maybe you have some random memory corruption
going on - it's hard to explain why a registered watcher pointer would
suiddenly become invalid - most likely, this is again a bug in ocserv-main
somewhere.
The most common problem is that people reuse/overwrite/zero/corrupt/free a
watcher structure while it is still in use by libev.
You can recompile libev with -DEV_VERIFY=3 - this will check all available
data structures at almost every call. This can slow down event handling a
lot, but it can also crash much earlier - more likely at the place where the
problem actually happens.
--
The choice of a Deliantra, the free code+content MORPG
-----==- _GNU_ http://www.deliantra.net
----==-- _ generation
---==---(_)__ __ ____ __ Marc Lehmann
--==---/ / _ \/ // /\ \/ / schmorp at schmorp.de
-=====/_/_//_/\_,_/ /_/\_\
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schmorp.de/pipermail/libev/attachments/20190703/1e2e5b14/attachment.html>
More information about the libev
mailing list