ocserv-main segfault in libev.so.4.0.0
§汤圆§/ty
20207593 at qq.com
Thu Jun 20 16:37:25 CEST 2019
Hello, libev team, sorry to bother you.
I am a small development team here, about 20 users connect to the ocserv server through cisco anyconnect. I don't know how to manually reproduce this problem, but in my scenario, the ocserv-main process will exit with segment-fault. All users will drop from the vpn at the same time. Every day, it will fail 1-2 times. It is not a fault at startup. Failure after running for a while.
From the results of coredump, I think this is a problem with libev.
As you can see from the dmesg-T log, this is the list of faults in the most recent week.
This problem makes us very depressed and will cause work to be interrupted.
This is the latest fault today.
[admin at vpn ~]$ uname -a
Linux vpn.kofo.io 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[admin at vpn ~]$ cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[admin at vpn ~]$ rpm -qa | grep ocserv
ocserv-0.12.3-1.el7.x86_64
ocserv-debuginfo-0.12.3-1.el7.x86_64
I tested two versions of libev
==========with libev-4.15-7.el7.x86_64.rpm:==========
[admin at vpn ~]$ gdb /usr/sbin/ocserv /tmp/core-ocserv-main-sig11-user0-group0-pid1778-time1561007979
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from /usr/lib/debug/usr/sbin/ocserv.debug...done.
done.
warning: core file may not match specified executable file.
[New LWP 1778]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main '.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000000000 in ?? ()
(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00007f305fc553d5 in ev_invoke_pending (loop=0x7f305fe5ea40 <default_loop_struct>) at ev.c:3322
#2 0x00007f305fc585b5 in ev_run (loop=0x7f305fe5ea40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
#3 0x000055d4d269d7da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
(gdb) l
1222 static void syserr_cb (const char *msg)
1223 {
1224 main_server_st *s = ev_userdata(loop);
1225
1226 mslog(s, NULL, LOG_ERR, "libev fatal error: %s", msg);
1227 abort();
1228 }
1229
1230 int main(int argc, char** argv)
1231 {
(gdb) quit
[admin at vpn ~]$ dmesg -T | tail
[Mon Jun 17 20:07:38 2019] traps: ocserv-main[4398] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:10 2019] traps: ocserv-main[4708] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:28 2019] traps: ocserv-main[4743] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:56 2019] traps: ocserv-main[4767] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:13:34 2019] traps: ocserv-main[4819] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:16:38 2019] ocserv-main[14426]: segfault at 55686ec0add8 ip 00007fb9b653aabc sp 00007ffcfdb6cec0 error 6 in libev.so.4.0.0[7fb9b6536000+d000]
[Tue Jun 18 03:30:28 2019] traps: ocserv-main[5392] general protection ip:7f5e1e8bcc48 sp:7ffc2e520af0 error:0 in libev.so.4.0.0[7f5e1e8b8000+d000]
[Tue Jun 18 12:47:01 2019] ocserv-main[6841]: segfault at 0 ip (null) sp 00007fffd1ace668 error 14 in ocserv[558a01f44000+5c000]
[Tue Jun 18 20:20:22 2019] traps: ocserv-main[25818] general protection ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]
[Thu Jun 20 13:18:57 2019] ocserv-main[1778]: segfault at 0 ip (null) sp 00007ffe0e0a4858 error 14 in ocserv[55d4d2691000+5c000]
==========with libev 4.25:Manual compilation and installation==========
dmesg -T:
[Tue Jun 18 20:20:21 2019] traps: ocserv-main[25818] general protection ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]
[admin at vpn tmp]$ sudo file /tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462
/tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'ocserv-main', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/usr/sbin/ocserv', platform: 'x86_64'
Unix Time:1560860462 = 2019/6/18 20:21:2 CST
[admin at vpn tmp]$ sudo chmod +r core-ocserv-main-sig11-user0-group0-pid25818-time1560860462
[admin at vpn ~]$ gdb /usr/sbin/ocserv /tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from /usr/lib/debug/usr/sbin/ocserv.debug...done.
done.
warning: core file may not match specified executable file.
[New LWP 25818]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main '.
Program terminated with signal 11, Segmentation fault.
#0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
2658 if ((w->pid == pid || !w->pid)
(gdb) where
#0 child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
#1 childcb (loop=0x7f49ca995a40 <default_loop_struct>, sw=<optimized out>, revents=<optimized out>) at ev.c:2690
#2 0x00007f49ca78c3d5 in ev_invoke_pending (loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:3322
#3 0x00007f49ca78f5b5 in ev_run (loop=0x7f49ca995a40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
#4 0x0000559f444867da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
(gdb) l
2653 ev_child *w;
2654 int traced = WIFSTOPPED (status) || WIFCONTINUED (status);
2655
2656 for (w = (ev_child *)childs [chain & ((EV_PID_HASHSIZE) - 1)]; w; w = (ev_child *)((WL)w)->next)
2657 {
2658 if ((w->pid == pid || !w->pid)
2659 && (!traced || (w->flags & 1)))
2660 {
2661 ev_set_priority (w, EV_MAXPRI); /* need to do it *now*, this *must* be the same prio as the signal watcher itself */
2662 w->rpid = pid;
(gdb) p w
$2 = (ev_child *) 0x2d3832312d534541
(gdb) p w->pid
Cannot access memory at address 0x2d3832312d53456d
p w->pid is a wild pointer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schmorp.de/pipermail/libev/attachments/20190620/5625867f/attachment.html>
More information about the libev
mailing list