ocserv-main segfault in libev.so.4.0.0

§汤圆§/ty 20207593 at qq.com
Thu Jun 20 16:37:25 CEST 2019


Hello, libev team, sorry to bother you.


I am a small development team here, about 20 users connect to the ocserv server through cisco anyconnect. I don't know how to manually reproduce this problem, but in my scenario, the ocserv-main process will exit with segment-fault. All users will drop from the vpn at the same time. Every day, it will fail 1-2 times. It is not a fault at startup. Failure after running for a while.


From the results of coredump, I think this is a problem with libev.



As you can see from the dmesg-T log, this is the list of faults in the most recent week.
This problem makes us very depressed and will cause work to be interrupted.
This is the latest fault today.



[admin at vpn ~]$ uname -a
Linux vpn.kofo.io 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[admin at vpn ~]$ cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 
[admin at vpn ~]$ rpm -qa | grep ocserv
ocserv-0.12.3-1.el7.x86_64
ocserv-debuginfo-0.12.3-1.el7.x86_64




I tested two versions of libev



==========with libev-4.15-7.el7.x86_64.rpm:==========
[admin at vpn ~]$ gdb /usr/sbin/ocserv /tmp/core-ocserv-main-sig11-user0-group0-pid1778-time1561007979
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from /usr/lib/debug/usr/sbin/ocserv.debug...done.
done.


warning: core file may not match specified executable file.
[New LWP 1778]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main                                                                   '.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in ?? ()
#1  0x00007f305fc553d5 in ev_invoke_pending (loop=0x7f305fe5ea40 <default_loop_struct>) at ev.c:3322
#2  0x00007f305fc585b5 in ev_run (loop=0x7f305fe5ea40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
#3  0x000055d4d269d7da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
(gdb) l
1222	static void syserr_cb (const char *msg)
1223	{
1224		main_server_st *s = ev_userdata(loop);
1225	
1226		mslog(s, NULL, LOG_ERR, "libev fatal error: %s", msg);
1227		abort();
1228	}
1229	
1230	int main(int argc, char** argv)
1231	{
(gdb) quit


[admin at vpn ~]$ dmesg -T | tail
[Mon Jun 17 20:07:38 2019] traps: ocserv-main[4398] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:10 2019] traps: ocserv-main[4708] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:28 2019] traps: ocserv-main[4743] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:12:56 2019] traps: ocserv-main[4767] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:13:34 2019] traps: ocserv-main[4819] general protection ip:7fb9b653e35c sp:7ffcfdb6ce50 error:0 in libev.so.4.0.0[7fb9b6536000+d000]
[Mon Jun 17 20:16:38 2019] ocserv-main[14426]: segfault at 55686ec0add8 ip 00007fb9b653aabc sp 00007ffcfdb6cec0 error 6 in libev.so.4.0.0[7fb9b6536000+d000]
[Tue Jun 18 03:30:28 2019] traps: ocserv-main[5392] general protection ip:7f5e1e8bcc48 sp:7ffc2e520af0 error:0 in libev.so.4.0.0[7f5e1e8b8000+d000]
[Tue Jun 18 12:47:01 2019] ocserv-main[6841]: segfault at 0 ip           (null) sp 00007fffd1ace668 error 14 in ocserv[558a01f44000+5c000]
[Tue Jun 18 20:20:22 2019] traps: ocserv-main[25818] general protection ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]
[Thu Jun 20 13:18:57 2019] ocserv-main[1778]: segfault at 0 ip           (null) sp 00007ffe0e0a4858 error 14 in ocserv[55d4d2691000+5c000]





==========with libev 4.25:Manual compilation and installation==========


dmesg -T:
[Tue Jun 18 20:20:21 2019] traps: ocserv-main[25818] general protection ip:7f49ca78cc48 sp:7ffe69d8fc50 error:0 in libev.so.4.0.0[7f49ca788000+d000]


[admin at vpn tmp]$ sudo file /tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462 
/tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'ocserv-main', real uid: 0, effective uid: 0, real gid: 0, effective gid: 0, execfn: '/usr/sbin/ocserv', platform: 'x86_64'


Unix Time:1560860462 = 2019/6/18 20:21:2 CST


[admin at vpn tmp]$ sudo chmod +r core-ocserv-main-sig11-user0-group0-pid25818-time1560860462   


[admin at vpn ~]$ gdb /usr/sbin/ocserv /tmp/core-ocserv-main-sig11-user0-group0-pid25818-time1560860462 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/ocserv...Reading symbols from /usr/lib/debug/usr/sbin/ocserv.debug...done.
done.


warning: core file may not match specified executable file.
[New LWP 25818]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ocserv-main                                                                   '.
Program terminated with signal 11, Segmentation fault.
#0  child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
2658	      if ((w->pid == pid || !w->pid)
(gdb) where
#0  child_reap (status=0, pid=1619, chain=1619, loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:2658
#1  childcb (loop=0x7f49ca995a40 <default_loop_struct>, sw=<optimized out>, revents=<optimized out>) at ev.c:2690
#2  0x00007f49ca78c3d5 in ev_invoke_pending (loop=0x7f49ca995a40 <default_loop_struct>) at ev.c:3322
#3  0x00007f49ca78f5b5 in ev_run (loop=0x7f49ca995a40 <default_loop_struct>, flags=flags at entry=0) at ev.c:3726
#4  0x0000559f444867da in main (argc=<optimized out>, argv=<optimized out>) at main.c:1440
(gdb) l
2653	  ev_child *w;
2654	  int traced = WIFSTOPPED (status) || WIFCONTINUED (status);
2655	
2656	  for (w = (ev_child *)childs [chain & ((EV_PID_HASHSIZE) - 1)]; w; w = (ev_child *)((WL)w)->next)
2657	    {
2658	      if ((w->pid == pid || !w->pid)
2659	          && (!traced || (w->flags & 1)))
2660	        {
2661	          ev_set_priority (w, EV_MAXPRI); /* need to do it *now*, this *must* be the same prio as the signal watcher itself */
2662	          w->rpid    = pid;
(gdb) p w
$2 = (ev_child *) 0x2d3832312d534541
(gdb) p w->pid
Cannot access memory at address 0x2d3832312d53456d


p w->pid is a wild pointer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schmorp.de/pipermail/libev/attachments/20190620/5625867f/attachment.html>


More information about the libev mailing list