RejectedSoftware Forums

Sign up

Vibe.d not quitting after a null ptr deref

On Linux, when I get any sort of uncaught Error/Throwable, the process just reports:

Task terminated with unhandled exception:

without any other information, and then just sits there taking up 100% CPU.

Is this known? Can I do something about it? I need to have it die normally.

Cheers,
-M

Re: Vibe.d not quitting after a null ptr deref

Am 22.04.2015 um 21:32 schrieb Márcio Martins:

On Linux, when I get any sort of uncaught Error/Throwable, the process just reports:

Task terminated with unhandled exception:

without any other information, and then just sits there taking up 100% CPU.

Is this known? Can I do something about it? I need to have it die normally.

Cheers,
-M

For segmentation faults, the process directly exists with -11 for me,
but for Throwables (non-Exceptions) I also get a segfault in
__d_throwc. It seems like throwing across C boundaries (libevent's
event_base_loop) may be the issue here. I couldn't reproduce the
hanging process, though.

Since the goal is to eventually switch to the libasync backend by
default, can you try of that works better for you? It should at least
have pure D call stacks.

For the libevent driver, the best option may be to simply directly call
C's exit() when an uncaught error happens in one of the C event callbacks.

Re: Vibe.d not quitting after a null ptr deref

On Sun, 26 Apr 2015 13:49:59 +0200, Sönke Ludwig wrote:

Am 22.04.2015 um 21:32 schrieb Márcio Martins:

On Linux, when I get any sort of uncaught Error/Throwable, the process just reports:

Task terminated with unhandled exception:

without any other information, and then just sits there taking up 100% CPU.

Is this known? Can I do something about it? I need to have it die normally.

Cheers,
-M

For segmentation faults, the process directly exists with -11 for me,
but for Throwables (non-Exceptions) I also get a segfault in
__d_throwc. It seems like throwing across C boundaries (libevent's
event_base_loop) may be the issue here. I couldn't reproduce the
hanging process, though.

Since the goal is to eventually switch to the libasync backend by
default, can you try of that works better for you? It should at least
have pure D call stacks.

For the libevent driver, the best option may be to simply directly call
C's exit() when an uncaught error happens in one of the C event callbacks.

Thanks Sönke!

I remember I removed this line in core.d, and it suddenly works great:

			// always pass Errors on
			if (auto err = cast(Error)th) throw err;

It has the side effect of continuing execution on any type of error, if it's in a fiber, which is sort of cool, but certainly not for every application. I'd rather shutdown and restart than to risk any sort of corruption, even if the chances are very slim.

I will try libasync. Is it ready for production use?

Re: Vibe.d not quitting after a null ptr deref

On Mon, 27 Apr 2015 12:43:17 GMT, Márcio Martins wrote:

On Sun, 26 Apr 2015 13:49:59 +0200, Sönke Ludwig wrote:

Am 22.04.2015 um 21:32 schrieb Márcio Martins:

On Linux, when I get any sort of uncaught Error/Throwable, the process just reports:

Task terminated with unhandled exception:

without any other information, and then just sits there taking up 100% CPU.

Is this known? Can I do something about it? I need to have it die normally.

Cheers,
-M

For segmentation faults, the process directly exists with -11 for me,
but for Throwables (non-Exceptions) I also get a segfault in
__d_throwc. It seems like throwing across C boundaries (libevent's
event_base_loop) may be the issue here. I couldn't reproduce the
hanging process, though.

Since the goal is to eventually switch to the libasync backend by
default, can you try of that works better for you? It should at least
have pure D call stacks.

For the libevent driver, the best option may be to simply directly call
C's exit() when an uncaught error happens in one of the C event callbacks.

Thanks Sönke!

I remember I removed this line in core.d, and it suddenly works great:

			// always pass Errors on
			if (auto err = cast(Error)th) throw err;

It has the side effect of continuing execution on any type of error, if it's in a fiber, which is sort of cool, but certainly not for every application. I'd rather shutdown and restart than to risk any sort of corruption, even if the chances are very slim.

I will try libasync. Is it ready for production use?

So I tried libasync for a day, and I concluded it's not quite as stable as libevent, yet.
It does terminate correctly more often! I never got it to hang on the frontend servers but it did a couple of times locally on mine and another dev's local environments.

The bigger issue is that every now and then, it crashes with this callstack:

[27E14F55:1C528955 2015.04.27 22:44:46.956 CRITICAL] Task terminated with uncaught exception: Connection closed
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] object.Exception@.dub/packages/vibe-d-0.7.23/source/vibe/core/drivers/libasync.d(1267): Connection closed
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] ----------------
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.VibeDriverCore.yieldForEvent()+0x47) [0x8d1877]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(bool vibe.core.drivers.libasync.LibasyncTCPConnection.waitForData(core.time.Duration)+0x1ea) [0x8ddb62]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.http.server.handleHTTPConnection(vibe.core.net.TCPConnection, vibe.http.server.HTTPServerListener)+0x72) [0xa1588a]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.http.server.listenHTTPPlain(vibe.http.server.HTTPServerSettings).doListen(vibe.http.server.HTTPServerSettings, ulong, immutable(char)[])._lambda4(vibe.core.net.TCPConnection)+0x33) [0xa10b1b]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.drivers.libasync.LibasyncTCPConnection.onConnect()+0x47) [0x8def4f]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.makeTaskFuncInfo!(void delegate()).makeTaskFuncInfo(ref void delegate()).callDelegate(vibe.core.core.TaskFuncInfo*)+0x40) [0x8d5d18]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.CoreTask.run()+0x20a) [0x8d0f22]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void core.thread.Fiber.run()+0x2a) [0xb3443e]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(fiber
entryPoint+0x61) [0xb34341]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] [(nil)]

From here on, every connection attempt is refused, while the process is still running.

Another issue which I didn't have time to investigate more, was that it very frequently hangs on startup. Basically it nevers starts to listen, with no errors or anything, just hangs there. I noticed it doesn't lower privileges when this happens, so it must be hanging at or before that.
It seems to happen only on release builds, so it might be a threading issue or perhaps an environment issue on that host...

Finally, just an observation, it won't be as fast as libevent for those of us compiling with DMD at least, for that reason. I got a 55+/-10 vs 100+/-5 request/sec with libasync vs libevent. Not sure if entirely due to compiling with DMD, but nonetheless a considerable difference.

Re: Vibe.d not quitting after a null ptr deref

Am 28.04.2015 um 14:41 schrieb Márcio Martins:

On Mon, 27 Apr 2015 12:43:17 GMT, Márcio Martins wrote:

On Sun, 26 Apr 2015 13:49:59 +0200, Sönke Ludwig wrote:

Am 22.04.2015 um 21:32 schrieb Márcio Martins:

On Linux, when I get any sort of uncaught Error/Throwable, the process just reports:

Task terminated with unhandled exception:

without any other information, and then just sits there taking up 100% CPU.

Is this known? Can I do something about it? I need to have it die normally.

Cheers,
-M

For segmentation faults, the process directly exists with -11 for me,
but for Throwables (non-Exceptions) I also get a segfault in
__d_throwc. It seems like throwing across C boundaries (libevent's
event_base_loop) may be the issue here. I couldn't reproduce the
hanging process, though.

Since the goal is to eventually switch to the libasync backend by
default, can you try of that works better for you? It should at least
have pure D call stacks.

For the libevent driver, the best option may be to simply directly call
C's exit() when an uncaught error happens in one of the C event callbacks.

Thanks Sönke!

I remember I removed this line in core.d, and it suddenly works great:

			// always pass Errors on
			if (auto err = cast(Error)th) throw err;

It has the side effect of continuing execution on any type of error, if it's in a fiber, which is sort of cool, but certainly not for every application. I'd rather shutdown and restart than to risk any sort of corruption, even if the chances are very slim.

I will try libasync. Is it ready for production use?

So I tried libasync for a day, and I concluded it's not quite as stable as libevent, yet.
It does terminate correctly more often! I never got it to hang on the frontend servers but it did a couple of times locally on mine and another dev's local environments.

The bigger issue is that every now and then, it crashes with this callstack:

[27E14F55:1C528955 2015.04.27 22:44:46.956 CRITICAL] Task terminated with uncaught exception: Connection closed
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] object.Exception@.dub/packages/vibe-d-0.7.23/source/vibe/core/drivers/libasync.d(1267): Connection closed
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] ----------------
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.VibeDriverCore.yieldForEvent()+0x47) [0x8d1877]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(bool vibe.core.drivers.libasync.LibasyncTCPConnection.waitForData(core.time.Duration)+0x1ea) [0x8ddb62]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.http.server.handleHTTPConnection(vibe.core.net.TCPConnection, vibe.http.server.HTTPServerListener)+0x72) [0xa1588a]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.http.server.listenHTTPPlain(vibe.http.server.HTTPServerSettings).doListen(vibe.http.server.HTTPServerSettings, ulong, immutable(char)[])._lambda4(vibe.core.net.TCPConnection)+0x33) [0xa10b1b]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.drivers.libasync.LibasyncTCPConnection.onConnect()+0x47) [0x8def4f]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.makeTaskFuncInfo!(void delegate()).makeTaskFuncInfo(ref void delegate()).callDelegate(vibe.core.core.TaskFuncInfo*)+0x40) [0x8d5d18]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void vibe.core.core.CoreTask.run()+0x20a) [0x8d0f22]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(void core.thread.Fiber.run()+0x2a) [0xb3443e]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] www(fiber
entryPoint+0x61) [0xb34341]
[27E14F55:1C54AF55 2015.04.27 22:44:46.961 ERR] [(nil)]

From here on, every connection attempt is refused, while the process is still running.

Another issue which I didn't have time to investigate more, was that it very frequently hangs on startup. Basically it nevers starts to listen, with no errors or anything, just hangs there. I noticed it doesn't lower privileges when this happens, so it must be hanging at or before that.
It seems to happen only on release builds, so it might be a threading issue or perhaps an environment issue on that host...

Finally, just an observation, it won't be as fast as libevent for those of us compiling with DMD at least, for that reason. I got a 55+/-10 vs 100+/-5 request/sec with libasync vs libevent. Not sure if entirely due to compiling with DMD, but nonetheless a considerable difference.

Thanks for sharing those results. There are definitely still issues (the
vibe.core.sync unit tests currently fail on Posix for example) and I
think we need to somehow make the libasync backend more prominent to
facilitate more thorough testing.

I'll have a look at the issues and ping Etienne (if he hasn't already
read this).