Hey guys,
I think I've run into something that may be a bug... I haven't entirely ruled out issues in my code but given the nature of the problem I think it's likely there might be something of vibe involved. Granted I may just be missing something about the semantics, but I'm at a bit of a loss after debugging my code for a while.
The issue crops up with a sequence that looks something like this:
1) Several outstanding client fibers all remote disconnect at roughly the same time - exceptions thrown in the tasks and some scope(exit)'s get triggered.
2) The scope(exit) stuff triggers calling a REST interface (via vibe's REST wrapper). Calling that interface will block of course and switch to other fibers.
So generally this all works fine, but on Linux (ubuntu 14.04, dmd 2.068.2, vibe 7.25) this will sometimes trigger a segfault somewhere with the REST call. Enabling "trace" logging reveals that this is being triggered by something related to allocating a new connection from the requestHTTP connection pool, although the crash does seem to occur after that when trying to "resume" some task (perhaps the one that is doing the REST request). Here's a portion of the trace - note that the lines that start with the date/time come from application logging:
HTTP client request:
PUT ... HTTP/1.1
User-Agent: vibe.d/0.7.25 (HTTPClient, +http://vibed.org/)
Connection: keep-alive
Accept-Encoding: gzip, deflate
Host: localhost
Content-Type: application/json
Content-Length: 3783
evbufferadd (fd 51): 235 B
evbufferadd (fd 51): 3783 B
buffereventflush
HTTP client reading status line
leastSize waiting for new data.
Socket event on fd 46: 17 (13D55D8 vs 13D55D8)
Connection was closed (fd 46).
resuming corresponding task...
2015-09-27 20:42:29: removeconnection test6: start
2015-09-27 20:42:29: remove_connection test6: found client
2015-09-27 20:42:29: Room 7: Removed client 'test6', ID 1073741829 (0 connections total)
2015-09-27 20:42:29: localhost: Sending room data to login server (20 rooms)
returning HTTPClient connection 6 of 7
HTTP client request:
PUT ... HTTP/1.1
User-Agent: vibe.d/0.7.25 (HTTPClient, +http://vibed.org/)
Connection: keep-alive
Accept-Encoding: gzip, deflate
Host: localhost
Content-Type: application/json
Content-Length: 3783
evbufferadd (fd 55): 235 B
evbufferadd (fd 55): 3783 B
buffereventflush
HTTP client reading status line
leastSize waiting for new data.
Socket event on fd 47: 17 (13D6758 vs 13D6758)
Connection was closed (fd 47).
resuming corresponding task...
2015-09-27 20:42:29: removeconnection test7: start
2015-09-27 20:42:29: remove_connection test7: found client
2015-09-27 20:42:29: Room 8: Removed client 'test7', ID 1073741830 (0 connections total)
2015-09-27 20:42:29: localhost: Sending room data to login server (20 rooms)
creating new HTTPClient connection, all 7 are in use
... 7EFF3111D280
Now got 8 connections
resuming corresponding task...
Segmentation fault (core dumped)
So I'm going to try to get gdb set up to get a proper stack trace - not sure how hard this will be - I generally do my debugging on Windows and just deploy to Linux but I cannot reproduce this issue on Windows at all. In the mean time I wanted to post to see if anyone else has run into anything like this.
This particular issue I can work around by moving the REST call to a deferred task/timer it seems, but ideally I'd like to understand the issue, particularly if it's something in my code so that I can avoid doing it in the future.
The full project is a bit much to post (and a pain to set up and run as it spans multiple processes) but if need be I can try and boil it down to a fairly minimal reproducer.
Thanks in advance!
PS: Just need to call out again how awesome vibe is - the REST interface generator in particular has allowed me to scale out my application to multiple servers while barely changing any of the code and letting vibe just nicely do the RPC via the REST stuff. So great!