I'd still wait a little. It looks like there might be more optimization
opportunities and I'll create a new pre-release version when those have
been investigated. But in any case, at least the latest alpha release
should be used, as that fixes the multi-core scaling issues.
I would like share my thoughts about vibed performance.
I use Linux, git master with latest multicore fixes and improvements. I think Vibed has bottleneck in libevent2 library now: Libevent2TCPConnection
class. I am not expert in libevent library, but I am sure that Libevent2TCPConnection class currently uses expensive and inefficient call sequence. I wrote another implementation libevent2_tcp.d
It works only for small request like hello-world from WebFrameworkBenchmark/benchmarks/vibed, but it has 2.5 performance gap over current version. The main idea is simple: read all data from one libevent2 chunk at once and do not use bufferevent_read in read method. You can take a look to peek() and read() methods in my implementation. I could not find correct way advance reading to next libevent2 data chunk and integrate it this Vibed.
Also I suppose read method is problem it self. I do not think that it is important right now, but it has argument ubyte[] and it makes impossible to use zero-copy approach. I always have to copy data in this method. It may be problem for high-speed processing with zero-copy solution like PFQ, DPDK, or Netmap.
My test result for my version:
wrk -t 4 -d 2s "http://localhost:8081/"
Running 2s test @ http://localhost:8081/
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 2.10ms 5.24ms 48.73ms 94.95%
Req/Sec 56.81k 13.56k 78.04k 65.85%
463299 requests in 2.10s, 78.32MB read
Socket errors: connect 0, read 717, write 0, timeout 0
Non-2xx or 3xx responses: 717
Requests/sec: 220691.82
Transfer/sec: 37.31MB
Please notice that my version has 717 errors even with small requests, and average is worst than 2ms
git master:
wrk -t 4 -d 2s "http://localhost:8081/"
Running 2s test @ http://localhost:8081/
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 318.67us 1.60ms 24.78ms 97.33%
Req/Sec 21.92k 2.33k 30.23k 73.49%
180981 requests in 2.10s, 30.20MB read
Requests/sec: 86188.69
Transfer/sec: 14.38MB