https://www.techempower.com/benchmarks/previews/round14
It finally seems to at least pass all the tests. Also it uses the new vibe-core beta.
But I would expect much higher position at least in the plaintext test.
https://www.techempower.com/benchmarks/previews/round14
It finally seems to at least pass all the tests. Also it uses the new vibe-core beta.
But I would expect much higher position at least in the plaintext test.
On Thu, 20 Apr 2017 16:24:59 GMT, Tomáš Chaloupka wrote:
https://www.techempower.com/benchmarks/previews/round14
It finally seems to at least pass all the tests. Also it uses the new vibe-core beta.
But I would expect much higher position at least in the plaintext test.
I've tried the plaintext test on my Broadwell NTB with i5-5300U
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext
Running 30s test @ http://localhost:8080/plaintext
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 0.86ms 2.16ms 83.77ms 99.19%
Req/Sec 2.64k 291.25 3.40k 82.08%
315459 requests in 30.02s, 50.24MB read
Requests/sec: 10507.38
Transfer/sec: 1.67MB
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext
Running 30s test @ http://localhost:8080/plaintext
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 430.97us 0.86ms 20.36ms 96.28%
Req/Sec 6.04k 1.24k 9.09k 68.58%
721024 requests in 30.03s, 111.39MB read
Requests/sec: 24012.03
Transfer/sec: 3.71MB
Compiled with dmd 2.074.0 with dub -b release
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext
Running 30s test @ http://localhost:8080/plaintext
4 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 728.27us 2.72ms 107.30ms 96.02%
Req/Sec 6.62k 1.14k 10.02k 74.67%
790419 requests in 30.04s, 95.73MB read
Requests/sec: 26315.10
Transfer/sec: 3.19MB
Compiled with go1.7.5
Am 20.04.2017 um 21:33 schrieb Tomáš Chaloupka:
On Thu, 20 Apr 2017 16:24:59 GMT, Tomáš Chaloupka wrote:
https://www.techempower.com/benchmarks/previews/round14
It finally seems to at least pass all the tests. Also it uses the new vibe-core beta.
But I would expect much higher position at least in the plaintext test.
I've tried the plaintext test on my Broadwell NTB with i5-5300U
new vibe-core
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext Running 30s test @ http://localhost:8080/plaintext 4 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 0.86ms 2.16ms 83.77ms 99.19% Req/Sec 2.64k 291.25 3.40k 82.08% 315459 requests in 30.02s, 50.24MB read Requests/sec: 10507.38 Transfer/sec: 1.67MB
old
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext Running 30s test @ http://localhost:8080/plaintext 4 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 430.97us 0.86ms 20.36ms 96.28% Req/Sec 6.04k 1.24k 9.09k 68.58% 721024 requests in 30.03s, 111.39MB read Requests/sec: 24012.03 Transfer/sec: 3.71MB
Compiled with dmd 2.074.0 with dub -b release
Compared to for example go-std:
[tomas@E7450 wrk]$ ./wrk -c 10 -d 30 -t 4 http://localhost:8080/plaintext Running 30s test @ http://localhost:8080/plaintext 4 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 728.27us 2.72ms 107.30ms 96.02% Req/Sec 6.62k 1.14k 10.02k 74.67% 790419 requests in 30.04s, 95.73MB read Requests/sec: 26315.10 Transfer/sec: 3.19MB
Compiled with go1.7.5
It almost looks like it never scales beyond a single core for some
reason. I'll have to start another profiling round to be sure, but it
could be related to the switch to std.experimental.allocator
. Maybe
the GC is now suddenly the bottleneck.
BTW, thanks a lot for fixing the benchmark suite! This is something that
I always had in mind as an important issue, but could never find the
time for. I'll try look into the performance issue within the next few days.
On 2017-04-23 13:53, Sönke Ludwig wrote:
It almost looks like it never scales beyond a single core for some
reason.
I've seen similar behaviors when I was running performance tests on a
small vibe.d application last year. It had the same performance
regardless if it was running single or multi-threaded.
BTW, how does vibe.d's multi-threading functionality works? Does it
spread the fibers across multiple threads or does it use multiple event
loops?
/Jacob Carlborg
Am 23.04.2017 um 16:48 schrieb Jacob Carlborg:
On 2017-04-23 13:53, Sönke Ludwig wrote:
It almost looks like it never scales beyond a single core for some
reason.I've seen similar behaviors when I was running performance tests on a
small vibe.d application last year. It had the same performance
regardless if it was running single or multi-threaded.BTW, how does vibe.d's multi-threading functionality works? Does it
spread the fibers across multiple threads or does it use multiple event
loops?
It starts one loop per thread and lets the OS distribute incoming
connections across threads (using SO_REUSEPORT
). However, usually the
better method is to actually start up one process per CPU core, as that
avoids issues like the GC lock bringing everything to a crawl.
It almost looks like it never scales beyond a single core for some
reason. I'll have to start another profiling round to be sure, but it
could be related to the switch tostd.experimental.allocator
. Maybe
the GC is now suddenly the bottleneck.BTW, thanks a lot for fixing the benchmark suite! This is something that
I always had in mind as an important issue, but could never find the
time for. I'll try look into the performance issue within the next few days.
I didn't post it but tested also with profilegc and new vibe-core allocates insanely more than the old one in the same simple plaintext test so it might be the case.