Re: My benchmarks - RejectedSoftware Forums

Posted Tue, 11 Feb 2014 13:41:21 GMT in reply to Sönke Ludwig

On Tue, 11 Feb 2014 13:34:09 GMT, Sönke Ludwig wrote:

On Tue, 11 Feb 2014 11:31:43 GMT, Rikki Cattermole wrote:

I've been working towards getting more requests per second.

I'm doing this on not such a good machine so take this with a grain of salt (HP elite 190a). E.g. my hard drives are quite old and worn with noticeable speed issues. Even things like AV effect its performance. For this they are disabled.
To do this was basically commenting out at https://github.com/rejectedsoftware/vibe.d/blob/06adc36e73a588f4563d33396a95c3891b2bb373/source/vibe/http/server.d#L1431

You mean the foreach (log; context.loggers)? If you don't have accessLogToConsole or accessLogFile set in the HTTPServerSettings, I don't see how that could possibly have a measurable influence on performance.

I'm currently getting around 5.3k requests per second.

That's still quite low actually, which OS is that on?

The other things I have done differently is that I have implemented a router that generates its checking code at compile time via delegates (because routes are known at CTFE).
There is around 18 routes currently being served of which only one (index) is being tested. It outputs basically just a HEAD request.

My idea was rather to optimize the existing runtime router by creating a decision tree from all the routes, so that the selection time goes down from O(n) to O(log(n)), which should be plenty sufficient if the server serves any kind of non-trivial content.

Strangely manually specifying a HEAD request with ab is slower by around 100ms.

For routes that doesn't exist, I'm getting around 5.2k requests/s. And with -i on ab around 4.2 which is quite a bit different. However my router handles 404's and so should not be causing an exception.

I think I'll replace that exception in the HTTP server by a manual call to the error page handler. Even if exceptions will be optimized (as they definitely need to), there is no need to go with an exception there, except for saving a tiny litte code duplication.

I rewrote my router specifically to test this out at. Before hand the most I could get was 1.3k. I don't know if its an accumulation of factors or simply regex harming.

When I get the time, I'll start another profiler run in VTune. After my last optimization run some months (a year?) ago, almost all of the time was taken by the kernel for I/O related things. Since nothing has changed since then regarding the router, logging or similar, my guess would be that something like a GC allocation or some performance hungry debug assertion has slipped back in somewhere and causes the slowdown*.

I'm a lot happier with this now than what I was a few days ago.

What I can say is that I've got up to 80k/s requests on my AMD Phenom-II quad-core on Linux (should be considarably slower than your i7). However it made a huge difference if the benchmark app (ab) was run on the same machine, or on a different one. The client and server processes can have surprising interactions when they run on the same machine and on the loopback device. There is also weighttp, which has a much lower overhead than ab. The only drawback is that it doesn't output as detailed statistics as ab.

* it should be said that I've used HTTPServerOption.distribute and -version=VibeManualMemoryManagement for the benchmarks.

What speaks against using -version=VibeManualMemoryManagement as a default anyway ? Is this whole option documented somewhere ?