I know you asked Sönke but the question was too interesting and I felt tempted to throw in some of my perspective about it since I've had these same questions myself before.

On 2014-07-22 5:31 AM, zhaopuming wrote:

The problem is we might have 5 billion ~ 10 billion request per day. And we need a latency as short as possible.

That's really huge, I'm expecting something in that order as well. It's safer to make it possible to scale it onto multiple servers with load balancers. I would suggest you use hash-based distribution to multiple Redis servers using the Redis key hashes, though you should calculate how many of them you'll need at first to avoid moving keys in interation until you find a "load % sweet spot" (they're scaled by Hash Range).

  1. Does vibe.d use a lot of classes instead of structs? does vibe.d allocate much?

The less you use the GC the less it could happen to collect BUT collection is still very quick - I've benchmarked it extensively and they're not even in the order of a hundred usecs for 1-2GB. If you compile with VibeManualMemoryManagement, most allocations are scoped and on vibe.memory freelists which are stacked, it is very light and you would avoid the GC for that part. But, you should use the GC for smaller data like string appenders because the free lists there are incremented by powers of 2 and the freelist in vibe.utils.memory are incremented by powers of 10 (which increases memory use). Packets won't produce allocations unless you want them to, they're put into a circular buffer for you to fetch with the InputStream.read(ubyte[]) base class interface of TCPConnection.

The libevent engine handling connections is a VERY fast library (written in C) and I don't think it allocates.

  1. Does the client support timer and wait/drop?

The timers are optimized carefully imo, they're in expiration-sorted arrays that use malloc and double-sized growth policy, this is the kind of place where compiler optimizations kick in. I haven't seen a connection timeout feature but I suppose it would be easy to implement with the connections started in runTask and closed by throwing in a timer

  1. how's vibe.d's JSON support in this scenario? would it hurdle the async mode because there are too much CPU work when doing serialization/deserialization?

There's some allocations in the JSON deserializer. The stack is used as much as possible though with recursion and the CPU usage is amazingly low because of compile-time traits which I think are 10-100x faster than run-time introspection in interpreted languages. You won't see any problems there and this is where you'll see D is most powerful.

  1. how do we utilize multicore in this setup? multiple processes?

The server automatically scales to all processors : many vibe.d worker tasks are started with libevent and listen on the same socket, which means your handlers will be run in every processor available. The D's druntime takes care of keeping the variables thread-local. So, multi-core optimizations are given for free. However, the bulk of the improvements will be in your usage of tasks, so don't be afraid to abuse starting new vibe.d tasks if you insist on never seeing any blocking, they're so much faster than threads with the same benefits.

  1. how does vibe.d handle network related errors? (connection break, etc)

The task throws with the error number (usually this happens in a libevent callback), drops the connection and cleans up automatically. The other connections continue as if nothing happened. Every error possibility is covered up to your application's code, and you can log it or not depending on the log settings with setLogLevel.

Finally, comparing it to Java's bytecode, I've put my bet that machine code optimizations and compile-time optimizations have put D in a much better place than even C, so there's definitely a head start. The rest depends of course on the quality of your code and your ability to keep the data near the computation with the given tools - and D has those tools ;)