Re: Does HttpClients support the use case of broadcasting messages to multiple remote servers?

Posted Sat, 05 Jan 2013 08:14:20 GMT in reply to zhaopuming

On Sat, 05 Jan 2013 04:00:41 GMT, zhaopuming wrote:

On Fri, 04 Jan 2013 17:39:54 GMT, Sönke Ludwig wrote:

One way to achieve this would be to use a timer and and a signal. The requests are all run in parallel in different tasks/fibers using runTask and the original task will wait until either the timeout is reached or all requests have been answered. The only drawback is that all the late requests will continue to run after the timeout - it shouldn't have any serious impact on anything though, their connections will time out eventually or the request is finished a bit later.

Thanks :-) The timer and signal mechanism is exactly what we were expecting.

But there is still a question, we have to use keep-alive connection and even http pipelining for better performance,
but in this senario, those surviving replies after the timeout will affect later requests. because in a keep-alive and pipelined connection, a new request won't be sent until the last response is replied(otherwise the ordering of http req/resp is messed up),

So if we have a huge incoming requests that fill up all connections in the connection pool, and some of them is timed out,
then new requests would have to wait for those timed out replies to come back, which is unnecessary, and will waste their own precious time, because their timer is already running on! Is there a way to abort a httpclient request without closing the current connection and affecting the connection pool's ability to immediately handler new incoming requests?

One way would be to make the connection pool more elastic to handle this, it could keep a timer on each connection,
and when a connection is timed out, it won't receive new requests for the time being, and the request goes to a 'availale' connection.
If there is no 'available' connection, which means current connections can't handle the traffic, the pool could create new connections,
and or abort the oldest connections(because it is already timed out, so the current request on that connection is useless).

I know this seems a very strange requirement, but our first priority is throughput and quick response, all timed out requests
should be throw away.

The connection pool will automatically create another connection to the same server if there are no existing unused connections left. So without terminating the timed out requests they will stack up a number of connections, but not directly influence later requests. For termination, I'll first have to add some new function to provide access to the TcpConnection.close function in some way.

It's planned for later to have more facilities for controlling tasks (e.g. waiting for their end or terminating them) and also to have a generic broadcast class that could be made general enough to handle this situation. I think such a broadcast class is quite important because generally it shouldn't be necessary to work with such low-level details such as rawYield. But I cannot really say when I'm able to get that done...

I don't quite understand emit() and rawYield() yet, I'll try the code later :-)

Now that you say it, there is a bug in that code. The signal also needs to be acquired/released during the while loop, just as the timer.

But basically the Signal.emit() call will cause every fiber that has currently acquired the signal to continue execution if it is stuck in a rawYield() (which goes to Fiber.yield() so this is basically just a fiber based condition variable. The same goes for the timer when it fires.

Btw. using requestHttp() will keep a connection open to each server automatically using a ConnectionPool internally. So there is no need to explicitly store HttpClient instances.

I have a question: each time requestHttp() is called for the same server, will it create a new ConnetionPool? or somehow it manages
to reuse a ConnectionPool the first time a requestHttp() is called for this server?

There is basically one static connection pool per thread. Each time a request is made, the pool is first searched for an existing (keep-alive) connection to that server. If one is found, it will be used for the request. Otherwise a new connection is made and put into the pool after the request. So effectively there will always be n active connections for n concurrent requests.