General questions on implementation

Permalink: HTTP NNTP

Posted Fri, 06 Dec 2013 18:06:41 GMT

Reply

Hi!

I have 2 general questions on the implementation. Maybe someone could give brief answers or some links with explanation?

vibe.d claims to have asynchronous I/O via fibers. As I understand it - this means the flow looks something like this. When we're handling a request we come to a certain point, where we need to e.g. query a record from DB. We issue an asynchronous request and call smth like yield, not to waste CPU time. Someone wakes us later, we check for the status of the request, and proceed if it's ready, or yield again otherwise.

So, it brings the question - in that case - who is responsible for calling yield? I assume if I'm not calling it myself, it should be called by the vibe.d functions. So, if I want to read from file - I shouldn't use readText, but should use some vibe.d yield-aware alternative. Is that the case?

Suppose, I have a class instance with state - e.g. a cached value of something. In each request I check if the value should be updated, possibly update it and proceed using this value. How I should make it work in vibe.d? If I create a class instance in shared static this(), then as I understand, the instance is shared among all of the threads, so if I use the cached value without synchronization - it's a race. Is that so? How can I avoid it?

Re: General questions on implementation

Permalink: HTTP NNTP

Etienne Cimon

Posted Fri, 06 Dec 2013 16:34:17 -0500 in reply to Sergei Nosov

Reply

On 2013-12-06 1:06 PM, Sergei Nosov wrote:

Hi!

I have 2 general questions on the implementation. Maybe someone could give brief answers or some links with explanation?

vibe.d claims to have asynchronous I/O via fibers. As I understand it - this means the flow looks something like this. When we're handling a request we come to a certain point, where we need to e.g. query a record from DB. We issue an asynchronous request and call smth like yield, not to waste CPU time. Someone wakes us later, we check for the status of the request, and proceed if it's ready, or yield again otherwise.

So, it brings the question - in that case - who is responsible for calling yield? I assume if I'm not calling it myself, it should be called by the vibe.d functions. So, if I want to read from file - I shouldn't use readText, but should use some vibe.d yield-aware alternative. Is that the case?

Suppose, I have a class instance with state - e.g. a cached value of something. In each request I check if the value should be updated, possibly update it and proceed using this value. How I should make it work in vibe.d? If I create a class instance in shared static this(), then as I understand, the instance is shared among all of the threads, so if I use the cached value without synchronization - it's a race. Is that so? How can I avoid it?

1)
ctx.core.yieldForEvent is called automatically in
vibe/core/drivers/libevent2_tcp.d at a lot of locations after reading or
writing data to or from buffers.

The fiber is resumed when ctx.core.resumeTask is called after a callback
is triggered from libevent2 here:

buffereventsetcb(bufevent, &onSocketRead, &onSocketWrite,
&onSocketEvent, client_ctx);

If you need to run routines that take a lot of CPU cycles, you can call
yield() every now and then to make sure other requests aren't left out
while this operation runs.

2)

You should place variables shared between fibers outside the scope of
shared static this()

Fibers don't race with each other, they are called sequentially in an
event loop as they are paused and resumed. Only threads are exposed to
data races, but by default, vibe.d runs on one thread only. Regardless,
D's variables are all thread-local by default as well. You can use
gshared storage specifier if you want to share globally, this is the
only possibility of there being a data race. It's recommended to use D's
synchronized specifier to lock global methods in a mutex if you need to
consult a gshared object.

Read more here: http://dlang.org/migrate-to-shared.html

Re: General questions on implementation

Permalink: HTTP NNTP

Etienne Cimon

Posted Fri, 06 Dec 2013 16:37:35 -0500 in reply to Etienne Cimon

Reply

On 2013-12-06 4:34 PM, Etienne Cimon wrote:

On 2013-12-06 1:06 PM, Sergei Nosov wrote:

Hi!

I have 2 general questions on the implementation. Maybe someone could
give brief answers or some links with explanation?

vibe.d claims to have asynchronous I/O via fibers. As I understand
it - this means the flow looks something like this. When we're
handling a request we come to a certain point, where we need to e.g.
query a record from DB. We issue an asynchronous request and call smth
like yield, not to waste CPU time. Someone wakes us later, we check
for the status of the request, and proceed if it's ready, or yield
again otherwise.

So, it brings the question - in that case - who is responsible for
calling yield? I assume if I'm not calling it myself, it should be
called by the vibe.d functions. So, if I want to read from file - I
shouldn't use readText, but should use some vibe.d yield-aware
alternative. Is that the case?

Suppose, I have a class instance with state - e.g. a cached value
of something. In each request I check if the value should be updated,
possibly update it and proceed using this value. How I should make it
work in vibe.d? If I create a class instance in shared static<br>this(), then as I understand, the instance is shared among all of the
threads, so if I use the cached value without synchronization - it's a
race. Is that so? How can I avoid it?

1)
ctx.core.yieldForEvent is called automatically in
vibe/core/drivers/libevent2_tcp.d at a lot of locations after reading or
writing data to or from buffers.

The fiber is resumed when ctx.core.resumeTask is called after a callback
is triggered from libevent2 here:

buffereventsetcb(bufevent, &onSocketRead, &onSocketWrite,
&onSocketEvent, client_ctx);

If you need to run routines that take a lot of CPU cycles, you can call
yield() every now and then to make sure other requests aren't left out
while this operation runs.

2)

You should place variables shared between fibers outside the scope of
shared static this()

Fibers don't race with each other, they are called sequentially in an
event loop as they are paused and resumed. Only threads are exposed to
data races, but by default, vibe.d runs on one thread only. Regardless,
D's variables are all thread-local by default as well. You can use
gshared storage specifier if you want to share globally, this is the
only possibility of there being a data race. It's recommended to use D's
synchronized specifier to lock global methods in a mutex if you need to
consult a gshared object.

Read more here: http://dlang.org/migrate-to-shared.html

If you want to read from a file, you should use

http://vibed.org/api/vibe.core.file/openFile

It takes care of yield() by itself.

Re: General questions on implementation

Permalink: HTTP NNTP

Sergei Nosov

Posted Sat, 07 Dec 2013 04:06:29 GMT in reply to Etienne Cimon

Reply

On Fri, 06 Dec 2013 16:34:17 -0500, Etienne Cimon wrote:

If you need to run routines that take a lot of CPU cycles, you can call
yield() every now and then to make sure other requests aren't left out
while this operation runs.

Ok, that's pretty much what I've thought.

Only threads are exposed to
data races, but by default, vibe.d runs on one thread only.

I'm not really familiar with web server development, but isn't it sub-optimal? I would expect that the "main loop" would have a pool of threads (with capacity = number of CPU cores), also it'll have some queue of fibers and it will run assigning those to free threads. If we're running on a single thread - we're not utilizing more than 1 core. Is my understanding correct? If so, what is the proposed way to utilize more cores?

Re: General questions on implementation

Permalink: HTTP NNTP

Sönke Ludwig

Posted Sat, 07 Dec 2013 08:33:22 GMT in reply to Sergei Nosov

Reply

On Sat, 07 Dec 2013 04:06:29 GMT, Sergei Nosov wrote:

On Fri, 06 Dec 2013 16:34:17 -0500, Etienne Cimon wrote:

If you need to run routines that take a lot of CPU cycles, you can call
yield() every now and then to make sure other requests aren't left out
while this operation runs.

Ok, that's pretty much what I've thought.

Only threads are exposed to
data races, but by default, vibe.d runs on one thread only.

I'm not really familiar with web server development, but isn't it sub-optimal? I would expect that the "main loop" would have a pool of threads (with capacity = number of CPU cores), also it'll have some queue of fibers and it will run assigning those to free threads. If we're running on a single thread - we're not utilizing more than 1 core. Is my understanding correct? If so, what is the proposed way to utilize more cores?

As long as everything is I/O bound, which can be considered a common case when DB queries and sizable pages are generated, a single thread is the most efficient choice. But an important point is that it greatly simplifies the threading model and doesn't require synchronization primitives to avoid low-level race conditions, which can give it an advantage compared to multi-threaded operation, even if the program is CPU bound. If this is not the case, there are two options to employ threads:

Currently, to enable multi-threaded processing in general, vibe.core.core.enableWorkerThreads needs to be called, which creates a worker thread per CPU core (this won't be necessary anymore in a later version). Then HTTPServerOption.distribute needs to be added to HTTPServerSettings.options. Now all incoming TCP connections will be distributed among the worker threads evenly by the OS.
A good alternative can be to do complex computations using runWorkerTask (which lets the task run in one of the worker threads) and let all I/O and request processing still happen in the main thread. This can often avoid explicit synchronization, while still fully exploiting the CPU's cores and thus simplifies the overall program architecture and may give a performance advantage.

Re: General questions on implementation

Permalink: HTTP NNTP

Sergei Nosov

Posted Sat, 07 Dec 2013 08:51:22 GMT in reply to Sönke Ludwig

Reply

On Sat, 07 Dec 2013 08:33:22 GMT, Sönke Ludwig wrote:

As long as everything is I/O bound, which can be considered a common case when DB queries and sizable pages are generated, a single thread is the most efficient choice.

Thanks, that actually makes a lot of sense.

A good alternative can be to do complex computations using runWorkerTask (which lets the task run in one of the worker threads) and let all I/O and request processing still happen in the main thread. This can often avoid explicit synchronization, while still fully exploiting the CPU's cores and thus simplifies the overall program architecture and may give a performance advantage.

I believe, this can be thought of as you treat heavy-CPU parts of the program as just another asynchronous I/O operations. Thank you for the insights!

Re: General questions on implementation

Permalink: HTTP NNTP

Dicebot

Posted Mon, 09 Dec 2013 14:46:02 GMT in reply to Sergei Nosov

Reply

On Sat, 07 Dec 2013 08:51:22 GMT, Sergei Nosov wrote:

On Sat, 07 Dec 2013 08:33:22 GMT, Sönke Ludwig wrote:

As long as everything is I/O bound, which can be considered a common case when DB queries and sizable pages are generated, a single thread is the most efficient choice.

Thanks, that actually makes a lot of sense.

A good alternative can be to do complex computations using runWorkerTask (which lets the task run in one of the worker threads) and let all I/O and request processing still happen in the main thread. This can often avoid explicit synchronization, while still fully exploiting the CPU's cores and thus simplifies the overall program architecture and may give a performance advantage.

I believe, this can be thought of as you treat heavy-CPU parts of the program as just another asynchronous I/O operations. Thank you for the insights!

Also if there is a continuous CPU-heavy task, it may be effective to delegate it to own thread with hard processor affinity set (don't know if vibe.d / Phobos provides wrappers for that) - impact of effective instruction cache can sometimes be huge. As a general rule of a thumb - every time you lock for shared resources in multiple threads you have opportunity to improve your program concurrent architecture ;)

Re: General questions on implementation

Permalink: HTTP NNTP

Shammah Chancellor

Posted Wed, 11 Dec 2013 06:19:41 -0500

Reply

On 2013-12-06 21:34:17 +0000, Etienne Cimon said:

If you need to run routines that take a lot of CPU cycles, you can call
yield() every now and then to make sure other requests aren't left out
while this operation runs.

As an aside, I was having trouble with my fiber's not being woken back
up. What exactly causes them to wake back up?

-Shammah

Re: General questions on implementation

Permalink: HTTP NNTP

Shammah Chancellor

Posted Wed, 11 Dec 2013 06:23:15 -0500

Reply

On 2013-12-09 14:46:02 +0000, Dicebot said:

On Sat, 07 Dec 2013 08:51:22 GMT, Sergei Nosov wrote:

On Sat, 07 Dec 2013 08:33:22 GMT, Sönke Ludwig wrote:

As long as everything is I/O bound, which can be considered a common
case when DB queries and sizable pages are generated, a single thread
is the most efficient choice.

Thanks, that actually makes a lot of sense.

A good alternative can be to do complex computations using

runWorkerTask (which lets the task run in one of the worker threads)
and let all I/O and request processing still happen in the main thread.
This can often avoid explicit synchronization, while still fully
exploiting the CPU's cores and thus simplifies the overall program
architecture and may give a performance advantage.

I believe, this can be thought of as you treat heavy-CPU parts of the
program as just another asynchronous I/O operations. Thank you for the
insights!

Also if there is a continuous CPU-heavy task, it may be effective to
delegate it to own thread with hard processor affinity set (don't know
if vibe.d / Phobos provides wrappers for that) - impact of effective
instruction cache can sometimes be huge. As a general rule of a thumb -
every time you lock for shared resources in multiple threads you have
opportunity to improve your program concurrent architecture ;)

What do you mean? Is there a good way to avoid locking for shared resources?

-Shammah

Re: General questions on implementation

Permalink: HTTP NNTP

Sönke Ludwig

Posted Wed, 11 Dec 2013 14:48:34 +0100 in reply to Shammah Chancellor

Reply

Am 11.12.2013 12:19, schrieb Shammah Chancellor:

On 2013-12-06 21:34:17 +0000, Etienne Cimon said:

If you need to run routines that take a lot of CPU cycles, you can
call yield() every now and then to make sure other requests aren't
left out while this operation runs.

As an aside, I was having trouble with my fiber's not being woken back
up. What exactly causes them to wake back up?

-Shammah

In the case of yield(), what happens is that the event loop is invoked
once to process any pending events, followed by restarting any
fibers/task in the same order in which they called yield.

For implicit yielding based on I/O, timers or other events, only the
matching event wakes up the task. So if a specific task isn't woken up
it usually simply indicates that an expected event hasn't yet fired -
or, of course, a bug ;)

One reason why things may stall is when the thread is continually
blocked by non-asynchronous blocking operations. Another common reason
might be that a task is waiting in vibe.core.concurrency.receive or in
TaskMutex.lock, but the task which is supposed to send a message or
give up the mutex was terminated by an exception.