RejectedSoftware Forums

Sign up

thread-safe MemorySessionStore ?

Quiting http://vibed.org/api/vibe.http.session/MemorySessionStore

If the server is running as a single instance (no thread or process clustering), this kind of session store provies the fastest and simplest way to store sessions. In any other case, a persistent session store based on a database is necessary.

Is my understanding correct - HTTP server does not do any locking when interacting with MemorySessionStore and basically the only way to get it with a multiple worker threads is to make similar implementation with mutex securing every method call? What is the general policy in vibe.d server code for global mutable entities?

Re: thread-safe MemorySessionStore ?

On Fri, 28 Jun 2013 21:14:50 GMT, Dicebot wrote:

...

Have asked in a more detailed form via e-mail.

Re: thread-safe MemorySessionStore ?

Am 28.06.2013 23:14, schrieb Dicebot:

Quiting http://vibed.org/api/vibe.http.session/MemorySessionStore

If the server is running as a single instance (no thread or process clustering), this kind of session store provies the fastest and simplest way to store sessions. In any other case, a persistent session store based on a database is necessary.

Is my understanding correct - HTTP server does not do any locking when interacting with MemorySessionStore and basically the only way to get it with a multiple worker threads is to make similar implementation with mutex securing every method call? What is the general policy in vibe.d server code for global mutable entities?

Yes, this is still unprotected from the early times with no threading
support - needs to be changed. If everything is to be handled correctly,
the HttpServerSettings.sessionStore would also have to be a
shared(SessionStore). I'm a bit reluctant to go forward with that on a
broad scale, though, as D still needs some work in that area and some
tendencies in recent discussions made me a bit nervous that a final
solution might be incompatible with the shared system that I envision
and try to emulate with the stuff in vibe.core.concurrency.

Re: thread-safe MemorySessionStore ?

On Sun, 30 Jun 2013 20:24:51 +0200, Sönke Ludwig wrote:

Yes, this is still unprotected from the early times with no threading
support - needs to be changed. If everything is to be handled correctly,
the HttpServerSettings.sessionStore would also have to be a
shared(SessionStore). I'm a bit reluctant to go forward with that on a
broad scale, though, as D still needs some work in that area and some
tendencies in recent discussions made me a bit nervous that a final
solution might be incompatible with the shared system that I envision
and try to emulate with the stuff in vibe.core.concurrency.

And, as far as I remember, event loops are thread-local too, sharing single socket using OS help?

I don't think that simply marking something with "shared" will do anything positive. On its own it is a no-op at best and potential cause for issues with qualified type mismatch. It can me made shared AND server code update with internal async locking mechanisms to make stuff magically work out-of-the box but it does not seem like a good approach. It is extremely hard to do such stuff in generic form without severely harming concurrent performance - and that is the main reason to use worker threads after all.

Perfect approach in my opinion would have been to treat worker threads as independent processing nodes, with easier tools for inter-node data sharing in user code, but completely thread-local on server part. However, to be actually usable that requires configurable load sharing algorithms for incoming requests so that requests from same source will always get to same thread. Does that sound implementable?

Re: thread-safe MemorySessionStore ?

Am 01.07.2013 11:44, schrieb Dicebot:> On Sun, 30 Jun 2013 20:24:51 +0200, Sönke Ludwig wrote:

Yes, this is still unprotected from the early times with no threading
support - needs to be changed. If everything is to be handled correctly,
the HttpServerSettings.sessionStore would also have to be a
shared(SessionStore). I'm a bit reluctant to go forward with that on a
broad scale, though, as D still needs some work in that area and some
tendencies in recent discussions made me a bit nervous that a final
solution might be incompatible with the shared system that I envision
and try to emulate with the stuff in vibe.core.concurrency.

And, as far as I remember, event loops are thread-local too, sharing single socket using OS help?

Exactly.

I don't think that simply marking something with "shared" will do anything positive. On its own it is a no-op at best and potential cause for issues with qualified type mismatch. It can me made shared AND server code update with internal async locking mechanisms to make stuff magically work out-of-the box but it does not seem like a good approach. It is extremely hard to do such stuff in generic form without severely harming concurrent performance - and that is the main reason to use worker threads after all.

Together with the lock() function in vibe.core.concurrency and Insolated!T it becomes a pretty useful tool to get type-safe, race-free shared memory access. lock() will internally cast away shared in a controlled scope, but only if it is safe to do so (e.g. it can be statically proven that no reference to shared memory escapes the scope).

The recent improvements in the compiler regarding unique(/"isolated") expressions will help making this even more useful, although some current limitations actually have made DMD 2.063 more cumbersome to work with (http://d.puremagic.com/issues/show_bug.cgi?id=10012, this unfortunately broke some valid constructor calls and now requires using cast).

Perfect approach in my opinion would have been to treat worker threads as independent processing nodes, with easier tools for inter-node data sharing in user code, but completely thread-local on server part. However, to be actually usable that requires configurable load sharing algorithms for incoming requests so that requests from same source will always get to same thread. Does that sound implementable?

Worker threads can be considered independent processing nodes (with the exception of the shared SessionStore). But adjusting the distribution of requests among threads will unfortunately not work, as solely the OS decides which incoming connection is handled by which thread (keep-alive connections will stay on the same thread, though).

But I guess having a thread-safe session object will be good enough in most cases. Keep in mind that even on a single thread it will be possible to have high-level race conditions due to multiple "parallel" fibers running, so forcing the same thread per source would not help much there.

Re: thread-safe MemorySessionStore ?

On Wed, 03 Jul 2013 09:49:56 GMT, Sönke Ludwig wrote:

Together with the lock() function in vibe.core.concurrency and Insolated!T it becomes a pretty useful tool to get type-safe, race-free shared memory access. lock() will internally cast away shared in a controlled scope, but only if it is safe to do so (e.g. it can be statically proven that no reference to shared memory escapes the scope).

The recent improvements in the compiler regarding unique(/"isolated") expressions will help making this even more useful, although some current limitations actually have made DMD 2.063 more cumbersome to work with (http://d.puremagic.com/issues/show_bug.cgi?id=10012, this unfortunately broke some valid constructor calls and now requires using cast).

I have no objections against the fact that it will be convenient and type-safe. My point is that any global synchronized access is an enormous performance killer comparing to "all thread-local" scenario when it comes to high concurrency models. Making it default may limit any further options. Not sure if result is worth the change.

Worker threads can be considered independent processing nodes (with the exception of the shared SessionStore). But adjusting the distribution of requests among threads will unfortunately not work, as solely the OS decides which incoming connection is handled by which thread (keep-alive connections will stay on the same thread, though).

Well, keep-alive connections are technically the same and one connection, so no wonders here :) I guess I need to study some docs on UNIX sockets to see if that can be controlled on OS level. What I am trying to do here is to create simple end-user guidelines regarding concurrency architecture that will scale flawlessly for typical simple web apps while still allowing developer to use one of most seductive vibe.d features - great performance with minimal efforts.

Unfortunately, my earlier experience with this type of services is about kernel mode processing where one can have full control about whole network stack and scheduling - still having lot of difficulties connecting that experience with user space environment :(

But I guess having a thread-safe session object will be good enough in most cases. Keep in mind that even on a single thread it will be possible to have high-level race conditions due to multiple "parallel" fibers running, so forcing the same thread per source would not help much there.

I was quite sure that race conditions can never happen between fibers within one thread as contexts for those get switched only in certain predefined places (usually after registering an event for async I/O operation). Am I wrong?

Re: thread-safe MemorySessionStore ?

On Wed, 03 Jul 2013 11:12:17 GMT, Dicebot wrote:

On Wed, 03 Jul 2013 09:49:56 GMT, Sönke Ludwig wrote:

Together with the lock() function in vibe.core.concurrency and Insolated!T it becomes a pretty useful tool to get type-safe, race-free shared memory access. lock() will internally cast away shared in a controlled scope, but only if it is safe to do so (e.g. it can be statically proven that no reference to shared memory escapes the scope).

The recent improvements in the compiler regarding unique(/"isolated") expressions will help making this even more useful, although some current limitations actually have made DMD 2.063 more cumbersome to work with (http://d.puremagic.com/issues/show_bug.cgi?id=10012, this unfortunately broke some valid constructor calls and now requires using cast).

I have no objections against the fact that it will be convenient and type-safe. My point is that any global synchronized access is an enormous performance killer comparing to "all thread-local" scenario when it comes to high concurrency models. Making it default may limit any further options. Not sure if result is worth the change.

I see. However, at least with current hardware, compared with all the other things that happen during a request this will still be a small fraction of the total CPU time (as it is highly unlikely to ever have actual contention for the MemorySessionStore lock, it should be well below 1000 CPU cycles for a single lock/unlock). An alternative idea could be to use thread-local (or even task-local) session instances internally and move them lazily between threads/tasks as needed. This would then only incur an inter-thread dependency whenever there actually are two requests for the same session in different threads (which is less likely due to keep-alive connections).

But apart from this I also suppose that in many large-scale scenarios sessions will be stored in an external database (such as Redis), which is then possibly distributed among multiple servers. Compared to the I/O overhead there, a simple atomic CAS will always be negligible.

Anyway the API needs to be fixed somehow. My idea was to make Session a struct instead of a class so that at least that part of the API is invariant to shared vs. non-shared vs. moving between threads. It's also one unnecessary allocation less per session. For SessionStore I think there is no way around making it shared, since it is infact shared between threads and that should be documented. If it then uses locks internally or some other means to protect the data is an implementation detail.

Worker threads can be considered independent processing nodes (with the exception of the shared SessionStore). But adjusting the distribution of requests among threads will unfortunately not work, as solely the OS decides which incoming connection is handled by which thread (keep-alive connections will stay on the same thread, though).

Well, keep-alive connections are technically the same and one connection, so no wonders here :) I guess I need to study some docs on UNIX sockets to see if that can be controlled on OS level. What I am trying to do here is to create simple end-user guidelines regarding concurrency architecture that will scale flawlessly for typical simple web apps while still allowing developer to use one of most seductive vibe.d features - great performance with minimal efforts.

Unfortunately, my earlier experience with this type of services is about kernel mode processing where one can have full control about whole network stack and scheduling - still having lot of difficulties connecting that experience with user space environment :(

Yeah, abstractions can sometimes make simple things really obscure... I'm currently having simlar fun with WinRT ;)

But I guess having a thread-safe session object will be good enough in most cases. Keep in mind that even on a single thread it will be possible to have high-level race conditions due to multiple "parallel" fibers running, so forcing the same thread per source would not help much there.

I was quite sure that race conditions can never happen between fibers within one thread as contexts for those get switched only in certain predefined places (usually after registering an event for async I/O operation). Am I wrong?

For low-level (memory/execution model related) race-conditions this is right, but higher level ones are still possible:

req.session["test"] = (req.session["test"].to!int() + 1).to!string;

If executed concurrently in multiple tasks on the same thread, this could drop some increments if the session object for some reason involves a blocking operation (e.g. because it uses a database as the backing store). But never mind... since this wasn't even the concern...

Re: thread-safe MemorySessionStore ?

On Wed, 03 Jul 2013 14:15:34 GMT, Sönke Ludwig wrote:

I see. However, at least with current hardware, compared with all the other things that happen during a request this will still be a small fraction of the total CPU time (as it is highly unlikely to ever have actual contention for the MemorySessionStore lock, it should be well below 1000 CPU cycles for a single lock/unlock).

Your are 100% right if you are speaking about raw throughput on low-to-normal concurrency targets. But I am speaking about c10k and up from there, not interested in anything more simple :P Time for single locking operation grows linearly with locking operation count, in other words, with simultaneous request count (as each request implies at least one locking operation here). I truly believe that vibe.d is closest of all competitors for reaching this target on carefully crafted non-synthetic applications, don't want to spoil the opportunity.

But apart from this I also suppose that in many large-scale scenarios sessions will be stored in an external database (such as Redis), which is then possibly distributed among multiple servers. Compared to the I/O overhead there, a simple atomic CAS will always be negligible.

Does it really work without some form of local caching? I have no web dev experience, unfortunately, but sounds like communication with external database may soon become a weak spot then. Can you provide any more input on this?

...

Speaking with you has inspired me with one possible architectural proposal - what about splitting SessionStore in two parts - SessionStore and SessionStoreCache, latter being always thread-local one and former implementing some sort synchronisation facilities for a background task. SessionStore here be shared memory store or external database - it does not really matter. Does vibe.d event model provide convenient means to fire such low-priority sync event for every worker thread?

Re: thread-safe MemorySessionStore ?

On Thu, 04 Jul 2013 10:19:07 GMT, Dicebot wrote:

On Wed, 03 Jul 2013 14:15:34 GMT, Sönke Ludwig wrote:

I see. However, at least with current hardware, compared with all the other things that happen during a request this will still be a small fraction of the total CPU time (as it is highly unlikely to ever have actual contention for the MemorySessionStore lock, it should be well below 1000 CPU cycles for a single lock/unlock).

Your are 100% right if you are speaking about raw throughput on low-to-normal concurrency targets. But I am speaking about c10k and up from there, not interested in anything more simple :P Time for single locking operation grows linearly with locking operation count, in other words, with simultaneous request count (as each request implies at least one locking operation here). I truly believe that vibe.d is closest of all competitors for reaching this target on carefully crafted non-synthetic applications, don't want to spoil the opportunity.

But it grows only once contention happens. As long as the lock duration << average request duration, this is not likely. But I agree that it may become an issue once you are on many cores and have really short request times. I think this is a relatively special scenario, but then again that of course doesn't make it less worthwhile to support...

But apart from this I also suppose that in many large-scale scenarios sessions will be stored in an external database (such as Redis), which is then possibly distributed among multiple servers. Compared to the I/O overhead there, a simple atomic CAS will always be negligible.

Does it really work without some form of local caching? I have no web dev experience, unfortunately, but sounds like communication with external database may soon become a weak spot then. Can you provide any more input on this?

I don't really know more details, but local caching would compromise (at least any hard) data consistency when multiple database clients are involved, so that would probably not work in many setups. But if you take into consideration that there are actually real sites running on Ruby, that little network overhead is absolutely negligible as long as the database is fast ;)

Speaking with you has inspired me with one possible architectural proposal - what about splitting SessionStore in two parts - SessionStore and SessionStoreCache, latter being always thread-local one and former implementing some sort synchronisation facilities for a background task. SessionStore here be shared memory store or external database - it does not really matter. Does vibe.d event model provide convenient means to fire such low-priority sync event for every worker thread?

Funnily, I was writing almost the same thing before finally reading your second paragraph and then dumping it ;)

The SessionStoreCache could then also offer different consistency models to optimize for speed when the data model allows it. Sounds like a good way to go no matter what the outcome of the distribution strategy investigation will be.

Re: thread-safe MemorySessionStore ?

On Thu, 04 Jul 2013 11:41:35 GMT, Sönke Ludwig wrote:

I don't really know more details, but local caching would compromise (at least any hard) data consistency when multiple database clients are involved, so that would probably not work in many setups. But if you take into consideration that there are actually real sites running on Ruby, that little network overhead is absolutely negligible as long as the database is fast ;)

I had impression that recent web trend is that data consistency is not primary goal unless it is some kind of bank stuff or similar finance-related service. Performance and high availability are in general considered more important. Well, at least that what you see reading reddit ;)

Funnily, I was writing almost the same thing before finally reading your second paragraph and then dumping it ;)

The SessionStoreCache could then also offer different consistency models to optimize for speed when the data model allows it. Sounds like a good way to go no matter what the outcome of the distribution strategy investigation will be.

Well, unless someone else will appear and say this idea sucks, I think that can be approved as a nice way to go, despite being slightly more complicated than straightforward sharing.