Thread specific data becomes easier in .NET 4.0 via ThreadLocal<T>
Occasionally, when dealing with multithreaded code, there are uses for data that is kept in a manner that is unique to the currently running thread. This is accomplished via using Thread-local storage. Typically, in .NET 3.5, this was handled via the [ThreadStatic] attribute, however, this puts certain restrictions on usage. The main problem with ThreadStatic is that it is exactly that – static data stored per thread. This can be problematic, especially if you’re using ThreadPool threads, as the data is never released cleanly.
Currently, if you want to use data kept at a local scope with a copy per thread, the solution is to use a LocalDataStoreSlot to manage your threaded data. This works, but is not very developer-friendly. It is not type-safe; all data is stored as a System.Object. It is clunky at best to use, especially if you want to free, as you have to use and track strings with named data slots in order to release the memory. Version 4 of the .NET Framework fixes all of these issues by introducing ThreadLocal<T>.
The new ThreadLocal<T> class provides us with a strongly typed, locally scoped object we can use to setup data that is kept separate for each thread. This allows us to use data stored per thread, without having to introduce static variables into our types. Internally, the ThreadLocal instance will automatically setup the static data, manage its lifetime, do all of the casting to and from our specific type. This makes developing much simpler.
Here’s a simple but complete example showing how this is used:
namespace CSharpConsoleApplication { using System; using System.Threading; using System.Threading.Tasks; class Program { static void Main(string[] args) { // Create our ThreadLocal<T>, with a Func<T> used to initialize using (ThreadLocal<int> dataPerThread = new ThreadLocal<int>(() => Thread.CurrentThread.ManagedThreadId)) { // Loop a few times, to see the output Parallel.For(0, 10, (i) => { Console.WriteLine("dataPerThread {0}/{1} for iteration {2}", dataPerThread.IsValueCreated, dataPerThread.Value, i); }); } // Wait for a keypress to exit Console.WriteLine("Press any key to exit..."); Console.ReadKey(); } } }
When we run, we’ll get results similar to the following:
dataPerThread False/9 for iteration 0 dataPerThread True/9 for iteration 2 dataPerThread False/10 for iteration 1 dataPerThread True/10 for iteration 5 dataPerThread True/10 for iteration 6 dataPerThread True/10 for iteration 7 dataPerThread True/10 for iteration 8 dataPerThread True/10 for iteration 9 dataPerThread False/12 for iteration 4 dataPerThread True/9 for iteration 3 Press any key to exit...
Your results will obviously be different, as the order is somewhat random. (We are multi-threading here!)
Here are a few things to note about this sample and these results:
First, when we create the ThreadLocal<int> instance, we have the option of providing a Func<T> to the ThreadLocal<T>’s Constructor. This allows us to provide an initialization routine which will be used the first time a given thread accesses the Value property. This greatly eases our construction burden, since we can just wrap everything in a delegate (potentially using closures around our class variables) to have complex initialization of our type without having to “think about†it during our work.
Second, when we first access our dataPerThread variable, it is uninitialized. If we need to do complex initialization, we can check the ThreadLocal<T>.IsValueCreated property. This will tell us whether or not a specific value has been created for the currently running thread. When we run the sample, you can see this – the first time a specific thread (the first number, either 9, 10, or 12) accesses dataPerThread, IsValueCreated returns false. When we then read the value, our Func<T> will execute, and set the data to the current managed thread ID. The subsequent times we try to read the data, it’s already been created.
Third, ThreadLocal<T> implements IDisposable. This provides a much simpler way of handling cleanup than trying to deal with static variables per thread.
So the final question I’ll address here: Why is this useful, and where would I want to use this class? In my case, I already have one very good use case. I will be using ThreadLocal<T> in order to handle a thread-specific cache.
We have some algorithms which are parallelizable. However, as we run from one case to the next, there is a chance we can save quite a bit of computation by caching and reusing certain steps, momoizing the results. However, since we’re threading the algorithm, sharing this cache amongst the threads is problematic – it requires adding locking within the algorithm in order to have a thread-safe cache. The locking nearly wipes out the speed gains we achieve via the caching.
By using ThreadLocal<T> to store our cache information, each thread can have it’s own copy of the cache, created once per run (local scope). This provides us with completely threadsafe, lock-free memoization, at the cost of two threads potentially recalculating results. Profiling our algorithm proved that this provided a dramatic speed improvement.
Howdy, Reed – I stumbled across this while trying to find the answer to a question I just posted on SO: http://stackoverflow.com/questions/2202735/what-are-the-advantages-of-thread-local-storage. I’m digesting your example at the end of this article & suspect it may answer my question. Might you have a pithy, SO-friendly answer that explains the use case for instance-level thread-local storage? 🙂
Jeff,
I responded to your SO post with details. Hopefully that will help you a bit.
If you’re still having problems, or have more questions, let me know.
-Reed
Thanks Reed for the post. MSDN wasnt helping me grasping ThreadLocal. your post surely did.
How do i use it if i want to access the dataPerThread.Value when i need to use it in some other function ?
I mean some other function called from this method but executing under the same thread context
Gurmeet,
This can be accessed like any other variable. The method in question will need to have some way to access the variable – whether its a ThreadLocal or a normal variable. If you pass a reference to the ThreadLocal to the method, you can use it there directly. Alternatively, you could wrap this in some other class to provide cleaner access, if required.
-Reed
Excellent article. ThreadLocal is incredibly powerful – it allows you to convert any class that is not thread safe into one that is. Each class runs in its own “sandbox”, the universe of variables that it sees are based on the ThreadID.
I actually manually implemented the effect of ThreadLocal a week ago – its a nice feeling knowing I can use ThreadLocal to replace all of my prior code.
To implement the effect of ThreadLocal yourself, create a wrapper over a non-thread-safe class which uses a dictionary to create a new class instance for each ThreadID (the key for the dictionary is the ThreadID). This means that if one thread is in the middle of doing something, another thread cannot come along and corrupt any class variables (by calling .ResetCounter(), for example).
ThreadLocal is incredibly powerful – it allows you to convert any class that is not thread safe into one that is.
Your article also beats the MSDN one for understandability by an order of magnitude.
I believe that ThreadLocal still has the problem that data might leak when used within the context of a ThreadPool (contrary to what your first paragraph seem to hint at). See https://plumbr.io/blog/locked-threads/how-to-shoot-yourself-in-foot-with-threadlocals . Or am I missing something?
Dejan,
The key to ThreadLocal here is that it’s IDisposable, so it uses the standard .NET disposal patterns to manage the lifetime of the created data, which in turn makes it “safe” (if used properly) to use, even in the thread pool. The article you are linking against is referring to the JVM version, which doesn’t have the disposal option (which in turn means there is no deterministic cleanup available).
*duh*, sorry about that article being JVM. I see that the .NET ThreadLocal implements IDisposable. But I suspect that when used in the context of a ThreadPool, it’ll still leak data because there’s not certainty about *when* the thread is being disposed, right? At least https://stackoverflow.com/questions/561518/is-thread-local-storage-persisted-between-backgroundworker-invocations and https://stackoverflow.com/questions/6944938/does-net-threadpool-thread-get-reset-when-it-goes-back-to-the-pool seems to indicate so.
If you look at the usage, both here and in the article, it’s disposed within the threadpool call – which makes it “safe”. If you don’t do that, then it’s going to have similar issues.