System.WeakReference internals and side-effects
After discussing the uses of WeakReference, I decided to dig into the internal implementation in .NET a bit more. I thought I’d write a short bit about how the WeakReference class works internally, and some side effects of using this vs. a standard strong reference.
First off, I’m going to in general ignore long weak references. They behave very similarly to short weak references, in terms of usage. (The difference is in whether or not the object is reachable after finalization, but prior to reclaiming the memory itself.) I have yet to find a use for them to be necessary, and nearly everything I’ve read suggests completely avoiding them.
Secondly, I’d like to credit Jeffrey Richter’s excellent article on memory management for much of the details of how the CLR handles weak references internally.
That being said, when you create a short weak reference, a few things happen. First, a new object (the WeakReference) class is constructed with your object. The WeakReference instance internally stores an IntPtr to a GCHandle which is allocated with GCHandleType.Weak (or WeakTrackRessurection for long weak references). The WeakReference then drops the strong handle to your object.
This is where the magic happens…
The CLR takes this GCHandle, and maintains an internal table of weak references. This is a separately maintained list of handles in the runtime. When a garbage collection happens, the GC builds a full graph of the objects rooted within your application. Prior to doing any cleanup, the weak reference table is scanned, and any references found which point to an object outside of the GC graph are marked as null. However, your WeakReference instance still points to this same location in the WeakReference table.
When you use a WeakReference, a few things happen. Upon accessing WeakReference.Target, the WeakReference class checks to see if the handle pointed to by it’s IntPtr is IntPtr.Zero. This will be true any time the object has been collected. This is why you must always check for null – the entire point of a WeakReference is to allow the GC to cleanup your object if it wants to, but you don’t know when or if that’s happened. If it’s not, it then asks the GC to retrieve the “real” object associated with the WeakReference’s GCHandle, and sets it to a System.Object reference. This strongly references the object, which will now prevent it from being collected. It then returns the new object. Provided you use the returned object in some fashion, you’ll keep that strong reference, which will keep the object rooted, so it will not be collected.
After you’re completely done with it, the WeakReference gets finalized, at which point the table entry is cleared and available for use by another weak reference in the future.
Now, the important part – what does this mean in terms of using WeakReferences in our applications? Any time we use WeakReference instead of a standard, strong reference, a couple of things will happen. One, we’re going to have the extra overhead associated with allocating a separate object (the WeakReference) which contains an IntPtr and a boolean (to track whether it’s a short or long reference). Second, any time we access the WeakReference.Target property, we’re effectively doing two null checks plus a method call on a GCHandle to retrieve our “real” object. In addition, we’re going to have to add extra logic on our end to test for whether the WeakReference is pointing to an object that’s been collected. In addition, the GC is going to have one more reference to check each time it does a GC.
There is definitely overhead involved in this, but surprisingly, it’s fairly minor – less than many property accessors used in common patterns. This is great news, especially considering how common the usage of WeakReference is becoming… Although it may not be realized, these are frequently used internally throughout the BCL, especially in Windows Presentation Foundation, since WeakReferences are a core element of data binding in WPF.