I write alot of very heavily multi-threaded code. In fact, nearly all the code that
I write ends up being called in a multi-threaded environment. As a result of this,
I try very hard to pick the proper algorithms and techniques to minimize the amount
of locking that I need to do. One of the classes introduced in .Net 2.0 is the ReadOnlyCollection.
This sounds ideal, as much of the locking code that I end up having to write looks
like:
lock (o)
{
foreach (string s in userNames)
{
// Do Something here
}
}
The code above is pretty bad, as it means my lock is held during the entire iteration
over the List. If this is something that's happening frequently, or my List has
many items in it, then I could end up paying a fairly large penalty. A Monitor just
isn't really the right locking pattern for this type of code, and if we run on a
16 core machine it really stinks to profile the code and see 15 cores all blocked
waiting for the Monitor.
Most of the time a
ReaderWriterLock is the best locking pattern for code that I write. I tend
to have 1 thread doing sporadic updates, and many threads operating on the contents
of the List. That code tends to look like:
_rwlock.AcquireReaderLock(Timeout.Infinite);
try
{
foreach (string s in userNames)
{
// Do Something here
}
}
finally
{
_rwlock.ReleaseReaderLock();
}
Now, for those of you paying attention, you already know that the ReaderWriterLock
in .Net stinks. This has been reported all over the place, by the very guys who
work on it.
Joe Duffy
has said, "It was no secret that the current ReaderWriterLock
type was such a pig, costing somewhere around 6X the cost of a monitor acquisition
for uncontended write lock acquires, that most people avoided it entirely."
Jeff Richter
isn't much of a fan either, having said, "While it is nice that a class like this exists, there are several problems with its implementation
and I recommend you do not use it."
Vance Morrison
has also chimed in with, "The current .NET Runtime implementation
of ReaderWriterLock is about 8 times slower than System.Monitor (normal locks),
in the common case where there is no lock contention."
Clearly, the ReaderWriterLock isn't the great answer we all hoped it would be. Monitors
work very well in some cases, but as the number of processing cores increase and
we're doing an ever increasing number of tasks in parallel, the Monitor becomes
less and less desirable.
To avoid this problem, we take several approaches in our codebase. Often we grab
a Monitor, clone an object graph, release the monitor, and the perform operations
on the cloned graph. The lets up avoid a number of locks at the (usually small)
price of increased memory and cpu usage.
In order to avoid even this penalty, we switched over a number of our methods to
no longer return an
IList interface, but to instead return a
ReadOnlyCollection. Based on name alone, it's obvious that the contents
of the collection won't change, and that there's no need to grab a Reader Lock or
a Monitor prior to doing an iteration over the collection. This makes it safe to
write code that looks like:
ReadOnlyCollection<string> items = DoSomething();
foreach (string s in items)
{
// Look Mom, no locks!
}
When I checked the documentation for ReaderWriteLock, I saw: An instance of
the ReadOnlyCollection generic class is always read-only. This is ideal.
It's perfect. It's exactly what I've been looking for.
Then one day I saw this:
System.InvalidOperationException: Collection was modified;
enumeration operation may not execute.
at System.ThrowHelper.ThrowInvalidOperationException
(ExceptionResource resource)
at System.Collections.Generic.List`1.Enumerator.MoveNext()
at ReadonlyCollectionCode.Form1.Iterate(Object o)
It turns out that the ReadOnlyCollection wasn't readonly after all! Upon some inspection
with reflector, the ReadOnlyCollection is only a thin wrapper around an underlying
generic IList, and any changes to that list are reflected in the ReadOnlyCollection.
This is not at all obvious at first glance, as the IList itself is what creates
the ReadOnly Collection via:
MyReadOnlyCollection = MyItems.AsReadOnly();
After seeing this, I read-read the documentation and noticed a line I missed the
first time around: A collection that is read-only is simply a collection with
a wrapper that prevents modifying the collection; therefore, if changes are made
to the underlying collection, the read-only collection reflects those changes.
All of a sudden, the ReadOnlyCollection has lost it's luster.
The following code demonstrates the problem. In this code I kick off a method that
continually iterates over the ReadOnlyCollection. If an Exception happens, that
exception is printed out and the method exits. Meanwhile, the foreground thread
adds some data into the original collection.
private List<string> MyItems = new List<string>();
private ReadOnlyCollection<string> MyReadOnlyCollection;
private void Iterate(object o)
{
while (true)
{
try
{
foreach (string s in MyReadOnlyCollection)
{
// don't bother to do anything
Thread.Sleep(1);
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.ToString());
return;
}
}
}
private void button3_Click(object sender, EventArgs e)
{
MyReadOnlyCollection = MyItems.AsReadOnly();
ThreadPool.QueueUserWorkItem(Iterate, null);
// Give the Iterate method a moment to get started
Thread.Sleep(1000);
for (int i = 0; i < 1000; i++)
{
MyItems.Add(i.ToString());
}
}
While the behavior of ReadOnlyCollection isn't exactly breaking news, hopefully it
serves to highlight some of the perils of assumptions, failure to properly read
documentation, and the general problems associated with threaded code. I hope anyone
out there who writed threaded code either already knows about this behavor, doesn't
make the same assumptions that I've made, or reads this blog!
Technorati Tags: Microsoft MVP,
.Net
My Technorati Profile