Parallelism in .NET – Part 10, Cancellation in PLINQ and the Parallel class

Many routines are parallelized because they are long running processes.  When writing an algorithm that will run for a long period of time, its typically a good practice to allow that routine to be cancelled.  I previously discussed terminating a parallel loop from within, but have not demonstrated how a routine can be cancelled from the caller’s perspective.  Cancellation in PLINQ and the Task Parallel Library is handled through a new, unified cooperative cancellation model introduced with .NET 4.0.

Read more

Parallelism in .NET – Part 9, Configuration in PLINQ and TPL

Parallel LINQ and the Task Parallel Library contain many options for configuration.  Although the default configuration options are often ideal, there are times when customizing the behavior is desirable.  Both frameworks provide full configuration support.

Read more

Parallelism in .NET – Part 8, PLINQ’s ForAll Method

Parallel LINQ extends LINQ to Objects, and is typically very similar.  However, as I previously discussed, there are some differences.  Although the standard way to handle simple Data Parellelism is via Parallel.ForEach, it’s possible to do the same thing via PLINQ.

PLINQ adds a new method unavailable in standard LINQ which provides new functionality…

Read more

Parallelism in .NET – Part 7, Some Differences between PLINQ and LINQ to Objects

In my previous post on Declarative Data Parallelism, I mentioned that PLINQ extends LINQ to Objects to support parallel operations.  Although nearly all of the same operations are supported, there are some differences between PLINQ and LINQ to Objects.  By introducing Parallelism to our declarative model, we add some extra complexity.  This, in turn, adds some extra requirements that must be addressed.

In order to illustrate the main differences, and why they exist, let’s begin by discussing some differences in how the two technologies operate, and look at the underlying types involved in LINQ to Objects and PLINQ .

Read more

Parallelism in .NET – Part 6, Declarative Data Parallelism

When working with a problem that can be decomposed by data, we have a collection, and some operation being performed upon the collection.  I’ve demonstrated how this can be parallelized using the Task Parallel Library and imperative programming using imperative data parallelism via the Parallel class.  While this provides a huge step forward in terms of power and capabilities, in many cases, special care must still be given for relative common scenarios.

C# 3.0 and Visual Basic 9.0 introduced a new, declarative programming model to .NET via the LINQ Project.  When working with collections, we can now write software that describes what we want to occur without having to explicitly state how the program should accomplish the task.  By taking advantage of LINQ, many operations become much shorter, more elegant, and easier to understand and maintain.  Version 4.0 of the .NET framework extends this concept into the parallel computation space by introducing Parallel LINQ.

Read more

Parallelism in .NET – Part 5, Partitioning of Work

When parallelizing any routine, we start by decomposing the problem.  Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element.  This process is called partitioning.

Partitioning our tasks is a challenging feat.  There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle.  Trying to work the perfect balance between the two extremes is the goal for which we should aim.  Luckily, the Task Parallel Library automatically handles much of this process.  However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions.

Read more

Parallelism in .NET – Part 4, Imperative Data Parallelism: Aggregation

In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel.  Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation.

Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria.  This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately.  The Task Parallel Library has tools to assist in this synchronization.

Read more

Parallelism in .NET – Part 3, Imperative Data Parallelism: Early Termination

Although simple data parallelism allows us to easily parallelize many of our iteration statements, there are cases that it does not handle well.  In my previous discussion, I focused on data parallelism with no shared state, and where every element is being processed exactly the same.

Unfortunately, there are many common cases where this does not happen.  If we are dealing with a loop that requires early termination, extra care is required when parallelizing.

Read more

Parallelism in .NET – Part 2, Simple Imperative Data Parallelism

In my discussion of Decomposition of the problem space, I mentioned that Data Decomposition is often the simplest abstraction to use when trying to parallelize a routine.  If a problem can be decomposed based off the data, we will often want to use what MSDN refers to as Data Parallelism as our strategy for implementing our routine.  The Task Parallel Library in .NET 4 makes implementing Data Parallelism, for most cases, very simple.

Read more

Parallelism in .NET – Part 1, Decomposition

The first step in designing any parallelized system is Decomposition.  Decomposition is nothing more than taking a problem space and breaking it into discrete parts.  When we want to work in parallel, we need to have at least two separate things that we are trying to run.  We do this by taking our problem and decomposing it into parts.

There are two common abstractions that are useful when discussing parallel decomposition: Data Decomposition and Task Decomposition.  These two abstractions allow us to think about our problem in a way that helps leads us to correct decision making in terms of the algorithms we’ll use to parallelize our routine.

Read more

« Previous PageNext Page »