In my previous post on Declarative Data Parallelism, I mentioned that PLINQ extends LINQ to Objects to support parallel operations. Although nearly all of the same operations are supported, there are some differences between PLINQ and LINQ to Objects. By introducing Parallelism to our declarative model, we add some extra complexity. This, in turn, adds some extra requirements that must be addressed.
In order to illustrate the main differences, and why they exist, let’s begin by discussing some differences in how the two technologies operate, and look at the underlying types involved in LINQ to Objects and PLINQ .
When working with a problem that can be decomposed by data, we have a collection, and some operation being performed upon the collection. I’ve demonstrated how this can be parallelized using the Task Parallel Library and imperative programming using imperative data parallelism via the Parallel class. While this provides a huge step forward in terms of power and capabilities, in many cases, special care must still be given for relative common scenarios.
C# 3.0 and Visual Basic 9.0 introduced a new, declarative programming model to .NET via the LINQ Project. When working with collections, we can now write software that describes what we want to occur without having to explicitly state how the program should accomplish the task. By taking advantage of LINQ, many operations become much shorter, more elegant, and easier to understand and maintain. Version 4.0 of the .NET framework extends this concept into the parallel computation space by introducing Parallel LINQ.
When parallelizing any routine, we start by decomposing the problem. Once the problem is understood, we need to break our work into separate tasks, so each task can be run on a different processing element. This process is called partitioning.
Partitioning our tasks is a challenging feat. There are opposing forces at work here: too many partitions adds overhead, too few partitions leaves processors idle. Trying to work the perfect balance between the two extremes is the goal for which we should aim. Luckily, the Task Parallel Library automatically handles much of this process. However, there are situations where the default partitioning may not be appropriate, and knowledge of our routines may allow us to guide the framework to making better decisions.
In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel. Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation.
Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria. This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately. The Task Parallel Library has tools to assist in this synchronization.
Although simple data parallelism allows us to easily parallelize many of our iteration statements, there are cases that it does not handle well. In my previous discussion, I focused on data parallelism with no shared state, and where every element is being processed exactly the same.
Unfortunately, there are many common cases where this does not happen. If we are dealing with a loop that requires early termination, extra care is required when parallelizing.
In my discussion of Decomposition of the problem space, I mentioned that Data Decomposition is often the simplest abstraction to use when trying to parallelize a routine. If a problem can be decomposed based off the data, we will often want to use what MSDN refers to as Data Parallelism as our strategy for implementing our routine. The Task Parallel Library in .NET 4 makes implementing Data Parallelism, for most cases, very simple.
The first step in designing any parallelized system is Decomposition. Decomposition is nothing more than taking a problem space and breaking it into discrete parts. When we want to work in parallel, we need to have at least two separate things that we are trying to run. We do this by taking our problem and decomposing it into parts.
There are two common abstractions that are useful when discussing parallel decomposition: Data Decomposition and Task Decomposition. These two abstractions allow us to think about our problem in a way that helps leads us to correct decision making in terms of the algorithms we’ll use to parallelize our routine.
Parallel programming is something that every professional developer should understand, but is rarely discussed or taught in detail in a formal manner. Software users are no longer content with applications that lock up the user interface regularly, or take large amounts of time to process data unnecessarily. Modern development requires the use of parallelism. There is no longer any excuses for us as developers.
Learning to write parallel software is challenging. It requires more than reading that one chapter on parallelism in our programming language book of choice…
This series introduces the Model-View-ViewModel Pattern from the point of view of a Windows Forms developer. The goal is not to introduce WPF, but to demonstrate some of the new features within Windows Presentation Foundation, and show how they should force every WPF developer to re-think how they design their applications.
The Model-View-ViewModel pattern is introduced after a discussion of three of the main features in WPF which enable it’s usage. In order to illustrate this, three versions a single application were written:
- A Windows Forms application
- A WPF Version of the application, using the same style
- A WPF Version of the application, built using MVVM
This allows a detailed understanding of the reasons behind MVVM, as well as the technology that enables the pattern.
Windows Presentation Foundation provides us with new opportunities to build applications that are very flexible to design, easy to maintain, and clear to understand. By taking advantage of Data Binding, Commands, and Templating, we can rethink the way we build our applications, and design them using the Model-View-ViewModel Pattern.
Now that I’ve walked through how we do this, I will revisit our original RSS Feed Reader application, and show samples of how this changes the design and code in this simple application.