BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News LINQ Aggregates in VB and C#

LINQ Aggregates in VB and C#

An aggregate is a function that takes a collection of values and returns a scalar value. Examples from T-SQL include min, max, and sum. Both VB and C# have support for aggregates, but in very different ways.

Both VB and C# support aggregates as extension methods. Using the dot-notation, one simply calls a method on an IEnumerable object. For example

var totalVirtualMemory =
(from p in Process.GetProcesses()

select p.VirtualMemorySize64).Sum();



Dim totalVirtualMemory = _
(From p In Process.GetProcesses _
Select p.VirtualMemorySize64).Sum

As you can see, the VB and C# versions are nearly identical. VB also exposes a LINQ syntax specifically for aggregates.

Dim totalVirtualMemory = Aggregate p In Process.GetProcesses _

Into p.VirtualMemorySize64

If this were the only difference, there wouldn't be anything to talk about. But things get interesting when you want to operate on more than one "column" at a time. For the sake of example, let's say you were interesting in both the total virtual memory and total working set (physical memory) currently in use.

Using anonymous classes, you could easily create one variable with both those values.

var totals = new
{
totalVirtualMemory = (from p in Process.GetProcesses()

select p.VirtualMemorySize64).Sum(),
totalWorkingSet = (from p in Process.GetProcesses()

select p.WorkingSet64).Sum()
};

The problem with this is that GetProcesses() is called twice. That means the OS has to be queried twice and two loops through the resulting collection. A faster way would be to cache the call to GetProcesses().

var processes = (from p in Process.GetProcesses()
select new { p.VirtualMemorySize64, p.WorkingSet64 }

).ToList();


var totals2 = new
{
totalVirtualMemory = (from p in processes
select p.VirtualMemorySize64).Sum(),
totalWorkingSet = (from p in processes
select p.WorkingSet64).Sum()
};

While closer, there are still two loops through the collection. To fix this, a custom aggregator is needed, as well as a named class to hold the results.

public static ProcessTotals Sum(this IEnumerable source)
{
var totals = new ProcessTotals();
foreach (var p in source){
totals.VirtualMemorySize64 += p.VirtualMemorySize64;
totals.WorkingSet64 += p.WorkingSet64;
}
return totals;
}

public class ProcessTotals
{
public long VirtualMemorySize64 { get; set; }
public long WorkingSet64 { get; set; }
}

var totals3 = (from p in Process.GetProcesses() select p).Sum();

A developer could do the same thing in Visual Basic, but there is another option.

Dim totals3 = Aggregate p In Process.GetProcesses _
Into virtualMemory = Sum(p.VirtualMemorySize64), _
workingSet = Sum(p.WorkingSet64)

Just like in the last C# example, you end up with a variable that has two fields. But unlike the C# example, you do not have the tradeoff between creating your own aggregate function and class or wasting cycles looping through the collection twice.

To be fair, C# does still have one more trick up its sleeve. Unlike VB, which only supports single-line anonymous functions, C# can make them as complex as necessary. This gives it the ability to create anonymous aggregate functions when needed.

var processes =
(from p in Process.GetProcesses()
select new { p.VirtualMemorySize64, p.WorkingSet64 });
var totals4 = processes.Aggregate(new ProcessTotals(), (sum, p) =>
{
sum.WorkingSet64 += p.WorkingSet64;
sum.VirtualMemorySize64 += p.VirtualMemorySize64;
return sum;

});

Note that the class ProcessTotals is still needed. An anonymous class cannot be used here because C# anonymous classes are immutable. Visual Basic allows for mutable anonymous classes, but that does not help here because VB cannot create the multi-line anonymous function.

While Visual Basic and C# have significantly more power than before, each illustrates areas where the other can be improved.

Rate this Article

Adoption
Style

BT