Key Takeaways
- Follow the .NET Framework Design Guidelines. They are just as relevant today as when it was first published over a decade ago.
- API design is paramount. A badly designed API can dramatically increase bugs while simultaneously decreasing reusability.
- Always keep the "pit of success" philosophy in mind: make it easy to do the right thing and hard to make mistakes.
- Remove "line noise" and "boilerplate" code so the business logic can take center stage.
- Think carefully before sacrificing clarity for performance.
C# 7 is a major update with a lot of interesting new capabilities. And while there are plenty of articles on what you can do with it, there aren't quite as many on what you should do with it. Using the principles found in the .NET Framework Design Guidelines, we're going to take a first pass at laying down strategies for getting the most from these new features.
Tuple Returns
In normal C# programming, returning multiple values from one function can be quite tedious. Output parameters are an option, but only if you are exposing an asynchronous method. Tuple<T> is verbose, allocates memory, and doesn't have descriptive names for its fields. Custom structs are faster than Tuples, but litter the code with lots of single-use types. And finally, anonymous types combined with dynamic are very slow and lack static type checks.
All of these problems are solved with C#'s new tuple return syntax. Here is an example of the basic syntax:
public (string, string) LookupName(long id) // tuple return type
{
return ("John", "Doe"); // tuple literal
}
var names = LookupName(0);
var firstName = names.Item1;
var lastName = names.Item2;
The actual return type of this function is ValueTuple<string, string>
. As the name suggests, this is a lightweight struct resembling the Tuple<T>
class. This solves the type bloat issue, but leaves us with the same lack of descriptive names Tuple<T>
suffers from.
public (string First, string Last) LookupName(long id)
var names = LookupName(0);
var firstName = names.First;
var lastName = names.Last;
The return type is still ValueTuple<string, string>
, but now the compiler adds a TupleElementNames attribute to the function. This allows code that consumes the function to use the descriptive names instead of Item1/Item2.
WARNING: The TupleElementNames attribute is only honored by compilers. If you use reflection on the return type, you will only see the naked ValueTuple<T>
struct. Because the attribute is on the function itself by the time you get a result, that information is lost.
The compiler maintains the illusion of extra types as long as it can. For example, consider these declarations:
var a = LookupName(0);
(string First, string Last) b = LookupName(0);
ValueTuple<string, string> c = LookupName(0);
(string make, string model) d = LookupName(0);
From the compiler's perspective, a is a (string First
, string Last
) just like b. Since c is explicitly declared as a ValueTuple<string, string>
, there is no c.First
property.
Example d shows where this design breaks down and causes you to lose a measure of type safety. It is really easy to accidentally rename fields, allowing you to assign one tuple into a different tuple that happens to have the same shape. Again, this is because the compiler doesn't really see (string First
, string Last
) and (string make
, string model
) as different types.
ValueTuple is Mutable
An interesting note about ValueTuple is that it is mutable. Mads Torgersen explains why,
The reasons why mutable structs are often bad, don't apply to tuples.
If you write a mutable struct in the usual encapsulated way, with private state and public, mutator properties and methods, then you are in for some bad surprises. The reason is that whenever those structs are held in a readonly variable, the mutators will silently work on a copy of the struct!
Tuples, however, simply have public, mutable fields. By design there are no mutators, and hence no risk of the above phenomenon.
Also, again because they are structs, they are copied whenever they are passed around. They aren't directly shared between threads, and don't suffer the risks of "shared mutable state" either. This is in contrast to the System.Tuple family of types, which are classes and therefore need to be immutable to be thread safe.
Note he said "fields", not "properties". This may cause problems with reflection-based libraries that consume the results of a tuple-returning function.
Guidelines for Tuple Returns
✔ CONSIDER using tuple returns instead of out parameters when the list of fields is small and will never change.
✔ DO use PascalCase for descriptive names in the return tuple. This makes the tuple fields look like properties on normal classes and structs.
✔ DO use var when reading a tuple return without deconstructing it. This avoids accidentally mislabeling fields.
✘ AVOID returning value tuples if reflection is expected to be used on the returned value.
✘ DO NOT use tuple returns on public APIs if there is a chance additional fields will need to be returned in future versions. Adding fields to a tuple return is a breaking change.
Deconstructing Multi-Value Returns
Going back to our LookupName example, it seems somewhat annoying to create a names variable that will only be used momentarily before it is replaced by separate locals. C# 7 also addresses this using what it calls "deconstruction". The syntax has several variants:
(string first, string last) = LookupName(0);
(var first, var last) = LookupName(0);
var (first, last) = LookupName(0);
(first, last) = LookupName(0);
In the last line of the above example, it is assumed the variables first and last were previously declared.
Deconstructors
Though similar in name to "destructor", a deconstructor has nothing to do with destroying an object. Just as a constructor combines separate values into one object, a deconstructor takes one object and separates it. A deconstructor allows any class to offer the deconstruction syntax described above. Let's consider the Rectangle class. It has this constructor:
public Rectangle(int x, int y, int width, int height)
When you call ToString
on a new instance you get, "{X=0,Y=0,Width=0,Height=0}". The combination of these two facts tells us what order to present the fields in our custom deconstruction method.
public void Deconstruct(out int x, out int y, out int width, out int height)
{
x = X;
y = Y;
width = Width;
height = Height;
}
var (x, y, width, height) = myRectangle;
Console.WriteLine(x);
Console.WriteLine(y);
Console.WriteLine(width);
Console.WriteLine(height);
You may be wondering why output parameters are used instead of a return tuple. Part of the reason may be performance, as this reduces the amount of copying that needs to occur. But the main reason cited by Microsoft is it opens the door for overloading Deconstruct.
Continuing our case study, we note Rectangle has a second constructor:
public Rectangle(Point location, Size size);
We answer this with a matching deconstruct method:
public void Deconstruct(out Point location, out Size size);
var (location, size) = myRectangle;
This works so long as each deconstruct method has a different number of parameters. Even if you explicitly list out the types, the compiler won't be able to determine which Deconstruct method to use.
In terms of API design, structs would usually benefit from deconstruction. Classes, especially models or DTOs such as Customer and Employee, probably shouldn't have a deconstruct method. There is no way to resolve questions such as "Should it be (firstName, lastName, phoneNumber, email) or (firstName, lastName, email, phoneNumber)?" in a way that will make everyone happy.
Guidelines for Deconstructors
✔ CONSIDER using deconstruction when reading tuple return values, but be aware of mislabeling mistakes.
✔ DO provide a custom deconstruct method for structs.
✔ DO match the field order in a class's constructor, ToString override, and Deconstruct method.
✔ CONSIDER providing secondary deconstruct methods if the struct has multiple constructors.
✔ CONSIDER deconstructing large value tuples immediately. A large value tuple has a total size of more than 16 bytes, which may be expensive to repeatedly copy. Note, reference variables always count as 4 bytes on a 32-bit OS and 8 bytes on a 64-bit OS.
✘ DO NOT expose Deconstruct methods on classes when it isn't obvious what order the fields should appear in.
✘ DO NOT expose multiple Deconstruct methods with the same number of parameters.
Out variables
C# 7 offers two new syntax options for calling functions with "out" parameters. You can now declare variables in function calls.
if (int.TryParse(s, out var i))
{
Console.WriteLine(i);
}
The other option is to ignore the output parameter entirely using a "wildcard".
if (int.TryParse(s, out _))
{
Console.WriteLine("success");
}
If you worked with the C# 7 preview, you may notice a change from using an asterisk (*) to using an underscore for the ignored parameters. This syntax won in part because the underscore is commonly used in functional programming languages for the same purpose. Other options considered included a keyword such as "void" or "ignore".
While the wildcards can be convenient, they imply a design flaw in the API. Under most circumstances, it would be better to simply offer an overload that omits the out parameters when they would otherwise normally be ignored.
Guidelines for Out Variables
✔ CONSIDER providing a tuple return alternative to out parameters.
✘ AVOID using out or ref parameters. [See Framework Design Guidelines]
✔ CONSIDER providing overloads that omit the out parameters so wildcards are not needed.
Local Functions and Iterators
Local functions are an interesting construct. At first glance they appear to be a slightly cleaner syntax for creating anonymous functions. Here you can see the differences.
public DateTime Max_Anonymous_Function(IList<DateTime> values)
{
Func<DateTime, DateTime, DateTime> MaxDate = (left, right) =>
{
return (left > right) ? left : right;
};
var result = values.First();
foreach (var item in values.Skip(1))
result = MaxDate(result, item);
return result;
}
public DateTime Max_Local_Function(IList<DateTime> values)
{
DateTime MaxDate(DateTime left, DateTime right)
{
return (left > right) ? left : right;
}
var result = values.First();
foreach (var item in values.Skip(1))
result = MaxDate(result, item);
return result;
}
However, once you start digging into them, some interesting properties emerge.
Anonymous Functions vs. Local Functions
When you create a normal anonymous function, it always creates a matching hidden class to store the function. An instance of this class is created and stored in a static field on the same hidden class. Thus once created, there is no further overhead.
Local functions are different in that no hidden class is needed. Instead the function is represented as a static function in the same class as its parent function.
Closures
If your anonymous or local function refers to a variable in the containing function, it is called a "closure" because it closes over or captures the local function. Here is an example,
public DateTime Max_Local_Function(IList<DateTime> values)
{
int callCount = 0;
DateTime MaxDate(DateTime left, DateTime right)
{
callCount++; <--The variable callCount is being closed over.
return (left > right) ? left : right;
}
var result = values.First();
foreach (var item in values.Skip(1))
result = MaxDate(result, item);
return result;
}
For anonymous functions, this requires a new instance of the hidden class each time the containing function is called. This ensures each call to the function has its own copy of the data that is shared between the parent and anonymous function.
The downside of this design is each call to the anonymous function requires instantiating a new object. This can make it expensive to use, as it puts pressure on the garbage collector.
With a local function, a hidden struct is created instead of a hidden class. This allows it to continue storing pre-call data while eliminating the need to instantiate a separate object. Similar to the anonymous function, the local function is physically stored in the hidden struct.
Delegates
When creating an anonymous or local function, you'll often want to package it in a delegate so that you can use it in an event handler or LINQ expression.
Anonymous functions are, by definition, anonymous. So in order to use them, you always need to store them in a variable or argument as a delegate.
Delegates cannot point to structs (unless they are boxed, which has weird semantics). So if you create a delegate that points to a local function, the compiler creates a hidden class instead of a hidden struct. And if that local function is a closure, a new instance of the hidden class is created each time the parent function is called.
Iterators
In C#, functions that use yield return to expose an IEnumerable<T>
cannot immediately validate its parameters. Instead, the parameter validation doesn't occur until MoveNext
is called on the anonymous enumerator that was returned.
This isn't a problem in VB because it supports anonymous iterators. Here is an example from MSDN:
Public Function GetSequence(low As Integer, high As Integer) _
As IEnumerable
' Validate the arguments.
If low < 1 Then Throw New ArgumentException("low is too low")
If high > 140 Then Throw New ArgumentException("high is too high")
' Return an anonymous iterator function.
Dim iterateSequence = Iterator Function() As IEnumerable
For index = low To high
Yield index
Next
End Function
Return iterateSequence()
End Function
In the current version of C#, GetSequence
and its iterator need to be entirely separate functions. With C# 7, these can be combined through the use of a local function.
public IEnumerable<int> GetSequence(int low, int high)
{
if (low < 1)
throw new ArgumentException("low is too low");
if (high > 140)
throw new ArgumentException("high is too high");
IEnumerable<int> Iterator()
{
for (int i = low; i <= high; i++)
yield return i;
}
return Iterator();
}
Iterators require building a state machine, so they behave like closures returned as a delegate in terms of hidden classes.
Guidelines for Anonymous and Local Functions
✔ DO use local functions instead of anonymous functions when a delegate is not needed, especially when a closure is involved.
✔ DO use local iterators when returning an IEnumerator when parameters need to be validated.
✔ CONSIDER placing local functions at the very beginning or end of a function to visually separate them from their parent function.
✘ AVOID using closures with delegates in performance sensitive code. This applies to both anonymous and local functions.
Ref Returns, Locals, and Properties
Structs have some interesting performance characteristics. Since they are stored in line with their parent data structure, they don't have the object header overhead of normal classes. This means you can pack them very densely in arrays with little or no wasted space. Besides reducing your overall memory overhead, this gives you great locality, making your CPU's tiny cache much more efficient. This is why people working on high performance applications love structs.
But if your struct is too large, you have to be careful about making unnecessary copies. Microsoft's guideline for this is 16 bytes, which is enough for 2 doubles or 4 integers. That's not much, though sometimes you can stretch it using bit-fields.
You also have to be extremely careful with mutable structs. It is easy to accidentally make changes to a copy of the struct when you were intending to modify the original.
Ref Locals
One way around this is to use smart pointers so you never need to make a copy. Here is some performance sensitive code from an ORM I've been working on.
for (var i = 0; i < m_Entries.Length; i++)
{
if (string.Equals(m_Entries[i].Details.ClrName, item.Key, StringComparison.OrdinalIgnoreCase)
|| string.Equals(m_Entries[i].Details.SqlName, item.Key, StringComparison.OrdinalIgnoreCase))
{
var value = item.Value ?? DBNull.Value;
if (value == DBNull.Value)
{
if (!ignoreNullProperties)
parts.Add($"{m_Entries[i].Details.QuotedSqlName} IS NULL");
}
else
{
m_Entries[i].ParameterValue = value;
m_Entries[i].UseParameter = true;
parts.Add($"{m_Entries[i].Details.QuotedSqlName} = {m_Entries[i].Details.SqlVariableName}");
}
found = true;
keyFound = true;
break;
}
}
The first thing you'll note is it doesn't use for-each
. To avoid the copy, it has to use the old style for loop
. And even then, all reads and writes are performed directly against the value in the m_Entries
array.
With C# 7's ref locals, you could significantly reduce the clutter without changing the semantics.
for (var i = 0; i < m_Entries.Length; i++)
{
ref Entry entry = ref m_Entries[i]; //create a reference
if (string.Equals(entry.Details.ClrName, item.Key, StringComparison.OrdinalIgnoreCase)
|| string.Equals(entry.Details.SqlName, item.Key, StringComparison.OrdinalIgnoreCase))
{
var value = item.Value ?? DBNull.Value;
if (value == DBNull.Value)
{
if (!ignoreNullProperties)
parts.Add($"{entry.Details.QuotedSqlName} IS NULL");
}
else
{
entry.ParameterValue = value;
entry.UseParameter = true;
parts.Add($"{entry.Details.QuotedSqlName} = {entry.Details.SqlVariableName}");
}
found = true;
keyFound = true;
break;
}
}
This works because a "ref local" is really a safe pointer. We say it is "safe" because the compiler won't allow you to point to anything ephemeral such as the result of normal function.
And in case you are wondering, "ref var entry = ref m_Entries[i];
" is valid syntax. You cannot, however, have it unbalanced. Either ref
is used for both the declaration and the expression or neither use it.
Ref Returns
Complementing the ref local feature is ref return. This allows you create copy-free function. Continuing our example, we can pull out the search behavior into its own static function.
static ref Entry FindColumn(Entry[] entries, string searchKey)
{
for (var i = 0; i < entries.Length; i++)
{
ref Entry entry = ref entries[i]; //create a reference
if (string.Equals(entry.Details.ClrName, searchKey, StringComparison.OrdinalIgnoreCase)
|| string.Equals(entry.Details.SqlName, searchKey, StringComparison.OrdinalIgnoreCase))
{
return ref entry;
}
}
throw new Exception("Column not found");
}
In this example we returned a reference to an array element. You can also return references to fields on objects, ref properties (see below), and ref parameters.
ref int Echo(ref int input)
{
return ref input;
}
ref int Echo2(ref Foo input)
{
return ref Foo.Field;
}
An interesting feature of ref returns is the caller can choose whether or not to use it. Both of the following lines are equally valid:
Entry copy = FindColumn(m_Entries, "FirstName");
ref Entry reference = ref FindColumn(m_Entries, "FirstName");
Ref Returns and Properties
You can create a ref return style property, but only if the property is read only. For example,
public ref int Test { get { return ref m_Test; } }
For immutable structs, this pattern seems like a no brainer. There's no extra cost to the consumer, who can choose to read it as either a ref or normal value as they see fit.
For mutable structs, things get interesting. First of all, this fixes the old problem of accidentally trying to modify a struct returned by a property, only to have the modification lost to the ether. Consider this class:
public class Shape
{
Rectangle m_Size;
public Rectangle Size { get { return m_Size; } }
}
var s = new Shape();
s.Size.Width = 5;
In C# 1, the size wouldn't be changed. In C# 6, this code would trigger a compiler error. In C# 7, we just add ref
and everything works.
public ref Rectangle Size { get { return ref m_Size; } }
At first glance it looks like this will prevent you from overriding the whole size at once. But as it turns out, you can still write code such as:
var rect = new Rectangle(0, 0, 10, 20);
s.Size = rect;
Even though the property is "read-only", this works exactly as expected. One just has to understand one isn't getting back a Rectangle, but a pointer to a location that holds Rectangles.
Now we've got a problem. Our immutable struct is no longer immutable. Even though individual fields cannot be altered, the whole value can be replaced via the ref property. C# will warn you about this by disallowing this syntax:
readonly int m_LineThickness;
public ref int LineThickness { get { return ref m_LineThickness; } }
Since there is no such thing as a read-only ref return, you can't create a reference to a read-only field.
Ref Returns and Indexers
Probably the biggest limitation of ref returns and locals is they require a fixed point to reference. Consider this line:
ref int x = ref myList[0];
This won't work because a list, unlike an array, makes a copy of the struct when you read its value. Below is the actual implementation of List<T>
from Reference Source.
public T this[int index] {
get {
// Following trick can reduce the range check by one
if ((uint) index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
return _items[index]; <-- return makes a copy
}
This also affects ImmutableArray<T>
and normal arrays when accessed via the IList<T>
interface. However, you could create your own version of List<T>
that defines its index as a ref return.
public ref T this[int index] {
get {
// Following trick can reduce the range check by one
if ((uint) index >= (uint)_size) {
ThrowHelper.ThrowArgumentOutOfRangeException();
}
Contract.EndContractBlock();
return ref _items[index]; <-- return ref makes a reference
}
If you do this, you'll need to explicitly implement the IList<T>
and IReadOnlyList<T>
interfaces. This is because ref returns have a different signature than normal returns and thus don't satisfy the interface's requirements.
Since indexers are actually just specialized properties, they have the same limitations as ref properties; meaning you can't explicitly define setters and the indexer is writable.
Guidelines for Ref Returns, Locals, and Properties
✔ CONSIDER using ref returns instead of index values in functions that work with arrays.
✔ CONSIDER using ref returns instead of normal returns for indexers on custom collection classes that hold structs.
✔ DO expose properties containing mutable structs as ref properties.
✘ DO NOT expose properties containing immutable structs as ref properties.
✘ DO NOT expose ref properties on immutable or read-only classes.
✘ DO NOT expose ref indexers on immutable or read-only collection classes.
ValueTask and Generalized Async Return Types
When the Task class was created, its primary role was to simplify multi-threaded programming. It created a channel that let you push long running operations into the thread pool and read back the results at a later date on your UI thread. And when using fork-join style concurrency, it performed admirably.
With the introduction of async/await in .NET 4.5, some of its flaws started to show. As we reported in 2011 (see Task Parallel Library Improvements in .NET 4.5), creating a Task object took longer than was acceptable and thus the internals had to be reworked. This resulted in a "a 49 to 55% reduction in the time it takes to create a Task<Int32> and a 52% reduction in size".
That's a good step, but Task still allocates memory. So when you are using it in a tight loop such as seen below, a lot of garbage can be produced.
while (await stream.ReadAsync(buffer, offset, count) != 0)
{
//process buffer
}
And as been said many times before, the key to high performance C# code is in reducing memory allocations and the subsequent GC cycle. Joe Duffy of Microsoft wrote in Asynchronous Everything,
First, remember, Midori was an entire OS written to use garbage collected memory. We learned some key lessons that were necessary for this to perform adequately. But I'd say the prime directive was to avoid superfluous allocations like the plague. Even short-lived ones. There is a mantra that permeated .NET in the early days: Gen0 collections are free. Unfortunately, this shaped a lot of .NET's library code, and is utter hogwash. Gen0 collections introduce pauses, dirty the cache, and introduce beat frequency issues in a highly concurrent system.
The real solution here is to create a struct-based task to use instead of the heap-allocated version. This was actually created under the name ValueTask<T> and was published in the System.Threading.Tasks.Extensions library. And because await already works on anything that exposes the right method, you can use it today.
Manually Exposing ValueTask<T>
The basic use case for ValueTask<T>
is when you expect the result to be synchronous most of the time and you want to eliminate unnecessary memory allocations. To start with, let's say you have a traditional task-based asynchronous method.
public async Task<Customer> ReadFromDBAsync(string key)
Then we wrap it in a caching method:
public ValueTask<Customer> ReadFromCacheAsync(string key)
{
Customer result;
if (_Cache.TryGetValue(key, out result))
return new ValueTask<Customer>(result); //no allocation
else
return new ValueTask<Customer>(ReadFromCacheAsync_Inner(key));
}
And add a helper method to build the async state machine.
async Task<Customer> ReadFromCacheAsync_Inner(string key)
{
var result = await ReadFromDBAsync(key);
_Cache[key] = result;
return result;
}
With this in place, consumers can call ReadFromCacheAsync with exactly the same syntax as ReadFromDBAsync;
async Task Test()
{
var a = await ReadFromCacheAsync("aaa");
var b = await ReadFromCacheAsync("bbb");
}
Generalized Async
While the above pattern is not difficult, this is rather tedious to implement. And as we know, the more tedious the code is to write, the more likely it is to contain simple mistakes. So the current proposal for C# 7 is to offer generalized async returns.
Under the current design, you can only use the async keyword with methods that return Task
, Task<T>
, or void
. When complete, generalized async returns will extend that capability to anything "tasklike". Something is considered to be tasklike if it has an AsyncBuilder
attribute. This indicates the helper class used to create the tasklike object.
In the feature design notes, Microsoft estimates maybe five people will actually create tasklike classes that gain general acceptance. Everyone else will most likely use one of those five. Here is our above example using the new syntax:
public async ValueTask<Customer> ReadFromCacheAsync(string key)
{
Customer result;
if (_Cache.TryGetValue(key, out result))
{
return result; //no allocation
}
else
{
result = await ReadFromDBAsync(key);
_Cache[key] = result;
return result;
}
}
As you can see, we've eliminated the helper method and, other than the return type, it looks just like any other async method.
When to Use ValueTask<T>
So should you use ValueTask<T>
instead of Task<T>
? Not necessarily. It can be a little hard to find, so we'll quote the documentation:
Methods may return an instance of this value type when it's likely that the result of their operations will be available synchronously and when the method is expected to be invoked so frequently that the cost of allocating a newTask<TResult>
for each call will be prohibitive.
There are tradeoffs to using aValueTask<TResult>
instead of aTask<TResult>
. For example, while aValueTask<TResult>
can help avoid an allocation in the case where the successful result is available synchronously, it also contains two fields whereas aTask<TResult>
as a reference type is a single field. This means that a method call ends up returning two fields worth of data instead of one, which is more data to copy. It also means that if a method that returns one of these is awaited within an async method, the state machine for that async method will be larger due to needing to store the struct that's two fields instead of a single reference.
Further, for uses other than consuming the result of an asynchronous operation via await,ValueTask<TResult>
can lead to a more convoluted programming model, which can in turn actually lead to more allocations. For example, consider a method that could return either aTask<TResult>
with a cached task as a common result or aValueTask<TResult>
. If the consumer of the result wants to use it as aTask<TResult>
, such as to use with in methods likeTask.WhenAll
andTask.WhenAny
, theValueTask<TResult>
would first need to be converted into aTask<TResult>
usingValueTask<TResult>.AsTask
, which leads to an allocation that would have been avoided if a cachedTask<TResult>
had been used in the first place.
As such, the default choice for any asynchronous method should be to return aTask
orTask<TResult>
. Only if performance analysis proves it worthwhile should aValueTask<TResult>
be used instead ofTask<TResult>
. There is no non-generic version ofValueTask<TResult>
as theTask.CompletedTask
property may be used to hand back a successfully completed singleton in the case where a Task-returning method completes synchronously and successfully.
This is a rather long passage, so we've summarized it in our guidelines below.
Guidelines for ValueTask<T>
✔ CONSIDER using ValueTask<T>
in performance sensitive code when results will usually be returned synchronously.
✔ CONSIDER using ValueTask<T>
when memory pressure is an issue and Tasks cannot be cached.
✘ AVOID exposing ValueTask<T>
in public APIs unless there are significant performance implications.
✘ DO NOT use ValueTask<T>
when calls to Task.WhenAll
or WhenAny
are expected.
Expression Bodied Members
An expression bodied member allows one to eliminate the brackets for simple functions. This takes what is normally a four-line function and reduces it to a single line. For example:
public override string ToString()
{
return FirstName + " " + LastName;
}
public override string ToString() => FirstName + " " + LastName;
Care must be taken to not go too far with this. For example, let's say you need to avoid the leading space when the first name is empty. You could write,
public override string ToString() => !string.IsNullOrEmpty(FirstName) ? FirstName + " " + LastName : LastName;
But then you might want to check for a missing last name.
public override string ToString() => !string.IsNullOrEmpty(FirstName) ? FirstName + " " + LastName : (!string.IsNullOrEmpty(LastName) ? LastName : "No Name");
As you can see, one can get carried away quite quickly when using this feature. So while you can do a lot by chaining together multiple conditional or null-coalescing operators, you should exhibit restraint.
Expression Bodied Properties
New in C# 6 were expression bodied properties. They are useful when working with MVVM style models that use a Get/Set method for handling things such as property notifications.
Here is the C# 6 code:
public string FirstName
{
get { return Get<string>(); }
set { Set(value); }
}
And the C# 7 alternative:
public string FirstName
{
get => Get<string>();
set => Set(value);
}
While the line count hasn't gone down, much of the line-noise is gone. And with something as small and repetitive as a property, every little bit helps.
For more information on how Get/Set works in these examples, see "CallerMemberName" in the news report titled C#, VB.NET To Get Windows Runtime Support, Asynchronous Methods.
Expression Bodied Constructors
Also new to C# 7 are expression bodied constructors. Here is an example,
class Person
{
public Person(string name) => Name = name;
public string Name { get; }
}
The use here is very limited. It really only works if you have zero or one parameters. As soon as you add a second parameter that needs to be assigned to a field/property, you have to switch to a traditional constructor. You also can't initialize other fields, hook up event handlers, etc. (Parameter validation is possible, see "Throw Expressions" below.)
So our advice is to simply ignore this feature. It is going to make your single-parameter constructors look different from all of your other constructors while offering only a very small reduction in code size.
Expression Bodied Destructors
In an effort to make C# more consistent, destructors are allowed to an expression bodied member just like methods and constructors.
For those who have forgotten, a destructor in C# is really an override of the Finalize method on System.Object
. Though C# doesn't express it that way:
~UnmanagedResource()
{
ReleaseResources();
}
One problem with this syntax is it looks a lot like a constructor, and thus can be easily overlooked. Another is that it mimics the destructor syntax in C++, which has completely different semantics. But that ship has sailed, so let's move on to the new syntax.
~UnmanagedResource() => ReleaseResources();
Now we have a single, easily missed line that brings the object into the finalizer queue lifecycle. This isn't a trivial property or ToString method, this is something important that needs to be visible. So again I advise against using it.
Guidelines for Expression Bodied Members
✔ DO use expression bodied members for simple properties.
✔ DO use expression bodied members for methods that just call other overloads of the same method.
✔ CONSIDER using expression bodied members for trivial methods.
✘ DO NOT use more than one conditional (a ? b : c) or null-coalescing (x ?? y) operator in an expression bodied member.
✘ DO NOT use expression bodied members for constructors and finalizers.
Throw Expressions
Superficially, programming languages can generally be divided into two styles:
- Everything is an expression
- Statements, declarations, and expressions are separate concepts
Ruby is an instance of the former, where even declarations are expressions. By contrast, Visual Basic represents the latter, with a strong distinction between statements and expressions. For example, there is a completely different syntax for "if" when it stands alone and when it appears as part of a larger expression.
C# is mostly in the second camp, but due to its C heritage it does allow you to treat assignment statements as if they were expressions. This allows you to write code such as:
while ((current = stream.ReadByte()) != -1)
{
//do work;
}
For the first time, C# 7 will be allowing a non-assignment statement to be used as an expression. Without any changes to the syntax, you can now place a "throw" statement anywhere that's expecting a normal expression. Here are some examples from Mads Torgersen's press release:
class Person
{
public string Name { get; }
public Person(string name) => Name = name ?? throw new ArgumentNullException("name");
public string GetFirstName()
{
var parts = Name.Split(' ');
return (parts.Length > 0) ? parts[0] : throw new InvalidOperationException("No name!");
}
public string GetLastName() => throw new NotImplementedException();
}
In each of these examples, it is pretty obvious what's going on. But what if we move the throws expression?
return (parts.Length == 0) ? throw new InvalidOperationException("No name!") : parts[0];
Now it isn't quite so clear. While the left and right clauses are related, the middle clause has nothing to do with them. Seen pictorially, the first version has the "happy path" on the left and the error path on the right. The second version has the error path splitting the happy path in half, breaking the flow of the whole line.
(Click on the image to enlarge it)
Let's look at another example. Here we are including a function call in the mix.
void Save(IList<Customer> customers, User currentUser)
{
if (customers == null || customers.Count == 0) throw new ArgumentException("No customers to save");
_Database.SaveEach("dbo.Customer", customers, currentUser);
}
void Save(IList<Customer> customers, User currentUser)
{
_Database.SaveEach("dbo.Customer", (customers == null || customers.Count == 0) ? customers : throw new ArgumentException("No customers to save"), currentUser);
}
Already we can see the length alone is problematic, though long lines are not unheard of with LINQ. But to get a better idea of how one reads the code, we'll color the conditional orange, the function call blue, the function arguments yellow, and the error path red.
(Click on the image to enlarge it)
Again, you can see how context shifts as the parameters change location.
Guidelines for Throw Expressions
✔ CONSIDER placing throw expressions on the right side of conditional (a ? b : c) and null-coalescing (x ?? y) operators in assignments/return statements.
✘ AVOID placing throw expressions on the middle slot of a conditional operator.
✘ DO NOT place throw expressions inside a function's parameter list.
For more information on how exceptions affect API design, see Designing with Exceptions in .NET.
Pattern Matching and Enhanced Switch Blocks
Pattern matching, which among other things enhances switch blocks, doesn't have any impact on API design. So while it certainly can make working with heterogeneous collections easier, it is still better to use shared interfaces and polymorphism when possible.
That said, there are some implementation details one should be aware of. Consider this example from the announcement in August:
switch(shape)
{
case Circle c:
WriteLine($"circle with radius {c.Radius}");
break;
case Rectangle s when (s.Width == s.Height):
WriteLine($"{s.Width} x {s.Height} square");
break;
case Rectangle r:
WriteLine($"{r.Width} x {r.Height} rectangle");
break;
default:
WriteLine("<unknown shape>");
break;
case null:
throw new ArgumentNullException(nameof(shape));
}
Previously, the order in which case expressions occurred didn't matter. In C# 7, like Visual Basic, switch statements are evaluated almost strictly in order. This allows for when expressions.
Practically then, you want your most common cases to be first in the switch block, just as you would in a series of if-else-if blocks. Likewise, if any check is particularly expensive to make then it should near the bottom so it is executed only when necessary.
The exception to the strict ordering rule is the default case. It is always processed last, regardless of where it actually appears in the order. This can make the code harder to understand, so I recommend always placing the default case last.
Pattern Matching Expressions
While switch blocks will probably be the most common use for pattern matching in C#; that is not the only place they can appear. Any Boolean expression evaluated at runtime can include a pattern expression.
Here is an example that determines if the variable ‘o' is a string, and if so tries to parse it as an integer.
if (o is string s && int.TryParse(s, out var i))
{
Console.WriteLine(i);
}
Note how a new variable named ‘s' is created by the pattern expression, then reused later by TryParse. This technique can be chained together for even more complex expressions:
if ((o is int i) || (o is string s && int.TryParse(s, out i)))
{
Console.WriteLine(i);
}
For the sake of comparison, here's what the above code would typically look like in C# 6.
if (o is int)
{
Console.WriteLine((int)o);
}
else if (o is string && int.TryParse((string) o, out i))
{
Console.WriteLine(i);
}
It is too soon to tell if the new pattern matching code is more efficient the older style, but it can potentially eliminate some of the redundant type checks.
Let's Make This a Living Document
The features in C# 7 are still new and there is much to be learned about how they work in the real world. So if you see something that you don't agree with, or is missing from these guidelines, let us know.
About The Author
Jonathan Allen got his start working on MIS projects for a health clinic in the late 90's, bringing them up from Access and Excel to an enterprise solution by degrees. After spending five years writing automated trading systems for the financial sector, he became a consultant on a variety of projects including the UI for a robotic warehouse, the middle tier for cancer research software, and the big data needs of a major real estate insurance company. In his free time he enjoys studying and writing about martial arts from the 16th century.