One of the things that I've been spending a lot of time on is two aspects of .NET 4.0. I've been spending a lot of time on Parallel and extensions in that space, but also a lot of time on Entity Framework and the features that will do into Entity Framework. It's a big release and is an awful lot of it as we were a little team of 3 in the UK looking at the technology and how we work with some early adopters and it's too much for us to take in. But each product group is trying to release something big as part of the .NET 4.0 release.
We have language space and we have an awful lot of work to make it easier to do exactly the same things in VB.NET and C# - so we've got a lot of work there. We've got a lot of work in dynamic to make those languages work better with dynamic languages and also work better with the existing COM applications. There is a lot of good stuff going on with languages. We add another language to the box, we add F#, so that will take us up to VB.NET, C# and F# in the box. If we go beyond languages, we are doing work on Workflow, so we are doing new versions of Workflow, which is quite a major overhaul on the way we tackle Workflow and the tools we do in Workflow.
We do work on SharePoint, so if you are a SharePoint developer and you are looking at where we are taking development tools, expect to see something significant with Visual Studio 2010, but the first thing you probably notice would be the development experience. We are taking the editor of this part of Visual Studio 2010 and we are rewriting it in Windows Presentation Foundation. To do that, we need to be pretty sure that we are adding something significant to this part - it's a big thing for us to take an editor that's been stable for a very long time and all of us got used to using it and then say "We'll take that, we'll throw that one away and we'll add a new editor."
I'm using some of the intern builds at the moment and what we have in the intern builds at the moment - and it's only intern build! - is an editor that is frustrating and annoying, but I can see the promise. The promise that they are doing is by leveraging Windows Presentation Foundation. It allows both Microsoft and its partners to build add-ins for Visual Studio to do some really creative visualizations in the editor. I'll give you an example of that: if you wanted to, you could actually graphically display classes. If you were following a standard where all your classes were named after animals, and you wanted them to appear as animals in the core editor, you could do that, which is something we were not able to do using the old kind of editor.
That's the weird level we can take it to, but obviously we'll expect people to do more simple and interesting things and useful things. When you think, if you are dealing with XML schema or comments, we can visualize those in a very different way once we've leveraged WPF as the runtime engine for displaying the designer. There is lots of stuff in it - it's a really big release. As I say for me, Parallel and Entity Framework is where I'm spending a lot of time at the moment. Next for me it will be Workflow. Workflow Foundation was a big technology for us in .NET 3.0. We did some work on it in .NET 3.5, a little bit of work in .NET 3.5 SP 1; in .NET 4.0 we take a lot of the feedback from the early adopters. They said "We love what you did here, but these are real issues for us adopting it widely. Could you look again at how you are implementing that stuff?" That's one of the things that our team has done on that.
They are very pleased with their initial results. Interesting about playing with it at the moment -somewhat internal - is each team is at a different point in its development cycle. Some bits are really solid and you could start shipping it tomorrow, then some bits are playing around with what's the best way of doing it - there was a lot in there.
My background is very much the semicolon guy. I come from the C background. If I go back far, I work with many different program languages - whichever one was required on a project. I've done a lot of work with C# and then, in July last year, I decided to switch to VB.NET and the reason why I'm a returning developer - I had a few years away from developing -I figured out I missed C# 2.0 and C# 3.0 so why not come back and do VB.NET?
It's been quite interesting for me, because apart from doing that I started finding more projects built in VB.NET, spending time with people, understanding where we are. If I look at UK, in the UK we have a lot of VB.NET developers. We primarily, as Microsoft, we talk C#, we talk C# pretty much 100% of the time. If we do an event, a conference, a blog post we do it in C#. Yes, there is at least one VB.NET developer for every 2 C# developers. Depending on what parts of the industry you look at, it becomes 50-50, so we have an awful lot of VB.NET developers. The reason we have a lot is because it's a great leverage. You can build the same class of application in VB.NET as you can in C#.
However, those 2 languages while they both sit on top of the CLR, were produced by different teams, were quite competitive inside Microsoft. Those teams have done different things in different ways. Where we run up against some difficulties is the C# team have introduced a number of things ahead of the VB team and if you are picking up a framework that leverages that or a sample that takes advantage of that, as VB.NET developers is just that extra complication that you've got to work around.
There is a bunch of things that you'll see in a C# example that you can do in VB.NET, but you do it in a different way - things like statement lambdas, anonymous methods.
We didn't have a way of doing that in VB.NET version 9, which is VB 2008 version. We changed that in .NET 4.0, we're talking strongly at PDC in October at something called The Co-evolution of Languages, which is our commitment to make these 2 languages actually do the same things at the same point in time. That works both ways.
There is a bunch of stuff that VB.NET is way better than achieving with C#.
In anything to do with COM languages, in anything to do with dynamic typing, VB.NET is much better than C#, so the C# team are adding a bunch of stuff to bring them on parity with the VB.NET team and the VB.NET team are adding a bunch of stuff to bring them on parity with the C# team. We will be pretty much identical until the point we can take accurate sample from either language and just by changing the syntax get to run without too much effort. We've not been there for a little while and that's a good thing. Interesting enough, the VB team was on a different stuff, but at the moment it's co-evolution we are working towards.
F# is interesting for me. I've sat, I've watched people presenting it; I've watched developers coding in it and I've never tried doing follow on any F# project. It's a very different language from VB.NET and C#. I see it when it's working in the city, a functional programming quite popular in the city in the financial institutions in the UK. The co-evolution commitment is very much about VB.NET and C#. F# is the new kid that comes in, I don't know what it does differently in terms of class of app it can build - can it build the same class of app? I think that's true, but it's not something I'll spend my meal time on.
There are a couple of things. The first one for me is I would like to see the product group do their backlog. If you talk to anyone in the product group about what they are shipping for .NET 4.0, they already have a very long list of stuff that they can't get into that time line. If you are speaking to someone in the Entity Framework team about what they are doing around object relational mapping, what they are doing around working with SQL Server, they'll all say "We are adding these 50 great features in .NET 4.0. And, by the way we have another 150 on our backlog."
They have loads of their wish stuff that they need to get through. The same is for Parallel -the Parallel focused got major release of Parallel stuff, they already got a great list of stuff that they want to add also. Each team is already thinking about "This is the stuff I'd like to add". On a personal level, one of the things I would like to see is language integration of Parallel.
The way they've done it is interesting - whenever you're adding new technology to something like a Framework, you can either add it as a library, or you can add it as language extensions that may be taking advantage of a library or you can add it as language extensions that just go and produce a lot of code that does something clever for you. If you take something like LINQ language integrated query, that was done as a language extension. That meant that the C# team had implemented the language extension, the VB.NET team had to implement the language extension and then, if you have another language, you didn't have the language extension.
LINQ only really works well with VB.NET and C#. Parallel, they decided to do as a library and I like that because it makes the Parallel capabilities available to all .NET languages. If you want to take advantage of multicore you can just take the Parallel library in .NET 4.0, you can use that for whatever language you want with as a .NET language. However, when you're working with a library, it just doesn't feel as smooth and something like Parallel really is becoming call to us in the future. For me, I would like to see the capabilities of the library become part of the language in the future version beyond .NET 4.0.
So, .NET 4.0 would be VB 10.0 and C# 4.0, let's say VB 11.0 and C# 5.0 let's have actual Parallel extensions of part of the language. Interesting enough, and it called me out, the VB team actually started talking about something called Concurrent Basic and it's just a research project that they just thrown it out there on Channel 9 and they said "Look! Have a look at what we're playing with: we're playing with extensions to language, which won't be in the .NET 4.0 time frame, may never happen, but take a look at it and give some feedback." That looks really sweet, how they've extended the language. Instead of still thinking of library I needed a new one of those, a new one of those - it's all about language extensions.
Parallel for me has become quite interesting over the last 12 months and everything that makes up Parallel. When I'm thinking about Parallel, I'm thinking about "I genuinely want 2 bits of code to run at the same point in time because I have more than one core on my box." If you take the notebooks that I see in this room, we probably got dual core machines. If I tell you the kind of notebook I'll buy next year, it will almost certainly be a quad core machine and I'll probably have hyper-threading on it. I'm looking at 8 virtual cores in my box. Then you're left with "As a developer, how does my application take advantage of it?"
There are a few things there: one is "Will the operating system take advantage of having 8-cores? Windows will?" - Absolutely! We have Windows running on 64-core, so the Windows will run great. Then, "Will the .NET Framework Runtime work great on it?" - Yes, the Runtime will. That's been architected to work across multiple cores. "Will our server elements of the .NET Framework library work well on it?" -Yes. That's been architected to work across multiple cores because that's a server technology.
The vast majority of the rest of the .NET Framework, though, was designed from the outset to work single core and it has all sorts of comments in it about "This works only single core.
We are single threaded. We run on a thread. If you try to do 2 things with this structure at the same time, bad things will happen." Most developers I know are not doing any kind of parallel programming. The reason that we won't get away with that is, broadly speaking, that every year we get faster machines to run our single thread on. We wrote something 5 years ago, it runs fast today.
If you look at what AMD and Intel are doing, their next release of their CPUs is going to run at the same clock speed. If fact, they thought at running it at slower clock speeds to get more cores on the same CPU. As developers, we need to start thinking about threads more, we need to think about multicore more and that's just incredibly hard to do. No matter whether you are doing VB.NET, you are doing the C# or another programming language, it's just really difficult to do that sort of stuff.
In .NET Framework 4.0 we are doing a lot of work in that area to try and make it way easier to work in that space. I just have a couple of bits that stick out for me. The first one is, at the moment, if you are writing for multicore, you tend to think in terms of threads and you start thinking about your application and how you are going to run on many threads. The fundamental problem with that is we don't make it easy for you to think about how you might partition your application then how you might schedule threads on the processor. We need to separate the partitioning on application from the number of threads that we go on.
We might look at an application that we're writing and think "Do you know I can see 1,000 things that potentially can be done in parallel, but I've only got a 4-core box." Actually, if these were CPU bound things, I only want 4 threads. I've got 1,000 potentials, I've got 4 threads - that's a difficult thing to do. If you try to do 1,000 threads on a 4-core box, this will be bad. Each thread will try to take 1 MB of memory, will be switching threads and it's not going to be a good thing. So we introduced a new thing called the Task. .NET 4.0 introduces something called the Task Parallel Library, which has a new type in it called the Task and we can now create instances of the tasks. Then, the CLR thread pool will ultimately end up running these things on threads, but we can create 5,000 tasks, if we want, 10,000 tasks, 20,000 tasks.
Ultimately, the CLR thread pool will do the right thing when it comes to the matter of threads that these tasks will run on. A task has much more power than a thread, for us as developers. We can have relationships within the parent tasks and children tasks, we can fault from the parent to the child, from the child to the parent, we can pass by return values - lots of things you can't do when you are just dealing with a thread as your primitive.
Above that, we add things like Parallel.ForAll, Parallel.ForEach and Parallel.Invoke and they do what they sound like and this is what you want to think of. If you just had 28 methods and you want them to run in Parallel, Parallel.Invoke enlists the 28 and they would just run in parallel for you.
We make it much easier to work with that. Underneath that we add things like concurrent data structures because a lot of our collections and dictionaries and lists are not threadsafe, so we need to have a way of having many threads work with these things and we need to understand what the answer should be. When one thread is enumerating through a list, then the other thread changes the value before the first thread gets to it. What does that mean? Those are collection data structures that we are working on.
Lot of work for Managed, which is what I've been referring today with the Task Parallel Library, but if you are a native developer with C++, there is a version for you for C++ as well. I'll go slightly further, so if you've got a favorite parallel library, you've got the Parallel.ForAll, the Parallel.ForEach, you got the concurrent data structures, we also have some primitive messaging types that the C# and the C++ team have added to the Framework.
Yes. One of the things the team have done is take what they did with the Task Parallel Library, those kind of primitive ways of working with tasks instead of threads and doing implementation of something Parallel LINQ and it was quite impressive. As long as you are dealing with LINQ to objects, so not LINQ to relational, parallelism of a relational query is something you expect your database server to do. If you are dealing with LINQ to objects a very small change of your code and you'll end up running in Parallel on a multicore box. It really is as simple as adding a single adornment to your code. You could have a big LINQ query from O in objects, where, select, and a single adornment on that code and you'll actually be using a whole different set of code, which will be Parallel LINQ. Broadly speaking you do that without to make side effects for the whole bunch of LINQ to objects. You have to be careful. By default when you do that to a thing, we no longer try to bring back things in the same order you would have got if you had done it sequentially, but you can force to try and do that as well. If you were doing some kind of LINQ across a directory structure and it came back in a certain order, if you did it on a single thread, if you made it Parallel LINQ, it might combine a different order. We can force that as well. We could say "Parallel LINQ, please, but could you keep your order?". We don't have quite the same benefit but we get an improvement.
Good question! Thanks for asking me because I do have a terrible tendency to forget C++, which is terrible when I think that quite a lot of the companies I work with actually have a significant amount of C++ development going on. C++ is still going strong. I think it's fair to say that from the outside looking in it you could see that we sort of deemphasized for a while our investments in C++. I think we started to address that with Visual Studio 2008, so the C++ team began to start talking bullishly about "We're going to add all these capabilities, we're going to add all these features." They've got a long list coming in Visual Studio 2010 in .NET Framework 4.0 stuff. If you take Parallel, which we mentioned earlier, they are doing more than the Manage guys are doing.
They've got a more complete implementation of Parallel capabilities for C++ developers than we have for C# or VB.NET developers. They've just gone a little bit further, so they are spending a lot of time on that. C++ is still an extremely popular language, especially in finance where I see a lot of C++ development going on. If we survey the UK, there is relatively few C++ development as a percentage there - it's round about 10% - 15% C++ developers but still a huge number, so there are a lot of folks around that. If you see C++ to have, you look at VB 2010 you'll be impressed with what we're adding, so it's a big investment effort for us.
8. What exactly is Cloud Computing?
For me, it's been something that I've been involved in for about 3-4 years now. My involvement stands from working with independent software vendors - ISVs. It was with the advent of software as a service and Salesforce and NetSuite and those guys, they were talking about instead of running stuff on premise to run it in the Cloud, so off premise, on-demand applications. That's when I started looking at that kind of the whole world of - I would say - my calculator application and I would make it run somewhere out there in the Internet, in somebody else's data center. And not only that but the results of the calculations stick out there as well.
I have quite a simple picture of it at the moment, although the business implications are quite complicated, but my simple vision of it is: it's about doing computing in someone else's data center. It's about running stuff, it's about storing stuff somewhere else. It's pretty fascinating watching companies handling that potential change to their model. If you are a traditional ISV, who has always sold on premise and you are now faced with competitors using cloud technology to deliver something to get it switched on in minutes in your customer base, please try our application. Switch it on! You can now try it!
There is a traditionalized view that you have to deal with each other out and it will take them 2 weeks to settle a box. It's been interesting watching the implications of it, but from a technology perspective - pretty straight forward - I want to somehow be able to run code and store results in somebody else's data center. In a way it is somehow more attractive than doing it on my own servers, on my own data center. Microsoft has been working on it for a few years now. About 3-4 years ago we were spending most of our time on guidance, we were helping companies with Windows development to understand how they might take advantage of running in a Cloud, but we switched quite significantly to building services for Cloud computing around about 2 years ago.
For me, the first technology I got to play with was a Cloud-based technology somebody called SQL Server Data Services - SSDS - and there was storage of data in the cloud and that was right about March 2008 - that was my first. This is a real service from Microsoft for a developer that I'm gonna use versus Cloud Services from Microsoft, which we've had for some time - things like XBox Live, Live Meeting - some technology that we've had for some time. That was the first one for me: SQL Server Data Services and then, of course, in October, we announced Azure Services Platform on PDC.
9. Can you tell us a little bit more about Azure?
It's a big thing, Azure Services Platform. I think we really did a great job in October of trying to talk about many different technologies from many different product groups without confusing people too much, but nevertheless it's still a little bit confusing. I'm trying to clarify that one. The Azure Services platform I tend to - and I like this sort of pinch on things - that is a marking term and it allows us to bring together a bunch of real tangible services that are useful to developers under one brand. We'll draw a boundary around those things. That inside of the Azure Service Platform we have a number of pieces that you can either use individually or you can use together.
At the base, if you look at any kind of diagram of the Azure Services Platform, you see something called Windows Azure, different to the Azure Service Platform. We would refer to the Azure operating system for the cloud as built on Windows server technology - so it really is using Windows server - but it does mean the things yo u'd associate with an operating system. It is responsible for, if I -as a developer - build some code that I want to run in the Cloud on top of Windows Azure, it will take that and it will do the right kind of magic to make that highly available, highly scalable, keep my data durable and give multiple writes for my data source in many places.
It will do that stuff for me, it will roll up logging information for me, it will allow me to get trace information, it will allow me to control authentication and authorizations and stuff like this for me. It's a sort of things I would associate within OS and it masks away the fact that my application will be running on many nodes, each running Windows Server, so it sits as a layer of Window Server. Windows Azure gives me the ability to write ASP.NET applications to run in the Cloud, so I can take my existing scales for building web form applications or web services using ASP.NET, whatever model I want to choose - traditional web forms, MVC take that capability and scales and deploy that into the Cloud to be run in the Microsoft Data Center.
I can take pretty much an application that I've already built in ASP.NET and run it out there and things will just work beautifully for me and I'll do all the right stuff. It gets to be more interesting around store, where I actually store stuff. If I'd build a traditional application that would probably be talking to SQL Server or Oracle or DB 2, I'd be working with relational database, I'd be going through ADO.NET to achieve that. When I go and say "I want to run in Windows Azure", I would use Windows Azure storage, which isn't relational. A completely different way of storing data and retrieving data. It's RESTful, so I'd certainly be into the whole world of HTTP and get/post/put/delete verbs.
I'd be working in that space and I would have to do some significant work to take my data access later work with relational and then rebuild that to go into Windows Azure storage. It's not insurmountable, there are a bunch of great samples out there where people have taken existing ASP.NET applications that use relational databases and re-architected it for a store that isn't relational, but it's still a lot of work and it gets increasingly hard. The more complicated your application is, the greater you made use of relational.
Windows Azure is our OS and it has its own storage, but then we have something called SQL Data Services. When it first came out it was called SQL Server Data Services, we renamed it in October and we positioned as this would give you relational capabilities in the Cloud. We had a CTP release of that, we had that in the market for around about 1 year. We had a lot of feedback on it. If you start from scratch, a lot of people liked it; they said "Loved it! I can see how it goes on scalability. I can see it's semi-relational, but it's doing about just enough in some places by allowing you to add some more features".
But if you are a company or an individual with an existing application that relied on relational, you looked at SQL Data Services and you said "Hey! I want it to do more relational. I want it much more like a relational database management system". It's only last week that we went public with we are changing the model for SQL Data Services and we are making it into a full on relational engine in the Cloud. We are going to use a Transact SQL, which is a version of SQL that reduces SQL Server, but use Transact SQL with SQL Data Services - you'll get to use stored procedures, triggers, views but not everything. You don't get everything you'd associate with SQL Server, you get a good chunk of it.
That means that, as a developer, I can use Windows Azure as my execution engine for running code - ASP.NET code, VB.NET, C#, etc. - and then, for storage, I've got Windows Azure storage and I've got SQL Data Services, depending on what I'm trying to achieve with it. Around all that there is a whole bunch of other stuff. We've got .NET services, live services; we have some stuff around dynamics and we have some stuff around SharePoint, but for me and the work we've been spending the time on it's been really about "I want to compute something in the Cloud. I want to store something in the Cloud" - and that's Windows Azure and SQL Data Services, technologies to look at.
We are definitely taking a different approach. We've got Amazon that has done a fantastic job of enabling a bunch of companies and individuals to do things that would have been extremely hard to do - you would need a big wallet, big checkbook to get to the part where you could have done the interesting stuff that you can do thanks to their Amazon Web Services. But we do a quite different approach to it. Amazon S3, which is the most dominant storage solution, is really about storing an object with the key and a little bit of metadata. It's a long way from relational to "I just want to keep storing objects. I have a key that allows me to get them back and have a little bit of metadata wrapped around them."
Even our basic storage solution - Windows Azure storage - is above that. Windows Azure storage has first class concepts of queues, of blobs and tables. In Windows Azure storage you can say "I want a table, these are my columns, I want to store some information in it." and you can do queries against that. I'm not using SQL syntax, but you can do queries against that. If you store blobs, you can store huge blobs. If you want a 50 GB blob, you can put it up there, you can do it as a bunch of blocks, but you get at the point where eventually another 50 GB blob out there. We also have queues, so we can easy pass stuff in and have lots of work as listing at the other end, which is part of Windows Azure to do that.
Even in our basic one, it gives much more than Amazon S3. Of course, SQL Data Services gives you full relational. Someone said Amazon SimpleDB, which is a little bit relational. It has a bunch of restrictions, everything has to be a string; when you ask for something back, I think it comes back in 250 chunks - it used to be only 250! We don't have those kind of restrictions, we have types, we don't have as many types in Windows Azure storage as you would expect for relational, but in SQL Data Services we have lots of types.
We have all sort of types you'd associate with SQL Server, except things like your own custom types, say, CLR types. So, we don't have CLR integration in that. At the data store level, I think we do a whole bunch of things more. That will make it easier for developers to go "I want to write a really complex application and this is doing an awful lot of data". When it comes to the compute part - if I want to build a calculator in the Cloud, Amazon has a clever but very simple model, which is "We will give you a virtual machine" and "Here is your virtual machine. Run inside the virtual machine and you can just build a normal app and you can do whatever you want to do with that."
It's just about having one or more virtual machines. We do a whole bunch of abstractions above our nodes running Windows Server. Once we do use a custom version of Hyper-V that the Azure team have worked on because they know exactly what they want to do with virtualization. From a developer perspective it doesn't feel like I'm deploying a virtual machine. I would build an application and I'd say "Here is my web app", I'd file a new web project, write some code and then I'd go to the XML configuration and say "Could you just run 25 of those, please?" and when it runs Azure does its magic, spins up and it will spin up a number of virtual machines.
But from a developer perspective, it feels like one platform I'm deploying to, it doesn't feel like I'm managing 25 virtual machines or I'm trying to use a tool across that. It feels like an operating system in the Cloud. It simplifies a lot of the stuff for us. We are learning from our own experiences of doing this sort of stuff what the people are doing and just trying to make it easier for developers to get this stuff up and running.
The flip side of that is you can't quite do things just the same way you would have done if you own a box and you were building an ASP.NET application for it. Most of the stuff works, let's say, but not quite everything. There is a guidance in the documentation that will keep working always. Actually, you can't quite do that - if you write something, then it might go away because you have to allow for the fact that you've written out what looks like a file system, but it might go away. If you want it to be permanent, stick in Azure storage or stick in SQL Data Services - so there is some guidance that we are working on.