The elevator pitch for Rust from Mozilla’s perspective is that in order to compete in the market place of web browsers, in order to innovate with new web browser technology, we have to be performant, there is no way we can survive in keeping a solid user base without keeping the kind of performance that you get right now from C and C++ development, but at the same time it’s too expensive to keep developing using C and C++ because it really hinders the speed at which you can innovate new techniques, especially for parallel computation. Everyone recognizes that the future is in making better use of parallel processing, that’s available on commodity hardware nowadays, and the hard part is there are a lot of great ideas at Mozilla, even for rendering standard web pages, doing things like CSS selector matching in parallel, great ideas that are really hard to implement properly in C and C++. And so, the goal with Rust is to provide a better infrastructure for developing the next generation web browser that Mozilla can then deploy in the future for its users.
2. Essentially, it’s defined a sane C and C++?
That’s an interesting way of putting it.
Werner: Just be sure: I said that, so you don’t get any flack from C++ users.
No, we are very interested in appealing to the C/C++ community, we are all too aware of how it’s important to make sure that we don’t scare off those people, which includes me, too much. I am a C/C++ developer too, so that’s important, but at the same time to say it’s a sane C/ C++, if you’re going to make a new language, there is also this desire to not just fix the warts, but revisit some of the basic assumptions and one of the goals for Rust was to say let’s find all of the interesting programming language ideas that are old and were proven back when they were invented 15-20 years ago and put them together in a sane way in a single language. That was the original goal with Rust, it wasn’t developing a new web browser, that was the driving force of the original design when Graydon Hoare, the inventor of Rust, was first working on it.
So, I wouldn’t just say it’s a sane C/C++, it’s more something that is trying to take all the good things from the functional programming community and also the good things from the communities using the actor model as their basis for computation so the Erlang language, the language and runtime and model, was a big influence on some of the design choices that were made in Rust and also C/C++ in terms of there were some good ideas there that we want to keep. A really interesting one that’s taken me years to appreciate because I didn’t grow up on C/C++, or I didn’t grow up on C++ I should just say, is the structure semantics and actually having resource acquisition as initialization. That kind of model is something that was really pioneered in C++ and we decided that was going to be part of Rust’s model as well. So, we’re happy to take inspiration from places like that, too. So we’re trying to find a good mix of a lot of great ideas. And another fun one that is just a personal favorite of mine is we also have a macro system so that people can do syntactic extensions, but unlike the C macro system, that’s just a very simple-minded ………
Werner: Text expander, essentially
Exactly. This is a hygienic macro system in the sense of Scheme, with a little more power to it in terms of being flexible about parsing more interesting grammars than just parenthesis languages.
Werner: The problem is there are so many interesting things, I'm trying to figure out what to start on first, you intrigued me with the macro system.
But I also should say there are a number of things in Rust that are in development. So, let me put it this way, here at CodeMesh one of the ideas is that languages are undergoing a lot of innovation and even now you see a lot of new languages coming out and I feel that Rust itself, just the one language, is undergoing a lot of churn. We’re trying to settle in on a good place for 1.0, a place where we can say you can make code that’s backwards compatible with this, and because of that we are taking certain features and marking them as guarded, as if you want to use this feature you have to opt in to using it because we don’t guarantee it’s going to have the same semantics in the future, so don’t really get rid of it, we just don’t want someone to think they're making backwards compatible code. So I would rather stick to the things that are on the backwards compatible list for 1.0 and unfortunately macros are not one of those things.
Werner: Macros are ...
They are going to be in the language. Syntax extensions are going to be in the language in the sense that you will be definitely using the macros that are provided by the standard library for certain kinds of coding patterns, things like our print line routine is a macro, so printf in C, you get a template string and then a series of arguments and woe is you if you didn’t actually have a match-up between the parameters in the templates string and the arguments that you fed in, that’s your problem and Rust is completely different. The print out routine that is analogous to printf is actually macro that analyzes the template string you provide and makes sure there is a correspondence between the parameters within the template string and the arguments you get, not only in terms of the number matching up, but also in terms of types. You have to make sure you provide the type in the slot that is compatible with the type that is specified for that part of the format string. So there are things like that that are definitely going to be in the language, there is no way that’s going away, we are making heavy use of macro technology and that’s going to stay, but user defined macros are the thing I don’t want to go on talking to much about because they are not going to be part of the official backwards compatible feature set for 1.0, just in terms of the syntax, the use foe defining them and what their uses what are.
Yes. So, you are going to keep going down this path, I guess.
Werner: Just a quick overview.
Yes, I know. The answer is pretty easy, the macro definition that you write as a user you have to specify the grammatical class for the arguments that are being fed into your macro, there are different cases that you can define for a certain keyword, macro keyword, every macro keyword is given its own special unique token that is globally understood in the compiler, it might have scope in the end. The important thing is you can tell in the parser “here is a macro, I am going to go look it up and I will find out what the possible non-terminals are that I am going to go look for inside of the macro”, so that’s how we get the ability to actually let the parser make more progress there in terms of it knowing that it’s supposed to look for an expression or identifier or a type expression, for example. So, yes, that is part of the macro system, but the technique that we’re using for doing it is currently, the statement is that it’s an Earley parser, it’s an old general context free grammar parsing technique.
Werner: Sort of like PEGs.
No, PEGs aren’t the same because they don’t handle arbitrary context free grammars, they are very different grammar specification entirely. This is more like, I forgot the name, there are three different classical context free parsing algorithms and my memory is that they are all super linear, they are all not that great, they can’t be, and Earley is particularly interesting because I believe it is cubic in the general case, but if you give it an unambiguous grammar is then quadratic. But there might be other parsing techniques that also have that property. Anyway, the point is that in name it’s supposed to be an Earley parser, but I haven’t looked at that code and I’ve heard it said that it’s not actually an Earley parser, it doesn’t actually fit those asymptotic time guarantees in general. So, I don’t want to say anything one way or another because I am just not sure, it could very well be Earley or it might not be. But we definitely have things in the parser that know how to deal with these structures.
That’s a great question. It’s an interesting problem in Rust because we allow user defined structures. And as an example of the kind of decision that was made here is that we adopted Erlang’s task model where you spawn off parallel lightweight tasks that will then be multiplexed on to whatever hardware you happen to have and the important thing here is that when a task fails it’s supposed to bring itself down properly and that’s where destructor semantics come in in terms of we have stack allocation just like C / C++ and so when you hit a failure somewhere deep within your stack of your task, it needs to actually unravel the task properly, the stack of the task properly, and run destructors properly, that’s fine and it’s a model for encoding a lot of interesting patterns, mutexes that you take by just declaring that you’re grabbing the mutex by putting it on the stack and either you return normally or you’ll have a failure and have to unwind, either way you are supposed to naturally unwind and release the mutex when you run its destructor. There is a subtle detail here in that what you do tabout failure within a destructor and there are a couple of different stories about that but we haven’t finalized, there is one thing specified in the language definition that hasn’t been implemented yet, so it’s not clear whether we will actually go through with that or not. But the essential idea is come up with some sane semantics that involves doing the best job we can at a graceful failure and running user-defined destructors as necessary.
Werner: These things are destructors which means they are deterministic, unlike finalizers.
That’s right. That is actually another interesting topic for me, I’m a GC guy and I really grew up on some of this stuff. The current situation and the current implementation of Rust is that you can have finalizers on automatically managed data, I didn’t mention this before, I should probably mention this right now, Rust has a nice mix of different kinds of objects that you can create, you can do stack allocated objects like I already mentioned and you can also allocate objects on what we call the exchange heap where an object can be transferred from one task to another and for those objects we require an ownership semantics where when you allocate an object it belongs to that task and there is a known owning pointer for that object and then if you transfer that object to another task, the owning task has to send it along and relinquish its owning pointer and say “I no longer have any references” because the owning pointer is the one that defines the scopes for any other possible pointers to that object in terms of saying “We all have to be within that lifetime, the dynamic extent of that owning pointer.” So, once you transfer an object on the exchange heap over to some task then you get a static guarantee that the task that had owned it no longer has any outstanding references to it.
So, that’s ownership semantics for the exchange heap. There is the obvious downside of this which is that you can only have one owning pointer and certain kinds of algorithms are difficult, if not impossible, to encode using that style of programming. So, Rust offers automatic memory management as well, however the crucial thing here, that’s very much like what Erlang does, is that the automatically managed memory is task-local, you cannot transfer things that are allocated on the garbage collected heap, cannot transfer them from task to task. What this means is that we don’t need to have a concurrent garbage collector, we don’t need a garbage collector that’s in charge of collecting from all the heaps for all the tasks, instead each task has its own garbage collector which means that if you write a piece of code that was highly optimized and was designed to not incur any GC pauses at all by not allocating any objects in the GC heap, that task will never invoke the garbage collector, no more pauses for you, at least not those kind of pauses. I can’t speak for whether your destructors will run in finite time, but that’s your problem and you have control over that, versus someone who writes code for a different task and this one doesn’t have the same latency requirements, they can choose to use GC'd objects and for them,they will get occasional interruptions from the GC on their task during the time it’s scheduled to do that task and thus it shouldn’t interrupt the flows of other tasks.
So, to bring it back to the original question, which was about finalizers, right now I have to admit the current Rust implementation does not have a tracing garbage collector, what it has instead is reference counting, that is not part of the language definition, it’s just an artifact of the implementation strategy chosen at the time for those features and as part of that the reference counted objects can be things with destructors because when the reference count goes to to zero, you run the destructor and then for cyclic data structures within a task, we run their destructors when the task itself dies, then we have the remaining things that are still on the GC heap and we go through it and run all of the destructors. That’s the current semantics that one sees with the current implementation of GC pointers which are another thing, I mentioned before about things that are feature guarded, macros are one of them, another thing that is feature guarded are GC objects because we know the semantics are going to change for those, we know the syntax for how we work with them is going to change, so you have to opt in to use those as well.
My hope and plan for where we will go for finalization is I would like us to see us adopt something more like reference queues or what’s known as guardians in the Lisp and Scheme community where instead of having an asynchronous finalization that is happening in an unpredictable manner which means you have to deal with the potential for interleavings and concurrent behavior between you and the finalizers the things that are being garbage collected. We’re trying to avoid that entirely, we’re trying to have clean task separation, so instead of having asynchronous finalizers, what I’d like to see us do is to have these references queues and guardians where instead what you do is, when you allocate an object you register it to be on some reference queue somewhere and then the task implementer is in charge of actually polling their reference queues at whatever rate they choose to use for their application, you can have multiple reference queues that way you can put different objects that belong to different kinds of categories in different reference queues and thus poll them with different frequencies. This is a really important feature because you don’t want, there are certain kinds of algorithms where it's really important that you run the destructors quickly and there are others kinds where the destructors might take a long time, it's not as important that you finalize them immediately, so separating this way is a really important idea.
I’ve seen it used pretty successfully in Scheme systems myself and it’s one of those things where I am not as familiar to how it’s used in Java, in the reference queue system there, but my understanding is you can get something like this going in Java, it just isn’t always done because Java has finalizers anyway. So, my hope is I would like to see us not have to work with finalizers just because I think they are too much of a burden for the programmer to try to work with and understand, it would take us back to the problem I was talking about earlier, about how it’s too difficult to innovate and maintain a system with certain kinds of features and I would think that asynchronous finalizers are an example of something that’s really tricky to work with and it should make it hard to opt in to use them, there should be a better easier options first and I think guardians or reference queues would be I think a good option for this, but we haven’t implemented it yet, so we don’t know what the real world story is about that yet.
The model as originally envisioned was to have M:N threads, the original goal was to have really lightweight tasks to the point of what Erlang provides and with segmented stacks that grow on demand and so you just make a really small stack when you first create a task and have it grow its stack as it needs to, but there are a lot of different people in the community and a number of them have pointed out that the overheads that we incur with this strategy seem bad enough that we should be considering other things. So, we had a bit of debate about this amongst our team and at this point there have been a couple of shifts. One is that we’ve abandoned segmented stacks, I don’t remember the exact API that we have instead now for stacks, but I believe is something along the lines of you have to say upfront how big a stack you want and when you hit the end of it you’ll just get a shutdown failure that is still clean in the sense that you’ll get a controlled shutdown, not just a segmentation fault that aborts everything.
The goal is to have something that cleanly shuts down in the case of stack overflow, but really what we would like to do is not have stack overflow at all or rather give the user the ability to say “Here is what I expect the stacks to be like”. We already have some analyses in place that try to insure that if, for example, you call out to a C function in the native code that you do it from a function that offers enough stack for that to be a reasonable thing to do. So that’s one way things have shifted a little bit away from the idealized model I mentioned earlier. The other way in which things are shifting a bit is that there is also a push to offer 1:1 threading, people who really want the model to be you fire off a task and it gets a dedicated OS thread or some way of expressing that it really is an isolated thing and you won’t get scheduling within the system itself.
The plan there is basically we still want M:N threading, we think that the reality is that users don’t want to have to think about how many cores are actually on their system, at least that’s not my mental model for how to do things, and there are a lot of situations where it isn’t even realistic to try to do that kind of decision statically, you’ve got dynamic behaviors in the system that you have to respond to dynamically in terms of resource consumption. So because of that, my take on this and I think many of the teams’ take on it and the community’s take on it is that we want M:N, but we also want 1:1, so the goal is to actually support both well, we would like to have it be something where our standard library can be deployed in a 1:1 setting and an M:N setting, that’s the goal. I don’t know exactly how to get there, there are some people who are thinking very hard about it and I am hoping they will come to some good answers about how to get there. So my answer is sort of a non-answer in terms of is it M:N or 1:1, the plan is to support both. It would be nice if the M:N part was layered on top of the 1:1 part if we can make that work nicely, but I don’t know what the reality is going to be there.
Werner: I guess not committing to certain decisions is important for you because you need the flexibility, for your programs to run on some embedded software where you can’t tolerate high latencies, but also for desktop machines.
It’s very true, even for Mozilla itself, alone, we care very much about deployment on desktops but also on mobile devices, we care very much about the mobile space and having Firefox and Android and Firefox OS that’s been recently released. The next generation browser project, Servo, is its code name or official name, I don’t know how to classify that as at this point, but the plan there also is to have that deployed on both desktop and mobile targets. So, for Mozilla, you are absolutely right, we care about handling both of these cases as best we can and having the flexibility to make different choices. But also there are people that are doing things with Rust that Mozilla itself isn’t trying to actively pursue internally, things like writing OS kernels, or writing a whole OS, these are things that we would like to think that Rust can work in these contexts and I don’t know any immediate reasons why it couldn’t, it’s just like C in the sense of it, if you do it the right way it will compile to some object code, you can tell it “I don’t want the runtime system, I am not going to use your garbage collector or your other facilities that your runtime system offers”, you can get object code out of that. So, because of that, people are deploying Rust experimentally in a lot of very interesting contexts, so all the more reason why you need to have flexibility about these decisions, about what kind of threading model it’s going to have.
We have traits and we have structs. Our model here, we might have classes in the future, there are discussions here in terms of that whether we made the right decisions in terms of just names of certain kinds of patterns, but the important things is for my definition, object oriented programming is all about inheritance and dynamic dispatch, having virtual method tables that you then call methods on.
Werner: So, polymorphism, essentially.
There are different kinds of polymorphism, is the question here, and in ML or the Haskell situation, let’s just say ML at least, the polymorphism would be parametric polymorphism and for Rust it has that kind of polymorphism as well where you can have type parameters on your methods or on your functions.
Werner: So, that’s generics, essentially?
Exactly, generics, and the point here is that you get static dispatch on generics at least with our implementation strategy, we’re using the C++ style of implementation for generics, not in the sense of how we typecheck them, but in terms of how we implement them, in terms of treating them you just expand out into the specialized code and let the compiler go haywire, trying to use as much knowledge as it can about the particular types that you instantiate with. So, there is some code bloat associated with that, that’s life. It’s not like C++ templates in terms of the type system model though, in particular we’ve been very careful to try to adopt a model where if you define traits or a parametric class, that’s got some sort of type parameter, we’ve been very careful to ensure that any method you call on that type parameter, the trait or whatever kind of thing that is parameterized, you have to say a bound up front that basically tells it “Here are the methods that I am allowed to call on this type parameter”. I believe it’s called Concepts, it’s a relatively new thing in C++.
Werner: There were supposed to be in the next version, but they were kicked out because they were too complex.
I see. By next version you mean C++ 14?
Werner: No, C++ 11 I think, I don’t know about C++ 14.
It might be in C++ 14, then. Java has them, it’s called f-bounded polymorphism, the idea is called f-bounded polymorphism, so we have that, you get that upfront detection of whether your template or trait, your parameter function makes any sense, as opposed to C++ where that discovery might be far delayed from some random instantiation that you used down the line. So, it’s really important to me to distinguish between the implementation strategy in terms of the compiler versus what the type system is actually going to check for you. So, we have static dispatch, static parametricity in terms of writing code that’s generic over a type, but that doesn’t give you enough expressiveness, in particular, something that’s a static method over a list of T, where T is the parameter, you have now forced that list to have the same T in all of those entries in the array, or list or whatever. That’s the usual way of interpretation of this, it’s different in OO languages where you have a subtype relationship or you might have a subtype of T, but the simple model ML has, you have a list of T you are forcing everything to be a T and you can express that in Rust, but you can also, very importantly, express the idea of “I have a list of pointers to to some trait definition, let’s say, what’s a good example of a common trait definition?
Werner: Like an iterable?
Yes, or let’s say observable, something where you can fire off observer responses or something like that. So, that’s a great example, actually, something where you tell people that are registered with you as observers, you want to tell them that some event happened to you, if you try to encode that using a static dispatch generic system, you’d be stuck with a list of observables where it’s all supposed to be the same instance of the observable type, some O that implements observable, but it would be of the same O and the important thing here is that now we have pointers to observable where there is actually a vtable that you can use to do the dispatch which is really important because yes, it’s dynamic dispatch so it’s slower because you don’t get static dispatch, but the whole point is that it’s more expressive, you get to mix and match the different kinds of objects that all implement the observer trait. So, it’s got OO in it, I like to focus on the FP style stuff myself because I think of myself as a functional programmer first, so I really enjoy things like it’s got an expression oriented syntax, it’s got specialized syntax for things like record update for replacing a single field, if you have a structure with many fields and you want to make a new one where you can change just one of the fields, it’s got a specialized syntax for that but it doesn’t involve rewriting all of the same members all over again, it’s just saying copy from the other guy, so things like that. That’s something that’s pretty standard in certain functional programming languages and it was nice to see it here. Even simple examples like if expressions are actual expressions in Rust, you don’t have to resort to ternary question mark, colon operator, it’s actually a proper word.
Werner: It’s how it’s supposed to be.
That’s right.
Werner: Take that, Java. So, I guess we could say, let’s call it functional, since we’re at CodeMesh.
The important thing, though, is it really does try to draw from all these different kinds of languages. When I first learnt Scheme the claim they made was that this language is not just functional, it’s totally impure, it’s got impurity all over the place, you can write imperative code in it, you can write functional programs in it, you can write object oriented programs in it. Some variants of Scheme do better jobs than others at actually providing good libraries for doing object oriented programming, but it does exist. So it would be a mistake to call Scheme just a functional programming language and I would like to think that Rust is in a similar sweet spot in terms of providing a lot of nice inspiration from these different paradigms and giving you access to them.
7. So it’s not dogmatic OOP, not dogmatic functional, it’s the best of both worlds?
Yes, I think so. There are people that wish there was rustic way to write things, but I certainly haven’t internalized that myself and I am one of the developers.
Werner: I guess you have to wait for 1.0.
That’s right, once 1.0 is out, maybe then I will actually know what rustic means, but for now I am still in the mode of playing with all kinds of different paradigms and see what they look like in the language.
The website is rust-lang.org, there is a GitHub repo with the source that you can grab and build yourself, we are probably going to start having nightly builds available soon. We already have, the current model is you grab the source code, you run a brilliant little configure script that actually does the magic of downloading, self-hosting, I didn’t make that clear before, otherwise this isn’t going to make sense, Rust is built entirely using Rust right now, the compiler is written in Rust and thus it needs a version of Rust in order to compile itself and so that means we need to have some way of getting the whole thing bootstrapped and the way we do that is by having a snapshot server where you download snapshots. But you don’t download, the beautiful thing here is the configure script is actually smart enough to inspect the state of the repository and say “I know which version of snapshot I need for this version of Rust”, if you go back in time with Git for previous version, it will grab an earlier snapshot as necessary. So, you can go grab, clone the repository, I believe there are instructions for it on the website, run configure and make on your system and then you’ve got a Rust compiler.
There is a chat room with people that are really eager to help, I am actually super-impressed by our community and I think that that’s in part because Graydon and the rest of the team worked hard to impose a culture there that is so respectful, with rules in place in terms of trying to limit the numbers of flame wars that arise, because when you are working on a language people are going to come in and complain about syntax or say wait a minute why should I use this instead of Scala, or instead of Go, and these conversations are permitted, but as soon as it starts turning into a flame war, the moderators come in and say “Cool it”. And I think things like that and a general atmosphere in the community of really wanting to succeed and trying to help people write really good Rust code and understanding the interesting type error messages they see, because of that ownership semantics that I talked about earlier all kinds of interesting type system problems that come up and the error messages are I think they are pretty good considering how advanced the type system is, I’ve certainly seen far worse error messages from compilers in my time, but they are still daunting at times, even for experienced developers.
Werner: Well, I think our audience is not easily daunted and they are all going to check out Rust, but only if they know how to behave and thank you, Felix.
Thank you, Werner.