BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Interviews Neha Narula on the Latest Research in Databases, Transactions, Distributed Programming

Neha Narula on the Latest Research in Databases, Transactions, Distributed Programming

Bookmarks
   

1. We are here at Craft Conf 2015 in Budapest. I am sitting here with Neha Narula. Neha, who are you?

Hi. My name is Neha. I am a Phd candidate at MIT, I am getting a Phd in computer science and I focus on systems, specifically databases in the multi-core and distributed systems contexts. I used to work at Google as a software engineer, before I went and got my Phd, and I worked on a bunch of projects there. The last one was a project called Native Client, known as NaCl.

   

2. So, NaCl. What is the basic idea behind it?

The basic idea behind Native Client is that the languages that we have right now for writing web applications are not sufficient for performance. So, JavaScript and HTML are amazing because they are ubiquitous, they can run on any system, they are a standard, they can run on any browser and you can change the code any time you want and it always gets reloaded from a server. So, JavaScript and HTML have enabled a lot of innovation on the internet which could not have happened if we just had stand-alone applications we had to download and install. However, at least up until a certain point, JavaScript was extremely slow and that has gotten a little bit better with things like V8 and the work that Firefox has done. But it used to be that JavaScript was very slow and if you wanted to do something that had really rich, 3D graphics as an example or needed to do a lot of physics rendering, then it was difficult to get that kind of performance out of JavaScript and therefore, the kind of applications you could build and run on the internet were somewhat limited.

So, Native Client was an effort to get native performance through the browser. The idea was that you could write your programs using our tool chain – C or C++ - they would be recompiled in a way that they would run safely on another computer. So, part of the reason why you do not want to just go running any extra code willy-nilly, that you download off a web page is because it could be malicious and JavaScript and the browser have a lot of built-in security to make sure that your computer can’t do something bad. This is compromised all the time, but at a high level, JavaScript and your browser are supposed to kind of provide a container. We wanted to execute outside that container. So, Native Client was about doing this safely and getting really good performance.

   

3. So, how is it different from Active-X? What makes it safe?

That is a great question. My understanding is that Active-X is proprietary. You can’t write in C and C++. Again, you have to use your own language. The idea here is that there is a lot of code, a lot of graphics libraries that have already been written and we want to be able to use that large amount of code in new applications for the web. That is one of the major differences, I would say. And we had a somewhat new way of ensuring security which was by analyzing the code using our compiler and running a sandbox around it, in the browser.

   

4. How is the sandbox implemented? I think I read something, it is using segmentation tricks and the CPU's memory model?

Well, I can’t say how it is implemented today because I worked on it a long time ago, but when I worked on it, we started out with a sandbox that actually used Ptrace and traced the process and did not allow it to make certain system calls. I think that since then they have moved to a better model than Ptrace, because Ptrace is kind of terrible, but I am not exactly sure what they are doing right now.

   

5. Let’s move on to your current work. You are in Academia. You are at the CSail Institute?

Computer science and artificial intelligence laboratory.

   

6. So, what are you up to? Is it more computer science, more AI?

I work in systems, in computer science, and my area of focus is really on databases and transactions and how to make them go faster in different contexts. That is what I work on!

Werner: But databases are a solved problem. I just get MySQL and it works.

If only that were the case! Actually, that is something that I say to a lot of people. I have gotten this question a lot. I get the question “Which database should I use?” So, my friends are starting out, they have a start-up, they are working on their own side project, whatever – which database should I use? And I tell everyone: “Use MySQL or Postgres. They are very well tested, they have been through a lot, they have been used in a lot of different applications, so just set one up and go”. The problem is when you start to reach the limits of what MySQL and Postgres can do. They are designed to work best on a single server, a single machine and when you want to have things like fault tolerance for your data, or scalability, or geo-replication, then you need to start thinking about solutions that work well on many machines and that is when things get really complicated.

   

7. What is the problem with replication? Is it just a delay to copy data over?

Replication is really important for fault tolerance, obviously. What is challenging about it is that there is basically two main ways to do replication: asynchronous and synchronous. So, asynchronous replication means that in the background, you copy over writes to the backup – you might think you have a primary and a backup and as writes come into the primary, the primary is in charge of ordering those and making sure things happen correctly and then it just sort of replicates to the backup, offline. The problem with that is that if the primary fails, it might be the case that you lose some data. It has not made it to the backup yet, right?

It is also difficult to negotiate who is the primary between those two – that can be very tricky. Synchronous replication is a lot nicer because you ensure that all servers have that data before you even say “Yes, it’s OK. This query succeeded, this data committed.” But that makes everything slower. It increases the latency, it reduces your throughput. That is the problem with synchronous replication! So, there is a wide swath of work that is all about how do we do this faster. If we do asynchronous replication, how do we recover from that, in case we do lose data? So, there is tons of work in this area.

   

8. Has this fall into the consistency topic or is it a different area?

Yes, it is all kind of related. So, I think, ideally, as developers, when we write our programs, we want to think about them running sequentially, like one step at a time. I write a function, I call it, it executes, I read a piece of data, then I write it. Those happen in the right order. That is the best way to think about writing your programs. And if that was good enough for a performance, we would just run all our programs that way. That would be ideal – that we would run everything on single processors, everything would execute one at a time, life would be great. We have not been able to do that for a very long time. So, especially now, getting more performance out of our systems, involves parallelization, it involves being able to execute things in parallel and when that happens, order goes out the window. So, these two things are kind of fundamentally at odds. It is difficult to reason about things happening at the same time, but you need to do things at the same time to get good performance.

   

9. When you say parallelization, do you mean that on a larger scale just doing tasks like indexing in parallel, or at a CPU scale?

That is a great question. I think that this applies at different scales, actually, and that is what is so fascinating. One thing that is really interesting to me is how multi-core systems are related to distributed system actually. So, yes, there is definitely running things in parallel on a multi-core system, so you literally have several different cores and you want to be able to execute CPU instructions simultaneously on the different cores. There is also getting parallel performance in a distributed system, which means you add another machine and it is able to do work in parallel with the other machines so you don’t have coordination or synchronization slowing things down.

   

10. So, you have a paper called “A multi-core DB is not a distributed system”. How does that relate?

There has been a lot of research in the operating system community which equates a multi-core system to a distributed system and also in the database community. There have been a lot of papers that advocate viewing a system with a lot of cores, a lot of CPUs as a distributed system and running a database node per core. So, assume each database node is actually just single threaded and each one is running on its own core and then use whatever technique you would use in a distributed system to coordinate between them. So, this is tremendously expensive and it is unfortunate, actually, that people think that this is a good idea because our hardware is very carefully designed.

Companies like Intel and AMD put a lot of effort into building really fast processors, fast memory doing all sorts of crazy kinds of caching, and branching and pipelining and all these insane techniques. Really, we have shared memory. Shared memory is pretty fast compared to sending network RPCs to a different processor, a different server. So, what that paper was about was kind of saying “Hey, let’s stop pretending like they are the same thing and actually build real systems and look at what works in a multi-core system versus a distributed system for databases”.

Werner: That is interesting. So, what you are saying is that we should make use of the shared memory and the shared memory capabilities rather than going with the dogma.

Exactly.

   

11. OK. It is interesting that in the industry, there is this trend, there is some research operation systems like Barrelfish a few years ago, that is going “Let’s treat each core as a system and just communicate by caches” Do you have an opinion on that?

Yes, I know that paper. Research can be very forward looking so, definitely, in the future, we might have machines that look more like distributed systems and where techniques like that might make sense and we have to start using techniques like that. I do not think we are not there yet. So, I think that right now, we should not jump the gun with the systems that we have, the hardware that we have.

   

12. So there are systems, research like Barrelfish that tries to treat each CPU as a system that only communicates via caches. What do you think about that in that context?

Sure. So, there is a lot of research in this area and I know that paper. It is really good work. So, I think that in research, we have to be really forward thinking and it is possible in the future that single server machines will start to look more like distributed systems and cache coherency will be something that we cannot rely on any more and it will become too expensive to implement. But, I do not think we are there yet.

Werner: So, we should thank the good people at Intel for giving us all these powers.

Exactly. Thank you for building really complicated CPUs.

   

13. Well, that is a good point. So what if ten years from now we are all running some simple ARM cores? Will that change the equation, do you think?

Yes, absolutely. We are already seeing some interesting research about using different cores on phones. So, in the mobile space, this is particularly important and power is really, really important. So, often, there will be one beefy, very powerful core that sucks up a lot of energy and then maybe a much, much weaker, low energy core and what you can do with your phone while it is on standby, you can put the big, heavy, expensive one to sleep and then you only wake the little lightweight one every so often to check and see if there are any notifications or messages or things like that. So, this can be great for power consumption. The problem is that these two cores look very different and as such, most traditional operating systems are not designed to deal with very, very different heterogeneous hardware and so, how do we handle this? What is the right kind of design to use? What are the right kind of abstractions?

   

14. Another one of your papers or your talks was “Phase Reconciliation for Contended In-memory Transactions”. So, let’s unwrap this title.

Ok. It is a very long title. So, this paper was about building a database on a multi-core system and the idea was that there had been a lot of research recently about how to build a fast database for multi-core, how do we remove all the different sources of conflict between the different cores, in general, in the database, in the general database structures. And that work was very effective at what it did. The problem is that oftentimes, data access is skewed so what is actually happening is that you have transactions that conflict on data. So, for example, if an auction item is extremely popular, a lot of users are bidding on the same item, aggregates, things like that, really popular tweets, get a lot of retweets and favorites – those are examples of when this happens.

This type of contention can be brutal in a multi-core system. So, we were looking at ways to remove this as a source of conflict and so we built this database that used a technique called phase reconciliation – that is what we named it and the idea behind this was that by leveraging the operations and side transactions and really looking at what kinds of operations were happening on records and using the semantics of those operations, we could execute more of these operations in parallel and get better performance from multiple cores.

Werner: So you distribute them across cores.

Exactly. We distribute the transactions across cores and these are transactions that would normally conflict on the same data and therefore would require locking between all the cores, but we managed to split up the operations so they do not conflict. We do them in phases, essentially.

   

15. So, can you give an example of phases: inserting something, or?

Sure. Increment is by far the simplest example. So you could imagine that you had all these transactions that were doing something fairly complicated and at the end, they all incremented the same counter, let’s say. So they were keeping track of the number of these transactions you have run. Normally, these transactions would all conflict on that counter and so, if you were using a concurrency control algorithm like two-phase locking, then the transactions would block, waiting to get a lock on that counter. If you were using optimistic concurrency control, then a lot of those transactions would abort because they would check –during validation they would see someone else had already written the counter.

So, basically you would get a lot of slow-down in running your transactions. They would all serialize on this counter. So, even though we might have 32 – 64 cores, you are not getting that amount of performance because a lot of your cores are stuck waiting and so, the idea behind this is that, actually, if it is a counter and we are doing increments, we can do those increments in parallel. We don’t really have to serialize on those increments. So, we are using that insight – how we build a database that takes advantage of this and does things in parallel and gets good performance.

   

16. So, basically, finding the components or the tasks that can run in parallel or are commutative.

Commutative – that is the right word. It is about taking the commutative operations and running them in parallel.

Werner: OK. Commutativity is always good. Everybody in the distributed world likes it.

Yes, commutativity is fantastic. It makes things a lot easier. One of the key things behind this work is marrying the benefit you get from commutativity, pulling out the commutative bits with serializable transactions. So, everything does not have to be commutative. You could still do many non-commutative operations like reads, reading data you just wrote as an example. But, we can sort of extract out the parallelism of the commutative operations in the transactions.

Werner: As I like to say in these interviews, math works.

Math does work, yes.

Werner: Or Maths, for British people, I think. Is it Maths in America or?

It is “Math” in America and “Maths” in the UK.

   

17. Ok. So, the audience can pick and chose. So, we are talking about transactions – are there any other challenges in transactions that you see?

Yes, I think this area could benefit from research, actually. I do not think we have explored all the different ways that we can get better performance. So, I think that right now, a lot of people are settling for systems. Developers are settling for systems that do not really give them really good semantics. So, it can be very difficult to understand the results that they are getting out of their database as an example. So, if you use a database and you do not use the strongest isolation level, if you use a distributed database and it does not support transactions, then it can be actually really hard to write your code because you can have a lot of things happening in parallel, a lot of things happening concurrently and they can interleave in all sorts of ways which is just extremely difficult to reason about. It is like a factorial explosion of interleavings, right?

So, concepts like serializability which is the idea of transactions in databases, serializable transactions, give us an illusion that the code that we write is running one at a time, that it is the only one running in the system and that is so much easier to reason about as a developer. We do not have to think about concurrency and interleaving. So, I think that this concept is really important. I think that actually, a lot of people think they do not need serializability but end up writing code and finding bugs later that happened because weird interleaving has happened they did not think about. So, I think that if we can program with strong consistency guarantees, everything is great.

We will have much fewer bugs, it will be much more reasonable to write code using stateful databases. The problem is it does not always perform well and so it is a constant battle to try to figure out how we can get better performance out of our systems. So, at first, we jumped from the SQL to the NoSQL world which was a big jump to get better performance while giving up everything. So, it was this idea that the whole database is expensive, so let’s just give it all up and use NoSQL. And then in more recent years, I think, we have been coming back from that and we are realizing that actually, there is a lot of good things about SQL as a language, but also about databases and transactions and that we should think about how to do them efficiently. So, I think there is going to be a lot more work in this area. There is still a lot more performance to get.

Werner: I guess the pendulum is still swinging back.

Yes, the pendulum is swinging. It always does that. It goes from one extreme to the other.

Werner: That is a lesson to learn. Never believe that a pendulum is just going to fly off in one direction.

Yes.

   

18. So, to wrap up. If you could pick one project or one research area to watch at CSail, that is reasonably interesting.

Oh, that is really hard.

Werner: You can get two.

I can get two research areas? So, this is actually pretty difficult because as a graduate student, I have been pretty heads-down recently. So, I haven’t been looking at other people’s research so much. One area that really fascinates me is machine learning and just all of the new techniques that are coming up in that area with natural language processing, with understanding the deluge of data that we are all dealing with. And also, machine learning and systems. So, combining those two areas because a lot of machine learning algorithms get better the more data you throw at them. But, it is difficult to build systems that can operate efficiently on large amounts of data. And so, how do we build systems that are optimized for machine learning algorithms? I think that is going to be pretty exciting.

   

19. It is definitely an interesting area. So, if we want to read all your papers, where could people go?

I have a website and it is just my first name and my last name – http://nehanaru.la - that is the name of my website. You can also find it on the MIT CSail PDOS web site – so PDOS is the name of my group at MIT and there is a link to it there as well. I am also on Twitter at @neha.

Werner: OK. And just to get people to go there, I think that we can also find out about your adventures in Iceland.

Yes.

Werner: So we are not going to tell them about it. They have to go to the website.

Yes, you can hear about an art project we did in Iceland.

Werner: So everybody go to Neha’s web site. Thank you, Neha.

Thank you.

Jun 08, 2015

BT