Hi, Ralph. I am Paul Fremantle and I started a company called WSO2 about ten years ago and did a lot of opens source middleware and over the last few years I got really interested in how that intersection of middleware fits with the Internet of Things and I started doing a PhD about a year and a half ago, so my interest is how to add scalable security to the Internet of Things, especially the backend clouds that support devices and the connections between those.
I think there is a huge amount of buzz and hype, of course, and I think that is going to play out and we are going to see a rump of things. I think you are right, of course, there had been SCADA, for instance- supervisory control and data acquisition is an old fashioned term, and then M2M and now here IoT, and so this is not strictly speaking new. I think there are three factors playing in this. One is the reduction of costs in devices which has changed the economics of this. I think the second is that previously a lot of systems worked on private networks, in private ways, and now people are opening up those systems to the internet and building much more publicly available and more flexible open systems. And the third one is I think that there is new movement of open source hardware and hackable hardware; with Arduino and all the things that followed since Arduino that have created a different level of entry into creating hardware, and of course 3D printing and new manufacturing technologies are also changing that. So, instead of needing a large enterprise that knows how to manufacture stuff, you can prototype something very simply, you can send it off to China and have a sample built for you in a week and couriered back to you. So there is a different barrier at the entry which is creating a new creative atmosphere and a whole lot of new projects and new opportunities. So, I think that confluence is creating a different environment in which there is something new happening, even though it looks similar to some stuff that’s happened before.
Ralph: So, if some kind of electronic device is now say more than $300, there is no reason why it shouldn’t be connected anymore, not from a cost perspective.
If it’s more than $3 there is no reason not to be connected anymore. There are small chips that cost $3 and have Wi-Fi and full connectivity on them that you can embed into your device. And that’s $3 at the retail price, that’s what you can buy this on eBay from, so the high volume OEM is much lower. So the change in economics is very important, yes.
I do feel that very strongly that the hype around IoT is about wearables, it’s about fitness trackers, it’s about watches, I am not personally into those, I have a smartwatch and I ended up not wearing it. I see a huge growth in industrial, in governmental, in public infrastructure. For example, one of my customers at WSO2 is a company called Trimble, they do GPS navigation, they have devices in tractors, in all sorts of systems. They envisage a world where every plant will have its own custom watering based on monitoring that device connected through the internet into a central system because someone at Princeton university has found out that if you water a plant precisely throughout its life based on that plant, you get 20% more yield out of it. So if the cost of the device to water it reduces enough, you can do that. Of course, they obviously don’t live in England, where I am, because here we have no control over how much watering happens to plants, it just happens anyway, in the US they can do this. But there’s an industrial agricultural example, there are people fitting IoT devices to cows to monitor when they are on heat because there is a real value proposition around that.
So, I personally see while the hype is very much about personals and wearables and such, I think it’s those industrial cases where the cost differential is suddenly going to make it massively more cost effective and build real business cases about monitoring and connecting plants, machinery, tools. Another example is a company Hilti, who are one of the top tool makers in the world, they are making smart tools. And one of the interesting things there, that you are seeing in many industries, is a shift that IoT is enabling from selling a device or selling a system to leasing it. So changing from a capital expense to an operating expense. And another example there is a company in Dubai called Pacific Controls who manage all sorts of buildings, security, all around the whole of Dubai. And for example they are talking about changing the business model from selling an elevator to renting it based on how much it goes up and down, so usage based tracking, usage based monitoring and IoT is enabling this shift from a sale to a rental model.
So Hilti, and other tool companies, are shifting from selling a jackhammer to setting up a rental counter on the construction site and the construction workers come and check out a jackhammer and then they check it back in. Now, if it goes missing, before if you sold a jackhammer to a construction company when you walk about their construction site, then that’s great, you’d sell another jackhammer, now it goes missing, you pay for that. So, these kind of economics are changing very much and it’s similar to software people who noticed the change from buying hardware to cloud has been based on a technological shift, it’s exactly the thing with IoT that we are seeing in industrial cases.
4. It’s more like pay as you use and pay how you use?
Exactly. With the sensors and the actuators inside these devices allowing that to happen and monitoring where the devices are and how much they are being used, are they working well, switching them off if you stop paying, all this kind of stuff. Yes, exactly.
Ralph: How important is sharing data in the field of Internet of Things? Because, in my opinion, it all gets smart when devices start to communicate and exchange information.
Yes, Ralph, this is exactly it. The thing is there are two levels: The first level is the sharing of data from a device to the rest of the world. A couple of days ago I heard that in Moscow they’ve got a system where they’ve embedded a small device into the curb and it can sense if there is a car parked next to it and it transmits that through a new protocol called Weightless which has a very low power requirement, it’s only a hundred baud but at about five kilometers distance. So it’s a new protocol really designed for this outdoor remote cases and so this device has a battery life of years and it sits there, built into the pavement, and now you can find out if there is a car parking space. You don’t have to drive round and round endlessly, there is a little that says ah, there is a free space, three streets down, round the corner.
So that sharing of the data is amazing and enables things, but then there is a second level. And that second level is where the good and the bad stuff happens and that’s when you correlate data from multiple devices. So if you can correlate that parking data with traffic data, now you can create a navigation of the quickest way to the parking space. That’s a bad example, but you could imagine, for example, if every car had some kind if smart technology in it and you also have pollution sensors, you can solve problems about how to reduce pollution in cities which is a big problem here in London. So you can see that as you get these multiple data streams and you share data and then you combine them, then you get amazing power. But you also get amazing problems because when you start to share that data that’s when you can de-anonymize data, that’s when you can start to pull together things that infringe on people’s personal liberties, on their situation.
We have this problem here in the UK that they actually installed scanners in the dust bins, they built smart bins in London and they installed scanners that are picking up people’s MAC addresses and using that to track people and then if you take that information and you put it together with some other information, people’s check-ins on Foursquare or something like that, then you might be able to say “I figured out that’s Ralph’s MAC address, now I know exactly where Ralph is” and then I look at the car emissions and you could pull all this data together and start to really infringe on people’s liberty, on their privacy, and enable all kinds of criminal acts, it’s not just government snooping on you, it’s criminals, it’s all kinds of situations. So I think there are some really scary scenarios there as well as you get that shared data and you start putting things together.
There is a number of risks because we often think of these just as sensors and there is obviously data theft, maybe burglaring your house because you know it’s empty, because the Nest has decided you are not home and therefore I know it’s a safe time to go and rob your house. Maybe you have a smart lock and I can hack the smart lock and I can get in more easily. But then there are also much, much scarier things when you start to affect the real world. So, every car in Europe has an OBD2 port on it and many of them have no security on that at all. So somebody puts a small device they get access to your code, they’ve got a small device, they plug it in and then remotely they can actually accelerate your car, brake your car without your control and there is no way you can stop it. You could easily cause death with hacking of a smart car even a non-smart car by adding a smart device in. But the internet is connected through the smart car system into the OBD2 port. So, there are some really scary challenges here. You could even imagine that in a smart city there might be systems where you could simply just cause mass panic by affecting aspects around people’s environments and that would cause lives or people to get hurt. So there is a lot of potential dangers when you start embedding the internet into the real world and connecting the real world to it. With any new phase of technology you get people scaremongering about security. If you go to any major bit of technology that’s come out, whether it’s PCs or client server or cloud or anything, you will see people standing up and saying there are big security challenges here, this is going to kill it, you can’t do this, it’s not going to work. And this is not my attitude, I’m very positive about technology, I think we can overcome these issues and we can build a secure model. But we do have to think about it and we have to take these challenges seriously. So I am not trying to scaremonger, I am not trying to scare people, but I am trying to say that this needs to be taken seriously.
Ralph: I think that one of the problems is probably that there are so many different devices, so many different technologies, different protocols and stuff like that and it’s really hard to define end to end security for all this ecosystems. Because as you said, we have PCs and we have some threats by viruses or whatever, but it’s a PC, there are many of them, but every time it’s the same more or less. But the Internet of Things is so different, so that’s going to be hard.
Yes, I think there are two aspects of what you are saying there, Ralph. There’s the fact that for example in a web world, in the traditional internet you have a web browser which is a very limited function and then that talks to a web server which is behind a firewall which is protected, which is managed very carefully. So those server endpoints have a lot of effort expended to make them secure and even in that quite constraint model we have big security hacks and flaws and millions of user IDs and passwords being stolen and credit cards numbers and so forth. So major security challenges in that. And now suddenly, as you say, we have not just a variety of devices but constraint devices, some of them very limited ability to do encryption and security, very small footprints, very low margin, you may think of a browser, it’s free, but it’s running on a very powerful machine that has a lot of capabilities and suddenly you are replacing that with something that maybe cost $10 to make or $20 to make. So the way I see it you are fundamentally creating a fractal perimeter; fractals are these odd shapes that look simple and beautiful but if you actually measure the perimeter of a fractal, it’s infinite, even though it’s bounded; and I feel we are getting in the same situation with the Internet of Things that previously we relied on perimeter security, VPNs, firewalls, things to control the edges and now you have no edge, it’s an infinite edge.
So, it’s not just the end to end problem, it’s the scale of the problem that’s very, very difficult to secure. But that doesn’t mean it’s impossible, it’s a question of good engineering, learning the best practices that we’ve applied and applying those to this problem and really trying to make sure we take it seriously as a community. And as a community we create the best practices for how to secure this and take it onboard, and that’s what the web community has done. Of course, there are always failures, but overall you have to say it’s been a success in the web space and that’s if you’d imagine going back 20 years and thought about the shift from computing as a really private thing that happened inside companies with no external boundaries and now we’ve expanded that boundary, this is just the next stage of that. So, I think it’s achievable that we can solve this but it’s necessarily easy. Not easy.
One of the things I was talking about in the talk was that Oyster card, which everyone here in London knows, is actually something that was originally called a MIFARE Classic. It’s an embedded RFID chip in a card and you might have thought that that was pretty secure because it’s hard to see what’s happening in it, but some researches at some university in the Netherlands led by a guy called doctor Roberto Garcia, it’s Flavio Garcia, I must get this right, he and his team really started looking at this device in some seriousness and they used scanning electron microscope and all sorts of technology that we in the software world are not used to. In the software world you try and hack a web server and try to find out what’s going on behind that, you have a few attack points, in the hardware world, especially chip devices, you can get these devices and open them up and apply all sorts of technology. So there are companies out there who have scanning electron microscope and will use those to change a bit on a hardware that is meant to be unchangeable, on the chip, they will literally take the top of the chip and zap one little transistor inside that chip that will switch a lock bit that’s meant to be locked and now you can read the code on that.
This idea of security through obscurity is worse in the hardware world and we’ve seen this with the Xbox for instance. So, the Xbox got hacked, the security keys on it got hacked and then everybody had open access to the Xbox. So I don’t believe that a security by obscurity will work in this case, but at the same time I am not yet convinced that the community is mature enough to act as a community. So in the software world we’ve had open source for 15 years, we have a lot of people who contribute best practice to open source. In hardware opens source is a pretty new concept and I think a lot of the people who are manufacturing devices and building IoT systems are not there yet in terms of sharing their best practice and working as a community in the same way the software world is. So, I am not saying we are there yet, but I definitely don’t think the obscurity is going to work, so, no.
There are a couple of research groups who have built, I think they are companies, because I don’t think there is any open research on this, car companies have built a firewall for OBD2 for the CAN-bus in the car, so effectively creating a firewall between different parts and you can certainly see that’s a good intermediate step. But if you take a Tesla for example. A Tesla is not just a simple CAN-bus, there’s an ethernet network on which everything is going and I think this perimeter model of separating stuff out is not a good replacement for fundamentally creating a secure network from the ground up, so I think that if we are really going to be serious about connected cars in 20, 30, 40 years, which we are going to have to be if they are becoming self-driving and security is going to be more important then, can you imagine a car kidnapping somebody just by reprogramming their self-driving car to take them somewhere, so in those worlds I think you need to build security from the ground up and it’s fundamentally the same thing that happened inside enterprise networks that 20 years ago we just assumed everything inside the network was secure and if you are outside you are outside and if you are inside you are inside. That kind of world I don’t think it’s going to work, I think we are going to have to come up with much better ways in the future. But it’s hard to do, the whole car industry is based around the standard, it’s very handy, it means that your car breaks down, you can go to any mechanic, they plug in the same place, they can read your. So, those kind of shifts can take a lot of work to do. So firewalls, CAN-bus firewalls, for example, is a great intermediate step to get us there.
Let me give you a good example of that. So, Bluetooth low energy is the latest cool thing, almost every IoT project I hear has got a BLE. BLE door locks, BLE trackers, a Bluetooth low energy dummy for a baby that measures its temperature so you can check on your baby when they are ill without having to go stick something in their ear, everything. So BLE the original spec, unfortunately it had a major security flaw which is they used effective secure encryption, but the key exchange at the start had a flaw in it. And a researcher found this and so if you happen to be in the area where that happens, if you pick up the key exchange then you can read all future traffic. So they fixed that in the standard, but the hardware where the new standard is going to be, the chips are still out there yet. The standard is fixed, the bug is found, but there’s billions of devices out there that have this bug built into them now and it’s going to take five, maybe even ten years to get past that. I mean London underground still accepts broken MIFARE Classic cards even though they are hackable and it’s at least five years since that was found out. So, you are absolutely right, this is the other challenge, in the software world, you can at least try to persuade people to update their systems and get clean code out that hasn’t got bugs when the bug is found. In the hardware world that is so much harder, so you are right, the pace of this development and the pace of the security wholes and challenges compared to the pace of hardware refresh and devices going out and getting people for instance to upgrade their home router or upgrade their in-house smart home system, that’s a big challenge in IoT, definitely.
Ralph: I think so.
We’re make it sound really depressing.
Ralph: But it’s so interesting.
I don’t feel depressed, I feel like there are a lot of opportunities to fix this, I think there are going to be some scares, especially over the next couple of years, I can imagine big scares and then. My big scare, my big worry is drones. I am sure that some terrorist is going to do a drone attack, that’s my prediction for this year, that there is going to be a terrorist drone attack because why wouldn’t they? It’s a $1,000 for a drone. So I think there are going to be some IoT scares, they already happened, so back in 2011 Fitbit was publishing people’s sexual activity by default to their website, so that was quite funny, so if you happened to put your full name in there then someone could click on your link 15 minutes last night, “is that long enough? I don’t know.” It was quite amusing, but that’s not a really bad scare yet, we are laughing about it, whereas there will probably be some scares where people aren’t laughing.
I think there are two aspects. One is anonymization is challenging, so for example there have been a lot of good cases where anonymized data has been de-anonymized. The Massachusetts State published anonymous health care data for researchers and researchers de-anonymized this and went to visit the governor, showed him his family’s health records which she didn’t much appreciate. Another example Netflix published anonymized Netflix viewing data and by tying them to IMDB reviews, because somebody rents a film and then they post their review on IMDB, they managed to de-anonymize that and then know what films people have been watching. So, it’s especially when you can correlate multiple different event streams it’s possible to de-anonymize data. So that’s one challenge and the other challenge is how do you net meaningful consent. So for example I am personally a diabetic, suppose there are devices, they are quite expensive in the UK, but you can monitor your sugar 24/7 though a device. I might be willing to share my average data with my doctor, but do I want to share my full data with my health insurer? There are some challenges here about consent, about managing that and about getting the balance right as you said between complete privacy, in which case there is no point being an IoT device, and sharing it. And so that’s really the focus of my PhD research and I am still quite early on in this and there are other people doing things. So, some of the things I have been looking at for that are for example the OAuth spec where you can issue a token to a device or to another system giving some access. So for example I might issue a token to my doctor saying you can access the average blood sugar levels but not the hour by hour, minute by minute blood sugar levels, which I might want to hide for some reason or I might not want to share. There is a very interesting standard called UMA, User-Managed Access, which has not got as much traction as I think it should do, partly because it’s a very complex standard, and that’s about trying to let users actually choose what they want. So with OAuth, they ask you a question, “would you like to share a, b, and c with Facebook”, say, and you can only say yes or no, whereas maybe what you want to do is say actually I would love to share a, but I don’t want to share b and c. So User-Managed Access is trying to enable that and there is some work going on fitting that together with IoT. That’s an interesting aspect and I think this is an area that really needs to develop is how do I control the sharing, how do I stay in control of my data, whether I am a person, a corporation, anyone using these devices and how do I share just what I want with the right level and protect against these kinds of attacks and I think that’s an open problem, there are some solutions, but it’s not solved by any means at all.
Ralph: Yes, and even if you solve that problem there is still the risk of somebody selling your data, at least I heard it from digital health cards that insurers are really, really looking to get data just to keep people out that might not be part of the business case.
Absolutely. And we’ve seen this with credit ratings, for example, your credit rating is now sold around the place, your personal data, your name and address is sold for spam purposes. There are lots of corporations selling data in ways that we might not personally like, but we click through some 28 page small print and part of that in there somewhere was that we can do this. So that’s a big challenge how we prevent that and there are technical and legislative things here, as well, there are the objectives that you have for your privacy and then you need some way of if somebody breaks that. Once you’ve given them your data you can have some technical rules about why you’ve given to them, but once you’ve given it to them the only way to stop them sharing it is legislative. There are lots of code protocols that are done by researchers where you can have shared encrypted data and do things in ways that people can only extract certain amounts of it, but these kind of cryptographic protocols are so expensive and slow that they never get reused in real life. So, you can do an SQL query against data and you can’t read the data, but you have permission through a cryptography model to do a particular SQL query and get some of that back. So there are some amazing research into this space, but it is so complex and so slow that none of it is faulted into real life. So for the moment we have to rely on things like European Union rules and data protection acts and stuff like that. So it’s got to be a combination of technical and legislative, I am personally more into the technical stuff, what can I do to minimize the data I give you so that you have less to sell.
I think they will recognize it, I think the internet people recognize and the benefits that’s brought to us. There is definitely going to be a point in time in 20 years’ time or 30 or whatever when this is just the world and that’s why we have to worry about the security now, because if every part of every surface in the world has an IP address and an IPv6 and every part of it is measured and monitored and so forth, that’s when we have to be really scared, you think the NSA knows a lot now, imagine what they are going to do in 20 years’ time. So I do think we need to care about these things.
Ralph: Yes, that’s right. So, I think that’s really some important topic for the coming years because as you said it’s cheaper, there is no reason why it shouldn’t happen and I guess there are some kind of interesting business cases. So, I guess we will see a lot of this in the coming years.
Definitely, yes.
Ralph: So, thanks a lot for sharing your experience with us and have fun at QCon.
Thanks very much for taking the time out to chat with me, it was very fun, good conversation. Bye now.