Fast and secure transfer of a video in a browser without using third-party applications - is it possible? Depending on your needs, there’s more than one way to add WebRTC to your website.
WebRTC (Web Real-Time Communication) is an open source technology for implementing multimedia communication capabilities in real time directly in your web browser. It sets the Peer-to-Peer connection between two or more people, which is perfect for transferring of a media (audio and video streams). This technology is supported by the following browsers: Google Chrome, Mozilla Firefox and Opera. You do not need any additional plug-ins for these browsers; just open a web page and start a conversation. There is no native support for those using Safari and IE, but there is a possibility to add special plug-ins.
The idea is simple. First, the browser sends a signal to the WebRTC server that the user wants to initiate a call. After getting the link from the server, the user sends this it to his companion. In the pop-up window browser prompts the user for permission to access a web camera and a microphone. If you use HTTPS protocol, the browser can remember the option you choose (give permission or no). If you use HTTP protocol, the browser will ask your permission each time you try to connect.
It sounds simple if you’re a user. But what about developers? How can you implement it on your website? And what if you want not only a video chat, but need to set up a recording with a webcam and send the results of the recording to Amazon S3 storage with the possibility of viewing of the recording by a third party. It can be great for collecting video reviews on some goods and services, recording of video messages while one member of discussion is absent, for interviews with candidates for a job, web support, and so on.
As you know, there are many video file formats. Most of them have advantages for some particular reasons. I could choose Flash format (FLV, F4V) for recording video, for example, but this technology is losing relevance on the web. Browsers have announced they will stop supporting Flash in the future and this is another reason why I choose to use WebRTC. Flash uses H.264 video codec, while WebRTC uses VP8. Support of H.264 is specified in WebRTC standards, but still not yet widespread. VP8 is free (H.264 is not), and the quality and a size of a video file are almost the same.
At the beginning, I choose to use the free JavaScript library called MediaStreamRecorder, created for cross-browser audio/video recordings. This library very often is associated with WebRTC implementation. Unfortunately, during the process I faced a number of technical difficulties. For example, each time the recording finished, the browser stopped responding and there were many lags.
I then tried the WebRTC Experiments library. It uses the same technology, but with a different implementation. The lags when recording stopped and the browser behaved properly.
The conclusions of the work with this library are as follows.
Pros:
- No need to run the server, the entire load falls on the browser.
- It is easy to control the recording: just press pause, start a new record, stop.
- There were no delays because everything has been done in Kurento (below I will describe it in details).
Minuses:
- Google Chrome records video and audio separately, that’s why in the end there are two separate files: audio and video without audio.
- When the recording lasts over 5 min, the browser stops responding, ignores user commands.
- It takes a few seconds for the video to decode and receive data after recording has stopped.
- It takes time to send the video to the repository. We lose data if user closes the tab of the browser while the file is being sent.
I needed a server that would accept the video stream from the client and would record it. A suitable option for me was to use Kurento Media Server, which I picked up on Amazon EC2 with STUN / TURN servers. I implemented the frontend using Kurento libraries (without using an intermediate server). Kurento (the Esperanto term for the English word ‘stream’) is an open source framework providing a media server based on standards capable of providing arbitrary media processing.
In the end this result was not acceptable for me, but you can use this method if mentioned above minuses are not critical for you. Since the idea was to achieve the best results as possible, I began to look for another solution.
Sometimes it is not enough to make a single record. What should you do if you need to divide the stream into parts for some reason, for example, if your application accepts only three or five-minute video recordings (to avoid overloading)? I thought about how to make 12 five-minute records during an hour, each of which has to be saved in a separate file. I tried to stop recording each time, disconnect from the server and start a new record again (once again connecting to the server). It didn’t work properly with Kurento, because every time I reconnect to the server, I lost a few seconds (the connection is not instantaneous).
As a result, I decided to take a non-stop record from the webcam using a single connection. For the subsequent dividing of the record into parts and sending it to Amazon S3, I decided to use the node-fluent-ffmpeg library. To do this, I set up a new server, but there was another difficulty. Kurento saves files in a webm format, which is played incorrectly in Google Chrome (Mozilla Firefox had no such problems): video stream played faster than an audio stream. The best way to fix it was to re-encode the files in mp4, which solved the problem.
Finally, everything worked out. I built the entire process with WebRTC technology, adjusted everything under the current task. Of course, implementation was not 100% perfect, and one can find both strengths and weaknesses in it:
Pros:
- Audio and video are in the same file, there is no division into two separate files.
- Video is recording in a real-time mode. There is no chance of losing the video file. In the case of the implementation of the recording on the client side, you have to wait for the upload of the file in the repository before the user leaves our service.
Minuses:
- It takes time to connect to Kurento Media Server.
- Traffic. The capacity of the client`s bandwidth must be sufficiently high.
- Implementation is hard, both on the resource and on time. I had to set up Kurento Media Server, coturn, Node.js for transcoding video to another format, dividing it into parts and sending all this to S3.
WebRTC technology is new, so there is a fair question on its security. It uses encrypting built on top of the TLS protocol. You can find security flaws, but it’s likely not the problem of the WebRTC technology, but of the browser itself through which the signal is transmitted. The main problem is that WebRTC very easily and quickly reveals the real user IP address, which is not protected by either a proxy or VPN, Tor, or popular plug-ins such as Ghostery. To organize audio or video conversation using WebRTC two computers must send an IP-address to each other (not only public, but also local). You can request an address using a simple script in JavaScript, and this is a huge problem in personal data protection, which can be solved only by deactivation of WebRTC.
There’s a lot of room for improvement, starting with proper browser support.
As you see, there is no fairy tale answer. This technology has plusses and minuses, advantages and disadvantages, but WebRTC is rapidly gaining popularity. The statistics of using WebRTC to date is impressive. 47% of businesses planned to use it within the following 12 months or already used it. 90% believed WebRTC has the potential to improve contact center services. Over 720 companies are using WebRTC in some form.
We can expect that WebRTC will influence the market of means of communications. There are many ways to use it. For example, imagine the online store with a link “customer support”, which you can press and enter a video chat with a consultant, ask your questions, get an advice. Directly from your browser, without any additional applications or plug-ins. It gives great perspectives for improving the quality of customer service and as a result, perspectives to increase profit. This means that companies who want to be the market leaders will use it soon, and the rest will follow.
There are many fields of life where traditional phone calls are replacing with quick link clicks or by a message by web messengers. It's much easier to call a taxi, order food for a dinner and so on by clicking a link on website than by making a phone call. This is a field where WebRTC technology can be used. It starts to compete with common video chats (such as Skype) and phone calls. Who knows, maybe WebRTC will change our everyday habits in the next few years.
About the Author
Nikolai Bezruk, JavaScript developer at Qualium Systems, a company creating web and mobile applications for startups and digital agencies. Nikolai is a team lead and software architect, doing full stack web programming. He also teaches new employees.