Securing data exchanges is hard in a world where both freelance hackers and government agencies are out to get you. Public key encryption is the basis for most forms of secure exchange, but it depends on trust - and those who are out to get your data are experts at abusing trust. Supposedly-secure connections can be downgraded to the point where they are easily broken, and even at full strength most forms of encryption are vulnerable to data capture and later decryption if your private keys are exposed.
But there are still ways to protect your secrets, especially if you have some control over both ends of the exchange. In this article you'll learn how you can secure application data exchanges in ways that block exposure both now and in the foreseeable future, and see examples of applying the security techniques in Java code. Along the way you'll also get some pointers on how you can protect your own online access.
Trusting authorities
Public-key cryptography is the basis for most data exchange security. The basic principle of public-key cryptography is that you work with a pair of keys, a private key that needs to be kept absolutely secret, and a public key you can distribute any way you want. The two keys work together – anyone holding the public key can encrypt data so that only you can decrypt it using the private key, and you can encrypt data using the private key that can be decrypted by anyone holding the public key.
Encrypting with the public key gives obvious benefits, since it provides a way for others to send you secret messages. Encrypting with the private key is less obvious in application, but an equally important part of security in that it allows signing of messages. You can compute a secure digest (basically a hash code that's very difficult to reverse) for any data you want to send, encrypt that digest with your private key and send both the data and the encrypted digest together. Anyone with your public key can then decrypt the digest you sent and re-run the digest computation on the data they received. This lets them verify both that the data they received matches what you sent (if the digest values match) and that it really was you who sent it (because your public key was used to decrypt the encrypted digest).
Public keys are usually distributed in certificate form, meaning they're wrapped in an envelope that identifies who you are. The envelope is signed by a certificate authority that vouches for your identity. Some certificate authorities are root authorities, meaning their own certificates need to be trusted implicitly. Other authorities use certificates that have been signed by some third-party authority, establishing a chain of trust ultimately based on one of the root authorities.
This is a great system for verifying identity as long as the certificate authorities are completely trustworthy, meaning they're both honest and competent. A number of incidents over the last 20 years, where someone was able to get a certificate issued in some other organization's name, have shown the competence of these authorities is not always at the level we'd like to see. Even worse have been cases where the certificate authority was itself compromised, with the authority's secret key being stolen, allowing the thief to create their own trusted certificates.
Mozilla products (including Firefox) use 166 trusted root certificates representing at least 70 different organizations. That itself seems like a large number of organizations to be granted such a high degree of trust over secure communications, but it's only the tip of the iceberg. Many other authorities are also trusted by virtue of having their certificates signed by these root authorities, and this delegated authority can even be extended to more levels. The total number of trusted certificate authorities is certainly in the hundreds, and possibly in the thousands (nobody really knows for sure, but the EFF SSL Observatory found 650+ from public websites as of 2010).
Security and middle men
So there are a lot of certificate authorities that are all trusted by your browser (or your application code – Java 7, for instance, has 81 trusted root certificates in the standard distribution) to sign certificates. How does that matter?
When you make a connection using the SSL protocol (actually TLS, the modern replacement for SSL based on the same principles) you rely on a certificate provided by the server to keep the initial data exchanges secret. Your connection will be routed through some number of intermediaries to get to the server, and as long as all these intermediaries are “honest” (i.e., they pass the data in both directions without modification) you can trust that your connection is more-or-less secure (we'll get into the more-or-less part later).
But if one of the intermediaries is dishonest you can be vulnerable to a man-in-the-middle attack. The easiest form of man-in-the-middle attack against TLS uses a falsified certificate for the server – one that says it's for the server, but has actually been generated by an attacker using their own private key. The dishonest intermediary can send this falsified certificate to your client so that it connects to the intermediary, then separately establish a secure connection to the server. By forwarding everything received on one connection to the other connection it can give you the illusion that you're connecting directly to the server. This allows the intermediary to both observe everything sent between you and the server and change anything it wants.
How realistic is this scenario of a dishonest intermediary? Unfortunately, it's very possible. Routers frequently have vulnerabilities that allow them to be taken over by hackers (either freelance or government-sponsored). Reputable manufacturers will issue firmware updates for routers when they learn of vulnerabilities, but most of us don't bother to update the firmware on our home or office routers, so known problems may allow them to be taken over. Recent reports suggest that in some cases router manufacturers may even deliberately create backdoors in their routers to allow government access (something the U.S. ironically accused Chinese manufacturer Huawei of doing).
There's not much you can do about these types of issues for enterprise routers, but at the home and small business level you can often replace the firmware supplied by the manufacturer with an open source alternative. Choices such as DD-WRT, Tomato, or OpenWrt work with most wired and wifi routers to provide both enhanced features and the assurance that you're running open source code which has at least been seen by many people (not a guarantee against vulnerabilities, but at least it means the obvious ones have probably been plugged)
In some cases (think free wifi access at diplomatic meetings, for instance), organizations deliberately use dishonest routers so they can observe data exchanges over supposedly-secure connections. On a larger scale, internet service providers may provide direct government access to their routers.
Even without taking over a router, bad guys can sometimes use DNS spoofing to implement a man-in-the-middle attack. This involves confusing a DNS server by giving it the wrong IP address for a host, generally the address of a system controlled by the bad guy. When your client wants to connect to a host it asks the DNS server for the IP address of the host, and the DNS server gives you the (wrong) address from its cache. You then connect to the wrong system, which can either directly pretend to be the system you wanted or set up its own connection to the original target system, acting as a dishonest intermediary.
So every time you make a secure connection there's the possibility that someone else has tapped into the exchange between you and the server. This wouldn't be a big problem if certificates were really ironclad guarantees of identity – but unfortunately, they're not. Hundreds (or thousands) of certificate authorities are trusted by your client, and all it takes for a man-in-the-middle attack to succeed is one bad authority to issue a certificate impersonating your target host. That's scary enough when you just consider the issues of authorities being hacked or not doing a proper job of verifying identity, but you can add the concern that many of these certificate authorities are either directly under the control of or subject to intimidation from authoritarian governments. So the public certificate system is effectively broken and should always be regarded as unsafe, even while we continue to use it where necessary.
Safer certificates
There are some steps you can take to protect yourself from falsified certificates. Browsers normally accept any certificate signed by an accepted authority without question, but browser extensions may be available to give you a warning when the certificate used by a site changes. In the case of Firefox, the Certificate Patrol extension does this (there's also an enhancement request to implement this certificate warning in the core Firefox code, but for now the extension is the best you can do). Using a browser extension of this type means you'll at least know that something is potentially wrong when you return to a secure site you've previously visited and find that a different certificate has been presented – but you'll then need to look into the details and decide whether to trust the new certificate or not, as certificates are often replaced for legitimate reasons (including getting too old, since certificates are normally only valid for a couple of years from the issue date). You'll also get a lot of warnings for large sites that use multiple servers with different certificates.
You have a wider range of options in your application code. If you're writing Java code, for instance, you can override the default certificate handling to implement a more secure technique that bypasses the certificate authority mechanism. For client code, this can be as simple as storing the expected server certificate in a keystore and only allowing that particular certificate to be used when establishing a connection. If you want this keystore to be used for all connections from the application you can just set it as the JVM system property javax.net.ssl.trustStore={keystore-path} and the associated javax.net.ssl.trustStorePassword={password}, where {keystore-path} is the actual file system path to your keystore and {password} is the keystore password. Once you've set this, only the certificates you've stored in the keystore (or certificates signed by one of those certificates, if it's authorized as a certificate authority) will be accepted for TLS connections.
You can also control the allowed certificates per connection. Here's an example of that approach, using several classes in the standard java.security and javax.net.ssl packages to implement the handling:
public static void main(String[] args) throws IOException,
GeneralSecurityException {
// open the keystore
KeyStore keyStore = KeyStore.getInstance("JKS");
FileInputStream fis = new FileInputStream(Constants.TRUSTSTORE_NAME);
keyStore.load(fis, Constants.TRUSTSTORE_PASS.toCharArray());
// create trust manager that trusts only the server certificate
String alg = TrustManagerFactory.getDefaultAlgorithm();
TrustManagerFactory tmfact = TrustManagerFactory.getInstance(alg);
tmfact.init(keyStore);
X509TrustManager tm = (X509TrustManager)tmfact.getTrustManagers()[0];
// create the connection (and make sure its secured)
URL url = new URL(args[0]);
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
if (!(conn instanceof HttpsURLConnection)) {
System.err.println("Connection is not secured!");
}
// configure SSL connection to use our trust manager
SSLContext context = SSLContext.getInstance("TLS");
context.init(null, new TrustManager[] { tm }, null);
SSLSocketFactory sockfactory = context.getSocketFactory();
((HttpsURLConnection)conn).setSSLSocketFactory(sockfactory);
// open connection to the server
conn.connect();
conn.getInputStream();
System.out.println("Got connection to server!");
}
As you can see from the comments, this reads a keystore containing a certificate, creates a TrustManager which only trusts that certificate, and forces the SSL/TLS connection handling to use that TrustManager when establishing the connection. If you try to connect to a server that uses a different certificate the authentication will fail and the connection will be blocked.
The full code is available in the github repository, as the class com.sosnoski.certs.UseCert. The class com.sosnoski.certs.GetCert retrieves the certificate provided by a server and stores it in the keystore used by UseCert, so it gives you an easy way to get started trying things out. Note, though, that large secure sites may have multiple servers with different certificates, so you may find that the certificate you store for a site doesn't always work.
Be the authority
Sometimes you need to work with many certificates, such as when you're hosting one or more services and want to use individual certificates for each allowed client (with client authenticated TLS). Trying to maintain up-to-date truststores for the servers can turn into a huge administrative burden in this type of situation, especially as client certificates expire and need to be replaced. Fortunately, there's an alternative that allows you to maintain access control without as much work – you can implement your own certificate authority, and only trust certificates issued by that authority.
There are several commercial or open source tools that support this type of custom authority. OpenSSL (http://www.openssl.org/) is one of the most widely used tools for this purpose (though there are probably better choices for implementing a fully-functional certificate authority – see the next section). As you might expect from the name, OpenSSL is an open source implementation of the SSL/TLS protocols. It also includes full support for both working with existing certificates and creating new ones, and runs on a range of platforms including Linux, MacOS X, and Windows.
There are several online guides available for implementing your own certificate authority using OpenSSL (such as this one for Ubuntu), so we won't go into the full details here. The basic principles are pretty easy, as shown by these Linux/Unix console commands (with lines indented where wrapped to fit the page width):
- Create a default OpenSSL certificate authority directory structure:
mkdir demoCA
mkdir demoCA/private
mkdir demoCA/newcerts
echo '01' > demoCA/serial
touch demoCA/index.txt - Create the private-public key pair for the certificate authority, and export the private key into the directory structure:
openssl genrsa -out ca-keypair.pem 2048 openssl pkey -in ca-keypair.pem -out
demoCA/private/cakey.pem - Create a self-signed certificate good for 10 years for the certificate authority, and export the certificate into the directory structure :
openssl req -new -x509 -days 3650 -key ca-keypair.pem
-sha256 -out demoCA/cacert.pem
[respond to prompts with certificate authority identity] - Create the private-public key pair for the user, and export the request for certificate to be issued by the authority (run on the system which will own the certificate):
openssl genrsa -out user-keypair.pem 2048
openssl req -new -key user-keypair.pem -sha256 -out
user-cert.req
[respond to prompts with user identity] - Sign the certificate (run on the certificate authority system):
openssl ca -out user-cert.crt -policy policy_anything
-md sha256 -infiles user-cert.req
[respond to prompts to sign and commit certificate]
The above uses 2048-bit RSA keys, generally considered safe at present. A stronger level of protection is given by 3072-bit keys, while the truly paranoid are using 4096-bit keys. Longer key sizes add some overhead when establishing a secure connection, but for all except the busiest sites this probably won't be a major concern. To use a different key size, just substitute that value for “2048” in the two “openssl genrsa ...” commands.
Naturally, you need to make sure that access to your certificate authority is kept under tight control, and especially that the private key used for signing the issued certificates is never exposed. You also need to have administrative procedures in place to verify the identity and authority of anyone applying for a certificate.
It's easy to make the earlier Java code work with your certificate authority. Just put the certificate authority's certificate in your truststore instead of a particular server's certificate, and any certificates signed by that authority will be accepted by the client. No changes to the code are required.
The authority giveth, the authority taketh away
There's actually one more step you need to take if you're building a full-fledged certificate authority: Support for revoking certificates after they've been issued.
Certificate revocation means that a previously-issued certificate is no longer valid, either because you've terminated the owner's access to your services or because they've been issued a replacement certificate. This type of replacement would be required if the holder of one of your certificates suspected their system had been accessed and their private key potentially revealed. Ideally it's something that will never happen, but by publishing and checking a CRL you'll be protecting yourself in case it does.
There's also a newer alternative to direct CRL checking, called Online Certificate Status Protocol (OCSP). OCSP allows code to check the current status of a certificate directly, by sending off a request to a server associated with the certificate authority that issued the certificate. If a certificate is being used for the common case of establishing a secure connection, the OCSP check can proceed in parallel with the connection process to minimize the overhead involved – there's no harm if you negotiate a connection and then find out the certificate is bad, you just need to get the result back before you actually start using the secure connection to exchange data. Alternatively, there's a technique called OCSP stapling that lets the server proactively get its certificate checked and verified with a signed timestamp from the issuing authority, so that clients know the certificate is still vouched for by the authority.
If you're going to build a certificate authority including revocation handling you're best off using a tool such as EJBCA. EJBCA is an open source enterprise certificate authority implementation based on Java EJB technology. It provides full support for generating and managing certificates, including direct CRL and OCSP support.
Most current browsers implement OCSP checking of certificates by default. In Java code, you can enable certificate checks using the system property com.sun.net.ssl.checkRevocation=true, and OCSP checking with properties ocsp.enable=true and com.sun.security.enableCRLDP=true (or equivalents, for non-Sun/Oracle JVMs).
Negotiated weakness
Fake certificates are the most direct way for a dishonest router or other intermediary to compromise a supposedly-secure connection, but they're not the only way. Another type of attack is based on subverting the negotiation that takes place between client and server in the process of establishing the connection. The object of this type of attack is to weaken the security of the final connection to the point where it can easily be broken.
To understand how this type of attack works you need to first know a little about how TLS works. A TLS connection uses a session similar to what you have when you log into a web site from a browser. In the case of TLS, the session is maintained separately by the client and server, but on either end it represents the shared state information that allows the two to communicate securely. This includes a secret key used to encrypt information sent between the two, but also sequencing and other control information used to maintain security (since TLS provides not only secrecy, but also message integrity checks to detect when information is modified in transit).
The initialization of an TLS session uses an exchange of messages called the “handshake”. A client gets things started with a request to the server, telling the server the highest TLS version it supports along with a list of usable cipher suites (which identify the security algorithms for a connection) and other supporting information. The server responds with a protocol version and cipher suite from those supported by the client, and also sends its certificate. The client verifies the server certificate, then generates a secret value and sends it to the server encrypted with the public key embedded in the server certificate. This secret value is then used as the base for computing secret keys used to encrypt and sign data exchanged over the secure connection.
A dishonest intermediary can potentially interfere with the initial stages of this handshake, since the first messages are exchanged without encryption or signing. In particular, it can weaken the TLS version and/or the cipher suites that are excepted by the client. For instance, the intermediary could modify the message so that only SSL 2.0 is allowed.
What makes this downgrading of the protocol dangerous is that older versions of the SSL protocol have a number of known vulnerabilities. Having assured that the connection uses a vulnerable protocol, the dishonest intermediary can then use one or more of the vulnerabilities to break the security of the connection.
TLS 1.1 and higher implement protections against this type of downgrade attack, including using a signature at the end of the handshake that covers all messages involved. But flaws are sometimes present in these protections, and at least for the case of application data exchange there's a solid way of avoiding even the possibility of such an attack.
Negotiating from strength
The best remedy to downgrade attacks is simple: Only allow secure connections to use specific trusted versions of the TLS protocol. Unfortunately this approach is not always practical when you're using a browser, since the “secure” web sites you want to use may not be up-to-date in their protocol support. Most browsers at least let you limit your secure connections to relatively recent versions of the protocol, and this is about the best you can achieve over the wild web.
But restricting protocol versions works very well if you control both ends of the connection, and is a best practice for enterprise data exchange. You should make sure that either your client code, your server code, or both, enforce this approach.
The easiest way to do this in Java client code is to use the system property https.protocols=TLSv1.1 (or TLSv1.2). The only limitation of this approach is that it forces all secure connections to use the same protocol. If you want to control the protocol on a per-connection basis you can tweak the SSLSocketFactory returned by javax.net.ssl.SSLContext to do this. See the com.sosnoski.tls.ForceTls class from the github repository for an example of this approach.
On the server end, you'd generally control this with options for your application server. For instance, when using Tomcat with Java connectors you can restrict the protocol suites using the sslEnabledProtocols attribute of the <Connector> element, so the equivalent of the above client configuration would be <Connector …sslEnabledProtocols="TLSv1.1" …/>. When using the APR connector you'd instead use the SSLProtocol attribute.
Anything you've ever said or done can and will be used against you
So far you've seen some techniques for making it more difficult for anyone to break into your secure connections. These techniques won't help if the server's private key is compromised, since anyone with access to the private key can decrypt the message exchanges during session establishment to view the session key, then use the session key to decrypt all data exchanged during the session. That's why it's so important to keep your private keys absolutely private, and never expose them outside the owning system.
But sometimes private keys are compromised, either though a hacked system, an insider with access, or by government order. If you know the key has been exposed you can generate a new private-public keypair and get a new certificate, revoking the old certificate so that it will no longer be accepted by clients (an important step, since otherwise whoever obtained your old private key will be able to use the old certificate to impersonate you). That takes care of future data exchanges... but if whoever stole your private key monitored and recorded prior sessions they'll be able to use the that key to decrypt and view all the data exchanged in those sessions.
There's a way to safeguard your secure connections against even this type of retrospective breaking. It's called “perfect forward secrecy”, and what it does is provide a way for the client and server to create a shared secret key for a session without exposing that key to anyone monitoring the exchange (even if the monitor has access to the server's private key, as long as it's just a monitor and not a man-in-the-middle).
The standard implementation of forward secrecy uses some form of Diffie-Hellman key exchange. The basic idea of Diffie-Hellman is to use the multiplicative properties of groups of integers modulo a prime number. It's easy to compute the value of a number raised to a power in a group, even if both the group and the power are large. But with known mathematics it's computationally very hard to go the other way – that is, given a number and a value, find what power of the number that value represents in a group (called the discrete logarithm problem). Each side in a Diffie-Hellman key exchange uses the same group and base number but generates a different value for the power, then they exchange values generated by the powers. They each end up with a shared value that combines the two secret powers. Anyone monitoring the exchange would need to work backward to solve the computationally difficult discrete logarithm problem in order to get at the secret, and with large enough values that's going to take far too long to be practical with any technology we can currently project.
The bad news for browsing is that many web servers don't support perfect forward secrecy for secure connections. This is partially because of some added overhead (which can be reduced by using a Diffie-Hellman variation based on elliptic curves rather than discrete logarithms), but often just a matter of using outdated security implementations. Google has been notably proactive in implementing perfect forward secrecy support, and recently other major sites are moving to support it. In the Chrome browser you can click the green lock icon for a secure site to see if perfect forward secrecy is being used (look at the Connection details to see if it gives a key exchange that includes the letters “DHE”, such as “ECDHE_RSA”).
It's pretty easy to require forward secrecy in Java application code. You can again use a system property for this purpose, as https.cipherSuites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA. If you want to control the protocol on a per-connection basis you can use the same approach as for controlling the protocol version, as shown in the com.sosnoski.tls.ForceSuite class from the github repository.
That cipher suite name looks pretty cryptic, but if you break it down one part at a time it becomes understandable. “TLS” naturally refers to the TLS protocol. “ECDHE” says to use elliptic curve Diffie-Hellman key exchange with ephemeral keys (meaning the keys are created fresh for each session and are not remembered afterward). “RSA” says RSA asymmetric encryption is used to secure the TLS handshake. “AES_128_CBC” says to use AES symmetric encryption with 128-bit keys in cipher-block chaining mode to protect the actual data exchange. Finally, “SHA” says to use the SHA secure hash algorithm.
This is probably one of the best cipher suites available for use with TLS 1.1 on Java 7 from Oracle. If you instead use TLS 1.2 you can upgrade to TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, which uses the stronger (and more secure) SHA2 digest algorithm. Ideally it would be better to use a different mode such as GCM (Galois/counter mode) instead of CBC, but that's not supported by Oracle's Java 7 JSSE (Java Secure Sockets Extension) implementation.
You may also want to consider dropping the “EC” from “ECDHE”. Cryptography guru Bruce Schneier is now recommending to prefer discrete logarithm approaches over elliptic curve cryptography (along with many other excellent recommendations in the linked article on how you can keep yourself, and your data exchanges, secure). Elliptic curve techniques are faster, but may potentially have weaknesses deliberately introduced by government organizations. Most commercial secure sites only appear to support the elliptic curve versions of Diffie-Hellman at present, though, which is why it's used in this example.
On the server end, you'd generally control the cipher suite with configuration options for your application server. For instance, when using Tomcat with Java connectors you can restrict the suites using the ciphers attribute of the <Connector> element, so the equivalent of the above client configuration would be <Connector … ciphers="TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA" …/>. When using the APR connector you'd instead use the SSLCipherSuite attribute.
It's an interesting historical note that although Diffie-Hellman key exchange was first published in 1976 (by Whitfield Diffie and Martin Hellman) it had actually been invented a few years earlier, by people working for the British GCHQ (one of the organizations receiving a lot of attention currently for their monitoring of – well, basically everything done online). For some reason the GCHQ did not want to make their invention public, so it fell to Diffie and Hellman to independently invent and publish the technique.
Time and Security
Security is always relative, and continuing advances in both cryptographic knowledge and computer technology weaken security over time. What was regarded as state-of-the-art security 30 years ago can now be broken easily, and there's every reason to expect that the same principle will apply in the future.
If properly implemented, the techniques discussed in this article will give you good protection against the most important currently known attacks, including brute force attacks that seek to find session keys in a reasonable amount of time (even from organizations with really big budgets for special-purpose hardware). This should prevent anyone from successfully tampering with your data exchanges, and (concealed backdoors in algorithms aside) should also keep your data secret for at least the next 10-20 years and probably considerably longer. But it likely will not keep your data secret 100 or 200 years from now, if it is recorded and anyone is sufficiently motivated to gain access using the tools that will be available by then. For most business or personal data that's not a problem, since we're much more concerned about near-term secrecy than long-term secrecy, but it's a good thing to keep in mind when thinking about data security.
For ongoing applications you need to review your security protocols and suites every few years, and when appropriate revise the application to upgrade security to a new standard. That's still not going to prevent someone from capturing and eventually decrypting the data, but at least it will continue to protect against tampering in the moment and against viewing data in the near term.
Wrap up
There are many threats to data communication security in today's world, and the situation promises to get even worse in the future – especially where hacking at the government level is concerned. Several advanced countries already have massive programs for data collection and cyber-sabotage. Those that don't already have such programs are likely to build them in the future as the cyber arms race heats up, so even if you trust your own government with your data you need to be concerned about the actions of other governments which may not be so “well-intentioned.”
In this article you've learned some ways of making it more difficult for anyone to see or alter your data exchanges. This type of hardening approach is an important part of security, but it's far from the only part. For instance, it doesn't do much good to use perfect forward security for your data exchanges if the data is then just stored in a poorly-secured database. Real security requires multiple layers of protection around your systems and applications. It also requires defensive thinking, where you treat everything that comes into your system as suspect until it's proven safe. So be careful what and who you trust, and keep your secrets safe!
About the Author
Dennis Sosnoski is a Java and Scala developer with a strong security background developed over many years of work on data communications and enterprise systems. He provides consulting and training services to clients worldwide, especially in the area of web services (including his Web Services Security class), and is soon to release the first of several products for securing personal data exchanges through his new site Azdeo. Dennis is also active in the Java community, as a frequent speaker at users groups and conferences, a writer for online technical publications, and contributor to a range of open source projects (including the Apache CXF web services framework, where he's currently integrating WS-ReliableMessaging with WS-SecureConversation). See his consulting and training services website for details on his background and services.