Wednesday, December 11, 2013

Perfect Forward Secrecy

There comes a time in every security consultant's career when they must step up to the plate and rehash some old topics in a blog. It is a right of passage to go out and share some insight into an interesting technical topic, often one that has already been covered to death, with hopes that their point of view will be just a tiny bit more interesting or understandable than the last article's.  Well this is that time for me and I've decided to write on the topical subject of Perfect Forward Secrecy.

I like the idea of discussing Perfect Forward Secrecy (PFS) for a few reasons. First, the term has become a very notable buzz word lately in relation to the recent government scandals we've all grown to love. Second, the math behind it is fairly simple to understand and it requires only a basic understanding of SSL and Public-key cryptography to get a grasp of the ideas. Third, it is not supported nearly enough and hopefully, when more people understand why it's absolutely necessary, more organizations will decide it is important enough to require.

Perfect Forward Secrecy is a property of cryptography. It is implemented in certain cipher suites of SSL and offers tremendous privacy benefits. But before we go into what PFS accomplishes and how it works, lets talk about what we are trying to fix.

What is the problem?


We need the ability to communicate privately over the web. This fact is not debatable. A modern user of the web must have the ability to visit sites and communicate knowing that their sensitive data (passwords, credit card numbers, etc) will arrive where it is intended to go without being viewed or tampered with. Right now SSL/TLS is our browsers' primary protection from eavesdropping over the net.  It uses a combination of asymmetric and symmetric cryptography to accomplish this.

When you enter a website and you see the little lock symbol next to the URL this informs you that your session is encrypted over SSL.  A lot of things are happening in the background during this communication.

Lets say for example you are visiting your bank's website that communicates over SSL.  A very simplified view of this connection is the following:
  1. Your browser asks for the bank's certificate, or public key, and verifies it is legitimate.
  2. Your browser creates a new, random, one-time, shared secret key.
  3. You encrypt this new secret with the banks public key and send it to the bank over the internet.
  4. The various properties of asymmetric cryptography prove that the only way to decrypt a message encrypted with a public key is with the matching private key. The bank is the only entity with this matching private key and uses it to decrypt the message and obtain the shared secret from earlier.
  5. All communications onward for the duration of the session are encrypted with the shared secret. Only you and the bank know this secret and thus only you and the bank can decrypt these messages.
  6. Once your session has completed the shared secret is discarded by both parties and never used again.
                  You                                  Internet                                      Bank                          
_____________________________________________________________________________

1)          [ pubkey ]             <-------------------------------------                  [ pubkey ]

2)           [ secret ]

3)     [ encrypt(secret, pubkey) ]         --------------->                [ encrypt(secret, pubkey) ]

4)                                                                              [ decrypt(encrypt(secret,pubkey),privkey) ]
                                                                                                           =
                                                                                                       [ secret ]



5)    [ encrypt(message, secret) ]        -------------->             [ encrypt(message, secret) ]

                                                                            [ decrypt(encrypt(message, secret),secret) ]
                                                                                                           =
                                                                                                    [ message ]


       [ encrypt(message2, secret) ]            <-----------           [ encrypt(message2, secret) ]

 [ decrypt(encrypt(message2, secret),secret) ]
                         =
                  [ message2 ]                                                                        
_____________________________________________________________________________


Now, there are a lot of points in this process where things can go drastically wrong. The public key used in step 1 can be fake or compromised through various methods, the secret/private keys may be small enough to bruteforce or crack, the cryptographic functions used to encrypt and decrypt might be provably insecure, or the random number generators might be compromised.  All of these are valid threat vectors for breaking SSL, but we are not going to focus on any of them today. We are going to assume that everything above just works.

Under our current assumptions we assume that this methodology accomplishes what it is set out to do.  You are choosing a secret that only you and your bank's website know and using it to encrypt all of your communication.  You are able to tell the bank, and only the bank, this secret because you encrypted it with their real public key and the only way they can decrypt it is with their real private key.

The pressure point we are going to explore here is around an attacker stealing the bank's private key.  If an attacker obtains a private key they will have the ability to impersonate whomever the true owner is and it is game over.  The attacker can trick your browser into thinking it is the bank in question and decrypt any messages you send to it without any warning.

But what makes this problem an interesting problem?  There are already various methods of responding to this kind of breach in ways where the attacker can't use old and stolen private key. These include revoking the certificate and in some cases shutting down the service altogether.

What makes this threat interesting is when an attacker is storing all the encrypted internet traffic from across the planet in their colossal data centers until the end of time.  With the protocol defined above an attacker would only need to obtain a private key,  possibly months or even years later, and decrypt the message sent in step 3.  By decrypting that message the attacker obtains the shared secret and can decrypt all the private communications of that session.

We are now aware that this not only something that can be done, but something that is actively being done.

This is the problem.  This is what Perfect Forward Secrecy so elegantly solves.

What is PFS?


Perfect Forward Secrecy is a concept where communication between two parties is not only encrypted, but will remain encrypted even if their primary encryption keys are later stolen. In the context of SSL it is an exchange that allows a browser and a website to agree on a shared secret without ever needing to say what that secret is over the wires. This way even when private keys are stolen, they can not be used to decrypt the shared secret. Since the shared secret is needed to decrypt every subsequent message in the session the stored communication remains indecipherable.

The math behind this is surprisingly simple and intuitive.  It uses the Diffie-Hellman key exchange to create the shared secret using some simple properties of integers.

When simplified, Diffie-Hellman works something like this:

  1. Your browser comes up with two numbers that anyone can know, p and g. lets say ours are p=5 and g=3.
  2. Your browser then picks a secret number that only it knows, a.  Lets say a=4
  3. Your browser sends the bank those two public numbers, p and g, along with (g^a) mod p.  
    • (3^4) mod 5 = 81 mod 5 = 1
    • (g^a) mod p = 1
  4. The bank then picks a secret number b, we'll say b=7, and sends back (g^b) mod p.
    • (3^7) mod 5 = 2187 mod 5 = 2
    • (g^b) mod p =2
  5. Since ( (g^a)^b) and ( (g^b)^a) are equal mod p, both parties can calculate ( (g^a)^b) mod p and agree that this value will be the shared secret.  
    • The bank can compute this because it knows b, p, and (g^a) mod p:
      • ( ( (g^a) mod p) ^ b) mod p = ( (g^a)^b) mod p
      • ( (1) ^ 7) mod 5 = 1
      • Shared Secret = 1
    • The browser can compute this because it knows a, p, and (g^b) mod p:
      • ( ( (g^b) mod p) ^ a) mod p = ( (g^a)^b) mod p
      • ( (2) ^ 4) mod 5 = 16 mod 5 = 1
      • Shared Secret = 1
    • Most interestingly, anyone who has intercepted the traffic will know p, g(g^a) mod p, and (g^b) mod p, but there is no efficient way to generate ( (g^a)^b) mod p from those values. Thus an attacker who can see this traffic will still not know the shared secret.
    * Note: the algorithm above was highly simplified and used small numbers for clarity.  To ensure the shared secret is not cracked p must be prime, g  must be primitive root mod p, and all integers must be much larger.

If you add this key exchange method into the graph from before it will look something like this:

  1. Your browser asks for the bank's certificate, or public key, and verifies it is legitimate.
  2. Your browser generates p, g and a for its side of the Diffie-Hellman exchange.
  3. You  encrypt the Diffie-Hellman parameters with the banks public key and send it to the bank over the internet.
  4. The bank generates a value for b and responds in the clear with its Diffie-Hellman parameters.
  5. Both the browser and the bank now agree on a shared secret that was never explicitly stated. 
  6. All the rest of the communication is encrypted with that shared secret that only you and the bank know.
  7. Once your session has completed the shared secret, variable a, and variable b are all discarded by both parties and never used again.


                  You                                       Internet                                Bank                          
_____________________________________________________________________________

1)             [ pubkey ]             <-----------------------------------                [ pubkey ]

2)                [ a ]

            [ p,g, (g^a) mod p ]

3)    [ encrypt((p,g, (g^a) mod p), pubkey) ]      ----->        [ encrypt((p,g, (g^a) mod p), pubkey) ]

                                                               [ decrypt(encrypt((p,g, (g^a) mod p),pubkey),privkey) ]
                                                                                                                 =
                                                                                                    [ p,g, (g^a) mod p ]



4)                                                                                                       [ b ]

               [ (g^b) mod p ]                     <-----------------------          [ (g^b) mod p ]

         

 5)     secret = [ ( (g^a)^b) mod p ]                                        secret = [ ( (g^a)^b) mod p ]  



 6)    [ encrypt(message, secret) ]        ----------------->          [ encrypt(message, secret) ]

                                                                              [ decrypt(encrypt(message, secret),secret) ]
                                                                                                             =
                                                                                                      [ message ]


       [ encrypt(message2, secret) ]            <-------------               [ encrypt(message2, secret) ]

 [ decrypt(encrypt(message2, secret),secret) ]
                         =
                  [ message2 ]                                                                      
_____________________________________________________________________________

The server's public key is still used to guarantee authenticity of the web server. The new shared secret is freshly generated every session and once the Diffie-Hellman parameters have been deleted, the private key can not be used against stored traffic.

What's next?


We have to live with the fact that all encrypted internet traffic will be stored permanently.  We also have to live with the fact that the websites we trust so dearly may eventually have to surrender their private keys.  What we don't have to live with is that those private keys will be able to decrypt our session's data years later.  There are already SSL cipher suites that support PFS such as the following:

ECDHE-RSA-AES128-GCM-SHA256
ECDHE-RSA-RC4-SHA
DHE-RSA-AES256-SHA
To identify suites with PFS look for either DHE (Ephemeral Diffie-Hellman) or ECDHE (Elliptic Curve Ephemeral Diffie-Hellman) in the name. In order to mitigate this threat these suites must be supported and prioritized by browsers and web servers by default.

Hopefully over time web servers will begin to support PFS suites by default in the same way that many sites are now migrating to SSL all the time. I'm crossing my fingers that this happens sooner than later.