This document attempts to describe Secure Sockets Layer, or SSL, in simple terms. A full explanation is a daunting task, as SSL depends on several layers of complex software and some mathematical concepts, but once these are grasped, SSL itself is not too bad!

The first thing you need to know about Secure Sockets Layer is that it is
no longer called that. For purist reasons, the protocol formerly known as
SSL is now called Transport Layer Security, or TLS. The reasons for this
name change are fairly esoteric and originate partly in a description of
networking architecture as the Open
Systems Interconnection seven-layer networking model. SSL hovered
uncomfortably between the transport layer (4) and the presentation layer
(6), and some experts felt, long after the OSI model had fallen out of
fashion, that SSL was not really a layer

at all. In the more
recent TCP/IP
model, which has fewer layers, SSL operates somewhere between the
transport and application layers. Also, the protocol can — in
principle — be applied to other networking protocols than just sockets

,
even though the vast majority of global communication are now made using
the socket programming interface.

So, just at the time when the terms SSL and Secure Sockets Layer were becoming familiar to the general public as the secure networking standard that all Internet services should aspire to, the guardians of the protocols decreed that it should be renamed, to the confusion of all. The same purist mentality led to the emerging popular term of URL being replaced by URI (a general term that incorporates both URNs and URLs).

In spite of the name change, TLS is not vastly different from SSL.
Internally, messages contain an SSL `ProtocolVersion`

number
within them. This was 0200 for the original Version 2 of SSL, and 0300 for
SSL Version 3. (SSL Version 1 was so insecure that it was never released.)
When TLS Version 1 was introduced, the internal protocol version number
was only increased to 0301. TLS V1.1 became 0302, and TLS1.2 became 0303,
and TLS1.3 will probably be 0304.

Almost every product that implements TLS continues to refer to it as SSL,
usually with some weasel words added like more properly known as TLS

.

I know most people are turned off by mathematics, but before I can explain some of the cryptography terms, I need to introduce a few mathematical concepts. Many of these may be familiar, but I will restate them anyway as a reminder.

- Integer
- This is the mathematicians' posh name for a whole number; that is, a number with no fractional part.
- Composite number
- When several integers are multiplied together to produce another number, the result is called a composite number. Some examples are: 2×3=6, 12×5=60.
- Factors of a number
- The integers that can be multiplied together to produce a number are called its factors. For many numbers, several different factors can form the number in different ways. For example, 60 = 60×1 = 30×2 = 20×3 = 15×4 = 12×5 = 10×6 = 5×6×2 = 5×4×3 = 5×3×2×2. So the factors of 60 are: 60, 30, 20, 15, 12, 10, 6, 5, 4, 3, 2, 1.
- Prime number
- A number whose only factors are itself and 1 is called a prime number. By convention, 0 and 1 are neither prime nor composite.
- Prime factors
- The factors that can be multiplied together to produce a number, but which are not, themselves, composite. So the prime factors of 60 are just 5, 3, 2.
- Prime factorisation
- This is the process of discovering the prime factors of a number. If the process does not find any prime factors, the number is a prime number.
- Co-prime numbers
- Two number are co-prime if they do not share any factors (except 1). So 60 is co-prime with 77 because none of the factors of 60 (5, 3, and 2) are equal to those of 77 (11 and 7).
- Powers and exponents
- When the same number is multiplied by itself several times, the
process is called raising that number to a power. For example
2×2×2×2 is called
two raised to the fourth power

, or justthe fourth power of two

. It is written in shorthand form as 2^{4}. The power to which the number is raised is called its exponent. So 4 is the exponent of 2 in this expression. A number raised to the power of one is just the number itself, and a number raised to the power of zero is always equal to one: so 2^{1}= 2, and 2^{0}= 1. - Binary notation
- Any number can be represented as the sum of a list of powers of 2. For
example 13 = 8 + 4 + 1 = 2
^{3}+ 2^{2}+ 2^{0}. A binary number is a representation of a number by 1s and 0s, where a 1 represents a power of two that is actually present in the sum, and a zero represents a power of two that is absent, when the powers are listed in decreasing size. So, again, 13 = 1×2^{3}+ 1×2^{2}+ 0×2^{1}+ 1×2^{0}. To form the binary notation, the power expressions and the plus signs are removed, leaving only the 0s and 1s. So this example is written as 1101. The ones and zeroes in such a number are called binary digits, an expression that is almost always shortened to bits. - Hexadecimal notation
- For all but very small numbers, the binary representation of a number
is very long-winded, so hexadecimal notation is used instead. Each
number is represented as the sum of a list of powers of sixteen. For
example 2439 = 9×16
^{2}+ 8×16^{1}+ 7×16^{0}, and 2748 = 10×16^{2}+ 11×16^{1}+ 12×16^{0}. Just as in the binary notation, the power expressions and the plus signs are then removed. But since the two-digit multipliers 10, 11, 12, 13, 14, 15 would lead to ambiguous number forms, they are replaced by the letters A,B,C,D,E,F (in upper or lower case). So the hexadecimal representations of the two example numbers are 2439 = 987x, and 2748 = ABCx (where the x suffix denotes a hexadecimal number^{†}). There is an easy conversion between binary notation and hexadecimal notation, as four bits map into a single hexadecimal digit, so every eight-bit byte can be mapped by a pair of hexadecimal digits. ^{†}In this document, I am not using the C-language notation for hexadecimal (0xhhhh, where h is any hexadecimal digit). I find it extremely clumsy, and it gives the wrong impression that the hexadecimal notation can*only*be specified in this form. There are actually several different ways of annotating hexadecimal used in different programming languages, and the one used in C and its descendants is, in my opinion, one of the most obtuse. The strict mathematical notation is to use the base 16 as a subscript, as in hhhh_{16}, but this is somewhat more difficult to type and to read.- Modular arithmetic
- Modular arithmetic is the arithmetic of a finite number of integers
less than a certain fixed
number, called the modulus. An expression in modular arithmetic is
followed by the abbreviation mod, followed by
the modulus itself. mod is an abbreviation
for the the Latin word
*modulo*, which is the ablative case of*modulus*, meaningwith the modulus

. As an example of its use, with a modulus of 12, is: 7+8 ≡ 3 (mod 12). Notice that the usual equals sign is not used. The ≡ sign, with three bars, meansis congruent to

, with the specific meaning of equality within modular arithmetic. In modular arithmetic, the modulus is congruent to zero. In the example, 7 and 8 sum to 15, but because this exceeds the modulus 12, the excess 12 is subtracted from 15 to leave 3. Probably the most familiar example of modular arithmetic is the 12-hour clock: when you add eight hours to seven o'clock, you get three o'clock, because the hours wrap around at twelve o'clock. Hours start at 12 o'clock, so the modulus 12 is also the zero.Modular arithmetic is very important in computing because integers are usually represented in a computer by a memory location of fixed size, called a word. In all but the most modern computers, the word size is usually 32 bits. Only integers up to 2

^{32}−1 (4,294,967,295) can be represented in a 32-bit word. (In older computers, the word size could be eight or sixteen bits. In very modern computers, the word size is 64 bits, and numbers up to 2^{64}−1 (18,446,744,073,709,551,615) can be represented.) By convention, numbers in modular arithmetic are always positive. But, in a computer, numbers with a high-order bit set to one are frequently treated as negative, so the range of signed integers that can be represented in a 32-bit word is −2^{31}to 2^{31}−1 (−2,147,483,648 to 2,147,483,647) or, in a 16-bit word, as −2^{15}to 2^{15}−1 (−32,768 to 32,767), which is the point of the comic above. Notice that the sign convention does not affect the binary (or hexadecimal) value of the number in computer memory: the bit pattern for −1 is*identical*to that for 4,294,967,295 and is FFFFFFFFx. When +1 is added to this number, the result is zero in both cases, because modular arithmetic is used:-1 + 1 ≡ 0 (mod 4294967296) 4294967295 + 1 ≡ 0 (mod 4294967296)

- Primitive root
- In a set of integers using modular arithmetic, a number is a primitive
root if every number co-prime to the modulus is equal to some power of
that root. That is,
`r`is a primitive root if, for every`p`that is co-prime to`n`, there is an integer`i`such that`r`.^{i}≡ p (mod n) - Elliptic Curve
- An elliptic curve function is a relationship between
`x`and`y`of the form`y`. This is not the equation of an ellipse. The term arose from the study of elliptic integrals, whose name arose because they were originally used to calculate the arc length of an ellipse.^{2}= x^{3}+ ax + b

The SSL architecture depends heavily on three types of computer cryptography. These are:

- Digest cryptography
- in which a message is irreversibly scrambled and shortened. No key is required.
- Symmetric-key cryptography
- In which a single key is used to encrypt and decrypt a message.
- Asymmetric-key cryptography
- In which a pair of keys are used to encrypt and decrypt a message. The two keys are related in such a way that a message encrypted with one of them can only be decrypted with the other.

Digest and Asymmetric-key cryptography can be combined to produce digital signatures.

At first, it seems like the irreversible digest cryptography would be
fairly useless. But it is extremely powerful in validating the **integrity**
of a message: that is, whether a message has been correctly preserved
during processing.

The following examples use one of the simplest digest methods, known as MD5:

MD5(Jack and Jill went up the hill.) = FDA1D3E3B885E20D8B53490CA945F4B0x MD5(Jack and Jill went up the hall.) = FF6624FC67385074B7D97233609CE839x MD5(Pay the bearer the sum of $1000.) = 833E3213A371D8E50CC5335C9197B37Ex MD5(Pay the bearer the sum of $9000.) = 677A105D4E03DE53D4307E689455B069x

Note that, whatever the length of the message being digested, the MD5 result is always exactly 32 hexadecimal digits, which is 16 bytes, or 128 bits. Furthermore, when a message is modified with only a tiny change – even just a single bit – its digest changes dramatically. This is because the internal calculations are actively encouraged to overflow the normal capacity of a 32-bit computer word, so that some data bits are deliberately lost. Because of this data loss, the original message is irretrievably lost, and cannot be recovered from the digest.

A message digest is also known as a hash. More modern digest algorithms are SHA-1, SHA-2, and the proposed SHA-3, where SHA means Secure Hash Algorithm. The MD5 and SHA-1 algorithms are nowadays regarded as inadequate for use in secure systems.

Symmetric-key cryptography is probably the most familiar form of encryption. An encryption algorithm applies an encryption key to a message containing plain text , and produces a message containing cipher text. Later, an opposite decryption algorithm applies the same key to the cipher text to recover the plain text. The algorithm typically rotates and manipulates the bits of the plain text in a predictable but difficult-to-reverse way, using the content of the encryption key, but none of the data is lost.

One of the first commonly used symmetric-key computer algorithms was the
Data Encryption Standard (DES), which was adopted in 1977. This requires
an eight-byte secret key, but only 7 bits in each byte are used, so it is
referred to as a 56-bit key. This implies that there are only

2^{56}
(about 72 quadrillion, or 72×10^{15}) different DES keys,
which in principle could be used to brute-force

the decryption of
a cipher text. With modern computing power, this is not totally
infeasible, so that increased key sizes and more complex algorithms have
been introduced. Examples of these: are Triple DES (3DES), which applies
the DES algorithm three times, with three different keys, so increasing
the nominal key size to 168 bits; and the Advanced Encryption Standard
(AES), which uses key sizes of 128, 192 or 256 bits. Newer encryption
algorithms include ARIA (Korea), Camellia (Japan), SEED (Korea), and GOST
(Russia).

Asymmetric-key cryptography is really the miracle technology that makes
SSL work. It is also known as public-key encryption. In principle, a pair
of digital keys are used, known as the public key and the private key. A
message encrypted with the private key can *only* be decrypted
with the public key, and a message encrypted with the public key can *only*
be decrypted with the private key. It seems that the public and private
keys are therefore interchangeable (so long as one of them is kept
secret).

In practice, the situation is somewhat different. The primary set of algorithms that make public-key encryption work are known as the Rivest, Shamir, and Adleman algorithms, also known as RSA. The mathematical principle is based on the difficulty of factorising very large numbers into their constituent prime factors.

The algorithm to choose a public key/private key pair is the following:- Choose two large but different prime numbers,
`p`and`q`. - Calculate the product
`n`of these two primes`n = p×q.` - Calculate the value
`φ`equal to`(p−1)×(q−1)`. This is a count of the numbers less than`n`that do not share a common factor with`n`, and is known as the Euler totient function of`n`. - Choose an integer
`e`between`1`and`φ`that is co-prime with`φ`: that is, such that`e`and`φ`have no common factors. - Determine the number
`d`such that`d×e ≡ 1 (mod φ)`; that is, the number`d`such that the product`d×e`gives the remainder`1`when divided by`φ`.

Now the public key consists of the **two** numbers `n` and
`e`, and the private key consist of the **two** numbers `n`
and `d`. The number `n` is known as the modulus
for this key pair, `d` is known as the decryption exponent, or
the private exponent, and `e` is known as the
encryption exponent, or the public exponent. The size of the
key is usually counted as the number of bits of the modulus `n`.

In the original
RSA paper, the algorithm suggests choosing the private exponent `d`
first, and then calculating `e` from it, but modern practice
reverses this. In fact, the public exponent `e` is usually fixed
as 65537, which is 2^{16}+1, the largest
Fermat Prime, also called F4. This, of course, means that the
private and public keys are not interchangeable.

To perform encryption of a message, the message itself is regarded as a
binary integer, `m`. (In the end, all binary strings are
actually very large binary numbers.) The encrypted cipher text is then
equal to `c ≡ m ^{e} (mod n)`,
which uses the public key numbers

The calculation of the exponents `m ^{e}` and

Symmetric-key cryptography is by far the most efficient way of encrypting large amounts of data. But it shows up a major problem. To use it, both partners in the conversation must know a single shared secret key. And how do you share a secret key with a partner that you have never communicated with before? This is known as the key exchange problem. An early solution to this problem was for a trusted courier to carry the key, physically locked in a secure container, from one location to another. This is highly secure, but expensive and inconvenient, and hardly practical for electronic commerce. Furthermore, the same key is used for encrypting all traffic, which somewhat simplifies the possibility of an attacker breaking the key.

The solution to the key exchange problem lies in asymmetric-key cryptography. Although it is computationally expensive, it only needs to be used briefly to exchange a key between partners. Once the key has been exchanged securely, it can be used in a much more efficient symmetric-key algorithm to encrypt the main data traffic, known as the bulk data transfer.

The RSA asymmetric key algorithm can be used to perform a key exchange.
One of the partners in the conversation (usually identified as the client) uses a random number generator
to create a secret key. The client then uses the public key of the other partner
(usually identified as the server) to encrypt the secret key.
By the properties of asymmetric encryption, only the server who knows the corresponding private key can decrypt the secret.
The random number created by the client thus become a shared secret

that is known only by the client and the server.

As we shall see later, SSL uses a slightly more sophisticated way to obtain a shared secret, but the principle is essentially the same.

In the earliest version of SSL, RSA was the only key exchange mechanism that was used.

A problem with the simple key exchange using RSA is that the same key is used for both authorization and encryption. If a private key is ever compromised in the future, then historical messages encrypted with it can be decrypted, possibly years later, if a copy of the encrypted conversation had been preserved at the time. The solution to this problem is known as forward secrecy.

A technique that can be used to solve the forward secrecy problem is the Ephemeral Diffie-Hellman key exchange.

The Diffie-Hellman key exchange occurs in two exchanges:

- Before exchanging the secret key, the client and server agree on a
prime modulus
`p`and a primitive root`g (mod p)`. In this context,`g`is called the generator. - The client creates a private key
`a`, calculates`A ≡ g`, and sends it to the server.^{a}(mod p) - The server creates a private key
`b`, calculates`B ≡ g`and sends it back to the client.^{b}(mod p), - The client calculates the shared secret key from
`s ≡ B`.^{a}(mod p) - The server calculates the shared secret key from
`s ≡ A`.^{b}(mod p)

A more complex form of public-key cryptography that can be use for key exchange is Elliptic Curve cryptography, which is described in A (Relatively Easy To Understand) Primer on Elliptic Curve Cryptography by Nick Sullivan. A more comprehensive but much more difficult description is at Standards for Efficient Cryptography 1 (SEC 1): Elliptic Curve Cryptography. A fairly accessible mathematical discussion is An Introduction to the Theory of Elliptic Curves by Joseph H. Silverman. I have also written a Java applet which visualises elliptic curves geometrically. The documentation with the applet also gives my simplified overview of the mathematical basis for elliptical curve cryptography.

An elliptic curve is generated by the cubic equation `y ^{2} = x^{3}
+ ax + b`. This curve has an interesting property that (with a few exceptions)
any straight line that intersects it in two points also intersects it in exactly one other point.
The three collinear points can be used to define a mathematical group addition property.
If the three points where the line meets the curve are

additionof two points to produce a third. This process can then be applied over and over to produce a fourth, fifth, sixth point, etc. Furthermore, the process can be kicked off from a single point on the curve by drawing a tangent at the first point to generate the second point. The number of iterations to get from a specified initial point to another identified point is, in general, extremely difficult to calculate, and this fact is used as the basis for elliptic-curve cryptography. The problem is called the elliptic curve discrete logarithm problem.

Although elliptic curves are usually defined using real numbers, for
cryptography the elliptic curves are defined over a finite set of
integers, using modular arithmetic with a prime modulus, `p`.
The curve

is therefore defined as:

`y ^{2} = x^{3} + ax + b (mod
p)`

Because only integer coordinates are used, the curves actually consist of a finite
set of disconnected points. This is only possible for certain values of `a`,
`b`, and `p`. In practice, rather than inventing
elliptic curves with arbitrary parameters, a number of standard curves are
used, such as those described at Recommended
Elliptic Curve Domain Parameters.

A combination of digest encryption and public-key encryption can be used to produce an unforgeable digital signature. The sender first creates a digest of the document to be signed – a piece of plain text – using one of the cryptographic digest algorithms. Then the digest (only) is encrypted with the sender's private key. The result is the digital signature. When the recipient receives the plain text and the signature, he or she must recreate the digest of the plain text, using the identical algorithm. They must then decrypt the signature using the public key of the sender, to recover the sender's version of the digest. If both versions of the digest match, the signature is validated. The fact that the public key successfully decrypts the signature proves that it must have been encrypted by the sender. The fact that the decrypted digest matches the receiver's recalculated digest proves that the plain text has not been altered since it was signed.

Note that encryption with the private key does not produce *secret*
cipher text, because anyone who knows the public key can decrypt it. But
it does assure anyone who does the decryption that the cipher text was
created by the owner of the private key.

For many years, software implementing strong encryption was regarded as a
munition, and the export of software or hardware containing it from the USA was
subject to extreme controls.
Weaker encryption, using shorter keys or weaker algorithms, was graciously permitted to be
exported from the USA, and so was known as export grade

encryption.
This is of course a very US-centric view: anything that is exported from the USA is imported by the other county;
so every other country in the world should have called it import grade

.
In France, the implementation of even export

grade encryption was subject to government scrutiny,
and a special licence was required to use software that used keys longer than 40 bits.
The French government removed this requirement in 1999.

This situation led to a couple of articles in the 1990s ridiculing the customs requirements of hand-carrying computers that contain encryption software. Read My Life as an International Arms Courier, by Matt Blaze, and My life as a Kiwi arms courier, by Peter Gutmann.

The US regulations were effectively removed in 2001, although it is still illegal to export encryption products from the USA to certain embargoed countries (the usual suspects). The current state of international cryptography law is maintained in the Crypto Law Survey.

Nevertheless, governments are deeply suspicious of citizens wishing to use encryption that prevents them being spied on, and as recently as January 2015 the British government has proposed that encryption products should become unavailable to the general public.

It is already evident that for objects like the public and private keys,
which are pairs of binary numbers, some structure needs to be imposed
before they can be used in real-word computer systems. For instance, we
need to know how long the numbers `n`, `d`, and `e`
are, and in what order they should be stored. (They cannot be simply
concatenated as a continuous bit string, as it would then be impossible to
know where one number begins and the other ends.

To represent the abstract concepts such as numbers and ordered pairs is a way that can be easily represented in a computing system, a notation known as Abstract Notation One (ASN.1) was developed in the 1980s. It is described completely in John Larmouth's book ASN.1 Complete. The abstract notation itself does not suggest or require any particular binary representation: as it name implies, it is totally abstract. Instead, the binary representations of such abstract concepts are described in a number of encoding rules. The two that are relevant in this discussion of SSL are the basic encoding rules (BER), and the distinguished encoding rules (DER).

ASN.1 can be regarded as a programming language for describing the layout
of complex data structures. It contains some built-in data types, which
can be combined to produce more specific custom-defined types. The
built-in types are: BOOLEAN, INTEGER, ENUMERATED, REAL, BIT STRING, OCTET
STRING, NULL, OBJECT IDENTIFIER. Other extended data types such as
UTF8String, UTCTime, and GeneralizedTime are also defined. Data types can
be composed together using the connectors SEQUENCE, SEQUENCE OF, SET, SET
OF, or CHOICE. A SEQUENCE or SET is collection of elements of different
types, whereas a SEQUENCE OF or SET OF is a collection of elements of the
same type. Just as in mathematics, the elements in a SET or SET OF do not
have any particular order, so they are not very useful for creating real
structures, and are never used. CHOICE is equivalent to a `union`

in other programming languages: only one of the elements actually appears
in the final structure.

An object identifier is a general-purpose way of identifying a type
within an international
hierarchical global directory of object types. Each object type is
identified by a sequence of integers, where each integer is notionally
associated with a node in a huge tree containing close to a million
entries. The lowest three nodes in the tree are associated with the
organisations that coordinate the type definitions, which are ISO (the
International Standards Organisation) and ITU-T (the International
Telegraph Union Telecommunication Standardization Sector). Each of these
organisations have their own hierarchy of definitions in the tree, and
there is also a hierarchy of joint definitions, for which responsibility
is shared between the two groups. The three base nodes are therefore known
as **itu-t** (0), **iso** (1), and **joint-iso-itu-t** (2).

When the ASN.1 architecture was first being defined, the ITU-T was known
as the CCITT (*Comité Consultatif International
Téléphonique et Télégraphique*, or International
Telegraph and Telephone Consultative Committee), so these nodes were
originally known as **ccitt** (0), **iso** (1), and **joint-iso-ccitt**
(2).

The graph at the right shows just a few of the entries in the object-identifier tree. The path from the root to any of the nodes in the tree is called an arc.

Notice the node labelled **internet** (1) below the 1.3.6 **dod**
arc. The notes on this
entry say OID

. So "the Internet" does not actually belong to the
Department of Defense! In fact, RFC1065 states:`1.3.6.1` was hijacked by the Internet
community in IETF RFC
1065, by Marshall Rose and K. McCloghrie. The list of subsequent
nodes is based on IETF RFC
1155. The authoritative reference on the rest is the Internet
Assigned Numbers document, currently IETF RFC
1700

As of this writing, the DoD has not indicated how it will manage its subtree of OBJECT IDENTIFIERs. This memo assumes that DoD will allocate a node to the Internet community, to be administered by the Internet Activities Board (IAB) as follows:

`internet OBJECT IDENTIFIER ::= { iso org(3) dod(6) 1 }`

The 1.3.6.1.4.1 arc leads to sections for private corporations, where they can register their own private object types.

OIDs can also be registered at a national level under arc 2.16.
See Operation
of a country Registration Authority. The registration for X.509
certificates is at OIDs
for X.509 Certificate Library Modules. (**not** at 1.3.6.1.4.1
...)
See notes at http://oid-info.com/get/1.2.840.113527.
oid-info.com states that new registrations are made under the 2.16.840.1
arc.

The numbers registered within the USA country node 1.2.840 begin at 113527. This apparently random number was chosen by Jack Veenstra, chairman of the US registration authority committee, to avoid special status being granted to organisations lucky enough to get the lowest numbers. Meanwhile, in 1991, the standards organisations invalidated the use of 1.2.840, and require country registrars to use the arc 2.16 in preference. This is unfortunate, as many of the OIDs in active use had already been defined in the 1.2.840 arc.

Examples:

2.16.840.1.113531 - Control Data Corporation 2.16.840.1.113564 - Eastman Kodak Company 2.16.840.1.113730 - Netscape Communications Corp. 2.16.840.1.113735 - BMC Software, Inc

Before the artefacts described by ASN.1 can be used in real computer system, they have to be efficiently encoded into computer-readable binary data. There are a number of different ways of doing this. BER (Basic Encoding Rules) is the simplest, but leaves some of the encodings ambiguous. DER (Distinguished Encoding Rules) is similar to BER, but removes the ambiguities. PER (Packed Encoding Rules) produces a more compact notation than DER. XER (XML Encoding Rules) specifies an encoding of ASN.1 into XML (Extensible Markup Language).

Only BER and DER are described here, as they are the encodings used by SSL. The formal specification of BER and DER is available as ITU-T Recommendation X.690.

When it comes to encoding a specification expressed in ASN.1, the approach is to use a TLV encoding (Type, Length, Value). The basic encoding rules (BER) describe precisely how the bits should be laid out in each of the type, length, and value components. The encoding combines compactness with extensibility.

The data type referred to in ASN.1 as an OCTET is meant to be a
representation of a sequence of eight bits, which is nowadays known as a
byte. Perhaps uniquely in this specification, the bits in an octet are
numbered 1 to 8 from right to left. (This is the different from the
convention that I have always been used to in my career at IBM, where bits
are numbered 0-7 from left to right. This 0-7 convention [MSB0] is also
used in RFC1166. The 7-0 convention [LSB0] is typically used in little-endian

systems, where the bit number represents the power of two that it
encodes.)

Each part of the TLV (Type, Length, Value) has its own BER encoding.

The type segment is the start of the binary TLV encoding. It contains a tag number that represents the type's value. Its first or only byte (octet) is partitioned into three subfields of two, one, and five bits:

┌────┬───┬───────┐ │ xx │ x │ xxxxx │ └────┴───┴───────┘ 87 6 54321

The bits are conventionally numbered 1-8 from right to left.

Each type has an associated **class**, represented by the first two
bits (8-7). The classes are

00 Universal 01 Application 10 Context-specific 11 Private

The class is effectively a namespace in which the tag number has meaning.

- Universal
- the type is part of the ASN.1/BER architecture itself.
- Application
- the type is unique within a specific application.
- Context-specific
- the meaning of the type depends on the context in which it is used.
- Private
- the type is being used in a private specification.

Bit 6 is the **P/C** (primitive/constructed) flag. A value of 1 means
that the type is a constructed one, and that the later value field (V) is
built up out of a series of more TLV components A value of 0 means
that the type is primitive, and does not contain any further TLV
components.

Bits 5-1 are the binary representation of the actual **tag number**,
except where they are equal to 11111, which is an indicator that the tag
number is continued in subsequent bytes. The subsequent bytes then contain
the tag number encoded seven bits at a time, with the high-order bit in
each byte (bit 8) signalling that the tag number is continued into further
subsequent bytes.

The tag numbers in the Universal class, which form part of the ASN.1/BER architecture, are as follows (in hexadecimal):

Hex | Name | Hex | Name | Hex | Name |
---|---|---|---|---|---|

01 | BOOLEAN | 0B | EMBEDDED PDV | 15 | VideotexString |

02 | INTEGER | 0C | UTF8String | 16 | IA5String |

03 | BIT STRING | 0D | RELATIVE-OID | 17 | UTCTime |

04 | OCTET STRING | 0E | TIME | 18 | GeneralizedTime |

05 | NULL | 0F | [unused] | 19 | GraphicString |

06 | OBJECT IDENTIFIER | 10 | SEQUENCE or SEQUENCE OF | 1A | VisibleString |

07 | ObjectDescriptor | 11 | SET or SET OF | 1B | GeneralString |

08 | EXTERNAL | 12 | NumericString | 1C | UniversalString |

09 | REAL | 13 | PrintableString | 1D | CHARACTER STRING |

0A | ENUMERATED | 14 | T61String | 1E | BMPString |

The tag numbers in the table above are the **default** numbers for
types in the Universal class. It is also possible to override the default
number for a tag by prefixing the type with a specific number enclosed in
square brackets. When this is done, the default class for the tag becomes
Context-specific

, unless the tag's class is also specified. The
following examples show various encodings for a field named `condition`

,
which is of type NULL:

ASN.1 specification | Hex | Explanation |
---|---|---|

condition NULL | 05 | The default encoding for NULL in the Universal class. |

condition [7] NULL | 87 | Encoding changed to 7 in the Context-specific class. |

condition [APPLICATION 7] NULL | 47 | Encoding changed to 7 in the Application class. |

It is necessary to change the encoding of tag whenever an ambiguity may arise. This can happen when an element may be one of a number of possibilities in a CHOICE construction, or where there are one more OPTIONAL fields, whose presence or absence must be signalled by a unique tag number.

As a further complication, if the keyword EXPLICIT follows the bracketed
tag number, it means that the new tag number (the one in brackets) must be
used in an external wrapper

TLV structure that contains the
original tag number in an inner TLV structure.

The length segment of the binary TLV encoding is represents the length of the subsequent value segment in bytes (octets). There are three forms defined for the length:

- Short form
- When the length is less that 128, the length can be encoded as a single binary value in bits 7 to 1, and the high order bit (bit 8) is set to zero.

┌───┬─────────┐ │ 0 │ xxxxxxx │ └───┴─────────┘ 8 7654321

- Long form
- For any length (but usually for those greater then 127) the length is
first encoded into a binary number that is right aligned and filled to
the left with zeroes to make a bit string that is a multiple of eight
bits. The bit string is then prefixed with a single byte in which the
high-order bit is set to one, and bits 7-1 are set to the number of
**bytes**in the bit string.┌───┬─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ 1 │ nnnnnnn │ │ xxxxxxxx │ │ xxxxxxxx │ │ xxxxxxxx │ ... │ xxxxxxxx │ └───┴─────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ 8 7654321 87654321 87654321 87654321 87654321 <-------------------- nnnnnnn bytes -------------------->

The value of`nnnnnnn`

must lie between 1 and 126: that is, it cannot be all zeroes or all ones. This is not a significant constraint: it allows lengths up to 2^{1008}−1 to be specified. - Indefinite form
- For composite types only, the length may be set to 10000000. This is followed by an arbitrary number of TLV constructs, the end of which is signalled by two bytes of zeroes. The length is found by searching for the two zeroes.

┌───┬─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ 1 │ 0000000 │ │ TLV item │ │ TLV item │ ... │ TLV item │ │ 00000000 │ │ 00000000 │ └───┴─────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ 8 7654321 87654321 87654321

Perhaps you need to be an old-time software developer to recognise just how exquisite the BER length encoding is. It is remarkably compact for short lengths (a single byte), yet is hugely extensible to numbers far in excess of anything that could possibly be required in everyday use.

The binary encoding the value segment of a TLV construct depends on its implied type. Most of the encoding are obvious or trivial, except for OBJECT IDENTIFIER. Here are descriptions of some of them.

- BOOLEAN
- A boolean value is encoded as hexadecimal 00 for false, and anything else for true.
- INTEGER
- An integer value is encoded as binary number, right aligned in a bit string that is a multiple of eight bits. Negative numbers are encoded in twos complement form so that the high-order bit is 1.
- ENUMERATED
- In this architecture, an enumerated value can be positive or negative, so it is encoded in the same way as an INTEGER.
- BIT STRING
- A generalised bit string can contain any number of bits, not
necessarily a multiple of eight bits. But TLV values must always contain
an integral number of bytes. To code this an arbitrary length bit string
is padded out
*on the right*to a multiple of eight bits, and then prefixed*on the left*with a single byte that contains the number of padding bits that have been added. For example, a string of three one bits, 111, can be encoded as 00000101,11100000, or hexadecimal 05E0x. where the 05x shows that five padding bits have been added. The BER encoding does not specify what padding bits should be used. - OBJECT IDENTIFIER
- The OBJECT IDENTIFIER type is an encoding
of the sequence of positive integers occurring on a specific arc within
the object-id tree. However, a more compact notation than the SEQUENCE
OF INTEGER TLV encoding is used. The binary representation of
each integer in the sequence is split into segments of seven bits each,
padded with zero bits on the left if necessary. Then a 1 bit is prefixed
to each seven-bit segment except the last, which is prefixed with a 0
bit. For example, 3166 is hexadecimal C5E, or binary 1100,0101,1110.
Padding on the left and splitting into seven-bit segments gives
0011000,1011110. Prefixing the non-last segment with 1 and the last
segment with 0 gives 10011000,01011110, which is 985Ex. This encoding
has the benefit that an integer less than 128 can be encoded in a single
byte with its usual representation.

As a further optimisation, the first two integers in each sequence are merged into one by calculating the value`40×i`, where_{1}+i_{2}`i`and_{1}`i`are the first two numbers in the sequence. This means that 40 is added to the second integer in each_{2}**iso**sequence, and 80 is added to the second integer in each**joint-iso-itu-t**sequence, and then the initial integer is suppressed in all three branches. Why is the multiplier 40? Probably because it is just less than one third of 127, so the three ranges can all have their first and second integers encoded into a single byte, so long as there are less than 40 branches at each of the nodes. In fact, the**0**and**1**nodes can never have more than 40 branches, or the encodings would overlap. (If the denouement of The Hitchhiker's Guide to the Galaxy had been more widely known when BER was being defined, the multiplier might well have been 42.) - NULL
- The NULL type is used to specify an element whose presence is required, but whose value is not required. As it is empty, the V part of TLV is absent, and the L part is zero. NULL is therefore encoded as 0500x
- UTCTime
- This is the first of the three time formats in ASN.1. This one encodes
a date and time incorporating a two digit year, even though it was
specified in 1982, when the imminent Y2K problem was already well
understood. It can be encoded as a ASCII character string in one of the
following formats:
yymmddhhmmZ yymmddhhmmssZ yymmddhhmm+hhmm yymmddhhmm-hhmm yymmddhhmmss+hhmm yymmddhhmmss-hhmm

where`yymmdd`

is a date in year, month, date order, and`hhmmss`

is a time in hour, minute, second format. The suffix Z indicates strict UTC time (equivalent to the Greenwich Mean Time time zone), and the suffix`±hhmm`

specifies a time offset in hours and minutes to specify that the encoded time is in a different time zone. If the offset is present, the encoded time is a local time, and the offset is the signed amount by which the local time is ahead of UTC. So the UTC time is obtained by subtracting the offset from the specified local time. An offset with a minus sign is added to the local time specified, and an offset with a plus sign is subtracted. - GeneralizedTime
- This is the second of the three time formats. It is nominally based on
the ISO standard 8601, which specifies a four digit year, and allows the
addition of fractional seconds. But unlike ISO8601, no delimiters are
allowed in the date/time representation. A single decimal fraction
delimiter is allowed, which is either a comma or a period (full stop).
The decimal fraction normally appears after the
`ss`

seconds component, but can appear after`mm`

minutes or`hh`

hours instead. The time zone offset is also allowed to specify only`hh`

hours, or to be omitted altogether, in which the time is interpreted as a local time only. The following are all examples of valid encodings (from X.680)19851106210627.3 19851106210627.3Z 19851106210627.3-0500 198511062106.456 1985110621.14159

- TIME
- This is a fairly recent time format that is specifically meant to mirror ISO 8601. It has been subtyped into a wide variety of other useful time and date types. However, it has not been incorporated into any of the entities used by SSL, so it is not discussed further.
- Character string types
- There are a dozen different character string types. The only basic
difference between them is the range of characters (the alphabet, or
character set) that can be used to encode them. The common character
string types and their character sets are:
- NumericString
- A string composed only of the ASCII numeric digits, 0-9, plus space. (11 characters.)
- PrintableString
- A string composed only of ASCII letters, numbers, and punctuation:
A-Z, a-z, 0-9, ' ( ) + - / : = ? . , plus space. Notice that @ and
currency symbols are
*not*included. (74 characters.) - VisibleString
- A string composed only of the ASCII printing characters, plus space, and excluding the control characters. It is defined by the ISO_IR 6 character set: A-Z, a-z, 0-9, ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ ] \ ^ _ ` { } | ~ plus space. (95 characters.)
- IA5String
- A string composed of any of the US-ASCII characters, including
DEL, space, and the control characters. (IA5 means
International Alphabet 5

.) (128 characters.) - UniversalString
- A string composed of 32-bit Unicode characters. (Potentially 4 billion characters, but many are not yet defined.)
- UTF8String
- A string encoded in the variable length character format UTF-8. It includes the same characters a UniversalString, but usually encoded more compactly. (Also potentially 4 billion characters.)

- There are several other named character string types, but they are obsolete and not widely used.

The Basic Encoding Rules described above allow a certain amount of flexibilty in the binary representation of the abstract specification. The Distinguished Encoding Rules tighten up the ambiguities to that the encoding can only be done in one way. This leads to the following changes:

- In the length section of the TLV encoding, the shortest possible encoding must be used. For example, if the length is less than 127, the short form must be used. For longer lengths, the long form is used, but the minimum number of bytes must be used. The indefinite length form must not be used.
- In the boolean type, FF must be used for the
**true**value. (BER allows any non-zero value.) - In an INTEGER, the shortest possible encoding must be used.
The sign bit, either 0 or 1, must not be extended into superfluous high-order bytes.
So FFx encodes −1, and 00FFx encodes 255, but FFFFx and 0000FFx are both invalid.
In
Larmouth's book, this rule is expressed by the statement that
the
top nine bits must not be the same

. (I am assuming that this restriction is a DER rule, and not BER one. Larmouth is a bit ambiguous about this, and I have not gone back to the formal specification to check.) - An ENUMERATED type is encoded in exactly the same way as an INTEGER, so the same rules apply: the top nine bits must not be the same.
- In a BIT STRING, the padding bits must always be zeroes.
- In a GeneralizedTime, the decimal separator must be a full stop (period).
- In a SET or SET OF construction, BER allows the members to be in any order, but DER requires the elements to be sorted. This contradicts the normal mathematical definition of a set, which says specifically that members are not ordered; but there is no other simple way to ensure that two identical sets are indeed identical when encoded. This difficulty hints that a SET is not really a very good primitive to use in ASN.1, and indeed it is never used in practice.

SSL uses raw binary types such as encryption keys and signatures. It should not come as a surprise, having ploughed through the huge description of ASN.1 and its encodings above, that these binary types are enclosed in ASN.1 containers, which are encoded using the Distinguished Encoding Rules (DER).

As described earlier, the RSA algorithms specifies a public key and a private key. The ASN.1 specification for these is contained in RFC2437.

The public key is just a sequence of two integers.

RSAPublicKey::=SEQUENCE{ modulus INTEGER, -- n publicExponent INTEGER -- e }

The private key contains much more structure than the public key.

RSAPrivateKey ::= SEQUENCE { version Version, modulus INTEGER, -- n publicExponent INTEGER, -- e privateExponent INTEGER, -- d prime1 INTEGER, -- p prime2 INTEGER, -- q exponent1 INTEGER, -- d mod (p-1) exponent2 INTEGER, -- d mod (q-1) coefficient INTEGER -- (inverse of q) mod p } Version ::= INTEGER

Although in principle it only needs to contain the modulus and the
private exponent, in practice the private key structure contains
additional information to assist in the decryption calculation. These
requirements are contained in
Fast decipherment algorithm for RSA public-key cryptosystem, and
include `p` and `q`, the factors of `n`. Any
of these hints and intermediate results would compromise the key, but it
is safe to store them in the private key structure because the entire
structure has to be kept secret.

A major construct that is used extensively in SSL is a certificate. This is a container for a public key, which is granted to a named subject, and digitally signed by a trusted named issuer. The issuer assigns a unique serial number to each certificate it issues.

An X.509 certificate is described in RFC5280.
It consists of a TBS

certificate, a signature algorithm, and a
signature. The meaning of TBS

is not specifically defined, but the
context implies a meaning of to be signed

. Obviously, the signed
part of the certificate cannot include the signature itself, so the
separation is a natural one:

Certificate ::= SEQUENCE { tbsCertificate TBSCertificate, signatureAlgorithm AlgorithmIdentifier, signatureValue BIT STRING } TBSCertificate ::= SEQUENCE { version [0] EXPLICIT Version DEFAULT v1, serialNumber CertificateSerialNumber, signature AlgorithmIdentifier, issuer Name, validity Validity, subject Name, subjectPublicKeyInfo SubjectPublicKeyInfo, issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL, -- If present, version MUST be v2 or v3 subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL, -- If present, version MUST be v2 or v3 extensions [3] EXPLICIT Extensions OPTIONAL -- If present, version MUST be v3 } Version ::= INTEGER { v1(0), v2(1), v3(2) } CertificateSerialNumber ::= INTEGER Validity ::= SEQUENCE { notBefore Time, notAfter Time } Time ::= CHOICE { utcTime UTCTime, generalTime GeneralizedTime } UniqueIdentifier ::= BIT STRING SubjectPublicKeyInfo ::= SEQUENCE { algorithm AlgorithmIdentifier, subjectPublicKey BIT STRING } Extensions ::= SEQUENCE SIZE (1..MAX) OF Extension Extension ::= SEQUENCE { extnID OBJECT IDENTIFIER, critical BOOLEAN DEFAULT FALSE, extnValue OCTET STRING -- contains the DER encoding of an ASN.1 value -- corresponding to the extension type identified -- by extnID }

I will not describe every single field in the certificate, but will highlight some of them.

The certificate contains two names, one of which is the `issuer`

and the other is the `subject`

. Each of them is a
Distinguished Name (DN), which is a sequence of Relative Distinguished
Names (RDNs), each of which is a type/value pair:

Name ::= CHOICE { -- only one possibility for now -- rdnSequence RDNSequence } RDNSequence ::= SEQUENCE OF RelativeDistinguishedName RelativeDistinguishedName ::= SET SIZE (1..MAX) OF AttributeTypeAndValue AttributeTypeAndValue ::= SEQUENCE { type AttributeType, value AttributeValue } AttributeType ::= OBJECT IDENTIFIER AttributeValue ::= ANY -- DEFINED BY AttributeType DirectoryString ::= CHOICE { teletexString TeletexString (SIZE (1..MAX)), printableString PrintableString (SIZE (1..MAX)), universalString UniversalString (SIZE (1..MAX)), utf8String UTF8String (SIZE (1..MAX)), bmpString BMPString (SIZE (1..MAX)) }

The ASN.1 syntax does not specify the format of the `value`

in the RDN, except to hint that it should be a `DirectoryString`

.
The supporting text in RFC5280 says that new certificates should only use
`PrintableString`

or `UTF8String`

encodings. In
real life, Distinguished Names are specified as a comma-separated list of
character RDNs, like CN=xxxx,O=xxxx,OU=xxxx, but within the certificate
the RDNs are actually DER-encoded OBJECT IDENTIFIERs. The preferred
textual form of a distinguished name id defined in RFC1779.
As an aside, distinguished names and RDNs are widely used in the LDAP
protocol, as every object in an LDAP directory is identified by a DN.

The certificate definition includes two optional `UniqueIdentifier`

fields, which were supposed to disambiguate reused names. They was never
used, and are now deprecated.

Certificates conforming to this specification contain time values. The
ASN.1 syntax implies that there is a choice of format, but RFC5280 states
that the time must be encoded with a two digit year (`UTCTime`

)
until the end of 2049, and with a four digit year (`GeneralizedTime`

)
from 2050. This does seem like a long time to retain an encoding that was
obsolete even when it was invented. It is to be hoped that a later version
of this RFC will bring that conversion date a little closer.

The signature part of the certificate contains an algorithm identifier, which itself contains two parts:

AlgorithmIdentifier ::= SEQUENCE { algorithm OBJECT IDENTIFIER, parameters ANY DEFINED BY algorithm OPTIONAL }The algorithm referred to is the signature algorithm. Recall that a digital signature is a digest followed by an encryption, so the algorithm must identify both parts. There is a list of a few signature OIDs in RFC3279, but it is now effectively superseded. The OIDs were originally specified as references to section numbers in public-key cryptography standards (PKCS) published by RSA Data Security Inc, a software security company founded by the inventors of the RSA algorithms. It is now RSA Security LLC, a division of the EMC corporation. This means that the PKCS documents are now held on the EMC website, and the PKCS#1 documentation is at PKCS #1: RSA Cryptography Standard. From there we can obtain the following signature OIDs, which are in the deprecated 1.2.840 arc:

pkcs-1 OBJECT IDENTIFIER ::= {iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) 1} md2WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 2 } md5WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 4 } sha1WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 5 } sha224WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 14 } sha256WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 11 } sha384WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 12 } sha512WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 13 } sha512-224WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 15 } sha512-256WithRSAEncryption OBJECT IDENTIFIER ::= { pkcs-1 16 }

When the OIDs for these algorithms are used in the `AlgorithmIdentifier`

sequence, the `parameters`

element must be present, and it must
be equal to NULL. Note that the MD2 and MD5
digest algorithms are obsolete and no longer used. SHA-1 is still widely
used, even though it has been deprecated. SHA-256 is slowly coming into use.

A certificate is constructed in such a way that it can be **trusted**.
The trust arises because of the digital signature that is included within the certificate.
Recall that a digital signature is created by using asymmetric encryption, with a private key,
of the digest of a piece of data – in this case, the rest of the certificate.
The signature is validated by decrypting it with the corresponding public key.
The digitally signed piece of data can be trusted
*as far as the owner of the private key can be trusted*.
To produce this level of trust, a number of public organisations called
certificate authorities have arisen.
A certificate authority has a single purpose: to create trusted digital signatures within certificates.
The certificate authority is identified by the `issuer`

distinguished name in the certificate,
as described above.
The other distinguished name in the certificate, the `subject`

, is the user of the certificate.
The trust relationship between the issuer and the subject is that
*the issuer asserts that the subject is who they say they are*,
and confirms the association between the subject and the distinguished name that represents it.
To cement this relationship, a fee is usually passed from the subject to the issuer.

The actual mechanics of the process is that the subject creates a file called a certificate request

.
This contains all the fields that the subject wants the certificate to contain,
but lacking the issuer's name and signature. The request is then sent to the issuer for signing.
When the issuer (the certificate authority) has confirmed the identity and validity of the subject's
distinguished name (and accepted the appropriate fee, if necessary),
it inserts the issuer's name and signature into the subject certificate to produce a signed certificate.
This binds the subject's distinguished name inexorably to the issuer's distinguished name,
along with the serial number and the range of dates for the validity.
If the subject attempts to change any of these attributes, by as much as a single bit,
the signature will fail to validate, and the certificate becomes useless.

Whenever the subject's signed certificate is presented to another user, such as a client using a web browser, the client must validate the signature in the subject certificate. To do this, the client must know the public key of the certificate authority, corresponding to the private key used to produce the signature. This public key is itself contained within another certificate, called the certificate authority certificate, or more simply just the signing certificate.

How can the certificate authority certificate be trusted? In some cases,
the trust in this certificate is obtained by having it signed by yet another certificate authority.
In fact, sometimes a whole chain of signing certificates is built, with each certificate in the chain depending
on the signature of a higher authority

.
However, this trust chain cannot go on for ever, and eventually a top

certificate is reached,
which is called a self-signed certificate, in which the the subject and issuer distinguished names are identical.
The certificate is still signed, but the signature really says nothing about the trustworthiness of the certificate,
because it is produced by the subject of the certificate, rather than an independent authority.

How can a self-signed certificate be trusted?
The answer to this question depends entirely on the context in which the trust is required.
In the most familiar case, when SSL is being used in a web browser, the browser itself provides the context.
Each browser is shipped with a list of trusted certificate authorities, which you get for free when you install
the browser software.
What this means in practice is that
*the browser manufacturer is trusting a certain set of certificate authorities
on your behalf*.
This may be regarded as a good thing or a bad thing: it means that you do not need to get into the minutiae
of the trustworthiness of certain certificate authorities, but it also means that you have lost the ultimate
control over the choice of those certificates.
In the early days of SSL deployment, there were only a few certificate authorities.
At the latest count there are over 300.

Within a private network, it is quite feasible to become a private certificate authority, and to sign all your own certificates. To express your trust in your own certificate authority, you must install the self-signed certificate of your private certificate authority into the trusted certificate store of all the browsers that may use it. This presumes that all the client browsers can be reconfigured under your own control.

Certificates become expired when their `notAfter`

date is reached.
Sometimes a certificate may need to be invalidated before the expiry date is reached, however.
The issuer cannot go to the subject's site and physically remove the certificate, so a less reliable process has to be used.
There are two such processes.

- Periodically, each certificate authority publishes a certificate revocation list (CRL), which contains the serial numbers of all the certificates that have been revoked, but not yet expired. It is a requirement that a process that validates a certificate should look in the most current revocation list to find out whether the certificate has been revoked. It is not expected that the validator should obtain the list from the certificate authority's site for every request. Instead, the CRL is cached locally, and only the local copy is referenced. Delays in publishing the CRL, and delays in downloading the cached copy, can mean that the validator may be unaware of revoked certificates for up to a week.
A more dynamic process called the Online Certificate Status Protocol (OCSP) was invented to circumvent the delays implicit with CRLs. Using this protocol, the validator does indeed contact an external site called an OCSP responder, supplying the serial number of the certificate to be validated. The responder can be the certificate authority itself, or a suitably delegated response server. The responder replies with a signed response that indicates the current revocation state of the specified certificate. The main drawback with OCSP is that the client has to contact yet another server during the validation process, leading to performance and privacy concerns.

An enhancement to the original OCSP protocol is the stapled OCSP request. In this protocol it is the server, rather than the client, who contacts the OCSP responder. The server then attaches (or

staples

) the OCSP response to the certificate to confirm that it is not revoked. The stapled response is timestamped and signed by the OCSP responder, so that the client can confirm that the OCSP response is trusted and is reasonably current.

However, none of these revocation systems is totally satisfactory, as discussed in Solving The SSL Certificate-Revocation Checking Shortfall. It is even claimed that certificate revocation doesn't work in practice, because many browsers do not even perform the revocation check (at least, in 2013).

Before you can embark on providing an SSL service, you have to create the artefacts needed to control the service. These are just:

- A private key.
- The corresponding public key, wrapped up in a certificate.

- A certificate request file.

- A private certificate authority certificate.

- A client certificate

Although the native encoding method for the SSL artefacts is DER (Distinguished Encoding Rules), the OpenSSL tool prefers to produce its output in a format known a PEM (Privacy Enhanced Mail). As its name implies, Privacy Enhanced Mail was originally designed as an infrastructure for encrypting email, but it never took off as a standard, and has been superseded by Pretty Good Privacy (PGP). All that now remains of the PEM infrastructure is the file format also known as PEM.

DER-encoded objects contain binary data which is generally not displayable or printable. PEM encoding uses base-64 notation to convert the binary DER objects into textual form, which can be displayed or printed, or quoted directly in emails.

If you are familiar with hexadecimal encoding, you will know that it
encodes every four bits in a binary string into one of the characters from
the 16-character alphabet `0123456789ABCDEF`

. Base-64 encoding
is similar, but instead it encodes every six bits into one of the
characters from the 64-character alphabet
`ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/`

.
For instance, `000000`

encodes as `A`

, `011010`

encodes as `a`

, and `111111`

encodes as `/`

.
Unlike hexadecimal encoding, the case of the letters in base-64 encoding
is significant. If the string to be encoded is not a multiple of 24 bits
then it is padded out with zero bits, and any complete six-bit padding
units are encoded as equals signs (`=`

). Therefore, a base-64
encoded message is always a multiple of four bytes. There is an online
hexadecimal to base-64 converter at tomeko.net.

To produce a PEM-encoded file, the base-64 data is split into 64-byte
lines, and delimited with lines containing ASCII strings, called
encapsulation boundaries. In the formal
PEM specification, these delimiters are: ```
-----BEGIN
PRIVACY-ENHANCED MESSAGE-----
```

and ```
-----END
PRIVACY-ENHANCED MESSAGE-----
```

, but in the format produced by
OpenSSL the
delimiters
used are `-----BEGIN `

and * type*-----

`-----END `__type__-----

,
where __type__

can be one of:`CERTIFICATE`

`CERTIFICATE REQUEST`

`ENCRYPTED PRIVATE KEY`

`RSA PRIVATE KEY`

`X509 CRL`

Each of the encapsulation boundaries must start and end with exactly five hyphens.

As most ASN.1 objects are a SEQUENCE with a long form length, they nearly always begin with hexadecimal 3082, for which the base-64 encoding begins with MII.

The openssl tool is a set of command line functions for managing keys and certificates. The primary openssl command contains many subcommands to create SSL artefacts such as keys and certificates.

There are sample commands at: madboa.com. Some examples of the more useful subcommands are the following:- Create a self-signed CA certificate valid for 3653 days (10 years)
and save the corresponding private key.
req -verbose -x509 -nodes -days 3653 -newkey rsa:2048 -keyout PrivateCA.key -out PrivateCA.pem

- Convert the PEM-encoded CA certificate to DER-encoded.
This
`.crt`

file type was originally expected by Netscape to be DER-encoded, but popular usage now also accepts such files being PEM-encoded. (See DER vs. CRT vs. CER vs. PEM certificates and how to convert them.) In fact, it seems that Android actually requires PEM-encoded CA certificates.x509 -inform PEM -in PrivateCA.pem -outform DER -out PrivateCA.crt

- Create a certificate request for a server certificate that is
valid for one year:
req -verbose -new -nodes -days 365 -newkey rsa:2048 -keyout ServerCert.key -out ServerReq.pem

This command will prompt you for the elements of adistinguished name

of your server. You must specify the

for your server in the Common Name section of the distinguished name.*hostname* - Sign the request file with the private key for the CA that you
saved earlier. This creates a server certificate signed by your
private CA. You will be prompted to confirm (with a y) that you wish
to sign and then commit the certificate. On completion, the new
certificate will be saved in
`/etc/ssl/newcerts`

, as well as the file specified by`-out`

.ca -verbose -in ServerReq.pem -out ServerCert.pem -cert PrivateCA.pem -keyfile PrivateCA.key

- Export the signed server certificate as a PKCS12 password
protected file. You will be prompted for an
Export Password

, which will be used to encrypt the file while it is being transported to another system.pkcs12 -export -in ServerCert.pem -inkey ServerCert.key -out ServerCert.p12

- To terminate OpenSSL subcommand mode, type the
**exit**command.

`openssl dhparam`

is used to generate a Diffie-Hellman prime
and generator pair (`p` and `g`).

Now I have explained encryption, digital signatures, ASN.1
specifications, DER encoding, private and public keys, certificates, and
openssl tooling, it's finally possible to describe the SSL protocol itself
(more properly known as TLS

).

The SSL protocol was originally invented by the Netscape Corporation as a way of providing secure browsing in their web browser. The Netscape Corporation was absorbed into AOL, which no longer preserves Netscape's original SSL documentation, but it has been captured by the Wayback Machine.

As I explained earlier, symmetric-key encryption is considerably more efficient than public key encryption. So, for encrypting bulk data transfer with high performance, symmetric-key encryption is a must. But to use it, both partners in the conversation must know a single shared secret key. But how do you share a secret key with a partner that you have never communicated with before? This is known as the key exchange problem. An early solution to this problem was for a trusted courier to carry the key, physically locked in a secure container, from one location to another. This is highly secure, but expensive and inconvenient, and hardly practical for electronic commerce. Furthermore, the same key is used for encrypting all traffic, which somewhat simplifies the possibility of an attacker breaking the key.

SSL, in brief, is a solution to the key exchange problem that is suitable for electronic communication.

The two partners in the conversation must be identified as the client and the server as their roles are different: the conversation is not symmetric. The conversation is initiated by the client, who provides a list of suggested encryption techniques. The server responds with a certificate containing the server's public key, and an encryption technique that is acceptable to the client. The client validates the server's certificate, and uses the public key within it to encrypt a random string called the pre-master-secret, which it sends to the server. The server uses its private key to decrypt the pre-master-secret. At this point, the key exchange problem is solved: the client and server can both use the pre-master-secret to generate the key required by the mutually chosen encryption technique.

Both the client and server now possess a mutually chosen encryption algorithm and a key to use with it. They are now in a position to exchange secret encrypted messages using fast symmetric-key encryption, using a shared key that has never appeared in plain text in the conversation. Furthermore, a different key is used for each conversation, limiting the opportunity for an attacker to break it.

The summary above is only a brief overview. The actual conversation is, of course, more detailed.

As the primary purpose of SSL is to negotiate and then use a symmetric-key algorithm, you would expect the specification of such an algorithm to be at the centre of the protocol, and this is indeed the case.

The encryption algorithm is bundled together with a key length (which may
be implicit), a key exchange algorithm, and a message authentication code
algorithm (MAC), to produce an entity called a cipher suite number (16 bits).
The cipher suite numbers are allocated somewhat arbitrarily, and are
assigned symbolic names. Only the names give any clue to the components.
Examples of *some* of the standardised cipher suite names are
shown below. The full list is in the
IANA TLS Cipher Suite Registry.

Hex value | Standardised cipher suite name |
---|---|

0000x | TLS_NULL_WITH_NULL_NULL |

0001x | TLS_RSA_WITH_NULL_MD5 |

0002x | TLS_RSA_WITH_NULL_SHA |

0003x | TLS_RSA_EXPORT_WITH_RC4_40_MD5 |

0004x | TLS_RSA_WITH_RC4_128_MD5 |

0005x | TLS_RSA_WITH_RC4_128_SHA |

0006x | TLS_RSA_EXPORT_WITH_RC2_CBC_40_MD5 |

0007x | TLS_RSA_WITH_IDEA_CBC_SHA |

0008x | TLS_RSA_EXPORT_WITH_DES40_CBC_SHA |

0009x | TLS_RSA_WITH_DES_CBC_SHA |

000Ax | TLS_RSA_WITH_3DES_EDE_CBC_SHA |

000Bx | TLS_DH_DSS_EXPORT_WITH_DES40_CBC_SHA |

000Cx | TLS_DH_DSS_WITH_DES_CBC_SHA |

000Dx | TLS_DH_DSS_WITH_3DES_EDE_CBC_SHA |

000Ex | TLS_DH_RSA_EXPORT_WITH_DES40_CBC_SHA |

000Fx | TLS_DH_RSA_WITH_DES_CBC_SHA |

0010x | TLS_DH_RSA_WITH_3DES_EDE_CBC_SHA |

C027x | TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 |

C028x | TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 |

C029x | TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 |

C02Ax | TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384 |

C02Bx | TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 |

C02Cx | TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 |

C02Dx | TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 |

C02Ex | TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384 |

C02Fx | TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 |

C030x | TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 |

C031x | TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256 |

C032x | TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384 |

The **names** of these cipher suites are reasonably well,
but not ideally, structured. The part of the name between TLS_ and _WITH_
refers to the key exchange mechanism. It can be one of RSA, RSA_EXPORT,
DH_DSS, DH_DSS_EXPORT, ECDHE_RSA, ECDHE_EDSA, etc. DH means
Diffie-Hellman. EC means Elliptic Curve.

The EXPORT part of the name refers patronisingly to the weaker algorithms that were once available outside of the export controls by the US Department of Defense. No-one in their right mind would ever use them now, as they are probably completely compromised.

The fragment of the name between _WITH_ and the final underscore refers to the bulk encryption algorithm. It can be one of NULL, RC2_CBC_40, RC4_40, RC4_128, DES_40_CBC 3DES_EDE, etc. CBC means Cipher Block Chaining. GCM means Galois/Counter Mode. EDE means Encrypt/Decrypt/Encrypt.

The very last part of the name, following the final underscore, refers to the algorithm used to generate the Message Authentication Code, or MAC. As we shall see later, the MAC is just a digest that is appended to each bulk data message. The MAC algorithm can be one of MD5, SHA, SHA256, SHA384, etc.

However, rather disgracefully in my opinion, the **binary values**
of these cipher suite codes bear no relationship to the structure of their
names. There is no bit-valued structure that indicates RSA or DH, no
single bit that flags the exportable state, no bit structure for the
encryption algorithm, or for the MAC algorithm. The suite numbers seem to
have just been assigned sequentially, with no regard to their structure.
This makes them difficult to interpret programmatically.

Now that we know that the entity that is to be negotiated between the client and server is a mutually acceptable cipher suite, we can describe how the negotiation is carried out in detail. The initial part of the conversation in which this happens is known as the handshake. The remainder of the conversion, in which the partners use the negotiated technique to exchange private data, is known as the bulk data payload transfer.

As well as negotiating encryption details, the handshake also authenticates the server. That is, the handshake confirms to the client that the server is actually who they claim to be. It is also possible for the server to request credentials from the client, so that the server can confirm that the client is who they claim to be. This feature is called client authentication, but it is not widely used.

A more detailed description of the handshake follows:

- The client sends a message known as
**client hello**. It contains:- A protocol version number (0300x for SSLV3, 0301x for TLSv1.0, 0302x for TLSv1.1, 0303x for TLSv1.2, 0304x for TLSv1.3).
- A 32-byte random number, which includes a timestamp in the first four bytes.
- A continuation session-id (only if this is a reconnection of a previous session)
- A list of suggested cipher suites that the client is prepared to use. They would typically be ordered from strongest to weakest, but the protocol does not require this.

- The server replies with a
**server hello**message. It contains:- A protocol version number, which may be equal to the version proposed by the client, or lower if the server cannot support the client's version.
- A 32-byte random number, which includes a timestamp in the first four bytes.
- A session-id. This can be the same as the client's session-id, if the server recognises that it is valid and refers to a previous session; or a new one, if the client has not provided a valid session-id. The server can also return a null session-id, if it does not want to allow the session to be re-established later.
- A cipher suite selected from the list provided by the client. The server is usually expected to choose the strongest cipher suite in the list, but the protocol does not require this. If the server is unable to choose a mutually acceptable suite, the handshake is abandoned with an error.

- The server then sends a
**certificate**message. It contains:- The server certificate

- For Diffie-Hellman type key exchanges only, the server also sends a
**server key exchange**message. It contains: - A temporary key to encrypt the forthcoming
**client key exchange**message - The server finally sends a
**server hello done**message to conclude its part of the handshake. - The client must validate the server certificate. The check must include at least a check that the certificate is within its valid date range and should also check that the certificate is trusted. The trust mechanism may include a validation that the certificate is signed by a trusted certificate authority, and that the certificate is not on a Certificate Revocation List published by that certificate authority.
- The client computes a
*pre-master secret*from the two random numbers provided by the client and the server. It encrypts this using the public key from the server's certificate. (This means that only the server can decrypt it.) For a Diffie-Hellman exchange, the encryption is performed using the temporary key sent earlier. The client then sends the**client key exchange**message. It contains: - The encrypted pre-master secret
- The client and server both calculate a master secret and a session key from the pre-master secret.
- The client sends a
**change cipher spec**message to start using the negotiated encryption details. - The client sends a
**client finished**message, encrypted according to the negotiated specifications. - The server sends a
**change cipher spec**message to start using the negotiated encryption details. - The server sends a
**server finished**message, encrypted according to the negotiated specifications.

The handshake is then complete. The client and server then continue with the bulk data transfer, with each message encrypted with the negotiated symmetric key, and containing a message digest or MAC calculated with the negotiated digest algorithm.

The protocols and algorithms described above say what happens when everything works as expected. But just like every other piece of software, it is possible for mistakes to creep into the implementation. Over the last few years many of these mistakes, or bugs, have been reported in the computer press, and some of them have even got into the mainstream media. The bugs have been given some exotic names, such as Heartbleed, Poodle, Beast, and Logjam.

Although SSL and TLS have been specifically designed to avoid the so-called man in the middle

attack,
many of these depend on exactly that.
A man in the middle

attack occurs when the attacker can intercept and modify the traffic somewhere between
the client and the server. Superficially, the network protocol (TCP/IP) is an end-to-end protocol:
it appears that what you send through the network pipe is exactly what is received at the other end.
But this is not how the communication works: every message goes through a whole series of computers,
and every one can potentially view and modify the messages as they flow.
Now most of the SSL traffic is indeed encrypted, but during the handshake, when the encryption parameters are
still being negotiated, some of the traffic is unencrypted and unsigned.

If you are using a wireless connection in a public space like an airport lounge or a coffee shop,
the very first machine that you communicate with is the wireless router that connects you to the Internet.
It is possible that such a machine has been compromised (or hacked) and may be operating as a man in the middle

.

Here's an analysis of some of the problems.

- Renegotiation Vulnerability
- This was one of the first widely reported vulnerabilities, which was reported before such reports started to be given snappy names. It is a weakness in the SSL renegotiation protocol, which allows renegotiation of some of the attributes of the connection. (This is not the same, as I once thought, as session resumption using the session-id value in the hello messages. Renegotiation is described here.) It was ultimately fixed by adding new extensions to the TLS connection data, as described in RFC5746.
- BEAST
- An abbreviation of
Browser Exploit Against SSL/TLS

. This exploits the reuse of Initialization Vectors (IV) in a Cipher Block Chaining process. - POODLE
- An abbreviation of
Padding Oracle On Downgraded Legacy

. It exploits the feature of various releases of SSL and TLS to downgrade to an earlier, weaker, version. If the connection is downgraded to SSL Version 3 (the "downgraded legacy" protocol), certain features of padding algorithm are then exploited to expose a single byte of the cipher text by inserting a guessed version of the bytes into the message. The guess will only be correct one time in every 256, but in theory, if the same encrypted plain text is sent through the system repeatedly, sensitive parts of the message (containing authentication details, for example) can be reconstructed one byte at a time. - Logjam
- This attack requires the collusion of a
man in the middle

attacker who is able to modify the unencrypted unsigned messages in the handshake. In this case, it works by changing the list of cipher suites in the**client hello**message to use one of the the low qualityexport

cipher suites. Recall that these were introduced into SSL primarily so that they were easy for the US government to crack. In the particular case of Logjam, the client and server are persuaded to choose a weak Diffie-Hellman with a small 512 bit key. Once the client and server are persuaded to fall back to export ciphers, the attacker can use a pre-calculated set of solutions to the Diffie-Hellman discrete logarithm problem. - FREAK
- An abbreviation of
Factoring Attack on RSA-EXPORT Keys

. This is the similar to Logjam above, but for RSA keys. - Heartbleed
- This was a coding error in OpenSSL. Part of the TLS protocol is a
heartbeat

extension, which sends messages over the connection to keep it open during times of low traffic. The heartbeat message contains a length and some payload data. As part of the heartbeat, the partner is supposed to echo the data that was sent, using the length contained in the message. The Heartbleed bug is OpenSSL was a failure to check that the length contained in the message was actually the same as the length of data that had been received. The exploitation of the bug was to send a high value in the message field, such as 65535 or FFFFx, but only a small amount of data to be echoed. Because of the bug, the data that was echoed was in fact the copy of the data received, followed by the 65500 or so arbitrary bytes of computer memory that followed the data that was supposed to be echoed. So there was ableed

of data because of theheartbeat

, hence the bug's name. In the hysteria that followed the publication of this bug, the impression was give that the 65500 bytes that was leaked inevitably contained secure data such as user names and passwords. I was never convinced of this. Although the exposure of up to 65500 bytes of data is undeniably bad, the likelihood that a particular 65500 bytes (out of a typical memory size of 4 gigabytes or more) actually contains something worth fishing for, is extremely low.

Transport Layer Security, as the successor to SSL which is now effectively superseded, is an extremely popular and convenient method by which quite secure connections between computers can be established. It is widely used in secure financial transactions such as online banking and online purchasing, so it is the lynchpin of all e-commerce on the Internet. In spite of the few vulnerabilities described above, it is also extremely safe. Although it was originally seen as a very expensive consumer of computer resource, computer speeds have generally improved so much that the additional overhead is now seen as relatively unimportant. Many deliverers of non-sensitive information, such as Google and Wikipedia, are now using it as a matter of course, to avoid unwarranted snooping on browsing habits.

It is quite likely that within a few years it will be the default mode of connection on the Internet, and unencrypted connections will be a rarity.

Copyright © Peter Havercan, 2015. A plain person's guide to Secure Sockets Layer by Peter Havercan is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. First published 17 July 2015.