]> How to Implement Secure (Mostly) Stateless Tokens Network Resonance
2483 E. Bayshore #212 Palo Alto CA 94303 USA ekr@networkresonance.com
A common protocol problem is to want to arrange to maintain client state with little or no local storage. The usual design pattern here is to provide the client with a token which is returned with subsequent interactions. In order to prevent tampering, forgery, and privacy issues, such tokens should be cryptographically protected. This draft describes one workable mechanism for constructing such tokens.
A common protocol problem is to want to arrange to maintain client state with little or no local storage. The usual design pattern here is to provide the client with a token which is returned with subsequent interactions. One such application is TLS tickets , which allow the server to offload the TLS session cache onto the client. Another application is globally routable unique identifiers (GRUUs) which bind a second URI to a SIP AOR. GRUUs can be defined in a stateless mode, which requires no storage on the SIP registrar. Another application for this kind of technique is to build Web "shopping carts". Because the state token is stored on a remote node it is susceptible to inspection and /or tampering by untrusted third parties. Therefore, it becomes important to cryptographically secure tokens, typically by encrypting them to provide confidentiality and adding a message integrity check (MIC) to provide integrity for the token data and assurance that it was generated by the consuming node. Note that the remote node doesn't need to do anything with the token other than echo it back, therefore there is no need for it to be able to access the internals of the token. Although the general techniques for constructing tokens of this type are well understood, there are some subtle issues involved and there is no single reference that describes acceptable constructions. Accordingly, each token-using application has had to design its own construction. The purpose of this document is to provide such a reference.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in .
This section provides a somewhat abstract overview of the techniques for constructing stateless tokens. In the next section we will describe precise example formats that implement these principles.
The reference architecture for this kind of system is shown in the figure below:
\ State <--------- 2. State token / Creation [time passes] 3. State token ---------> \ State <--------- 4. I remember you / Recovery ]]>
In message 1, the client contacts the server. The server creates some state (e.g., creates a shopping cart structure), serializes that into a state token, and sends it to the client in message 2. From the client's perspective, this token is opaque, though it of course has some internal structure that is parseable by the server. Some time later, the client wants to talk to the server again. When it reconnects, it sends the state token back in message 3. The server decodes the state token and recovers the previously created state and can process whatever transaction the client is trying to perform.
We assume that the server has some static cryptographic keying material, consisting of:
These keys SHOULD be randomly generated and of sufficient length to match whatever cryptographic algorithms are in use. They are stored in semi-permanent storage and used for protecting all tokens. In order to construct a token, the server takes its state and packs it into a single string S. S should be constructed so that it can be unambiguously parsed by the server. The server than encrypts the token and applies a MAC. For instance:
The encryption algorithm SHOULD to be chosen to have the following properties: Given any (EA, S) pair, it is computationally infeasible to determine whether EA was derived from S. (Note that you may obtain some information from length). Multiple encryptions of the same value S produce different tokens It is computationally infeasible to determine any information about the relationship between the S values embedded in two separate tokens. AES in either CBC or counter (CTR) with randomly chosen IVs both meet these requirements. The MAC algorithm is simply a standard MAC, such as HMAC . The design described above works in a variety of situations but has the significant drawback that it does not permit the server to invalidate or supercede states. Consider the case where the token is being used to contain login state. One function such interfaces typically offer is a "logout" button. However, if the server is stateless, then there is no way to detect that a token which is presented corresponds to a session which has been invalidated. A related issue is "state rollback". If a server gives a client a new state token (e.g., decrementing an account balance) the client can send an older token instead; this is effectively a replay attack.
Invalidating or superceding tokens requires the server to be able to store some state. The usual procedure is for the server to retain a list of invalid tokens. In the most primitive implementation, this requires the server to keep all invalid tokens since the beginning of time. Obviously this is inconvenient, so tokens are created with an expiration time, thus limiting the invalid list to unexpired tokens which are invalid (there is a parallel here to certificate revocation lists ). Even with expiry, high invalidation rates can lead to large Invalid Token Lists (ITLs). This is especially true if you are having state n+1 supercede state n (this is done by invalidating state n and then issueing state n+1). A number of techniques can be used to minimize the size of the ITL. For instance, it may be stored in a Bloom filter or each token may be assigned an index with the ITL stored as a bit field. These options are examined in more detail in .
In this section we describe one acceptable token format that complies with the guidelines in the previous section. For reference, consider a state value as shown in Figure 1:
Figure 1: Example state value
This state value would allow the server to "remember" some registration information, e.g., for a web bulletin board. Note that the token construction is agnostic about the state value as long as it can be represented as a byte string. The token format is shown in Figure 1.
+-------------------------------+-------------------------------+ <-+-+ | | Expiration Time (32 bits) | | | +---------------------------------------------------------------+ | | | Sequence Number (32 bits) | | | +---------------------------------------------------------------+ | | | | | | | State (S) | | | / (Variable Length) / | | | | | | | +-------------------------------------+ | | | | Padding | | | +-------------------------+ Variable length | | | | | | +-> +---------------------------------------------------------------+ <-+ | | | | | | MAC | | | | (Typically 160 bits) | | | | | | | | | | | +---------------------------------------------------------------+ | | | | | Encrypted Authenticated Portion Portion ]]> Figure 2: Example stateless token format
We assume that the state value that the server wishes to deliver is a byte string S. To construct a token, follow the following steps. Construct TI = ET || SEQ || S where ET is the expiry time in seconds since the UNIX epoch and SEQ is the token sequence number. Both are 32-bit integers so TI can be unambiguously parsed. Generate a random 128-bit initialization vector. Set EA equal to the encryption of TI with key K_e and initialization vector IV: EA = Encrypt(K_e, IV, TI) Compute the MAC over the Version, Protection Suite ID (PSID), and EA: H = MAC(K_m, Version || PSID || EA) Pack the values into token T. Because the producer and consumer of the token are the same machine, the Version and Protection Suite IDs are strictly unnecessary because the consumer knows what kind of tokens it produces. However, in the interest of tokens being self-describing and ease of version/algorithm transition, they are included in the token. Similarly, there is no need to have standardized Version and Protection Suite ID (PSID) values. However, we recommend that Version be the integer 1 and the following Prot_Id values: Protection Suite PSID Encryption MAC AES_128_CBC_WITH_SHA1 1 AES-128-CBC HMAC-SHA1 AES_256_CBC_WITH_SHA_256 2 AES-256-CBC HMAC-SHA-256 Note that this follows the "Encrypt-then-Authenticate (EtA) paradigm recommended by Krawczyk . To verify token T, perform the following steps: Compare MAC(K_m, Version || Prot_ID || EA) to the MAC in the token. If they don't match, reject the token. Decrypt EA using K_e and IV to recover TI: TI = Decrypt(K_e, IV, EA) Break up TI into ET, SEQ, and S. If the current time is after ET, the token is expired and should be rejected. If applicable, verify that the token is not on the ITL. Deliver the token to the application
One concern that people sometimes have is that you may wish to periodically roll over keys. In general, this is not necessary since modern cryptosystems do not require rekeying with the traffic volumes relevant here. If this is a concern, then there are several easy options, including adding a key ID, placing the sequence number in the clear, or overloading the IV or protection suite ID. If only a few keys are used, trial verification can be used to determine which one is active.
In this section, we discuss three data structures for maintaining the ITL.
The most natural implementation is to simply store a list of all the invalid but unexpired tokens. This requires I*size(Token) bytes of storage where I is the number of tokens. If you instead store a list of sequence numbers, then the required storage becomes 4I. It's probably a good idea to keep this list sorted so that binary search can be used for looking up potential tokens. Note that if you have a very high invalidity rate it is more efficient to maintain a valid token list.
In environments where a large fraction (> 1/32) of tokens will eventually be invalidated, a superior data structure is a bitmask vector. What needs to be recorded here is: The sequence number of the earliest valid token (the low-water mark) A bit vector with one bit for every token from the low water mark to the latest unexpired invalid token (the high-water mark). For instance, if you have issued tokens 0-18 and tokens 1, 3, 5, 7, 11 12, and 15 have been invalidated, but tokens 0-7 have also expired, you would need to store a low-water of 8 (the earliest valid token), plus to store an 8-bit bitmask vector as shown in Figure 3. Note that because the high-water mark is 15, there is no need to store bits corresponding to sequence numbers 16-18.
Figure 3: Bitmask vector invalidation list
The bitmask may also be stored compressed (e.g., run length encoded), though of course this provides slower access.
With lower invalidation rates, Bloom filters can be used to store the ITL. Bloom filters have the significant advantage that they are much smaller (about 10x smaller for bitmasks for 1% revocation rates). They have the drawback that they have false positives (tokens which appear to be invalidated but are not in fact invalid). In settings where the state being stored is soft (e.g., ), this isn't a problem but when it is hard state, then it can be. The Bloom filter can be tuned for arbitrary false positive rates, but improved specificity requires larger Bloom filter sizes. Another issue with simple Bloom filters is that they do not allow you to delete entries from the ITL when the token expires. The result is that the filter fills up with expired tokens and produces a monotonically increasing false positive rate. One approach here is to use counting Bloom filters . However, these can still overflow and produce false positives. A superior solution is simply to use multiple Bloom filters corresponding to different expiry periods and then delete a Bloom filter once the current time sweeps past the expiry period represented by the filter.
In this section, we address a number of easy mistakes to make in designing mechanisms of this type.
One common error is to simply encrypt the token without using any authentication or message integrity. The result is that tokens are susceptible to a variety of forgery attacks. This is a particular problem if a stream cipher such as counter mode is used, because an attacker can make targeted changes to any bit in the token, but attacks are possible with CBC mode as well. Consider a token format like the one presented in Figure 1 but without an integrity check. The attacker contacts the server and gets a state token T, containing his identity. The first block (128 bits) of ciphertext contain ET, SEQ, and the first 64 bits of S, which contain the username, which we call I. The attacker wants to pose as username I'. The attacker then generates a new IV' with the low order 64-bits set to IV XOR I XOR I' and builds a new token T' = IV' || EA. Because of the properties of CBC, when T' is decrypted its first block will decrypt to EQ || SEQ || I', thus allowing a user to pose as any other user. Such attacks are much more serious with CTR mode, where the whole plaintext may be tampered with. This is simply a special case of the general cryptographic rule that encryption cannot be counted on to provide integrity.
Encryption of tokens is not strictly necessary in order to provide integrity for the tokens. However, in most settings it is desirable to provide confidentiality. In the case of TLS tickets, that is because secret keying material is carried in the token. In the case of GRUUs one of the purposes of the construction is to provide privacy. Therefore, despite the possibility that confidentiality is not required, in general we recommend encrypting the data unless there is a clear requirement for it not to be.
In cases such as GRUU where privacy is a requirement, then it is important for tokens to be unlinkable; at minimum, it must be infeasible for an attacker to determine whether two tokens issued by the same server correspond to the same or related underlying state information. Optimally, it should be infeasible for an attacker to determine whether two tokens were generated by the same server or different servers (obviously, this depends on them using the same token format and similar state value lengths.) This places a requirement on the algorithms used to encrypt the token; repeated encryptions of the same (or related) plaintexts must produce ciphertexts that cannot be distinguished from encryptions of different plaintexts. Specifically, block ciphers in ECB mode are not suitable here and block ciphers in CBC or CTR mode require unique initialization vectors. In order to avoid attackers determining the temporal relationship between two tokens, the IV should be generated with a cryptographically secure random or pseudorandom number generator. This is also the rationale for encrypting the expiry time and sequence number. Note, however, that this is a tradeoff; having these values in the clear would allow immediate rejection of invalidated or expired tokens. Instead, the server has to decrypt the token in order to check its status.
As described in , if one wishes to be able to invalidate tokens, one must keep state on the server. This is obvious in contexts where you wish to invalidate tokens directly (such as logout) but in cases where you simply wish one state to supersede another, the necessity for invalidating the old state is less obvious. However, the failure to do so subjects you to replay attacks. Implementations which replace state tokens therefore need to maintain invalid token lists.
In order to prevent a token from being used outside its intended context it may be necessary to bind the token to a set of verifiable information associated with the intended context. For example, if a token belongs to a particular user then including the username in the token allows the server to verify that the user authenticated during the current session is indeed the user that was issued the token. Another possibility is to bind the token to a particular key that the client possess and is included in the token (note that in this case the token must be encrypted). Other information may be bound within the token, which may further limit its use in other contexts. In some cases privacy or other considerations may prevent inserting this information into the token. In these cases the system should attempt prevent the disclosure of the token to third parties through the use of encryption and other access control mechanisms.
The material in is based on the analysis in . The material in was contributed by Joe Salowey. Thanks to Joe Salowey for review comments.
&RFC2119; &RFC2104; &RFC3280; &RFC4507; &I-D.ietf-sip-gruu; Specification for the Advanced Encryption Standard (AES) The Order of Encryption and Authentication for Protecting Communications (or: How Secure Is SSL?) Client-Side Caching for TLS Space/time trade-offs in hash coding with allowable errors Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol