Cyclic redundancy check
Encyclopedia : C : CY : CYC : Cyclic redundancy check
A cyclic redundancy check (CRC) is a type of hash function used to produce a checksum – a small, fixed number of bits – against a block of data, such as a packet of network traffic or a block of a computer file. The checksum is used to detect errors after transmission or storage. A CRC is computed and appended before transmission or storage, and verified afterwards by recipient to confirm that no changes occurred on transit. CRCs are popular because they are simple to implement in binary hardware, are easy to analyze mathematically, and are particularly good at detecting common errors caused by noise in transmission channels.
Introduction
A CRC "checksum" is the remainder of a binary division with no bit carry (XOR used instead of subtraction), of the message bit stream, by a predefined (short) bit stream of length [n], which represents the coefficients of a polynomial. Before the division, [n] zeros are appended to the message stream.CRCs are based on division in the ring of polynomials over the finite field GF(2) (the integers modulo 2). In simpler terms, this is the set of polynomials where each coefficient is either zero or one (A.K.A binary or base 2 numbers), and arithmetic operations wrap around. For example:
- [(x^3 + x) + (x + 1) = x^3 + 2x + 1 \equiv x^3 + 1]
- [(x^2 + x)(x + 1) = x^3 + 2x^2 + x \equiv x^3 + x]
- [\frac = (x^2 + 1) - \frac]
- [(x^3 + x^2 + x) = (x^2 + 1)(x + 1) - 1]
Any string of bits can be interpreted as the coefficients of a polynomial of this sort, and to find the CRC, we divide by another fixed polynomial, after first multiplying by [x^] where [n] is the degree of the fixed polynomial. The coefficients of the remainder polynomial are the CRC.
In the above equations, [x^2+x+1] represents the original message bits 111, [x+1] acts as the key, and the remainder [1] (equivalently, [x^0]) is the CRC. The degree of the key is 1, so we first multiplied the message by [x^1] to get [x^3 + x^2 + x].
In general form:
- [M(x) \cdot x^ = Q(x) \cdot K(x) + R (x) ]
CRCs are often referred to as "checksums," but such designations are not strictly accurate since, technically, a checksum would be calculated through addition, and not through division as is the case with CRCs.
Closely related to CRCs are error-correcting codes, which allow the correct message to be reconstructed in the face of transmission errors. These codes are based on closely related mathematical principles.
Implementation
There are simple, efficient algorithms for computing the CRC of a block of data, such as those shown below.One common bitwise algorithm is based on a shift register which has a size in bits equal to the degree of the chosen polynomial. The main portion of the algorithm can be expressed in pseudocode as follows:
function crc(bit array bitString[1..len], int polynomial) else } return shiftRegister }Another common algorithm uses a lookup table indexed by multiple most-significant bits of the
shiftRegister to process multiple bits at once. A 256-entry lookup table is a particularly common choice.''
Another implementation uses a shift register, but instead of changing multiple bits in the shiftRegister based on the xor of one bit of the shiftRegister and one bit of the bitString, it is possible to xor (compute the parity of) all the bits of the shiftRegister selected by the polynomial and the bitString and add that single bit to the shiftRegister. With suitable adjustments to the polynomial, this also produces the same remainder. This algorithm is usually inefficient in software, but used in some hardware implementations, and is often used when describing the close relative to a CRC, the linear feedback shift register.
The specific CRC produced is defined by the polynomial used. To produce an n-bit CRC requires a degree-n polynomial, of the form [x^n + \ldots + 1 ]. This is naturally expressed as an (n + 1)-bit string, but the leading [x^n] term is normally implicit, leaving an n-bit string Thus, depending on the bit-order convention used, the CRC-16-IBM polynomial [x^ + x^ + x^2 + 1], will be represented as the hexadecimal number 0x8005 or as 0xA001.
One of the most commonly encountered CRC algorithms is known as CRC-32, used by (among others) Ethernet, FDDI, ZIP and other archive formats, and PNG image format. Its polynomial can be written 0x04C11DB7, or 0xEDB88320 by reversing the bit string derived from the polynomial as explained above. The W3C webpage on PNG includes an appendix with [a short and simple table-driven implementation] in C of CRC-32. Despite the popular acclaim, it is important to notice that the polynomial in question was arbitrarily chosen and generally exhibits very poor error detection properties (in terms of Hamming distance per given block size) compared to polynomials selected by algorithmic means
- Castagnoli's et al. work on algorithmic selection of CRC polynomials
- verification of Castagnoli's results by exhaustive search and some new good polynomials
.
Variations
There are several variations on CRCs
- The
shiftRegistercan be reversed, so its least-significant bit is tested and it is shifted to the right by 1 bit each step. This requires apolynomialwith its bits reversed, and produces a bit-reversed result. This variant is actually the one most commonly in use.
- The data bits may be read into the shift register either most significant first, or least significant bit first. When used in communication protocols, the CRC is usually calculated in the order the bits are sent over the physical layer, to preserve the CRC's burst error detection characteristics.
- To check the CRC, instead of calculating the CRC on the message and comparing it to the CRC, a CRC calculation may be run on the entire codeword. If the result is zero, the check passes. This works because the codeword is [M(x) \cdot x^ - R(x) = Q(x) \cdot K(x)], which is always divisible by [K(x)].
- The shift register may be initialized with ones instead of zeroes. Equivalently, the first [n] bits of the message may be inverted before feeding them into the algorithm. The reason to do this is because an unmodified CRC does not distinguish between two messages which differ only in the number of leading zeros. When this inversion is done, the CRC does distinguish between such messages.
- The CRC may be inverted before being appended to the message stream. When this is done, the CRC may be checked either by the straightforward method of computing the CRC on the message, inverting it, and comparing with the CRC in the message stream, or by calculating the CRC on the entire message. In the latter method, the result for a correct message will not be zero, but instead will be the result of dividing [\sum_^ x^] by [K(x)]. This result is called the check polynomial [C(x)], and its hexadecimal representation is sometimes called a magic number.
Reversed representations and reciprocal polynomials
Polynomial representations
All the well-known CRC polynomials of degree [n] have two common hexadecimal representations:
- A hexadecimal number with [n] bits, the least significant bit of which is always 1, the most significant bit representing the [x^] coefficient and the least significant bit representing the [x^] coefficient.
- A hexadecimal number with [n] bits, the most significant bit of which is always 1, the most significant bit representing the [x^] coefficient and the least significant bit representing the [x^] coefficient.
Using the normal representation in the right shift algorithm or the reversed one in the left shift algorithm gives incorrect results.
In both these representations the [x^] coefficient is omitted and understood to be 1.
There is a third representation in use, from a paper by P. Koopman and T. Chakravarty
- analysis of short CRC polynomials for embedded applications
- A hexadecimal number with [n] bits, the most significant bit of which is always 1, the most significant bit representing the [x^n] coefficient and the least significant bit representing the [x^1] coefficient. The [x^] coefficient is omitted and understood to be 1.
Reciprocal polynomials
A reciprocal polynomial is created by assigning the [x^] through [x^] coefficients of one polynomial to the [x^] through [x^] coefficients of a new polynomial. Example: The reverse of [x^ + x^ + x^ + 1] is [x^ + x^ + x^ + 1]. The most interesting property of reciprocal polynomials, when used in CRCs, is that they have the exact same error-detecting strength as the polynomials they are reciprocals of. The reciprocal of a polynomial generates the same codewords — that is, the bit strings consisting of a valid CRC appended to a message — as the original polynomial, only bit reversed. But the reciprocal polynomial is not the same as the original polynomial, and the CRCs generated using it are not the same (even modulo bit reversal) as those generated from the original polynomial.Another interesting property of reciprocal polynomials is related to their representation. Consider again the CRC-16-IBM polynomial [x^ + x^ + x^ + 1] and its reciprocal. These are its representations
- Normal original: 0x8005
- Reversed original: 0xA001
- Koopman original: 0xC002
- Normal reciprocal: 0x4003
- Reversed reciprocal: 0xC002
- Koopman reciprocal: 0xA001
Error detection strength
The error-detection ability of a CRC depends on the degree of its key polynomial and on the specific key polynomial used. The "error polynomial" [E(x)] is the exclusive-or of the received message codeword and the correct message codeword. An error will go undetected by a CRC algorithm if and only if the error polynomial is divisible by the CRC polynomial.
- Because a CRC is based on division, no polynomial can detect errors consisting of a string of zeros prepended to the data, or of missing leading zeros. However, see Variations.
- All single bit errors will be detected by any polynomial with at least two terms with non-zero coefficients. The error polynomial is [x^k], and [x^k] is divisible only by polynomials [x^i] where [i \le k].
- All two bit errors separated by a distance less than the order of the polynomial will be detected. The error polynomial in the two bit case is [E(x) = x^i + x^k = x^k \cdot (x^ + 1), \; i > k]. As noted above, the [x^k] term will not be divisible by the CRC polynomial, which leaves the [x^ + 1] term. By definition, the smallest value of [] such that a polynomial divides [x^ + 1] is the polynomial's order. The polynomials with the largest order are called primitive polynomials, and for polynomials of degree [n] with binary coefficients, have order [2^n - 1].
- All errors in an odd number of bits will be detected by a polynomial which is a multiple of [x+1]. This is equivalent to the polynomial having an even number of terms with non-zero coefficients.
- All burst errors of length [n] will be detected by any polynomial of degree [n] or greater which has a non-zero [x^0] term.
[M(x) \cdot x^ = Q(x) \cdot K'(x) + R(x)]
[M(x) \cdot x^n = Q(x) \cdot x \cdot K'(x^n) + x \cdot R(x)]
[M(x) \cdot x^n = Q(x) \cdot K(x^n) + x \cdot R(x)]
That is, the CRC of any message with the [K(x)] polynomial is the same as that of the same message with the [K'(x)] polynomial with a zero appended. It is just a waste of a bit.)
The combination of these factors means that good CRC polynomials are often primitive polynomials (which have the best 2-bit error detection) or polynomials whose factors are a primitive polynomial of degree [n-1] and [x+1] (which detects all odd numbers of bit errors, and has half the two-bit error detection ability of a primitive polynomial of degree [n]).
CRC polynomial specifications
The following table below omits "Initial value", "Reflected value" and "Final XOR value".- These hex values are important for some more complicated checksums (like most forms of CRC-32 and CRC-64). CRCs less than CRC-16 do not tend to use these values.
- Very often custom versions of checksums are created by changing these values, as it does not alter the mechanics of the checksum algorithm.
- The definition of CRC-12 is disputed, as there are 3 forms of CRC-12 in common use.
- Both forms of CRC-8 in use have notable weaknesses mathematically.
- It is assumed that at least 10 forms of CRC-16 and CRC-32 exist, but no form of CRC-16 or CRC-32 is mathematically optimal.
- CCITT CRCs differ from ITU CRCs (of the same size), as the same entitiy has standardized checksums more than once but in different eras.
CRCs in common use (in ITU-IEEE syntax)
| Name | Polynomial | Representations: Normal or Reverse (Normal of Reciprocal) |
|---|---|---|
| CRC-1 | [x + 1] (use: hardware, also known as parity bit) | 0x1 or 0x1 (0x1) |
| CRC-5-CCITT | [x^ + x^ + x + 1] (ITU G.704 standard) | 0x15 (0x??) |
| CRC-5-USB | [x^ + x^ + 1] (use: USB token packets) | 0x05 or 0x14 (0x9) |
| CRC-7 | [x^ + x^ + 1] (use: telecom systems) | 0x09 or 0x48 (0x11) |
| CRC-8-ATM | [x^8 + x^2 + x + 1] (use: ATM HEC) | 0x07 or 0xE0 (0xC1) |
| CRC-8-CCITT | [x^8 + x^7 + x^3 + x^2 + 1] (use: 1-Wire bus) | |
| CRC-8 | [x^8 + x^7 + x^6 + x^4 + x^2 +1] | 0xEA(0x??) |
| CRC-10 | x10 + x9 + x5 + x4 + x + 1 | 0x233 (0x????) |
| CRC-12 | [x^ + x^ + x^3 + x^2 + x + 1] (use: telecom systems) | 0x80F or 0xF01 (0xE03) |
| CRC-16-Fletcher | See Fletcher's checksum | Used in Adler-32 A & B CRCs |
| CRC-16-CCITT | x16 + x12 + x5 + 1 (X25, V.41) | 0x1021 or 0x8408 (0x0811) |
| CRC-16-IBM | x16 +x15 + x2 + 1 | 0x8005 or 0xA001 (0x4003) |
| CRC-16-BBS | x16 + x15 + x10 + x3 (Use: XMODEM protocol) | 0x8408 (0x????) |
| CRC-32-Adler | See Adler-32 | See Adler-32 |
| CRC-32-MPEG2 | See IEEE 802.3 | See IEEE 802.3 |
| CRC-32-IEEE 802.3 | [x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1] | 0x04C11DB7 or 0xEDB88320 (0xDB710641) |
| CRC-32C (Castagnoli) | [x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^9 + x^8 + x^6 + 1] | 0x1EDC6F41 or 0x82F63B78 (0x05EC76F1) |
| CRC-64-ISO | [x^ + x^4 + x^3 + x + 1] (use: ISO 3309) | 0x000000000000001B or 0xD800000000000000 (0xB000000000000001) |
| CRC-64-ECMA-182 | [x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^ + x^9 + x^7 + x^4 + x + 1] (as described in [ECMA-182] p.63) | 0x42F0E1EBA9EA3693 or 0xC96C5795D7870F42 (0x92D8AF2BAF0E1E85) |
| CRC-128 | IEEE-ITU Standard. Use superseded by hashes MD5 & SHA-1 | |
| CRC-160 | IEEE-ITU Standard. Use superseded by hashes MD5 & SHA-1 | |
CRCs and data integrity
While useful for error detection, CRCs cannot be safely relied upon to verify data integrity (that no changes whatsoever have occurred), since, because of the linear structure of CRC polynomials, it is extremely easy to intentionally change data without modifying its CRC, as demonstrated by [CRC and how to Reverse it]. Message authentication codes can be used to verify data integrity.When CRCs collide
As all hash functions, CRCs tend to have collision rates nearing 100% after a certain number of collision tests. Each additional bit added to a CRC (ie: CRC-20 vs CRC-21) cuts the number of collisions down by nearly 50%.- The theoretical collision rate for CRC64 is around one collision every 18×1018 CRCs.
- A properly designed CRC (using fewer bits) can be as effective as a poorly designed CRC using more bits due to the irreducable polynomial property of CRCs. In this case CRC-32 can perform nearly as well as CRC-40.
Designing CRC polynomials
The selection of generator polynomial is the most important part of implementing the CRC algorithm. The polynomial must be chosen to maximize the error detecting capabilities while minimizing overall collision probabilites. The most important attribute of the polynomial is its length (the number of the highest nonzero coefficient), because of its direct influence of the length of the computed checksum.
The most commonly used polynomial lengths are
- 9 bits (CRC-8)
- 17 bits (CRC-16)
- 33 bits (CRC-32)
- 65 bits (CRC-64)
- Irreducibility in this case means that the polynomial cannot be divided by any polynomial except itself and 1 with zero remainder.
- CRCs with more than one nonzero coefficients are able to detect all single bit errors in the input message.
- CRCs can be used to detect all double bit errors in the input message shorter than 2k, where k is the length of the longest irreducible part of the polynomial.
- If the CRC polynomial is divided by x + 1 then no polynomial with odd number of nonzero coefficients can be divided by it. Hence, it can be used to detect odd number of errors in the input message (like single bit parity function).
- CRC polynomials detect (single) burst errors shorter than the number of the position of the highest polynomial coefficient.
See also
General category Specific Technological ReferencesReferences
External links
- [Tutorial and C++ implementation] of CRC
- Cyclic redundancy check - a simple guide to what it means for your data, CD and DVD discs. http://www.softwarepatch.com/tips/cyclic-redundancy.html
- [The CRC Pitstop]
- Williams, R. (1993-09) [A Painless Guide to CRC Error Detection Algorithms]
- [Understanding Cyclic Redundancy Check]
- Black, R. (1994-02) [Fast CRC32 in Software] — Algorithm 4 is used in Linux and info-zip's zip and unzip.
- Barr, M. ([1999-11], [1999-12], [2000-01]) checksums, CRCs, and their source code. Embedded Systems Programming
- [CRC32: Generating a checksum for a file], C++ implementation by Brian Friesen
- Online [Tool to compute common CRCs (8/16/32/64) from strings]
- Online [CRC calculator]
- Online [CRC Tool: Generator of synthesizable CRC functions]
- [Online Char (ASCII), HEX, Binary, Base64, etc... Encoder/Decoder with MD2, MD4, MD5, SHA1+2, CRC, etc. hashing algorithms]
- [CRC16 to CRC64 collision research]
- [Reversing CRC – Theory and Practice.]
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
