An rdb file begins with “REDIS0003” encoded as a bytestring. [9 bytes]
For each database in the dump
An rdb file ends with the EOF opcode (0xff) [1 byte]
0×00 → String object
0×01 → List object
0×02 → Set object
0×03 => Sorted set object
0×04 → Hash object
0×09 → Zipmap
0×0a → Ziplist
0×0b → Intset
0×0c → Ziplist (as value-score pairs for a sorted set)
The first two most-significant bits hold the type of the value and the length
The remaining bits hold the length (or, in the last case, encoding type) to be encoded (marked below as X’s)
The first 1-5 bytes of a string object contains the length of the string object and whether or not it is an encoded string, as described in the length-encoding section above .
If the string is not an encoded type, it is simply a series of n bytes where n is the length determined above.
If the string is an encoded type and the bottom two bits of the length are either (00,01,or 10), the string is loaded as an integer and converted to a string. Otherwise the string is lzf-decompressed and loaded as a string.
Example (the string “foo”):
03 → (00000011) The string is normally encoded as a series of 3 characters
[66 6f 6f] => “foo”
Example (the string “-1”):
[c0] → (11000000) The string is an 8-bit signed integer
[ff] => “-1”
Example (the string “256”):
[c1] → (11000001) The string is a 16-bit signed integer
[00 01] → “256”
The first 1-5 bytes of a list object contains the number of list members, as described in the length-encoding section above .
There is no special encoding type for lists. Their compact representation is the ziplist but this will be specified as the object type, not in the length encoding.
List objects are simply n strings, decoded as string objects where n is the length from above.
The first 1-5 bytes of a set object contains the number of set members, as described in the length-encoding section above .
There is no special encoding type for lists. Their compact representation is the intset but this will be specified as the object type, not in the length encoding. The intset is reserved for sets in which all members are integers.
Set objects are simply n strings, decoded as string objects where n is the length from above.
The first 1-5 bytes of a sorted set object contains the number of value-score pairs, as described in the length-encoding section above .
There is no special encoding type for sorted sets. Their compact representation is a ziplist of value-score pairs but this will be specified as the object type, not in the length encoding.
Sorted Set objects are n value-score pairs.
Each value is decoded as string objects where n is the length from above.
Every value is followed by a double value specifying its score. This double should be decoded as specified in the double section below .
The first 1-5 bytes of a hash object contains the number of key-value pairs, as described in the length-encoding section above .
There is no special encoding type for hashes. Their compact representation is the zipmap but this will be specified as the object type, not in the length encoding.
The rest of the hash is decoded as 2n string objects where n is the length from above.
Every other string is the key followed by the string representing its value.
(Note: I was unable to find whether the endiannesses listed below as little-endian encodings were always litle-endian or whether they are stored as host byte order.)
Ziplists are space-efficient special encodings for lists and sorted sets. The max number of members and max size for the ziplist encoding is set in the conf file that the Redis server reads when starting.
The first 4 bytes store the number of bytes in the ziplist
The next 4 bytes store the offset (in bytes) to the end of the last entry in the list
The next 2 bytes store the number of members of the ziplist
The remaining n-1 bytes of the ziplist store a sequence of members. If the ziplist is encoding a sorted set, the members should be parsed as value, score pairs.
The structure of every ziplist member is as follows:
Every ziplist terminates with a 0xff byte
Example (encoding the very silly list [1,1]):
[13 00 00 00] → Little-endian 32-bit length (in bytes) of ziplist
[0e 00 00 00] → little-endian 32-bit offset (in bytes) to the end of the last entry in the list
[02 00] → little-endian 16-bit number of list entries
[00] → number of bytes of previous entry
[c0] → value encoding (11000000, which means a 16-bit signed integer follows)
[01 00] → 16-bit value
[04] → number of bytes of previous entry
[c0] → value encoding (11000000, which means a 16-bit signed integer follows)
[01 00] → 16-bit value
[ff] → end of ziplist
(Note: I was unable to find whether the endiannesses listed below as little-endian encodings were always litle-endian or whether they are stored as host byte order.)
The intset is used to encode sets consisting only of integers.
Intsets begin with a 32-bit length, specifying the length of each member of the intset.
Second, there is a 32-bit number specifying the number of elements in the intset
Finally, every member is a signed integer, the size of which is specified above.
Intsets end with a 0xff
Example (encoding the set [1,-1])
[02 00 00 00] → Little-endian 32-bit length encoding, in bytes, of members of the intset (2, or 16-bits)
[02 00 00 00] → Little-endian 32-bit number of elements in the intset (2)
[ff ff] → One little-endian 16-bit member (-1 [signed int16])
[01 00] → One little-endian 16-bit member
[ff] → End of intset
Zipmaps are space-efficient special encodings for hashes. The max number of members and max size for the zipmap encoding is set in the conf file that the Redis server reads when starting.
The first byte specifies the length of the zipmap. If the length of the zipmap is greater than or equal to 0xfe, disregard this length and traverse the entire zipmap until 0xff is encountered after a member.
Next is a series of key-value pairs encoded as follows:
The zipmap ends with a 0xff
Example (encoding the hash {bar: 1})
[01] → The length of the zip-map (1 key-value pair)
[03] → The length of the first key
[62 61 72] → “bar”
[01] → The length of the first value
[02] → Number of free bytes
[31] → “1”
[6e 65] → 2 free bytes (“ne” from a previous setting of bar to “one”. These can be thrown out)
[ff] → End of zipmap
The LZF compression is a fast, cousin of the more-popular LZW compression algorithm.
The first 1-5 byte(s) are a length-encoded length of the compressed string.
The second 1-5 byte(s) are a length-encoded length of the decompressed string.
The next n bytes are in compressed lzf format, where n is the first length above.
To decompress LZF, use the following loop:
Example (encoded string “if i never if i never if i never”)
[12] → The compressed length is 18 bytes
[20] → The original length is 32 bytes
[0b] → (000 | 01011) Copy the next 11+1 bytes literally
[69 66 20 69 20 6e 65 76 65 72 20 69] → “if i never i”
[e0 0a] → (111 | 00000) (10) Set length to 10+7.
[0a] → (10) Set distance to (00000 00001010) or 10. We move to the 10th to last character in “if i never i”. This is ‘f’. (“i” is the 0th character). Now copy 10+7+2 characters. There are only 11 characters to copy so we will copy those 11 characters, move back to ‘f’ and copy 8 more. Our decoded string now reads “if i never if i never if i neve”
[00] → (000 | 00000) Copy the next 0+1 byte literally
[72] → (r) “if i never if i never if i never”