WoWInterface - Coding an efficient serializer

Any comments as well would be appreciated, such as comparison to the ideas of existing serializer libs like AceSerializer.

For additional info, here's a copy of the serialization specification created and used for my serializer code.

Code:

SerialLib was designed to form serialized strings that had the least overhead possible.



Most simple data types only take up one byte using specific type identifiers.

Simple data is any data that can be defined as a constant.



Numbers are encoded using a number conversion system that converts between theoretical bases.

The original base is looked at as a base-256 system in which each byte is a single digit.

The converter changes this to something like a base-248 system that allows number-shifting to occur to avoid unusable bytes.

The encoder is also designed for variable number length, using the same method systems use to encode strings.

This is so it can encode/decode several hundred bytes as easily as it can a single byte.

This is depending on system limits of course.



Simple data and their identifiers

        +        +Zero

        -        -Zero

        B        Boolean True

        b        Boolean False

        I        +Infinity

        i        -Infinity

        N        NaN

        n        nil

        s        Empty String

        |        Empty Table (Reference Pointers are created for this object type)



Complex data and their identifiers

        C        Binary-Encoded Lua Function (uses string.dump() and processes string result, Reference Pointers are created for this object type)

        D        +Integer

        d        -Integer

        E        +Exponential (Floating-Point)

        e        -Exponential (Floating-Point)

        F        +Fractional (Floating-Point)

        f        -Fractional (Floating-Point)

        r        Reference Pointer (points to previously-encountered object data)

        S        String

Note: On exponential floating-point numbers, the encoder tries to encode as an integer.

It'll encode it as an exponential number if it exceeds the point in which the internal data loses precision on an integer level.

This threshold is equal to ±2^53 and is a limit on the data type Lua uses (64-bit double-precision floating-point), not the serialization format.



Structural data and their start/end identifiers

        {        }        Tables  (Reference Pointers are created for this object type)



Serialized data is created by stringing multiple pieces together.

Deserializers are expected to have a variable number of returns and return exactly the same number of serialized pieces in a string.

Serializers are expected to encode tables as pairs of pieces representing key/value pairs and numerical indices to appear in ascending order before other indices.

Serializers are also expected to write a nil identifier for any unsupported data. nil identifiers are forbidden in table data and any key/value pair containing at least one should be omitted from entry.

Note the specification does list a format for Lua functions, but the WoW API blocks the recreation of functions using binary data. For now, it's recommended to write a nil identifier instead. Also note negative zeroes do exist in Lua, see specification IEEE 754-1985.