The Typhoon wave file format

This HTML document was converted from fmt_typh.rtf, which was retrieved from https://tomasmulcahy.com/muki-pakesch-tx16w-archive/.

This specification describes the compression algorithm for Typhoon format waves. It does not cover the file format, which is AIFF-C. The documentation for AIFF-C is available at the site ftp.sgi.com in the directory /sgi/aiff-c.9.26.91.ps.Z (compressed Postscript file).

Documentation v1.3 (for DWVW v1.2)

DWVW is a lossless (or bit faithful) compression method for digital audio data. Lossless means that the exact original data will be preserved when compressing and decompressing.

The compression utilize the fact that the delta between the sample points is generally less than the full dynamic width. Each sample point is subtracted from the previous one and the difference is enthropy encoded in a special format. Therefore the compression works best on low frequency sounds with low noise ratio, where the difference between each sample is small.

DWVW can be applied on samples of any bit resolution and with any number of channels. As opposed to AIFF standard, sample bits are not "left justified". Instead the necessary translation should be done when decompressing. Also, while AIFF interleaves multichannel sounds, DWVW doesn't as this complicates compression and decompression. Each channel follows one another with only a slight break in the bit run. The first delta for each channel should be put at an even 16-bit word position.

The encoding stores the delta points with only as many bits as is required (hence the name "variable word width"). Thus, the number of bits used by each delta has to be stored as well. Since this count varies very little we apply a (simpler) delta encoding on this information.

To wrap it up, each compressed sample point consists of two values: the delta from the last sample and the difference in word width of this delta from the last delta (hereby referred to as "the WWM" - the word width modifier). Even though the word width modifier is stored first in each delta frame we will describe the delta information first.

The delta is always stored as an absolute difference (i.e. unsigned) in a varible number of bits. An extra bit follows that tells the sign (if the delta isn't zero). The number of bits required for the delta (i.e. the word width) is decided by the position of the most significant high bit in the absolut value. One bit less than this is actually stored since the first bit is always high. For instance, the delta 11 (binary 1011) has a required word width of four bits ,but only the least significant three bits are stored. A zero delta will have a zero word width and consequently requires neither delta bits nor sign bit. A delta of one will require only a sign bit.

One special case requires attention. A normal two's complement number's lowest negative number is one less than the highest positive number. Treating zero as a positive value this gives exactly as many negative as positive numbers. The delta encoding on the other hand does not consider zero to be of any sign and does therefore not include the one extra negative value. If this value is encountered in the delta stream it is encoded as one greater than it actually is (putting it within the expressable range of values). To distinguish it from the next lowest value one extra bit is inserted after the sign bit. The bit is high for the lowest value and low for the next lowest value. For example, a 16-bit two's complement number can be -32768. It would be encoded as negative 32767 with an extra high bit. The value -32767 would also be encoded as negative 32767 but with the extra bit low. Of course, only these two values require the extra bit.

The WWM preceeds the delta bits. It is encoded as a series of low bits (0) terminated by a high bit (1) (in most cases). The count of low bits tells the modifier amount. If the modifier isn't zero an extra bit follows that tells the modifier sign. A high bit means negative modifier.

Word width "wraps" at the used bit resolution (new-width = (original-width + modifier) modula bit-resolution). This enables us to go from a small width to a large width by using a negative modifier. Because of this fact a WWM will never need to be larger than the sound bit resolution divided by two (rounded downwards). If the modifier is the maximum the terminating high bit would be superfluous, so in this case it isn't inserted. (However; the sign bit is always included, even if the bit resolution is even.)

For encoding the current word width and sample value should be initially reset to zero for each channel (the first delta will thus be the sample value). A compressed channel always starts on an even 16-bit word boundary.

Notice that the highest possible compression ratio is eight times, i.e. one bit per sample. This occurs when the source is continous series of zero samples.

DWVW sample delta bit frame:

0	WWM is the count of low bits (can be none)
1	terminating high bit (if not max W=WM)
ms	WWM sign, high is negative (only on non-zero WWM)
delta	(word width - 1) sample delta bits (if delta 1)
sb	-7 (mod 16 = 10)
xb	extra bit (only on lowest and next lowest possible delta value)

Some encoding examples (the examples all represent extreme situations with unusually poor compression):

Bit resolution	16
Delta	923 (bin 00000011 10011011)
Current width	1
New width	10
Modifier	-7 (mod 16 = 10)
Yields	0000000 1 1 110011011 0

Bit resolution	12
Delta	-2048 (bin 1000 00000000)
Current width	0
New width	11
Modifier	-1 (mod 12 = 11)
Yields	0 1 1 1111111111 1 1 (-2048 is encoded as 2047 with extra bit and negative high)

Bit resolution	8
Delta	-12 (bin 11110100, negated 00001100)
Current width	0
New width	4
Modifier	+4
Yields	0000 0 100 1 (no terminating bit for WWM)

Back to blog post