Richard Harvey
- Nov 23, 2021
- 3 min read

A zippy lecture

A vague memory from childhood is a British TV series called “Rainbow”. As with all good children’s TV, behind the inanity and stupefying dullness were some sharp observations. The show was clearly meant to model good and bad behaviour in children. Two of the characters, George and Bungle, played yawn-inducing goody-two-shoes but Zippy was a loud-mouthed know-it-all. Needless to say, the psychotic Zippy became the star of the show. All lecturers have a bit of zippy in them. And, unless you have completely avoided computers your whole life then you cannot have missed the zip program and its bretheren which attempt to reduce the file space occupied by data files. Thus this was a zippy lecture in both senses of the word.

The lecture did not dwell upon the strange history of data compression but I thought it might be interesting to do so. Notwithstanding the importance of David Huffman’s contribution in 1952 [1], I think it’s worth noting that it was really in the 1970s and 80s when compression became a crunch topic. The personal computer was able to generate large amounts of data but data storage was still very expensive. Indeed in the early 1970s, a 1MB hard drive, if you could get it, would have cost around $1M. Nowadays we might expect to pay 1 cent. Hence lossless compression was suddenly very valuable indeed and there was flurry of activity. Much of that activity was commercial and a variety of algorithms turned out to be covered by patents.

In principle patents are not very problematic as one just pays the license fee and life goes on. In practice they can be very challenging. Firstly, mathematical algorithms are not patentable so intellectual purists are bothered when compression systems, which are mathematical algorithms, are patented. Secondly, it is not at all easy to check if an algorithm is covered by patents — there are “submarine” patents which do not appear via patent searches but surface at awkward moments and cause problems.

One such issue arose with GIF, the Graphic Interchange Format used mainly for short animations. The GIF format relied upon a form of compression known as Lempel-Zev-Welch. The first two papers by two Israeli engineers, Abraham Lempel and Jacob Ziv, known as LZ77 and LZ78 were rather technical [2], [3] and not covered by patents but Terry Welch speeded up the algorithm and created the LZW algorithm which was patented [4]. Welch also publicised the algorithm with an important, and quite readable article in IEEE Computer Magazine which is a widely read magazine designed to appeal to non-specialist computer engineers.

So the situation was that GIF was described but, unknown to the designers of GIF, it was reliant on a patent. I'm not clear if they were just sloppy in their patent searching or if the patent was indeed concealed but, as soon as the GIF standard became widespread, the patent holders started to approach users for licence fees. There was considerable kerfuffle at this, and I should note that it was very strange that the key Computer article did not mention the existence of a patent filing at all. It is normally good practice to warn people if things are covered by patents and I am at a loss to explain why this was not done.

Out of this came several things — a deep dislike of the patent holders Unisys (there were numerous attempts to discredit them including several cyber attacks) and an alternative format known as PNG. The patents have finally expired and so the overall effect of the patent was to generate a lot of ill will and a public domain competitor to the system - both surely count as great strategic losses.

The Unisys patent battle did have some positive ramifications - one of the unsung parts of MPEG is MPEG-LA, the MPEG licensing authority. Headquartered in Denver, MPEG-LA is a company that administers the patent pools required for various aspects of the MPEG standards. Given that a compression standard may infringe thousands of patents, the ideal of “pooling” these and giving licensors access to the bundle is a highly practical way out of the Unisys dilemma and it patent pools are now the model for progress in this highly competitive area of compression.

References

1. D. A. Huffman, "A Method for the Construction of Minimum-Redundancy Codes," in Proceedings of the IRE, vol. 40, no. 9, pp. 1098-1101, Sept. 1952, doi: 10.1109/JRPROC.1952.273898.

2. J. Ziv and A. Lempel, "A universal algorithm for sequential data compression," in IEEE Transactions on Information Theory, vol. 23, no. 3, pp. 337-343, May 1977, doi: 10.1109/TIT.1977.1055714.

3. J. Ziv and A. Lempel, "Compression of individual sequences via variable-rate coding," in IEEE Transactions on Information Theory, vol. 24, no. 5, pp. 530-536, September 1978, doi: 10.1109/TIT.1978.1055934.

4. Welch, "A Technique for High-Performance Data Compression," in Computer, vol. 17, no. 6, pp. 8-19, June 1984, doi: 10.1109/MC.1984.1659158.

A zippy lecture

Recent Posts