Brecht Declercq is Digitisation and Acquisition Manager for VIAA in Belgium
At first glance, there is a strict distinction between carriers of analogue and digital audiovisual information. But in practice, this distinction is not always clear. There are even carriers of audiovisual information that can hardly be catalogued under one of those two names. The word ‘digitisation’ is therefore not always used correctly. Moreover, there is an important difference between digital information, file-based information, and information stored on mass storage systems. By taking a closer look at the history of digital information storage, it becomes clear that the world has not made the switch overnight. There are decisive inventions, but more often even very long-term evolutions. Both inventions and evolutions, of which the importance is almost forgotten, even prove to be essential for understanding where we have arrived today, and how we can preserve any audiovisual information.
Strictly speaking, digitisation is the conversion of analog into digital information. But at the Flemish Institute for Archiving (VIAA), where I work since 2013, we don't think that is enough. We not only want to make audiovisual information digital, we also want that digital signal to exist in the form of a file, and we want these files to be placed onto our mass storage infrastructure (servers, tapes) according to a traditional file system, so that we can manage the files and their content much faster and more easily: move, copy, check, analyse, describe, etc.
Taking all kinds of audiovisual media (records, cassettes, tapes, films, etc.) as a starting point, and what I described above as the final goal, then for the vast majority of those media digitisation is the biggest job and the one that attracts the most attention. That is why we very often use the term "to digitise" as a pars pro toto for the much wider range of activities that go with it, even if strictly speaking it’s not always correct.
Some people make a fairly strict distinction between analogue audiovisual information, on cassettes, tapes or records of all kinds, and opposed to that: the computer, with its files, its hard disk, and possibly some types of carriers that we associate with files: floppies in all kinds of colors and shapes, USB sticks, memory cards, ...
But in fact there is a whole bunch of information carriers that are technically on the transition between these two seemingly different worlds. These carriers contain audiovisual information that is already digital, but does not yet exist in the form of a file. Or even one step further: the carriers contain audiovisual information, digital and stored as a file, but they still cannot be managed easily and quickly, because they’re not on a mass storage infrastructure. VIAA also wants to save this information. Since the information is already digital, we prefer the terms ‘digital migration’ or ‘digital transfer’ over ‘digitisation’.
The history of digital information, storing of audiovisual information and files
Digital information really means nothing more than "information expressed in figures". To write those numbers out, obviously we don't just use the Arabic numbers. We code each number with a combination of 0 and 1, positive and negative, black and white, absent and present, yes and no, to be or not to be, quoi. Some of us might not be aware, but we have been storing information in a structured way in this way for centuries. Consider the punch cards that have been around since the 18th century! A hole is 1, no hole is 0.
Up until the first decades of the twentieth century, the information that was stored digitally remained relatively simple: series of numbers, possibly combined into accounting tables and suchlike. But audiovisual information is actually much more complex. Although… already in the nineteenth century, information that could be converted into an audiovisual signal was stored in the form of a kind of punch card: cardboard books for street organs. They let the air through the organ pipes, or block it, depending on whether or not there is a hole in the cardboard. The piano roles for pianolas work on the same principle. It is digital information convertible into audio, but it is very, very simple.
For decades, the digital storage of audiovisual information did not go beyond this simple principle, while the analogue storage of audiovisual information made it much faster to store and play back very complex signals such as speech. First on flat carriers - think of the phonograms of Edouard-Léon Scott de Martinville (1860) and the tin foil (1877) of Edison. Later on there were cylindrical carriers: the wax rolls (1887, also from Edison) and at about the same time the flat discs: gramophone records (1887) from Berliner. But these techniques are all based on a kind of engraving with a needle, either from left to right (laterally) or up and down (vertical).
In 1899, Danish inventor Waldemar Poulsen read an article from the American engineer Oberlin Smith, suggesting that one could record sound on a wire or ribbon by applying metal to it and magnetizing the particles with an electromagnet. The wire recording is born. It would be the basis for all later magnetic, tape-based information carriers. With that we had carriers for complex audiovisual information and we had a digitally coded, audiovisual storage medium (the street organ cardboard books!), but we did not yet have a combination of both.
Pulse code modulation, a way to record complex analogue sound using digital code, was invented in 1937. But it took until 1967 before this way of coding sound information is combined with magnetisation on a tape. The Japanese broadcaster NHK was the first to store a digital, audiovisual signal on a tape (notably a magnetic tape originally intended for analog video!). The Japanese did it by magnetising that tape, but without giving the digital information the shape of a computer file. This will remain a way of working in use until the beginning of the next century (Digital Betacam, Digital Audio Tape, ...). Incidentally, magnetisation also became an interesting form of storage for non-audiovisual information. On some types of magnetic tape one can already store both audiovisual and non-audiovisual information in the 1970s.
Using magnetisation to store sound digitally in the seventies and eighties proves that it can achieve a much higher quality of storage. Moreover, this method seemed less susceptible to the loss of signal that affects analogue magnetic tape. Nevertheless, the success is limited to the professional environment.
At the end of the 1960s, Philips engineers found that the Japanese way of magnetising tapes had three major disadvantages. Firstly, at that point, storing digital information on a magnetic tape to be read out with a reading head was place consuming: the way of coding was just not compact enough. Too much tape was needed for just a tiny bit of moving image or sound. The Philips people preferred to work optically, with a laser beam. By doing so, one can store more on a small surface and there is also less chance that the reading head will damage the carrier, because there is no physical contact!
Secondly, a tape isn’t considered very practical, because the information is stored in a linear way: you have to wind it to the place where the information you are looking for is stored – which is, by the way, still a disadvantage of LTO-tapes, a storage medium very popular today in large audiovisual archives. On a plate or a disk the right place can be reached much faster, because the reading head can move freely in two directions. The combination of the disk and the laser technology lead to the optical disks, first the LaserDisc, and shortly afterwards also the CD.
But there is a third disadvantage, that even the LaserDisc does not really resolve in the first instance: the digital signal is not packaged as files. This ensures that the way in which the digital information is precisely structured, still differs between different carrier types. One doesn’t only need the right reading equipment, but also to know about the structure and the code. A traditional file system such as that of a computer still can't handle it right away. To solve this problem, we’ll have to tell a third story, in addition to storing digital information and storing complex audiovisual information: the story of storing files.
The word ‘file’ has been used in combination with digital information (punch cards) since the 1950s. But the real design of digital information as files as we know it today only came into existence in the 1970s, in parallel with the first floppy disks, on which, for the first time ever, one could store any file format. The main limitation was that it had to be what we now consider very small files, because the capacity barely reached 1MB!
The storage of digital information, the storage of complex audiovisual information and the storage of files could therefore only come together on the condition that either the audio and video were very small, or the storage media had sufficient capacity. Both movements occurred more or less at the same time. The oldest standardised digital file format for moving images, H120, dates from 1984. The resolution was 176 by 144 pixels and it ran at 30 frames per second. That playback required from the computer that it processed the images at 2Mb per second, which was a huge accomplishment in those days!
But once that got off the ground, things went fast. The capacity of the storage media grew very quickly, and the capacity and computing power needed to play sharper, clearer and more beautifully coloured images also grew. How to code images in a nicer and also smaller way, was established in standards for file formats, which also had to evolve very quickly. It is that lightning-fast evolution that ensures that - even though the signal is digital and file-based - we may still have difficulty recognizing, transferring, reading and playing those files.
The challenges of digital transfer
The oldest audio-visual carriers with a digital signal of which VIAA has migrated the content, are LaserDiscs. These disks contain a digitally coded, but non-file-based video signal. To save that signal as a modern file, one must first read it, but then also transcode it: the old code has to be converted into a more recent one (in our case JPEG2000) and packaged as a contemporary file (in our case MXF). At this point the first and most important challenge of the digital transfer becomes evident, and guess what, it is exactly the same as in digitisation sensu stricto: how do we avoid to change the signal, the essence, the images and the sounds?
A second challenge arises if the player allows the signal to be delivered both in an analogue and digital form, as is the case with the DV, DVCAM, and DVCPRO cassettes for example. With analogue playback (via the so-called SDI output), an automatic restoration is applied, which improves the image on the one hand, but also alters it and does not transmit certain metadata. The digital signal output (IEEE 1394 output, also known as FireWire or d.Link output) delivers the signal as it is, but that includes the very common errors or dropouts, resulting often in very distorted images. The decision about which version(s) will be preserved and what the files should look like, has led to heroic debates between ‘purists’ and ‘pragmatists’ (my epithets) in the audiovisual preservation community.
A third challenge is the fact that some carrier types that are able to store digital information, are medium independent. Since the mid-eighties we have digital, complex audiovisual files, which can also be stored on a type of information carriers that can contain any file format. Floppies (with sufficient capacity), CD-ROMs, DVDs, USB sticks, XDCAM discs, etc. can store any kind of files, audiovisual or non-audiovisual. Which files are stored on these carriers and which ones we chose to preserve then becomes the big question.
Hit the mark: the three challenges above can even arise in combination. After much thinking my personal conclusion is that digital transfer is the same as digitising analog information, and yet completely different. And it might come as a deception to some, but the one is certainly no simpler than the other.