IUHFC Project, Digitisation

With the selections made, the next step was to prepare the films for online delivery. In order to decide which format to offer them in (i.e. Flash, .WMV or QuickTime) we first had to agree upon the method of presentation. Would they, for instance, be streamed, download or indeed both? Downloads would benefit the user in terms of accessibility, allowing the ability to view them offline, however would also require faster connection speeds. Streaming on the other hand, while limited in this respect – and still dictated to some degree by connection speeds – could provide a more even platform shared by users. In the end, however, the deciding factor was copyright. These films are intended purely for education use, not general distribution, and since streamed video is easier to retain control of than downloads, we opted for Flash, a streaming format supported by most internet browsers.

To convert the films we used Grab Networks’ Anystream software, an audio-visual transcoder. Whether from Hi- or Lo-Band U-Matics, Beta SP or S-VHS – determined in each instance by the selection process – all were transcoded at 800 kilobits per second. This bit-rate, in line with many online video delivery services (the BBC’s iPlayer, for instance, streams at approximately 750kbp/s) was chosen because it provides a manageable file size (the higher the bit-rate the slower the stream) while retaining an acceptable degree of quality.

Once the conversion to Flash had taken place the resulting files were then checked against their ‘parents’, the aim of which being quality assurance, allowing, of course, for the effects of compression. Steps were then taken to eradicate video noise from the image borders by cropping frames where necessary. This ‘noise’ was a result of transcoding from tape, a by-product of the analogue to digital process.

You can find further information about this process, as well as information concerning the bit-rates employed, on each of the film’s respective pages – simply select the ‘Technical Info’ drop-down.

Digitisation of Text

Although essentially only a matter of scanning the booklets, the digitisation of articles published in University Vision and those accompanying the IUHFC films required a similar decision-making process to that of the films.

There was some discussion, for instance, as to whether the documents should be scanned as black and white or colour. It may seem like a somewhat superfluous question, since in all cases the documents consist of black text on ‘white’ pages, but once scanned the difference between the two is considerable. What, after all, is white? There are varying degrees thereof. With a colour scan the result is a colour realistic reproduction, whereas a black and white scan will remove all trace of hues and provide what it considers white. As to which way to go, the arguments for either are equally valid. The colour scan would provide users with an exact reproduction of the original, while the black and white one provides a contrast easier to read. Given that this was not an archive project – in which instance a faithful reproduction would clearly be more preferable – and since importance was placed more on the content of the document, the decision was made to go the black and white route(1).

It was important, also, to consider the resolution employed when scanning. The reason for this was two-fold. First, there was the issue of file size. A document scanned at 300dpi, the standard for printed publications, will have a larger file size than one scanned at 200dpi. This impacts directly on the user in that a larger file will naturally take longer to download, something we were keen to avoid. The second consideration, and one made in tandem with the first, was how the document would look when printed. After all, it is all very well producing a small file with print that can be enlarged on-screen, but if the text is unclear when printed it is of little use to the user. For both of these reasons it was agreed that 200dpi provided a happy medium.

There was then the question of ‘searchability’. The project team were keen to provide users with the ability to run a general search on the BUFVC site and, should any attributes be included within this documentation, be directed to the relevant page. To achieve this the documents were re-scanned using OCR software. In essence this involves the scanner extracting text from the scanned document and placing it in a rich text format file. Unfortunately, this process will often discount much of the page formatting, so it was necessary to reconstruct the document as printed(2). This accounts for the discrepancy in page numbers when comparing original and OCR versions (both are provided for download), since a greater amount of text can fit on an A4 sheet.

As with the IUHFC films, information related to the software and processes used in the digitisation of text can be found accompanying the relevant documents, either embedded within the University Vision files, or, in the case of film booklets, on the corresponding webpage – simply select the ‘Booklet’ drop-down.

(1) Although covers were scanned separately in colour before being added to the final document.
(2) This process was only applied to the University Vision articles, although ideally we would like to apply it to the IUHFC film booklets at a later date.

Frazer Ash
Digital Transfer Manager, Learning on Screen