Using the Lifespan Collection as a case study, the intention of this document is to describe best-practice digitisation and preservation solutions and propose a workflow in order to accomplish these tasks.
The approach we have taken is outlined below under the following headings: Content and Organisation of the Collection, Audio Quality and Transfer, Storage and Access and, finally, Workflow. Each section documents the progress that has been achieved to date making explicit the decision-making processes involved.
Content and Organisation of the Collection
The first step in our digitisation strategy was to comprehensively audit and organise the existing audio tape collection and paper records. This task is now complete and has involved the cleaning, ordering and re-housing of the audio tapes in a physically accessible format. The collection is now housed in clearly labelled box files and uses a consistent colour-coding for ease of identification s detailed below.
During the time that we were undertaking this audit and re-location, we also attended to the documentation of the audio tape collection. This involved ensuring that any conflicts in identification numbers of cassettes was finally and accurately resolved but, more significantly, resulted in the production of a comprehensive Excel database of holding. This resource is electronically searchable by research stage, group and generation, and is consistent with coding of paper records and additional research material. It has also allowed us to calculate that the Lifespan Collection holds 2165 tapes (with 39 currently designated ‘missing’). As the Collection included audio tapes of both 90 and 120 minutes, it was necessary to record the tape length in every case in order to accurately calculate the play time of the collection as a whole and in sections. The total was assessed as being 3,404.5 hours of recorded interviews.
With this audit, organisation and documentation of the collection complete, we could then consider the technical requirements of audio quality and transfer.
Audio Quality and Transfer
Initially, advice was sought on the different transfer options available to us and, in particular, the variables of sound quality, file size and potential cost. After listening to several test transfers provided kindly by Andrew Hallifax, it was decided that flat transfer would be the preferred method. Once this was decided, we contacted five contractors recommended by the British Library Sound Archive seeking quotations for undertaking this work. From these five quotations, which included a variety of transfer rates, formats and services, we found out that the full digitisation of the collection would cost around £70,000.
The International Association of Sound and Audiovisual Archives suggests a transfer rate of at least 24 bit in a frequency of either 44.1 or 96MHz; this view is shared by the British Library, which generally suggests a transfer rate of 96/24 in terms of audio quality alone. However, this rate would take up considerably more space and therefore result in a more costly endeavour. 96/24 rate and other newer technologies like Super Audio offer a far better sound quality, but considering the greater scope for studies involving semantic content (rather than for instance linguistic content), conventional CD quality (44.1/16 rate) seemed a more reasonable solution in terms of space management and costing.
An offer by Bristol-based company, ‘All you need is ears’, has been provisionally accepted on the basis of their realistic and detailed explanation of the process (including cheaper options), their offer of providing an engineer to train people at Royal Holloway for adequate storage, and their proposal to test transfer a sample of the collection for free. The transferred test samples arrived promptly in both 44.1/16 rate and 96/24 rate WAV files as well as in MP3. The transferred samples are now in DVD, fully labelled with logos and content, and include some metadata. We have since kept ‘All you need is ears’ abreast of our progress, but the digital transfer of the audio collection still awaits designated funding. Lifespan Research Group is currently working on possible bids to JISC, The Wellcome Trust and the ESRC for this purpose.
Storage and Access
All companies offering digital transfer services provide MP3 files in addition to the WAV files, free of charge. MP3 files are light in terms of storage space and offer the possibility of easy upload to the internet. Due to their size, these files would offer easy access and manipulability. WAV files would initially provide good quality sources for preservation and back-up, and future use for transfer or access opportunities.
Royal Holloway, University of London, offers a digital storage infrastructure (Storage Area Network), although as yet it does not offer enough storage space to accommodate the extensive Lifespan Collection (c. 2TB). Royal Holloway has also invested in software (Equella) for a Digital Object Repository, within which it is envisaged that research data generated by academic departments ought to be stored and, subsequently, accessed. The Lifespan Collection has engaged in the JISC-funded Lifespan RADAR project to provide an initiative to explore the particular needs of research data collections within the DOR framework. Ideally, the College in which this pioneering research took place, and this unique data set was collected, should provide the home for its digital manifestation. From Royal Holloway’s perspective, the case in favour of the Lifespan Collection being lodged with the home institution is further strengthened when the Research Excellence Framework’s requirements and research future of the college is taken into consideration. The Lifespan Research Group is therefore currently in discussion with the IT Department and the Library about future collaboration over the Lifespan Collection, but has also explored possibilities with Qualidata (University of Essex), Timescapes (Leeds University) and CRESC (University of Manchester).
While the audio recordings are currently of a good sound quality, the cassettes are reaching the end of their life expectancy and need to be preserved as a matter of urgency. A mid-term solution, in order to keep the tapes flexible, is to rewind the tapes. But this, again, would be a time-consuming and therefore, costly, endeavour.
In addition, we need to decide what external metadata we will link to each file. Work on this front is being undertaken and will involve existing and complementary research data exists in the form of word-processed transcripts, interview schedules, training manuals, questionnaires and quantitative data. This data, alongside developed training materials, will be exploited as a referencing system and used for contextualising the original concepts and scales.
Workflow
We are currently seeking funding sources to go ahead with an initial phase of transfer and digitisation with ‘All you need is ears’. This would consist of a 15 month programme, in which packages of around 80 tapes would be delivered ready for transfer once a month, every month, for that period of time. As each section is completed, tapes would be returned to Lifespan Research Group, separately from finished digitised files – for reasons of security. This first phase of the process would aim to digitise around 1000 tapes, which represents 40% of the whole collection. Costings have been calculated accurately for the temporary storage devices, the courier service and the insurance against loss needed to fulfil this programme. If this first digitisation phase goes ahead, it would provide the team not only with a sizeable section of the collection digitised, but also with valuable experience of the transfer process and the opportunity to evaluate and adjust practice to effectively meet needs.