Lifespan RADAR: final report

March 26th, 2010

Lifespan RADAR is coming to an end. While we have still a few tasks pending, mainly related to the dissemination of our efforts through presentations and peer-reviewed publications, we will be submitting our final report before the end of the month. Please download it here.

Thanks to all our collaborators:
Philip Johnson
Andrew Hallifax
Stephen Greaves and Jenny Grigg at All you need is ears
Libby Bishop at Timescapes
Mike Savage and Niamh Moore at CRESC

Metadata schema: research item ontology

March 26th, 2010

The RHUL Digital Object Repository was launched at the end of last calendar year with a collection of RHUL’s research publications and one of past exam papers. The need to host collections to store primary research data presented a new requirement for the repository.

Our approach was to reach a common understanding of the ontology of the LC and of the tools available to the DOR. The initial draft schema was still too oriented towards the specific Lifespan ontology to be re-usable as a base schema for other collections. We therefore began to think of the object as a generic research item with curation and preservation needs, so we used existing knowledge, along with the research done as part of the Lifespan project to think about how a new schema for a research item might work. In developing the schema it was necessary to balance conforming to standards and adopting best practice with the implementation constraints posed by our specific repository tool and user requirements. The resulting draft schema will therefore not be a pure implementation of any one approach but will adopt aspects drawn from research information management, research publication and preservation models to create a schema for a repository collection that is both interactive and accessible, and aware of the ethical and preservation issues around such a collection. We take an iterative approach to schema development so this initial draft schema will be refined throughout the implementation cycles, starting with a small proof of concept implementation.

Once we reached a common understanding of the LC and the tools available to the DOR we could proceed to create a schema that was extensible and re-usable beyond the initial Lifespan RADAR use case. The DOR has a second proposed research data collection that can be used to test how re-usable the schema is. Additionally the college Open Access Publications Policy (OAPP) will require us from 1 September 2010 to store datasets alongside publications if required by research funders, and the experience of developing the LC will assist with this.

Alison Pope has prepared this document with more details.

Metadata schema: plan of action

December 22nd, 2009

Lifespan Research Group
17 December 2009

Last week we met with Alison Pope, Royal Holloway’s Business Analyst and our project partner, to discuss the construction of a metadata schema for the Lifespan Collection and the storage of the data set in the University’s Digital Object Repository (DOR). By bringing together Alison Pope’s specialist expertise in digital curation and preservation services with Lifespan Research Group’s understanding of the needs of the collection, a plan of action was devised. The multi-faceted nature of the collection means that we will approach the process of designing a schema by piloting a very small sample of the archive.

In the first instance, we had been concerned with how researchers’ from diverse disciplines and with widely varying research questions would be able to access our collection. Royal Holloway’s DOR works with Equella, a flexible platform that encompasses the possibility of creating tailor-made metadata schemas for each research collection. In order to create a schema suited to the needs of our collection, Alison Pope explained convincingly that the only way to build versatility at a macro level was to work closely and practically with individual examples, uncovering and solving difficulties as they arise.

She suggested that we choose a sample of six individuals from our collection of over 500 interviewees to explore their metadata needs. It was further decided that we should choose four individuals that are related to one other, in order to illustrate the connections within this large data set, and choose two others who are entirely unrelated. We should also make sure that our small sample of six interviewees includes examples from different sections of the collection (or phases of the research) because, whilst there is a great deal of continuity across the collection as a whole, there are some differences between, for example, the first set of interviewees approached and the last.

By testing in practice this small section, we hope to be able to identify the requirements of the collection as a whole. This task will be particularly interesting because it will shift our focus from thinking about the collection as an encompassing whole, toward working iteratively with a very small set of examples to explore the challenges on a case by case basis. We plan to meet her at the start of the year to gain feedback and develop the usability of our case sample.

Research, Access and Consent

December 4th, 2009

Study day @ Lifespan Research Group
18th November 2009

The Lifespan Research Group is funded by the JISC Repositories Start-up Project to explore the preservation and access of digitised research data for future research use. A significant part of this project involves the examination of the ethical and legal issues surrounding the access of the collection for new research purposes. One of the main challenges faced by the Lifespan Collection is the sensitive nature of the interviews, which relate personal experience, including mental health issues. Another interesting challenge, encountered by many other similar collections, is the retrospective nature of this attempt. The research was carried out at a time when archiving and secondary analysis were not yet common practice. Thus, it is important that a strategy is developed that respects both the original consent forms and ethical permissions and the possibility of future use. The purpose of the study day was therefore to collate relevant expertise in this area to inform future research procedures and to work towards an authoritative statement on the ethical position of re-use. This statement should also have relevance for other data collections facing similar issues.

The day began with an introduction by Professor Antonia Bifulco, Director of the Lifespan Research Group and Principal Investigator of the JISC funded Lifespan RADAR Project. Toni has had a leading role in conducting the research projects and therefore has an over-arching view on the collection and the research output. Leonie Hannan and Ananay Aguilar, both Research Assistants at Lifespan, introduced the physical content of the Lifespan Collection in more detail, reflecting on the possible ethical and legal issues that could arise from external access. With these considerations, the Study Day started in full.

Dr Philip Johnson contributed with a detailed examination of the legal issues involved, breaking down relevant parts of copyright legislation, the Data Protection Act and the Human Rights Act. Dr Libby Bishop sought to move the debate from a position focusing solely on ethical treatment of subjects of research toward a more inclusive understanding of the research process. She argued that all participants have to be considered in a way that takes into account transparency and, ultimately, the social responsibility, of research. In a similar vein, Dr Niamh Moore drew attention to the ethical issues concerning the collection and primary use of research data. Fundamental to their discussion was the dimension of time. The peer-reviewed output of research is designed to withstand time, but the often fragmentary and uncertain nature of the research process itself offers unforeseeable challenges, and remains more vulnerable to misinterpretation. On the other hand, as Libby pointed out, it should be the researcher’s responsibility to account for the process as well as the output. Both depend on each other and should be open to scrutiny in the same way.

Dr Graham Smith, Senior Lecturer in History and Co-Principal Investigator of Lifespan RADAR, chaired a session in which different researchers explored the potential of the Lifespan Collection in their fields. Smith used the Lifespan Collection on his oral history based research on food in England. Dr Helen Fisher, from the field of clinical psychology, used some of the quantitative data to compare it with measures taken as part of her previous research. Julia Feast and Margaret Grant from the British Association for Adoption and Fostering have used the methods developed from Lifespan’s research for their own research on a case study in trans-national adoption. The different uses given to Lifespan research’s output –qualitative and quantitative data and method- illustrate not only the potential of the collection, but also the many factors that ought to be considered for a comprehensive ethical and legal strategy.

The Study Day showed that considerations of ethical and legal issues surrounding the secondary use of research data should attend not only to the subjects of research, but also to the researchers themselves. The variety of research themes and approaches offered by the collection, further stressed the impossibility of designing a holistic strategy that accounts for the past and the future of this data. With this in mind, we discussed the potential for engaging with the community on the decision regarding future use. While Libby demonstrated how such a strategy can often lead to success, it may be costly or represent a research project in itself. A parallel strategy is to provide, on the one hand, levels of access for differently trained researchers and, on the other, sufficient contextual information for this potential audience.

With these themes in mind, we hope to produce in the near future a strategy for our own collection, together with recommendations for collections and archives with similar concerns.

Study day on 18 November at Lifespan

October 26th, 2009

Research, Access and Consent

The ethical and legal implications of archiving and secondary analysis in social-scientific research

We are hosting a study day to explore the ethical and legal issues in secondary research. The day will address themes surrounding the preservation and access management of sensitive life histories. The Lifespan Collection, with its valuable interviews and extracted data, offers a case in point as it will be the first of its kind at Royal Holloway, University of London. With specialists in the field, including experienced archivists, legal advisers and researchers, we will explore possible interpretations of the existing legislative and ethical frameworks affecting research.

Speakers to include:
Dr Libby Bishop (University of Leeds & UKDA, University of Essex)
Dr Niamh Moore (University of Manchester)
Dr Philip Johnson (Royal Holloway, University of London)
Dr Graham Smith (Royal Holloway, University of London)
Dr Helen Fisher (Institute of Psychiatry & King’s College, London)

The Study Day will be held on
Wednesday, 18 November from 10.00-16.30
at Royal Holloway’s central London campus: 11, Bedford Square

Refreshments and lunch will be provided.
Further information will be provided in due course.

To reserve a place contact: Lifespan@rhul.ac.uk
Places are limited and will be filled, free of charge, on a first come, first served basis

Lifespan Research Group
Royal, Holloway, University of London
11 Bedford Square, London WC1B 3RF
(Corner of Gower St and Montague Place – nearest tubes Russell Sq and Tottenham Court Road)

JISC projects start-up meeting

July 27th, 2009

Information Environment and VRE programmes start-up meeting
JISC
7-8 July 2009

This event provided a lively opportunity to become acquainted with other JISC-funded projects and their people. Building on JISC’s collaborative nature, the programme managers offered the tools and atmosphere for successful networking amongst participants. The workshop for first time project managers focused on the importance of community building exercises in order to share solutions, build on similar experience and strengthen the overall structure for future projects. The drinks and dinner that followed further reinforced JISC’s aim.

Unfortunately, we couldn’t attend the second day of activities, but were happy to meet some of the participants and discuss similar experiences in developing a research data collection. We hope to cultivate these contacts and make use of the community’s experience.

Our digitisation strategy

July 7th, 2009

Using the Lifespan Collection as a case study, the intention of this document is to describe best-practice digitisation and preservation solutions and propose a workflow in order to accomplish these tasks.

The approach we have taken is outlined below under the following headings: Content and Organisation of the Collection, Audio Quality and Transfer, Storage and Access and, finally, Workflow. Each section documents the progress that has been achieved to date making explicit the decision-making processes involved.

Content and Organisation of the Collection

The first step in our digitisation strategy was to comprehensively audit and organise the existing audio tape collection and paper records. This task is now complete and has involved the cleaning, ordering and re-housing of the audio tapes in a physically accessible format. The collection is now housed in clearly labelled box files and uses a consistent colour-coding for ease of identification s detailed below.

During the time that we were undertaking this audit and re-location, we also attended to the documentation of the audio tape collection. This involved ensuring that any conflicts in identification numbers of cassettes was finally and accurately resolved but, more significantly, resulted in the production of a comprehensive Excel database of holding. This resource is electronically searchable by research stage, group and generation, and is consistent with coding of paper records and additional research material. It has also allowed us to calculate that the Lifespan Collection holds 2165 tapes (with 39 currently designated ‘missing’). As the Collection included audio tapes of both 90 and 120 minutes, it was necessary to record the tape length in every case in order to accurately calculate the play time of the collection as a whole and in sections. The total was assessed as being 3,404.5 hours of recorded interviews.

With this audit, organisation and documentation of the collection complete, we could then consider the technical requirements of audio quality and transfer.

Audio Quality and Transfer

Initially, advice was sought on the different transfer options available to us and, in particular, the variables of sound quality, file size and potential cost. After listening to several test transfers provided kindly by Andrew Hallifax, it was decided that flat transfer would be the preferred method. Once this was decided, we contacted five contractors recommended by the British Library Sound Archive seeking quotations for undertaking this work. From these five quotations, which included a variety of transfer rates, formats and services, we found out that the full digitisation of the collection would cost around £70,000.

The International Association of Sound and Audiovisual Archives suggests a transfer rate of at least 24 bit in a frequency of either 44.1 or 96MHz; this view is shared by the British Library, which generally suggests a transfer rate of 96/24 in terms of audio quality alone. However, this rate would take up considerably more space and therefore result in a more costly endeavour. 96/24 rate and other newer technologies like Super Audio offer a far better sound quality, but considering the greater scope for studies involving semantic content (rather than for instance linguistic content), conventional CD quality (44.1/16 rate) seemed a more reasonable solution in terms of space management and costing.

An offer by Bristol-based company, ‘All you need is ears’, has been provisionally accepted on the basis of their realistic and detailed explanation of the process (including cheaper options), their offer of providing an engineer to train people at Royal Holloway for adequate storage, and their proposal to test transfer a sample of the collection for free. The transferred test samples arrived promptly in both 44.1/16 rate and 96/24 rate WAV files as well as in MP3. The transferred samples are now in DVD, fully labelled with logos and content, and include some metadata. We have since kept ‘All you need is ears’ abreast of our progress, but the digital transfer of the audio collection still awaits designated funding. Lifespan Research Group is currently working on possible bids to JISC, The Wellcome Trust and the ESRC for this purpose.

Storage and Access

All companies offering digital transfer services provide MP3 files in addition to the WAV files, free of charge. MP3 files are light in terms of storage space and offer the possibility of easy upload to the internet. Due to their size, these files would offer easy access and manipulability. WAV files would initially provide good quality sources for preservation and back-up, and future use for transfer or access opportunities.

Royal Holloway, University of London, offers a digital storage infrastructure (Storage Area Network), although as yet it does not offer enough storage space to accommodate the extensive Lifespan Collection (c. 2TB). Royal Holloway has also invested in software (Equella) for a Digital Object Repository, within which it is envisaged that research data generated by academic departments ought to be stored and, subsequently, accessed. The Lifespan Collection has engaged in the JISC-funded Lifespan RADAR project to provide an initiative to explore the particular needs of research data collections within the DOR framework. Ideally, the College in which this pioneering research took place, and this unique data set was collected, should provide the home for its digital manifestation. From Royal Holloway’s perspective, the case in favour of the Lifespan Collection being lodged with the home institution is further strengthened when the Research Excellence Framework’s requirements and research future of the college is taken into consideration. The Lifespan Research Group is therefore currently in discussion with the IT Department and the Library about future collaboration over the Lifespan Collection, but has also explored possibilities with Qualidata (University of Essex), Timescapes (Leeds University) and CRESC (University of Manchester).

While the audio recordings are currently of a good sound quality, the cassettes are reaching the end of their life expectancy and need to be preserved as a matter of urgency. A mid-term solution, in order to keep the tapes flexible, is to rewind the tapes. But this, again, would be a time-consuming and therefore, costly, endeavour.

In addition, we need to decide what external metadata we will link to each file. Work on this front is being undertaken and will involve existing and complementary research data exists in the form of word-processed transcripts, interview schedules, training manuals, questionnaires and quantitative data. This data, alongside developed training materials, will be exploited as a referencing system and used for contextualising the original concepts and scales.

Workflow

We are currently seeking funding sources to go ahead with an initial phase of transfer and digitisation with ‘All you need is ears’. This would consist of a 15 month programme, in which packages of around 80 tapes would be delivered ready for transfer once a month, every month, for that period of time. As each section is completed, tapes would be returned to Lifespan Research Group, separately from finished digitised files – for reasons of security. This first phase of the process would aim to digitise around 1000 tapes, which represents 40% of the whole collection. Costings have been calculated accurately for the temporary storage devices, the courier service and the insurance against loss needed to fulfil this programme. If this first digitisation phase goes ahead, it would provide the team not only with a sizeable section of the collection digitised, but also with valuable experience of the transfer process and the opportunity to evaluate and adjust practice to effectively meet needs.

How mandates work in practice

June 2nd, 2009

Research in the open: How mandates work in practice
Repositories Support Project
29 May 2009

In order to inform the project on current mandates in relation to open access policies, we attended this one-day conference organised by the Repositories Support Project and the Research Information Network. The event provided an opportunity to discuss the ways mandates are working, the policies and processes to improve them, the ways mandates can become embedded within the research cycle, and their current and future impact.

Speakers included representative from UK research funders such as the Wellcome Trust and ESRC, as well as from HEFCE itself. These presentations were complemented by informative presentations from Charles Oppenheim of Loughborough University who spoke about the recent ‘Houghton Report’ and the costs of implementing open access policies and Julia Wallace, who presented the PEER project pioneering collaboration between publishers, repositories and researchers. Paul Ayris of University College London announced and spoke about his institution’s forthcoming policies. Publishers, repository managers, publishers, learned societies and academic researchers were also present, offering rich discussions from multiple perspectives.

Open access policies are being enforced by Research Councils and other funding institutions to ensure greater distribution of research output. While a few universities, like UCL, are at the forefront of open access implementation, others are still outlining their corporate strategies to keep in pace with these policies. Being informed of the different strategies employed not only by other repositories, but most importantly, by funding bodies, is useful to the Lifespan Collection’s future projects. In this sense, we aim to bring the Repositories Support Project’s experience to play within our institution.

The Lifespan Collection Seminar in Manchester

June 2nd, 2009

Data on three generations of London families, their psychosocial experience and mental health: Exploring its preservation and re-use
CRESC, Manchester University
27 May 2009

The Lifespan COllection was presented by Antonia Bifulco, Graham Smith, Ananay Aguilar and Leonie Hannan (Department of Health and Social Care, Royal Holloway, University of London). The collection comprises a data series collected during two MRC programme grants 1980-1990 examining psychosocial risk factors for psychological disorder in families in London. It consists of the life stories of over 500 individuals, spanning three generations (from 16-85 years old), with interviews captured on audio-tape, the narrative summarised on formal schedules and an electronic data set. Content includes, but is not limited to, childhood and adult experience of adversity, lifetime psychological disorder using clinical interviews, assessment of marital relationships, support, attachment style and self-esteem. The information is collected with standardised investigator-based interview tools, with training materials and manuals.

The aim of the presentation was to outline the collection and discuss its potential for preservation and external access. The Repository Start-up Project Lifespan RADAR funded by JISC was outlined in terms of its aims and objectives and past and future projects. Aiming to offer the collection for alternative research, the presentation included a case study in using the collection for research on food and eating practices.

Many thanks to the organisers at the ESRC Centre for Research on Socio-Cultural Change (CRESC) at the University of Manchester. The session was chaired by Prof Mike Savage, Director of CRESC.

Two events

June 2nd, 2009

Digital economies and the politics of circulation
Columbia University, New York
3-4 April 2009

On 3 and 4 April Ananay Aguilar attended this conference organised by Ana Maria Ochoa and Claudio Lomnitz at the Center for Ethonmusicology and the Center for the Study of Ethnicity and Race, respectively, at Columbia University in New York. As the title suggests, the conference offered an interdisciplinary site for discussing the politics surrounding emerging economies of production, preservation and circulation of digital material. Central to the discussions were themes related to the production of memory and the circulation of heritage material as sources for the construction and reformulation of local identities, the creation of alternative production and consumption models and the intervening legislations. Some case studies showed how local and global legislations, rather than keeping pace with emerging technologies and tools, are tied up to traditional enterprises. The theme of intellectual property was used as a springboard to foreground the increasing incommensurability between the local, national and global politics of diversity, and governmentality.

Of particular interest to Lifespan’s Collection was Lawrence Liang’s (Alternative Law Forum, Bangalore, India) presentation. He raised the discussion in relation to the responsibility that archiving entails: responsibility to preserve, to make adequate use and to share. This is tied up with the creation of appropriate legislative tools: the responsibility of scholars is linked not only to creating meaning and inducing change, but to measuring the reach of single tools in reformulating these practices and originating others.

Archiving and Reusing Qualitative Data: Theory, Method and Ethics across Disciplines
CRESC, Manchester University
19-20 March 2009

Looking to embed the Lifespan RADAR project (and the longer-term development of the Lifespan Collection) firmly in the discourses of the cross-disciplinary field of archiving and qualitative data re-use, we attended the CRESC Conference at the University of Manchester. With theoretical, methodological and ethical issues at the heart of the discussion, the conference provided an expansive forum for participants to consider the shifting dialogues between users and archivists in the Web 2.0 era. From this standpoint, the notion of ‘Archives 2.0’ has emerged, which embraces the forward-looking criteria of Web 2.0 by working toward a fuller realisation of the potential of the web as a platform for archival collections and research resources. More specifically, it has brought to the table a range of services that are freely available and can be utilised by the smallest of institutions on the lowest of budgets. As might be expected, a very wide variety of papers were heard, from collection-specific case studies to theoretically-sourced thoughts on the construction of the finding-aid in the age of ‘Archive 2.0’.

Examples were provided of how Web 2.0 has been creatively used by recent digitisation projects, as a portal for newly digitised material and e-learning facilities. But this raised questions about the role of libraries and archives in the era of online digitised resources and the challenge of born-digital data. Derek Law, of the University of Strathclyde, accused the sector of neglecting the future in favour of safeguarding the past and challenged professionals to come to terms with the digital era.

Attending this conference, and participating in the discussions had, has helped us to situate the Lifespan Collection project within the broader context of up-to-date archival standards and within an interdisciplinary theoretical framework. It also provided an excellent opportunity to get to know academics and archivists at other institutions working toward similar goals.