The digital curation lifecycle
Digital curation and data preservation are ongoing processes, requiring considerable thought and the investment of adequate time and resources. You must be aware of, and undertake, actions to promote curation and preservation throughout the data lifecycle.
The digital curation lifecycle comprises the following steps:
Conceptualise: conceive and plan the creation of digital objects, including data capture methods and storage options.
Create: produce digital objects and assign administrative, descriptive, structural and technical archival metadata.
Access and use: ensure that designated users can easily access digital objects on a day-to-day basis. Some digital objects may be publicly available, whilst others may be password protected.
Appraise and select: evaluate digital objects and select those requiring long-term curation and preservation. Adhere to documented guidance, policies and legal requirements.
Dispose: rid systems of digital objects not selected for long-term curation and preservation. Documented guidance, policies and legal requirements may require the secure destruction of these objects.
Ingest: transfer digital objects to an archive, trusted digital repository, data centre or similar, again adhering to documented guidance, policies and legal requirements.
Preservation action: undertake actions to ensure the long-term preservation and retention of the authoritative nature of digital objects.
Reappraise: return digital objects that fail validation procedures for further appraisal and reselection.
Store: keep the data in a secure manner as outlined by relevant standards.
Access and reuse: ensure that data are accessible to designated users for first time use and reuse. Some material may be publicly available, whilst other data may be password protected.
Transform: create new digital objects from the original, for example, by migration into a different form.
DMP Online is a flexible web-based tool to assist users to create personalised data management plans according to their context or research funder.
A service and infrastructure for computationally intensive learning analytics, and in particular for generating reports that mix static text with live data visualisations.
The DataCite Consortium provides a number of services to support efforts at increasing the ease and prevalence of data citation.
DataStage is a flexible data storage system that provides controlled access, secure backup, and the ability to transfer selected files to a more permanent archiving facility.
DataUp is a tool that assists researchers in reviewing, documenting, sharing and archiving their tabular data, especially Microsoft Excel spreadsheets.
DataVerse Network software allows organisations to host a storage and access system for research materials.
DMPTool is an online service to enable researchers to create data management plans now required by many funding agencies, and to receive tailored institutional guidance to help them in the process. The DMPTool provides data management plan templates for numerous U.S. funding agencies, giving step-by-step instruction and where possible alerting users to the services available at their own institutions.
EZID is a service provided by California Digital Library for creating and managing persistent, unique identifiers.
Figshare is an online, open access digital repository which enables users to upload and share research outputs of many different types, including qualitative datasets, media, presentations, posters, software code and figures. Figshare offers unlimited storage space for publicly-shared content, as well as 20GB of private storage per user.
The i2b2 (Informatics for Integrating Biology and the Bedside) Hive is a set of microservices aimed at genomics researchers, providing the functionality needed to construct a research workflow.
iRODS software creates virtual collections, allowing the user to interact with their stored data without needing to keep track of, or even have ultimate control over, the storage and computing facilities hosting the information.
Kepler is a scientific workflow modelling and management system that enables users, regardless of programming experience, to set up data analysis pipelines.
LabTrove is a blogging platform specifically designed for use in a research environment.
myExperiment is an online social networking service aimed at scientific researchers; the site fosters collaboration by allowing members to share scientific workflows, experiment plans, and other digital objects.
The PERICLES Extraction Tool (PET) captures information about the environment in which digital objects are created and modified, as an aid to recording provenance, contextual and preservation metadata.
RSpace is an electronic lab notebook (ELN) system, provided by Research Space. Its roots are in the eCAT electronic lab notebook, which was designed specifically for individual laboratories. RSpace’s functionalities go beyond this, from an ELN for solo researchers to a full, enterprise level offering for institutions.
Taverna is a management system designed to assemble, run, document and share scientific workflows.
WebCite is an on-demand web archiving service that takes snapshots of Internet-accessible digital objects at the behest of users, storing the data on their own servers and assigning unique identifiers to those instances of the material.