e-Science Curation Study
Objectives
of the study
The Digital Archiving Consultancy has been
asked to carry out an audit of current “curation” of primary research data and
to identify the future requirements for future curation of this data, focusing
on primary research data, including data generated within the e-Science core
programme in the
We are entering an era in which digital data resources are becoming a central pillar of scientific research. Data volumes are growing, in many cases massively, as is the complexity of the data itself.
This will be magnified by the spread of “Grid” infrastructure and technologies. The Grid will allow the efficient manipulation of vast amounts of information such as that contained in the human genome or the results from experiments in CERN's new Large Hadron Collider. It will also allow the ability to mine data again and again by comparing existing data sets collected for one purpose with new and previously unrelated information, so generating new knowledge.
Implications
for future “curation”
There will be significant implications for the future curation of primary research data if we wish to ensure that such data can continue to be accessed and re-used over time. Digital information is now enabling new methods of research, dissemination and collaboration in areas ranging from environmental science to genomics. Digital technology also offers us the ability to exploit data more deeply and more broadly, and build on existing data. In all these areas there is a considerable amount of research and development work in progress, developing new tools and technologies to support increasingly powerful and sophisticated use and re-use of digital data, enabling inter alia easy collaboration (including between disciplines) and recognition of source.
(see e-Science link below).
Requirements for data curation vary between
disciplines but persistence of this information is increasingly important: not
only for validation of research but because it contributes to dynamic knowledge
bases or future research. Already in the
Many
issues
However, there are many questions. For instance, how much and what should be kept? Who should keep it? How do we pay for it? Who pays? How do we keep it? Preserving and keeping digital data appropriately is not straightforward - for instance, data formats come, vary, and go within just a few years.
The Programme of work
The following summarises the work which is being undertaken:-
v A desk review of Grid and non-Grid literature.
v Questionnaire surveys to relevant populations to establish current curation practice and requirements for the future. The populations being canvassed are:
§ - Data generators
§ Policy makers and funders
§ “Service providers (such as libraries, computer centres).
v Interviews with key individuals.
The
The DTI and the Research Councils committed
an initial £118M to a government-industry programme on e-Science. The reason for this investment is that GRID
technology is seen as the natural successor to the world-wide web and the
The e-Science core programme has been
established to co-ordinate the research effort on e-Science in the
Links
For
a good introduction to digital preservation issues see the article by Jeff Rothenberg
in Scientific American see: http://www.kb.nl/kb/ict/dea/download/dig-info-paper.rothenberg.pdf
For
further information about the JISC Continuing Access and Digital Preservation
Strategy for 2002-5 see:
http://www.jisc.ac.uk/dner/preservation/dpstrategy2002b.html
For
further information about the e-science programme see:
http://www.escience-grid.org.uk/
and
http://umbriel.dcs.gla.ac.uk/NeSC/general/
For
further information about the JISC Committee for the Support of Research see:
http://www.jisc.ac.uk/jcsr/index.html
For further information about US National Science Foundation blue-ribbon committee see: http://www.cise.nsf.gov/evnt/reports/atkins_annc_020303.htm
For
further information about The Data Archiving Consultancy see:
http://www.philiplord.com/index.html. (Please note this site is undergoing major reconstruction!)
If you have any further questions, please
contact Philip Lord (telephone 020-8607-9102) or