Contents

>Past President's Column
>PEDECIBA
>Scientific and Technical Information
>Macromolecular Nomenclature and Terminology
>Aligning the Red and Blue Books
>
IUPAC News
>Awards
>IUPAC Projects
>Highlights from PAC
>Provisional Recommendations
>
New Books
>Reports from Conferences
>Conference Announcements
>Conference Calendar

CI Homepage

Chemistry International
Vol. 24, No. 6
November 2002

 

Scientific and Technical Information


Cometh A Digital Dark Age?

by Tony Davies

Read also Wendy Warr 's Report on the 2002 ICSTI General Assembly

I was fortunate enough to represent IUPAC at the recent ICSTI (International Council for Scientific and Technical Information) seminar on the Digital Preservation of the Records of Science hosted by UNESCO in Paris over 14 and 15 February. The topics covered were an eye-opener for an analytical spectroscopist. I had thought that over the years we had managed to supply our field with a range of widely implemented international data standards capable of guaranteed long-term digital archiving. I suppose I was rather proud of what we had achieved as a community of users, manufacturers, and industry. I now realize we are the lucky ones. The rest of the scientific world are currently running scared of what now appears to be the advent of the so-called "Digital Dark Ages." In this issue I will highlight the reasons for the meeting at UNESCO and what urgently needs to be achieved on a global scale.

There is a general worry in the international scientific community that the moves toward electronic production and presentation of scientific data will lead to serious deficits in the archiving of the records of science. The first meeting on this topic was organized in January 2000 by ICSTI. A progress review in 2001 established the urgent need for a second meeting, which was hosted by UNESCO this February.

The objectives of the February meeting were outlined as follows:

  • to ensure all the interests in digital preservation in science are aware of all current activities in the field
  • to evaluate the needs for coordination of the efforts
  • to create any necessary structures and work programs to ensure coordination of the activities

It was decided that future meetings should also deal with the following issues:

  • What are the varieties and future uses of scientific and technological information that must and will be archived?
  • What is the minimum amount of information (data fields) needed to locate and identify information and who is creating what kinds of standards related to location and basic identification?
  • What business and information models are appropriate and how should access to the digital archives be arranged?
  • Where are the common issues with the preservation of more general cultural archives and how can these be accommodated?

The Seminar
The seminar started with the usual welcoming speeches and an explanation of the interests of the sponsoring organizations. There then followed two days of specialist presentations from interested scientific organizations, international representative bodies, and renowned speakers from the scientific publishing industry.

For me, one of the most worrying revelations during the two-day meeting was the current acute fear amongst science historians, which was reported by William Anderson of CODATA. He used a phrase, which at the time was completely new to me, in revealing that there is imminent danger in the arrival of a new "Dark Age," wherein our scientific cultural heritage may be permanently lost through the exclusive use of electronic media. This Dark Age will become more severe when electronic laboratory notebooks finally became integrated into the normal working environment. The danger of this new era was highlighted by an example of the problems archivists are now struggling with.

A worrying example was used to highlight the problem of archiving electronic material. Following the death of an eminent British scientist, his widow presented his archive material to the British Library for posterity. The problem was, however, that there is effectively no infrastructure available in this national archive for handling two old personal computers and boxes of old format disks!

Unfortunately, we are currently in the situation that science archivists have well-established practices for handling paper legacies, but currently have terrible problems when presented with digital content.

The Debate on What to Archive?
A large amount of time was devoted to discussion on exactly what should be archived; however, no general agreement was reached. The data community (probably heavily influenced by the FDA 21 CFR Part 11 rules currently revolutionizing pharmaceutical IT) thought that all information needed to be stored, whereas the traditional archivists looked at the logistics and demanded that only selected content land in the electronic archives.


...there is no equivalent law requiring that electronic-only publications be archived.


Although there is a legal requirement for publishers to deposit to their national archives all material printed in that particular country, there is no equivalent law requiring that electronic-only publications be archived. This international legal loophole urgently needs to be closed and will apparently be addressed during the Spanish presidency of the European Union. It will be interesting to see how the Council of Ministers deals with this thorny subject.

On an international level it was clear that the classic role of the librarian as archivist is outdated and being continually undermined by the digital presentation of scientific publications. An ever-increasing proportion of library budgets is being spent on digital-only subscriptions to peer-reviewed scientific journals. These electronic journals are maintained off-site and accessed through the Internet-often on a pay-per-view basis. The librarians cannot archive this material, as it never physically lands in the individual organizations. It was generally agreed that it is foolish to expect the publishers to take over the role of archivists and so another mechanism needs to be put in place.

A series of presentations dealt with individual limited-term projects that were or had been run in various countries funded by the Mellon Foundation, the EU, and by different national governments. What was strikingly clear was that the projects were not coordinated and any benefit would probably end with the funding.

Not Just a Problem for Scientists!
Having only just become aware of the phrase "Digital Dark Age," you can imagine my complete surprise when browsing through one of the bookshops at Newark International Airport two weeks later, I discovered a brand new book Dark Ages II-When Digital Data Die by Bryon Bergeron, a teacher at Harvard Medical School and MIT (published by Prentice Hall PTR, Upper Saddle River, New Jersey 07458, USA, ISBN 0-13-066107-4, www.phptr.com ). Much of this interestingly written book, which contains many anecdotes, directly addresses the problem of long-term data archiving. Written in clear, normal language, it is not a tacky techie tome for IT freaks. Instead, it has good advice for everyone from home computer users to managers of corporate networks. Bergeron attacks "Bloatware" succinctly and provides many useful links to more detailed information sources such as the US NARA (National Archives and Records Administration) Center for Electronic Records guidelines. The table below, extracted from the book, gives an idea of the level of the advice the book offers.

Expected Media Lifetimes under Ideal and Typical Conditions. (Extracted and adapted from Dark Ages II, Chapter 3, page 82.)

Storage medium

Typical lifetime (years)

Ideal lifetime (years)

Comments
CD-R 5-100 2-30 Dye less stable than pits used in commercial CD-ROMs
CD-ROM 30-200 5-50 Uses pits on a metal surface to encode data-fragile surface
DVD 100 20 Higher density susceptible to environmental changes
DVD-R 20-30 10 As with CD-R less stable than commercial media
Hard disks ?100 10-20 Lifetime is down to stability of the mechanical parts
Magnetic tape 30-100 5-20 Rewind periodically to release tension
WORM 30-200 5-50 Formats not as standardized as for CD-ROMs and DVDs
Paper Buffered ?500 50-500 !
Photographic print ?200 ?100 Assuming non-acid paper and stored out of light (non Polaroids!)
Microfilm 500 100-200 Standard for archives

Meeting Outcome
One of the messages that came out of the meeting was the clear need for a more active advocacy effort to make scientists aware of the encroaching danger and, especially, of the heritage value of their work, which they should be careful to make available to archivists. As digital preservation will not be a cheap exercise it was seen as important that the need be expressed at many levels in order to convince those who control the different budget sources of the vital nature of this work. ICSTI will take the lead in this area.

The different needs of the text archivists as opposed to the data archivists were clear to all by the end of the meeting. This was especially the case during discussions on metadata content. From my own recent experiences working with FDA 21 CFR part 11 compliance systems, I can see that the issues of exactly what metadata is worthy of storage and how to obtain it is still a critical factor in an industry well advanced in archiving digital content. Among those sciences just feeling their way into this field, there are those who cannot currently agree on what constitutes metadata!

I was surprised by the depth of thought given to this issue by many of the contributors to the seminar. There were a number of well-constructed arguments, such as those presenting the desire for a "technology watch" on current archival computing systems. This technology watch will need to be established in order to warn in advance of upcoming mitigation needs when computer hardware or software on which the archives are reliant are about to become outdated.

Developing countries reported that they need help, not only in the area of the preservation of science information, but also with more exposure, which they currently lack.

Conclusions
Okay, all I can say is worry! Basically, we should all be rather worried about the current status of born-digital scientific information. Fortunately, the current precarious state of our science legacy has been spotted and there are now international initiatives underway at a political level to secure the significant funding required for establishing the necessary infrastructure. We can only hope that they are successful. Maybe by talking about the problems with our colleagues we can raise awareness and support those striving to find appropriate solutions

Tony Davies <[email protected]> is secretary of the IUPAC Committee on Printed and Electronic Publications, and external professor at the University of Glamorgan, United Kingdom.

Reprinted from <www.spectroscopyeurope.com>

IUPAC


News and Notices - Organizations and People - Standing Committees
Divisions - Projects - Reports - Publications - Symposia - AMP - Links
Page last modified 31 October 2002.
Copyright © 1997-2002 International Union of Pure and Applied Chemistry.

Questions or comments about IUPAC, please contact the Secretariat.
Questions regarding the website, please contact [email protected]