[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

science with digital data

Since this comes up with some regularity, let me offer some avuncular
comments from a field which went essentially all-digital some years ago.
(Supercomputer guru Larry Smarr said almost 20 years ago that it was the
fate of astronomy to become the first completely digital science - and
it worked out pretty close). Replicability of data has become very
important; recycled use of archival data produces perhaps more
publications than the initial observers, thanks in great measure to
the ease of piping the bits wherever needed without, say, anyone
having to do without a unique photographic plate. Migrating data
media forward has to be considered a cost of doing business. One
thing that we did right, back in about 1978, was bang heads together
worldwide and agree on a common format for multidimensional data,
wrked out between representatives of the global community (ISTR
that there were folks from the US, UK, continental Europe, Australia,
and Japan). This has been extended since, by consensus, in interesting
but studiously backwards-compatible ways (the interested can look up
FITS, the Flexible Image-Transport System). The key was doing this so
early in the proliferation of formats that the costs of writing routines
to read/write these FITS files was modest, and has vanished into the
software effort since then. Furthermore, there was no huge investment
needed to do this (unlike, say, scanning and indexing the whole
astrophysical journal literature, for which having a sugar daddy like
NASA was indispensable). Whenever I see what geophysicists (or
worse, people working in medical imaging) still have to go through trying
to hack proprietary data formats, I think more and more highly of
the people who got FITS established.

Mind you, upkeep of data archives is a genuine cost, in which there has
to be sensible planning to move them forward into new media. (It was
a big enough project to move my own lifetime data collection from 9-track
tapes to 4mm/8mm tapes, and I'm still copying those to CDs). It's
not as if you can guarantee that any particular kind of scan will
reproduce everything you could learn by inspecting a fossil - but
how would it be to have a really good volume scan of the Berlin 
Spinosaurus, for example?

Bill Keel
Astronomy, University of Alabama