John Chamberlain
Developer Diary
 Developer Diary · You Heard It Here First · 20 November 2003
The Dawn of a New Era for Scientific Data
A fundamental change is occurring in the way scientific data is created, published and analyzed. A number of different groups are feverishly working on semantic protocols that will allow scientific data to be seamlessly exchanged. This will increase the amount, quality and speed with which researchers can access data. Many of the ad hoc methods scientists must currently use will go away and be replaced by integrated tools like the OPeNDAP Data Connector. To get a sense for the impact compare the way data flows in a typical contemporary lab with the vision being worked on:

   What Happens Now                                  
   An instrument records the data
   A technician downloads from the instrument, and ...
   ...preprocesses the raw data file
   ...cleans up and adds any special values, codes
   ...saves the data in a file with an ad hoc format
   One or more other scientists/technicians process the data into one or more output products
      (the output product might be in a standard format like HDF)
   The user finds out about the data by any of a myriad non-standard ways
   The user has to figure out somehow which data file they want
   The user downloads (or orders) the data file
   The user possibly has to parse or process the file somehow
   The user opens in their analysis program
   The user repeats the above steps for different analysis programs
   The user can only output whatever the analysis programs support

   The Future
   An instruments streams data to a database
   The data is automatically catalogued by the site
   The site catalogue is automatically published to various global catalogues
   The user uses an integrated client (like the ODC) to search for data
   The source database is published by various streaming server
   The integrated client can browse, query and preview any data from any server
   The integrated client can exchange data live with any analysis tool
   The integrated client can re-publish data and output it to any generic format

Integrated client's like the OPeNDAP Data Connector are middleware that act as the lingua franca for scientists, allowing them to collect, analyze and publish data with greater ease than was ever possible before. In a sense they are kind of like Napster for Scientists. Ultimately the result will be an explosion in the quantity of scientific discoveries.

Developer Diary · · bio · Revised 20 November 2003 · Pure Content