How to Manage Your Raw Data So that it is Audit Proof

Casey K. Chan

In peer review publications or in graduate theses only the final set of data are usually published. In some peer reviewed publications, raw data may be included as supplemental data. It is essential to establish a trail on how the final set of data is derived from the raw data for the following reasons:
  1. alternate consolidation and alternate display of the data can be quickly generated if the original raw data is well organized
  2. should publisher or funding agency request the raw data it can be easily made available
  3. in the case of dispute in the authenticity of the research data an audit-proof set of raw data will be your best defense.
The best way to accomplish is to systematically set up a series of spreadsheet that are linked. A preferred approach is to create a "protocol" page in which the detail of the experiment is described and how the data are collected. The raw data are recorded in a separate spreadsheet and subsequent are data derivation are recorded in another spreadsheet with deriving formula linked to the raw data. The linking formulae provide full documentation on how the final derived data are related to the original raw data.

You can download the original Excel file from here goo.gl/qeDN1. It will be displayed in Google Doc format by default (ugly) but can be downloaded as an Excel file.