Skip to Main Content

Research Data Management: Format & Organization

Sustainable formats

Formats more likely to be accessible in the future are:

  • Non-proprietary
  • Open, documented standard
  • Common usage by research community
  • Standard representation (ASCII, Unicode)
  • Unencrypted
  • Uncompressed

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.

Examples of preferred format choices:

  • PDF/A, not Word
  • ASCII, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG

Naming conventions

Directory structure naming 

When organizing files, directory top-level folder should include the project title, unique identifier, and date (year). The substructure should have a clear, documented naming convention; for example, each run of an experiment, each version of a dataset, and/or each person in the group.

File naming

  • Reserve the 3-letter file extension for application-specific codes, for example, formats like .wrl, .mov, and .tif.
  • Identify the activity or project in the file name

Use free tools to help you:

http://www.bulkrenameutility.co.uk

http://renamer4mac.com

File naming conventions for specific disciplines

DOE's Atmospheric Radiation Measurement (ARM) program 

This guide was developed by the CUNY Office of Library Services and is based on (and, in some cases, pulls from) guides created at the libraries at the CUNY Graduate Center, New York University, Massachusetts Institute of Technology, University of Massachusetts, University of Michigan, and Stanford University.