Skip to main content

Working with research data

It makes sense to organise your data appropriately from the outset. Organising your files in a logical and consistent way will make it easier for you to find, use and reuse your data in the short and longer term.

 


Organising files and describing data

To enable data to be located quickly and easily, a logical, consistent approach should be adopted to organising and naming your files and folders. This should be agreed upon at the beginning of your research project.

Decide (with your colleagues) on a file naming convention at the start of your project. Useful file names are consistent, meaningful to you and your colleagues and allow you to find the file easily.

It is useful if you and your colleagues agree on the following elements of a file name:

  • Vocabulary – choose a standard vocabulary for file names, so that everyone uses a common language.
  • Punctuation – decide on conventions for if and when to use punctuation symbols, capitals, hyphens and spaces.
  • Dates – agree on a logical use of dates so that they display chronologically i.e. YYYY-MM-DD.
  • Order - confirm which element should go first, so that files on the same theme are listed together and can therefore be found easily.
  • Numbers – specify the amount of digits that will be used in numbering so that files are listed numerically e.g. 01, 002, etc.

 

Keeping track of which versions of documents are the most recent can be difficult if you do not employ some kind of versioning convention.

  • Using a numbering system can be helpful, for example:  v01 would be the first version, v02 the second version. Minor changes can be indicated by increasing the decimal figure for example, v01_01 indicates a minor change has been made to the first version, and v03_01 a minor change has been made to the third version.
  • When draft documents are sent out for amendments they should carry additional information to identify the individual who has made the amendments. For example: a file with the name datav01_20150820_CB indicates that a colleague (CB) has made amendments to the first version on the 20th August 2015. The lead author would then add those amendments to version v01 and rename the file following the revision numbering system.
  • Include a 'version control table' for each important document, noting changes and their dates alongside the appropriate version number of the document. If helpful, you can include the file names themselves along with (or instead of) the version number.
  • Agree who will mark them as 'final.'

Research data needs to be saved in a file format that ensures the accessibility and re-usability of your research data in both the short and longer term. When you plan your research you should consider which file formats you will be using to store your data. The software you use will probably dictate this but in some cases you may have a choice.

Factors you should consider:

  • What software and formats you and your colleagues have used in the past?
  • What formats will be easy to share with colleagues? It’s useful to check colleagues all use the same software
  • What formats are at risk of obsolescence? Formats such as those produced by Microsoft Office are likely to last a reasonably long time but as they are proprietary they may not necessarily exist forever or remain easily readable.

Files can very quickly become disorganised and unmanageable if files are not organised in a consistent and logical way.

  • Check for established approaches in your team or department which you can adopt.
  • Name folders after the areas of work to which they relate and not after individual researchers or students. This avoids confusion in shared workspaces if a member of staff leaves and makes the file system easier for new staff or subsequent projects to navigate.
  • When developing a naming scheme for your folders it’s important that once you’ve decided on a method, you stick to it. If you can, try to agree on a naming scheme from the outset of your research project.
  • Structure folders hierarchically. Start with a limited number of folders for the broader topics, and then create more specific folders within these.

Providing clear data description, annotation, contextual information and documentation about your research will ensure your research data can be shared and understood by all users. It is good practice to begin to document your data at the very beginning of your research project and continue to add information as the project progresses. Include procedures for documentation in your data management planning.

There are a number of ways you can add documentation to your data:

  • Embedded documentation - information about a file or dataset can be included within the data or document itself. For digital data sets, this means that the documentation can sit in separate files (for example text files) or be integrated into the data file(s), as a header or at specified locations in the file. Examples of embedded documentation include: code, field and label descriptions, descriptive headers or summaries, transcripts, recording information in the Document Properties function of a file (Microsoft).
  • Supporting documentation - this is information in separate files that accompanies data in order to provide context, explanation, or instructions on confidentiality and data use or reuse. Examples of supporting documentation include: working papers or laboratory books, questionnaires or interview guides, final project reports and publications.