Planning Data Management


The data associated with a research project can be most effectively managed if you carefully plan how you will manage the data as part of the overall development of the research project. A crucial step in that planning process is developing a holistic, realistic, data management plan (DMP). A DMP outlines the steps you will take and the practices in which you will engage – on an ongoing basis – to successfully manage your data as you conduct your research. A good DMP discusses all aspects of organizing, documenting, transforming, and possible sharing of research data.

Once you have completed a first draft of your DMP, it is a good idea to turn to local data librarians, the staff of a repository such as QDR, or other relevant resource people, to finalize specifics you might be uncertain about (e.g., DOIs; access controls; long-term storage policies; licensing copyrighted items).

Carefully considering how research data will be handled during each phase of a research project, and documenting those decisions, are helpful in several ways. Managing data and generating documentation are much easier to do if you have a clear plan for each from the beginning of a project, and if you follow that plan as you collect information and generate data, rather than seeking to carry out the relevant data-management steps retroactively. Planning data management helps you to organize your workflow and operate efficiently, facilitates analysis and writing, and makes the eventual sharing of data easier.

Planning data management also helps you to estimate the costs associated with managing your data, and with depositing your data in a digital repository. This allows you to include realistic cost estimates in grant proposals that you develop to support your research. In particular, you are advised to check the fees that data repositories charge for curating and preserving data. For instance, in July 2018, QDR started transitioning to a model in which depositors bear some of the costs of curation and preservation. Discounts are available. More information can be found here.

Your Data in the Data Lifecycle

You should begin to plan how you will manage the data that you will collect or generate in association with a particular research project when you first start to design the project. When doing so, you should keep the Data Lifecycle (pictured below) in mind, and consider the data management tasks that need to be carried out at various points in the cycle.

  • data (collecting, organizing, and reducing information) involves naming files according to a convention and organizing them in logical directories;
  • data involves transcribing, translating, anonymizing, checking and cleaning, and backing up;
  • data involves migrating them to recommended long-term preservation formats, transferring them to suitable media, writing documentation, creating metadata, and archiving them;
  • giving access to data involves sharing them and establishing appropriate access controls.
Image of the data lifecycle.
Data Lifecycle. Based on:UK Data Service

Effectively managing data may be costly in time, money, or both. Increasingly, funding agencies are amenable to you including data management expenses in the project budget that forms part of your grant proposal. You should inquire with the agencies to which you plan to apply about what types of data-management expenses are allowable, what percentage of an overall budget data-management expenses can represent, and what terminology you should use to request the relevant funds. You can also ask other scholars who have carried out the data management tasks that you foresee undertaking (e.g., transcribing, translating) in similar contexts for help in estimating how much each relevant task might cost. The UK Data Archive provides a useful costing tool (pdf) to estimate the costs of data management for a research project.

Creating a Data Management Plan (DMP)

A Data Management Plan (DMP) is a formal document discussing how you will handle research data while you are collecting, generating, and analyzing them as part of a research project, and once the project has concluded. Developing your DMP should be one aspect of designing your overall research project. QDR has assembled a Data Management Planning Checklist to help you identify the data management tasks to include in your DMP. You should review your DMP periodically as you carry out your project and consider whether any adjustments are necessary.

Funding agencies are increasingly requiring that researchers create a DMP as a condition for awarding funds, and academic journals are increasingly calling on researchers to provide detailed information about how they generated and managed their data. However, QDR strongly recommends that you create a DMP for your own use, even if you are not formally required to do so by a funder or by the venue in which you hope to publish your work.

Frequently, funding agencies require DMPs to adhere to a specific structure or answer a particular set of questions. The DMP Tool, developed by the California Digital Library, includes a large catalog of institutional templates and guides researchers to producing a comprehensive DMP that conforms to funder requirements. A similar tool, focused on UK funders, is DMPonline, developed by the Digital Curation Center. DMPs can also be published through the DMP tool, in some journals like Research Ideas and Outcomes (RIO), and also on QDR as documentation, giving context to a published data project.

If you would like to mention QDR in your Data Management Plan (DMP) as the repository in which you plan to deposit your data, please contact us for a brief conversation about your potential deposit. If we agree that your data is suitable for deposit with QDR, we offer template language to adapt and include in your DMP.

 

Further Reading

Some additional resources on data management planning and creating DMPs can be found here: