DMAP question guidance:
Implementation section

Guidance for answering questions within the Implementation section of your data management and access plan (DMAP).

How do I use this guidance?

There are notes for all the questions in the implementation section of the data management and access plan (DMAP). You can also refer to:

 

Or, you can prioritize the questions you answer according to:

 

Data sharing

Provide a concise summary of the objectives and reasons for data collection or generation within the project. This should align with the project’s goals and objectives and could mention specific research questions or hypotheses that the data will address.

 

Things to consider:

 

  • What are the primary objectives of your project that require data collection, usage or generation?
  • How will the data contribute to achieving the overall goals of your project?

Specify the types of data to collect, generate or use, such as experimental measurements, survey responses, observational data, images, etc and specify which of these data are third party (if any).

 

Things to consider:

 

  • What types of data will be collected, generated or used – it may be useful to organize them into those three categories
  • Are any of these data types from a third party?
  • If they are from a third party – which policies noted in the relevant policies section are applicable?

For each data type, specify the format (e.g. CSV for tabular data, TIFF for images, json, geojson) for storage and for sharing (if different from storing). Note whether this format is proprietary or not and if so, provide a justification.

 

Things to consider:

 

  • What formats will be used for storing each type of data and why? It can be useful to think about factors like standardization, compatibility with analysis tools etc
  • What formats will be used for sharing each type of data? (If different to storing)
  • Are there any standard formats in your field that you are required to use?
  • Are your formats proprietary and why have you chosen to use a proprietary format?

For the types of data listed as already existing (i.e. they are being (re)used), note the source and explain how that data will be integrated with the new data and note which permissions are needed to use existing data.

 

Things to consider:

 

  • Will any existing data be reused in your project? If so, what are the sources?
  • How will the existing data be integrated with the data you collect or generate?
  • Do you have the necessary permissions to use any third party existing data, how will you adhere to the intellectual property rights?

For data collected, or derived from collection, describe where the data will come from for each type (e.g. sensor readings, interviews, simulations, field observations or a mix) and a brief overview of the methodology.

 

Things to consider:

 

  • What are the primary sources of your data (e.g. surveys, sensors, experiments)?
  • How will the data be collected or generated?
  • What conditions or settings are involved in the data collection process?

Provide an estimate of the total volume of data you expect to generate or collect in gigabytes or terabytes and give a brief description of the infrastructure in place to store it.

 

Things to consider:

 

  • What is the estimated volume of data (in gigabytes/terabytes) you expect to generate or collect?
  • How will you manage and store this volume of data?
  • Are there any potential challenges related to data volume that you need to address?

Referring to your answers in your DMAP for Direction Setting question 8 will help you with answering this Implementation question.

Identify which of the project data contain personal or sensitive information and specify what the information is which makes it personal, sensitive or combined.

 

Things to consider:

 

  • Which data is personal or sensitive? Definitions for both are provided in the recipe for this template for guidance
  • Why is that data personal or sensitive? Does it contain farmer names and addresses, phone numbers, ethnicity, political beliefs etc?

Referring to your answers in your DMAP for Direction Setting question 3 will help you with answering this Implementation question.

Identify and explain who might benefit from the data and the potential for reuse beyond the project lifecycle.

 

Things to consider:

 

  • Who are the beneficiaries of your data and why? Consider other researchers, policymakers, the public etc.
  • How might the data be reused beyond the initial project, think about other contexts or future research?
  • What is the potential impact of the data on your field or on broader scientific or societal issues?

Referring to your answers in your DMAP for Direction Setting questions 2, 4 and 6 will help you with answering this Implementation question.

Making data 'Findable'

Outline how data will be made discoverable by stakeholders during the project and internally, afterwards. If there is a plan to publish the data openly, after the project, outline how it will be made discoverable for that, too.

 

Things to consider:

 

  • How will your data be made discoverable to users; how will they be able to search and find it?
  • What tooling (e.g. repositories, or catalogues) will you use to list your data? For information around repositories and catalogues see the ‘Resources’ section in the recipe.
  • What recognised metadata standards will you employ to enhance visibility?

Describe the use of persistent identifiers for your data types. These unique ID’s ensure datasets can be reliably found overtime and avoid confusion and duplication.

 

Things to consider:

 

  • What persistent identifiers will you assign to your datasets (e.g. DOIs, GUIDs)?
  • How will you ensure that each dataset has a unique identifier?
  • How will you maintain and manage these identifiers over time?

Describe the practice of naming conventions for datasets within this project.

Things to consider:

 

  • What naming conventions will you use for your datasets?
  • How will you ensure that dataset names are both consistent and descriptive?
  • Are there standard naming practices in your field that you need to follow?

Specify the metadata that will be captured to describe the data.

 

Things to consider:

 

  • How will you ensure that project metadata is comprehensive and includes all necessary information?
  • Here is a checklist of recommended metadata to capture:- A globally unique Persistent Identifier (PID) e.g. a DOI (provided by public data catalog or CG data space)
    – A title
    – Related people, i.e. the creator of the dataset and others who contributed to the dataset
    – Date on which the dataset was completed
    – A description of how the data were created (contextual information)
    – Target group for the dataset deposited (i.e. scientific disciplines)
    – Keywords that describe your data (use controlled vocabularies if available for your field)
    – A license that clearly states the extent to which the data is accessible (a list to choose from is given in public data catalog or CG data space)
    – Temporal coverage: the period of time to which the data relate
    – Spatial coverage: Geographical location of the research area or site
    – Related datasets, resources like publications, websites etc. (digital or analogue)
    – File formats used in the dataset

Describe what supporting documentation you will provide about the data, and in what detail and format it will be provided.

 

Things to consider:

 

  • What documentation will you provide to accompany your data, think about explaining methodologies, schemas, and any relevant context?
  • How will you ensure that users can understand and use your data effectively, consider terms of use and guidance and how your data should be cited by others?
  • How will you document changes and versions of your datasets?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 4, 5, 6, 7, 9 and 10 will help you with answering this Implementation question.

Making data 'Accessible'

Outline how project data will be made accessible (during and afterwards), if data is not to be made open, provide reasoning as to why. Aim to make data openly accessible unless there are specific reasons to restrict access such as personal or sensitive information.

Things to consider:

 

  • How will data be accessed during the project and after the project? Will the data be openly accessible to all users, or will access be restricted?
  • How will users access the data once they’ve found it? (e.g. through a specific repository or platform, download, email, API)?
  • What procedures will be in place to manage access to restricted data, think about the process of managing access requests?

Referring to your answers in your DMAP for Direction Setting questions 2 and 6 will help you with answering this Implementation question.

Are there any embargo periods applicable to project data, and if so, provide the details and a rationale explaining why it is necessary.

Things to consider:

 

  • Is there an embargo period before the data can be publicly accessed?
  •  Why is an embargo period necessary for your data?
  • How long will the embargo period last, and when will the data be released?
Making data 'Interoperable'

Indicate which domain specific data and metadata vocabularies, standards or methodologies you will follow to make your data interoperable?

 

  • Visit the vocabulary page for a list of domain specific vocabularies. Note here the ones you will use in your investment. If a vocabulary you’re using doesn’t appear on the list, add it here also.

Indicate which domain agnostic standard vocabularies will be used for all data types present in your data set to allow interdisciplinary interoperability?

 

  • Visit the vocabulary page for a list of domain agnostic vocabularies. Note here the ones you will use in your investment. If a vocabulary you’re using doesn’t appear on the list, add it here also.

Outline what software and tools are required to support the interoperability of the project data?

Things to consider:

 

  • What software or tools are required to use or analyze your data?
  • Will you provide documentation on how to use the software and tools and provide data in an interoperable way?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 3, 4, 5, 6, 7 and 8 will help you with answering this Implementation question.

Where you use project specific or local vocabularies, will you provide mappings to more commonly used vocabularies, if so, outline your approach.

 

Things to consider:

 

  • Has this already been decided, if not, how will you decide what mappings to use
  • What vocabularies are already in use within the investment and is it necessary to create new ones
  • Will you consult the Data Office on whether they can offer guidance?
Making data 'Reusable'

Specify the licenses under which data will be made available (e.g. Creative Commons, GPL) and what conditions or restrictions that has on data reuse. Also specify where you will make the guidance for this available.

Things to consider:

 

  • What licenses will you use to clarify the terms of data reuse? (For more guidance on choosing the correct licence, see the ‘Resources’ section in the recipe.)
  • Are there any conditions or restrictions on how the data can be reused?
  • Where can users find the full text of the license?

Indicate how long datasets will remain reusable.

 

Things to consider:

 

  • How long will project data be available for reuse? Some datasets may remain usable longer than others
  • Will the data be updated over time, and if so, how?
  • How will you ensure that the data remains available for long-term reuse (reference the above question on long term access)?

Outline the actions that will be taken to ensure data quality.

Things to consider:

 

  • What quality control measures will you implement to ensure data quality (e.g. peer review)?
  • How will you validate and error check the data before sharing?
  • Will you document the quality assurance process? If so, how, and where will it be documented?

Referring to your answers in your DMAP for Direction Setting questions 4 and 5 will help you with answering this Implementation question.

Data lifecycle technical infrastructure

Describe the end to end infrastructure in place to support data management (e.g. institutional repositories, cloud storage, data analysis tools). It might help to refer to the data lifecycle diagram in the recipe and address each step in turn.

 

Things to consider:

 

  • For each step in the data lifecycle, what infrastructure is being used?
  • Which tools are recommended by CGIAR and could more be used?
  • Is each tool fit for purpose?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 3, 5 and 8 will help you with answering this Implementation question.

Outline measures in place for data protection throughout the data lifecycle.

 

Things to consider:

 

  • What measures will you implement to protect your data?
  • How will personal and/or sensitive data be handled to ensure it is collected and stored correctly?
  • What access control measures will you use to restrict access to authorized personnel and collaborators?

Referring to your answers in your DMAP for Direction Setting question 3 will help you with answering this Implementation question.

Describe the planned use of access control for project data to manage different levels of sensitivity.

Things to consider:

 

  • What are the different levels of sensitivity in your data and what levels of access are there to each?
  • How will you manage the different access levels to your data?
  • How will you monitor access to your data?

Referring to your answers in your DMAP for Direction Setting question 4 will help you with answering this Implementation question.

Detail the backup and recovery procedures in place.

 

Things to consider:

 

  • Where will data be backed up to and what is the retrieval process in the case of an incident?
  • Which data will be backed up and how long will it be backed up for?
  • Who is responsible for completing the back up procedure and how often will they do it, is this a manual or automated process that needs verifying?

Describe the strategies and infrastructure in place to preserve data beyond the project lifecycle.

Things to consider:

 

  • How will you ensure that data remains accessible in the long term, specify the retention period?
  • What strategies are in place for the long-term preservation of data, how will you go about ensuring its tenure?
  • Who will be responsible for maintaining access to the data after the project ends including reviewing and updating retention policies?
Ethics

Outline any ethical issues that have been identified and need managing throughout the duration of the investment.

Things to consider:

 

  • What ethical issues have been identified? If you’re unsure what ethical issues might be present, see the ‘Resource’ section of the recipe, which discusses the use of a data ethics canvas
  • Have you obtained approval from relevant ethics committees if applicable?

Referring to your answers in your DMAP for Direction Setting question 10 will help you with answering this Implementation question.

Describe how informed consent will be obtained if applicable to this project.

 

Things to consider:

 

  • How will you ensure participants are fully informed about the study and data use?
  • What methods will you use to obtain and document informed consent?
  • How will you handle situations where participants withdraw consent?

Outline the approach to anonymizing data where applicable.
Things to consider:

 

  • Identify which data needs to be anonymized and where it will be recorded
  • What process will happen to ensure specified data is anonymized?

Referring to your answers in your DMAP for Direction Setting question 3 will help you with answering this Implementation question.

Outline your approach to managing risk for your sensitive and personal data. More guidance on this can be found in the ‘Resources’ section of the recipe.

Things to consider:

 

  • Data minimization – are any extra, unlikely to be used, data being collected? If there is no need to collect the data, it should not be collected
  • What measures will you take to protect confidential data? For instance, making use of anonymization techniques to retain the utility of the data but reduce the risk
  • What will your retention policy be for personal/sensitive data?
Data sharing

Briefly outline your data sharing agreement (if applicable) and any recipients of it, alongside any restrictions, constraints or challenges observed.

 

Things to consider:

 

  • Will you need to put one or more Data Sharing Agreements in place?
  • If so, who with and in regards to which data?
  • What challenges, if any, are there in regards to the sharing agreement or data shared, and how will those be managed?

Referring to your answers in your DMAP for Direction Setting questions 4, 6 and 7 will help you with answering this Implementation question.

Compliance and support

Describe how you will monitor compliance with the Data Management and Access Plan throughout the investment.

 

Things to consider:

 

  • How will you monitor compliance with your data management plan?
  • What procedures will you implement for regular audits?
  • How will you report and address compliance issues?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 4, 5, 7 and 10 will help you with answering this Implementation question.

Describe how you will track the project’s FAIR potential maturity level throughout the investment, using the FAIR potential tool.

Things to consider:

 

  • What will be the process, triggers or cadence for reassessing the project’s FAIR maturity level using the FAIR Potential tool?
  • Will you keep a record of each assessment and if so, where will these be kept, alongside what documentation?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 3, 5, 7, and 9 will help you with answering this Implementation question.

Describe how you will monitor the FAIRness of the project’s data assets through the use of the FAIR Data Assessment Tool.

Things to consider:

 

  • What will be the process for monitoring and assessing the project’s data assets using the FAIR Data Assessment Tool?
  • What will you do with the information received from the tool? (i.e. record it, action it)
  • Will you create any documentation to go alongside the assessments and where will this be kept?

Outline any skill gaps within the team related to the data lifecycle and how this will be addressed.

Things to consider:

 

  • Are there any gaps in skills that can’t be covered by the current project team?
  • How will these gaps be addressed?
  • How will you support any necessary learning and skill development throughout the project?

Referring to your answers in your DMAP for Direction Setting questions 1, 2, 3, 4, 5, 7, 9 and 10 will help you with answering this Implementation question.

Outline the mechanisms for supporting the use of Data Management and Access Plan including training for those who need it.

Things to consider:

 

  • What mechanisms will you implement to enforce the data management plan?
  • What are the consequences for non-compliance?
Technical support

Below are some additional resources that may be useful when answering technical questions regarding metadata and standards

Was this page helpful?
YesNo