Federal agencies are now requiring more formal data management plans than in the past. These include:
- Office of Science and Technology Policy - The White House
- National Science Foundation (NSF) Data Management Plan Requirements
- NASA Data Management Plan Guidance
- National Institutes of Health - NIH
This page is designed to assist PI's in developing data management plans to support successful proposals. We welcome comments and feedback to help us ensure that the information here is up-to-date.
Information provided here includes:
- available data repositories (predominantly those on campus),
- contacts who can provide help to a PI to identify project-specific needs for data storage,
- generally asked questions about data storage, and
- advice on how to support someone who is helping to develop the data management plan and/or responsible for the data management during the life of the project.
In many cases, it also helps to get an example data management plan (from a successfully funded proposal!).
California Digital Library's DMP Tool
There are additional online resources available like the California Digital Library's DMP Tool - which allows you to select the specific solicitation and then walks you through each section of the solicitation's requirements - https://cdlib.org/services/uc3/dmptool/."
eScholarship, University of California, provides a "Primer on Data Management: What you always wanted to know".
Penn State's resources for data management, data sharing and archiving
- DataCommons@PSU
- Office of the vice president for research at Penn State
- ScholarSphere
- University Libraries Research Data Management Services
- The University Libraries offers researchers direct support with data management in the form of consultations, workshops, information sessions, guidance on tools and repositories, and writing data management plans through its Research Data Management Team. Researchers can request assistant by emailing the Research Data Management Team or contacting the STEM Data Management Librarian, Briana Ezray.
In general, the Data Commons is designed for large scientific databases and final archiving. It includes an extensive search capability. ScholarSphere is designed for more variable data (and may include visual archives) but is not optimized for large, searchable databases. Consult these links for more information on each option to determine the optimal location for storage and management of your database.
Questionnaire to guide development of a Data Management Plan
The following questions were created provided by Penn State's "DataCommons@PSU". They are intended to assist in the development of data plans for management of final (archival) databases. (PDF copy of Data Plan Questionnaire)
- Project Timeline -- What is the proposed length of the project and expected completion date?
- Data Type(s) -- What data types will be produced by this project? For example, tabular, spatial, large scale databases, images, videos, etc.
- Software Platform -- What software platform(s) will you and your project partners be using to enter, create, and analyze the data?
- Data Format(s) -- What will be the resulting data format(s)? For example, xls, shp, pdf, tiff, jpg, netcdf, etc.
- Data Standards -- Do you have any scientific or research community standards by which you must create your data or metadata? Have you determined standard platforms and formats for all project members who will be creating data? Who will be creating the metadata? Do you need support for this?
- Data Users -- Who will be the primary users of this data during the project life cycle and where will they be located—at PSU or will they be remote users? Will these users require login access?
- Data Input -- Who will be responsible for the data? For example, this person would be tracking the latest version of the database, data set, etc. Who will be inputting data? Will they be located at PSU or be entering data remotely?
- Expected Storage Needs -- How much data do you expect to generate during the life cycle of this project? What do you expect to be your yearly storage needs?
- Additional Data & Models -- What additional data might you need for this project? Are you using any models or analysis techniques that could be documented? Will users be accessing these results or running analyses remotely.
- Expected Access Needs -- hat access needs will be required at the completion of this project?
*Note that investigators can request a DOI in advance of publication (likely at review stage) and include the dataset link in their publication(s). This is a powerful way to disseminate your data.
Questionnaire for generating metadata for the DataCommons@PSU"
This questionnaire is designed to assist in the creation of metadata for your data set. It should capture the basics of "Who, What, When, Where, How" for the data. (PDF copy of Data Plan Questionnaire)
- Primary Contact -- Please provide contact information for the primary data contact if possible. This would be the person who knows the most about the data.
- Title of Data Set -- Please list your preferred title for your data set.
- Date of final version of data set -- Please list the date of the final version of the data set. For example, if the last update to the data was in 2012, you would list 2012. For a data set that spans multiple years, you can list this as a range of dates, for example 2002-2012. If this is data that is updated, you can list the month and year of the latest update.
- Abstract of purpose and method -- Please describe the data as well as the purpose of the data. You can use the description or abstract from your project or research proposal for this.
- Researcher(s) -- Please list the researchers on the project. You can include researcher role, Principal Investigator, Co Principal Investigator, Senior Personnel if you would like.
Please also include your preferred top level affiliation. For example, if you are with a center such as the Center for Informatics, or an institute, college or department, you can list that as your top level affiliation. You may also include your collaborators at other institutions. - Data Access -- If your data will be downloadable via FTP, just include the words FTP in this section. If you have other needs such as data that can be geoenabled (visualized in a GIS), please let us know.
- Additional Information -- Is there any additional information about your data? A website you want us to link to?
- References -- Include the titles and journal information for any publications related to this data that you would like us to list.
- DOI -- If you would like a DOI for your data or if you have a DOI for a publication and would like this included in the metadata, please let us know.
- Supplementary Materials -- Do you have any additional material you would like to include with this data? For example, a final report, presentation, or data dictionary.
- Additional Points of Access -- Are there any existing or required access points for your data such as a specialized data repository for your field?
- Metadata* -– Descriptive information to aid in designing the data management structure.
Please email Bernd J. Haupt (bjhaupt@psu.edu) if you have suggestions for updates or additions to this information.