Readying Your Data Deposit
Describing Your Data
Prepare to describe your dataset in order for it to be discoverable in the repository. The Data Deposit Form will guide you through what you need to include. At the minimum, you will need to provide a title, author list, and a description.
Tips for metadata:
When including data underlying a publication - Data from: Title of Publication (Article, Monograph, Report, etc.) Example - Data and scripts from: Clustering and assembly dynamics of a one-dimensional microphase former
When including data associated with a larger study/project - Name of Study/Project, Data Details (Time Range, Location, Other Descriptive Information) Example- IPHEx-Southern Appalachian Mountains -- Rainfall Data 2008-2014
Creator: Include those individuals who were involved in creating or authoring the data. These individuals will be listed within the data citation. Another contributor field is available for listing individuals who contributed to the dataset but should not be included as creator/author(s).
Description: For data underlying a publication including the article abstract is sufficient. Other information may include study aims, methodology details, and other contextual details such as programs, software, or equipment used. Learn more about our metadata fields for research data.
Formatting Your Data
Save your files in a preservation-friendly format. Proprietary formats can cause problems with long-term preservation as software platforms change and previous versions become obsolete. If you wish to include a proprietary format, you may do so, but full preservation can only be assured for sustainable file formats. If you aren't sure if there is a preservation-friendly format for your files or you are unable to uncouple your data from proprietary software, contact email@example.com
Documenting Your Data
Provide appropriate documentation that clearly describes your data so that others can interpret and use your files. The documentation may be README files, data dictionaries, codebooks, instruments, user's manual, and/or fully commented code. The important thing to keep in mind is that someone else will need to understand how to use your data, and they will not know all of the nuances in your file names, labels, data values, etc. without guidance. Need help getting started creating documentation? See this Cornell guide for writing README files.
Pro Tip: To enhance reproducibility, be sure to include the name of the software (and version) you used to collect and analyze your data within your documentation.
Organizing Your Data
Organize your files in a logical order, but do not use nested folders (folders within folders). For example, all code in a folder called "code" with no additional subfolders in that "code" folder. Regardless of the structure, we ask that you include a README file (see above) that provides a description of what you are depositing.
Licensing Your Data
Decide on the appropriate license for your data. Datasets submitted to the DDR will have a default CC0 public domain dedication assigned. The DDR strongly recommends the use of a CC0 waiver to encourage the broadest reuse of the data and expects users of data within the DDR to follow growing scholarly best practice to properly attribute and cite data producers. If CC0 is not appropriate for your data, during the submission process you can elect to apply an alternative Creative Commons license.
Leave Files Uncompressed
Compressed files cannot receive the same level of preservation as uncompressed files. There may be situations where providing a zipped file is preferable in order to maintain a more complex file hierarchy or reduce the size of your deposit. We can accommodate these deposits but suggest only compressing files when necessary.
If you have any questions about how to prepare your data for the deposit, we are happy to help! Contact us at firstname.lastname@example.org with your question or to set up an appointment.