Written by the PRC Data and Methods Core
Unless you’ve been living under a rock, you’ve probably heard that there is increasing emphasis on data transparency and collaboration in the academic and scientific community. More funders and journals require data sharing. Sharing data can mean having a DOI for data and other products to refer to in grant applications and various other benefits. This article is meant to ease your fears a bit with some key pieces to consider and talk with your team about.
What is considered “data”?
Data sharing is making research data available to others for reuse or further analysis. It involves sharing datasets, responses, observations, or other forms of data collected during the research process. Data sharing also promotes transparency, scientific collaboration, and allows other researchers to validate or build upon existing findings, thereby advancing knowledge in the field.1, 2
- Data/dataset; this includes, but is not limited to:
- Survey data
- Textual data – transcribed interviews, focus group discussions
- Observational data
- Biometric data
- Tools or set of instructions for analyzing the data (to promote transparency, facilitate collaboration, and enable others to build upon existing work):
- Analytical code / analysis script
- Codebook
When do I share my data?
Talking with your team when you’re still in the planning phases of your research is ideal. Sharing data that is well documented, organized and was planned to be shared (proper consents, cleaning procedures, etc.) will always be easiest.
If that’s not possible, you can share your data at whatever point it makes sense for your project. For example, you may be required to share a dataset you used for analysis completed for a publication. In this case, you would likely want to share your dataset for when the publication becomes available. For those who wish to publish their entire project’s data (not just a subset for analysis), some repositories will track versions for you as you make edits, meaning you can share at any time and continue to update.
Where do I start?
Start by thinking through the following with your team and possibly a data sharing expert (Becker offers consultations):
- Ethical Considerations:
- Audience (who will the data be shared with?)
- Privacy concerns and confidentiality
- Informed consent from participants
- Legal and Institutional Policies:
- Copyright and intellectual property rights
- Institutional guidelines and data sharing policies
- Data Sensitivity:
- Handling sensitive or proprietary data
- Anonymization and data de-identification technique
![](https://prcstl.wustl.edu/files/2025/01/image.png)
Figure Source: Data Management in Large-Scale Education Research (Figure 16.3 A series of decisions to make when deciding where to share data.)
Where do I share my data?
Deciding where to share your data has important implications. Some fields have field-specific repositories that are a field standard. There is not one repository specific to public health research. Below we offer some general-purpose data repositories that other colleagues at PRC have used before. More guidance on choosing a repository from Becker can be found here. Remember to read and understand all repository requirements and policies.
Repository Name | Notes | Examples of shared data/projects* |
digitalcommonsDATA@becker (This is for datasets, data documentation, and supporting materials) | A dataset could be made “restricted” to meet requirements while supporting docs can be “open access”. They would have different DOIs, but could be connected to one another. The major benefit is having one-on-one help dedicated to helping you get your data up. WashU is pushing to expand the collection; the larger it gets, the more useful it will be. | Ross & Renee (PRC): Qual study: https://digitalcommonsdata.wustl.edu/datasets/t2kc5m67ys/1, data supporting materials: https://digitalcommons.wustl.edu/data/5/ Quan study: https://digitalcommonsdata.wustl.edu/datasets/n8r43gf2dm/1, data supporting materials: https://digitalcommonsdata.wustl.edu/datasets/6z94cbyt2r/1 |
ICPSR | You can search pubs that have used specific datasets. ICPSR QDS (Qualitative) toolkit- developed in part here at WashU: https://www.icpsr.umich.edu/web/ICPSR/series/1780 https://qdstoolkit.org/publications/#datasets | Vetta Sanders Thompson (WashU): https://www.icpsr.umich.edu/web/ICPSR/studies/38493/summary LHD profile: https://www.icpsr.umich.edu/web/ICPSR/studies/37144 Qual CRC study: https://www.icpsr.umich.edu/web/ICPSR/studies/38312/summary |
OSF | WashU has institutional membership, learn more here | Rebekah (PRC): https://osf.io/ap3tk/ Example project template: https://osf.io/59gte/ |
*Please let us know if your project has an example to share!
How do I prepare my data to share?
- Ensure data quality and completeness.
- Follow metadata standards and documentation practices.
- Use data formats and ensure long-term usability.
- Disclose risk assessments (this is on YOU, not the repository!):
Where do I go if I need some help?
WashU contact info: https://becker.wustl.edu/services/data-management-and-sharing/data-management-and-sharing-consultation-request/
Data Management and Sharing at WashU (DMS@WashU): https://beckerdms.wustl.edu/
Crystal Lewis’s Book, Data Management in Large-Scale Education Research is an excellent resource. Chapter 16, specifically covers Data Sharing and is freely available online. Crystal is a former data manager at an academic research center.
PRC folks who have experience sharing their data!
Please contact us if you would like us to link to your shared data on our PRC Website!
References:
2. NIH: National Institutes of Health (2023). NIH Data Sharing Policies.