At the begining of the project an anonymous online questionnaire containing more than 40 questions about demographics, occupational capacity and reproducability related queries was circulated in order to collect feedback from the project participants regarding the state of reproducability in their day to day experiences.
The questionnaire contained questions asking about their level of working experience, familiarity with source control systems, backup and restore procedures used to safekeep their work, their coding experience, operating systems in use, familiarity with virtual machines, familiarity with existing reproducability frameworks, their participation and involvement in reproducing other papers or being asked for help reproducing their own work.
We received 39 filled in responses over a period of a couple of months. A quick overview of the demographics revelaed that 80% of the responders were male and 18% female. Age of the participants was rougly split in the 25-30, 31-35, 35-40 and 41-45 age groups. One third of the responders were doing science for 4-8 years, another third for 9-15 and the final third for more than 15 years. Almost half of them were research fellows and close to 20% were professors.
Version control usage
56% of the responders were actively using a source code repository to store their work. The most popular source code control software used was git followed by svn. On the matter of safekeeping of their work, 62% responded that they keep their own personal backups on external hard disks, 46% said they used free online services to store their backups, while another 41% said that they trust the backups already offered by their institution.
Survey responders attempting to reproduce other papers
On the part of the questionnaire regarding reproducability, more than 80% reponded that they have recreated work described in another paper. 62% did so because the methods used there were interesting enough and they wanted to know more about them, 60% becasue they wanted to extend the work presented there and produce further results, while 33% did so becasue the results presented there were suspicious and wanted to verify the validity of them.
These reproducability attempts were performed to 1-5 papers (for 70% of the responders), 6-15 papers (for 20%) and for more than 15 papers for the remaining 10% of the participants. For each reproduced paper, 73% claimed that they would do anything in their power to recreate another paper if it was important to their work, whereas 15% claimed that they would spend up to 2-3 days trying to recreate it. 10% claimed that they were willing to spend up to 1 week trying to get succesful results.
97% would first ask for assistance from colleagues in their attempts to recreate other papers, while 60% would directly contact the original author/s asking for clarifications. From the participants that actually contacted the original authors, 68% mentioned that the author was helpfull in assisting, 21% claimed that the author’s reply was ambiguous and not clear enough, while 5\% said that they never heard back from the original author.
Survey responders asked for assistance in reproducing their own papers
51% of the participants claimed that they were contacted for help in recreating their own work. 43% of them once only, and 24% of them more than 10 separate times. 20% of those contacted refused to reply. From the ones that replied, 47% spend only a few hours assisting, while another 43% had communications that lasted several weeks until the issue was resolved. From the participants that failed to respond to the queries for help, 60% mentioned that the requests was very trivial, 40% mentioned that they were lacking free time at the time of the request, while other responders mentioned that the requests were not clear, or insulting in nature or that the work was partly covered by copyright or of confidential nature.
37% of the responders said that they were willing to spend up to a couple of extra days to make sure that their work is reproducable, 24% that they were willing to spend up to a few hours only, while 26% allocate up to one week. The majority of the responders (78%) said that they would like to have a reproducability workflow embedded in their work, and only 10% claimed otherwise. Smaller percentages said that it depends on the nature or the complexity of the paper.
Of the people that wanted to make their work reproducable, 54% attempted to provide a reproducablity workflow of their own. The remaining 46% attempted to find online an already established reproducibility workflow. 56% of those that looked online were not able to find a clear and easy to follow reproducability workflow. From the 44% of actively searched online for solutions, 46% ended up just being a little more descriptive in the explanations provided in their paper, 25% ended up asking other colleagues what they use in order to provide reproducibility in their papers, while 21% of them gave up while ot being able to find anything actionable.
From the reproducability survey we concluded that Reproducability is a desirable trait in a scientific paper, although the process to do so is not clearly defined. It seems that authors realize the importance of reproducability but given the difficulty finding easy ways to do so and the limited amount of time they devote on doing so, often leads them in droping that feature from their papers.