heRcules
A REPOSITORY WITH SCRIPTS FOR LEARNING DATA ANALYSIS IN R
Abstract
Data analysis is a crucial step in the development of scientific projects, playing a central role in the validation and interpretation of study results. Before data collection begins, the researcher must meticulously and systematically plan their experiments and analyses, ensuring a robust approach that minimizes the influence of biases that could compromise the validity of the results. This document reports the creation of the "heRcules" repository, a public resource offering script models in the R language for scientific data analysis, with a particular focus on the Biological and Health Sciences. This repository is designed to be a valuable tool for researchers, providing ready-to-use scripts for executing essential tasks such as experimental planning, data analysis, result visualization, and hypothesis testing. The initial model, described in this document, includes scripts for a wide range of functions: sample size calculation, statistical power calculation, spreadsheet import, creation of vectors and data frames, descriptive statistics, file export, graph creation (using both base R and ggplot2), outlier tests, normality tests, and notebook creation with R Markdown. The repository is hosted on the GitHub platform (https://github.com/drhrf/heRcules.git), ensuring that the resources are available efficiently, free of charge, and collaboratively to the scientific community. This repository aims not only to facilitate the work of individual researchers but also to promote transparency and reproducibility in scientific research, providing a solid foundation for conducting rigorous and well-founded data analyses, such as those exemplified in the current model.
References
CHAMPELY, S. pwr: Basic Functions for Power Analysis. R package version 1.3-0, 2020. Disponível em: https://link.ufms.br/1gVny. Acesso em: 4 mar. 2004.
DEBASTIANI, V. J. Introdução ao R. [S. l.], 2020. Disponível em: https://link.ufms.br/jrVkK. Acesso em: 21 dez. 2021.
DRAGULESCU, A.; ARENDT, C. xlsx: Read, Write, Format Excel 2007 and Excel 97/2000/XP/2003 Files. R package version 0.6.5, 2020. Disponível em: https://link.ufms.br/50ihv. Acesso em: 4 mar. 2004.
GROSJEAN, P.; IBANEZ, F. pastecs: Package for Analysis of Space-Time Ecological Series. R package version 1.3.21, 2018. Disponível em: https://link.ufms.br/RC3TO. Acesso em: 4 mar. 2004.
KASSAMBARA, A. rstatix: Pipe-Friendly Framework for Basic Statistical Tests. R package version 0.7.0, 2021. Disponível em: https://link.ufms.br/aOTIi. Acesso em: 4 mar. 2004.
R CORE TEAM. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2013. Disponível em: https://link.ufms.br/U0dqv. Acesso em: 4 mar. 2004.
REVELLE, W. psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, 2021. Versão 2.1.9. Disponível em: https://link.ufms.br/R179A. Acesso em: 4 mar. 2004.
SHAPIRO, A. S. S.; WILK, M. B. An Analysis of Variance Test for Normality (Complete Samples). Biometrika, v. 52, n. 3/4, p. 591–611, 1965. Disponível em: https://doi.org/10.2307/2333709. Acesso em: 4 mar. 2004.
TORCHIANO, M. effsize: Efficient Effect Size Computation. R package version 0.8.1, 2020. Disponível em: https://doi.org/10.5281/zenodo.1480624. Acesso em: 4 mar. 2004.
TUKEY, J. W. Comparing individual means in the analysis of variance. Biometrics, v. 5, n. 2, p. 99-114, 1949. Disponível em: https://doi.org/10.2307/3001913. Acesso em: 4 mar. 2004.
WARING, E.; QUINN, M.; MCNAMARA, A.; LA RUBIA, E. A.; ZHU, H.; ELLIS, S. skimr: Compact and Flexible Summaries of Data. R package version 2.1.3, 2021. Disponível em: https://link.ufms.br/g9Atv. Acesso em: 4 mar. 2004.
WICKHAM, H. Reshaping Data with the reshape Package. Journal of Statistical Software, v. 21, n. 12, p. 1-20, 2007. Disponível em: https://doi.org/10.18637/jss.v021.i12. Acesso em: 4 mar. 2004.
WICKHAM, H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
WICKHAM, H.; FRANÇOIS, R.; HENRY, L.; MÜLLER, K. dplyr: A Grammar of Data Manipulation. R package version 1.0.7, 2021. Disponível em: https://link.ufms.br/udQwn. Acesso em: 4 mar. 2004.
WUERTZ, D.; SETZ, T.; CHALABI, Y. fBasics: Rmetrics - Markets and Basic Statistics. R package version 3042.89.1, 2020. Disponível em: https://link.ufms.br/HOaQj. Acesso em: 4 mar. 2004.
ZHU, H. kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4, 2021. Disponível em: https://link.ufms.br/UUuNg. Acesso em: 4 mar. 2004.
Edutec uses the Creative Commons License - Attribution 4.0 International as a basis. It believes in the importance of the open access movement in scientific journals, such as the Open Archives Initiative.
By submitting a text to the journal, the authors agree to the following terms:
- The authors agree to license works accepted for publication under the Creative Commons License - Attribution 4.0 International.
- With CC-BY 4.0 licensing, authors maintain intellectual rights over the text and grant Edutec the right to first publication.
- The authors authorize the sending and indexing of texts written by them in databases and academic and scientific information portals.
- Authors are allowed and encouraged to publish and distribute their work online, in institutional repositories, on personal pages and academic social networks, after the editorial process, as long as the forms of licensing and presentation of the text are respected, in addition to the indication of the place of origin of the publication, in this case, the link to the Edutec.
- The authors of the works published in the Edutec are expressly responsible for their content.
- All works submitted to Edutec that have images, photographs, figures in which images of human beings are registered in their bodies must be accompanied by an Image Use Authorization Term by the member participating in the image and, in the case of children, family members of exposed children, with data and signature.
- In commitment to practices for the openness and popularization of science, Revista Edutec may create content (including with the participation of Generative Artificial Intelligence, always with editorial supervision) for scientific dissemination of the publication on social media, citing the authorship and the access link.
Funding data
-
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Grant numbers 152071/2020-2











2.png)

