R is the open source software environment and language used for data analysis and statistical computing. A great deal has already been said and written about R’s wide variety of graphical and statistical techniques. Even though R as a programming language is constantly growing in popularity in the pharmaceutical industry, it is still quite unpopular to use R in the preliminary stages of research like importing data from different sources, tidying it, calculating new variables in datasets and making other amendments available in SAS data steps.
This blog explores the ways R can come in useful as a tool for programming datasets and compares both tools in terms of performance and ease of use. It will also focus on the reliability of packages (from CRAN repository, Bioconductor and other sources) that one can use when creating and modifying datasets in R.