The statistical programming language R is often underrated within the Pharmaceutical Industry. Often the default is to pay for expensive software when R could be a viable option. R is freely available and runs on almost all operating systems including Unix, MacOS, and Microsoft Windows.
R is an object-oriented language with similarities to C++ and FORTRAN, because of this structure R has the ability to easily manipulate statistics and graphics. R has a set of user developed packages containing groups of functions. There are thousands of well-tested functions available that build on the cutting-edge statistical research; many journals suggest or require available R packages for publication. The packages and the integration with academia means that improvements and experimental procedures are often implemented quickly.
The lack of a separate macro language and the code structure when dealing with large datasets and complex data processing can increase the learning curve for R, however R still remains one of the most useable graphics tools. For exploratory graphics there are few software packages that match R with ggplot2 or lattice graphics for the ease of use and customisation. As well as this the integration between R and other open source packages such as LaTeX allows for easy and highly customisable reports. These advantages suggest that R should more often be considered when analysing and summarising clinical data.
A major benefit of using R is the integration with other software. Figure 1 shows an example of how R can be used as part of an integrated process for producing output and executive summaries.
Figure 1: R software integration
Figure 2: Graphics produced using lattice (above) and ggplot2 (below) packages
Cutting Edge Statistics at a Price:
As R is open source there is a set of user developed packages containing groups of functions; each one targeting an area of statistics, graphics or programming. These packages range from doBy, a package for performing grouped analysis and processing, lars, a package that performs least angle regression and produce lasso plots commonly used in genomics, to R2WinBUGS, a package to call WinBUGS from within R.
There are also thousands of well-tested functions available from the central repository that build on cutting-edge statistical research; many journals require R packages to be available prior to publication. The regular use within academia means that improvements and experimental procedures are often implemented quickly.
However, the package structure and live updating nature of R that is one of its major strengths, is also one of its major concerns. These concerns can often be alleviated through a well maintained validation process and carefully controlling versions. A full discussion on the regulatory compliance and validation issues is available from www.r-project.org/doc/R-FDA.pdf.
R should form an integral part of a suite of analysis software as it has the following strengths:
•Quick release of cutting edge methods
•Compatibility with a range of other software
•High quality graphics The strength of R in experimental procedures and as an investigative tool is undeniable and the increasing compatibility with other analysis software is rapidly making R an attractive alternative.