Introduction

R is a free and open-source scripting language developed by Ross Ihaka and Robert Gentleman also known as “R & R” of the Statistics Department of the University of Auckland.in 1993. It's an alternative implementation of the S programming language, which was widely used in the 1980s for statistical computing. The R environment is designed to perforrm complex statistical analysis and display results using many visual graphics. The R progamming languague is written in C, Fortran, and R itself. Most R packages are written in the R programming language, but heavy computational chucks are written in C, C++, and Fortran. R allows integration with Python, C, C++, .Net, and Fortran.

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

The R environment

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includesan effective data handling and storage facility,
a suite of operators for calculations on arrays, in particular matrices,
a large, coherent, integrated collection of intermediate tools for data analysis,
graphical facilities for data analysis and display either on-screen or on hardcopy, and
a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it as an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

Why R
  1. R programming is used as a leading tool for machine learning, statistics, and data analysis. Objects, functions, and packages can easily be created by R.
  2. It’s a platform-independent language. This means it can be applied to all operating system.
  3. It’s an open-source free language. That means anyone can install it in any organization without purchasing a license.
  4. R programming language is not only a statistic package but also allows us to integrate with other languages (C, C++). Thus, you can easily interact with many data sources and statistical packages.
  5. The R programming language has a vast community of users and it’s growing day by day.
  6. R is currently one of the most requested programming languages in the Data Science job market that makes it the hottest trend nowadays.

Statistical features of R

Basic Statistics: The most common basic statistics terms are the mean, mode, and median. These are all known as “Measures of Central Tendency.” So using the R language we can measure central tendency very easily.
Static graphics: R is rich with facilities for creating and developing interesting static graphics. R contains functionality for many plot types including graphic maps, mosaic plots, biplots, and the list goes on.
Probability distributions: Probability distributions play a vital role in statistics and by using R we can easily handle various types of probability distribution such as Binomial Distribution, Normal Distribution, Chi-squared Distribution and many more.
Data analysis: It provides a large, coherent and integrated collection of tools for data analysis.


Programming Features of R

R Packages: One of the major features of R is it has a wide availability of libraries. R has CRAN(Comprehensive R Archive Network), which is a repository holding more than 10, 0000 packages.
Distributed Computing: Distributed computing is a model in which components of a software system are shared among multiple computers to improve efficiency and performance. Two new packages ddR and multidplyr used for distributed programming in R were released in November 2015.

Programming in R

Since R is much similar to other widely used languages syntactically, it is easier to code and learn in R. Programs can be written in R in any of the widely used IDE like R Studio, Rattle, Tinn-R, etc. After writing the program save the file with the extension .r. To run the program use the following command on the command line: 

 R file_name.r

Example:
# R program to print Welcome to R

# Below line will print "Welcome to R!"
cat("Welcome to R!")

Advantages of R
  • R is the most comprehensive statistical analysis package. As new technology and concepts often appear first in R.
  • As R programming language is an open source. Thus, you can run R anywhere and at any time.
  • R programming language is suitable for GNU/Linux and Windows operating system.
  • R programming is cross-platform which runs on any operating system.
  • In R, everyone is welcome to provide new packages, bug fixes, and code enhancements.
Disadvantages of R
  • In the R programming language, the standard of some packages is less than perfect.
  • Although, R commands give little pressure to memory management. So R programming language may consume all available memory.
  • In R basically, nobody to complain if something doesn’t work.
  • R programming language is much slower than other programming languages such as Python and MATLAB.

Applications of R
  • We use R for Data Science. It gives us a broad variety of libraries related to statistics. It also provides the environment for statistical computing and design.
  • R is used by many quantitative analysts as its programming tool. Thus, it helps in data importing and cleaning.
  • R is the most prevalent language. So many data analysts and research programmers use it. Hence, it is used as a fundamental tool for finance.
  • Tech giants like Google, Facebook, bing, Twitter, Accenture, Wipro and many more using R nowadays.
Download and Install

To use R language, you need the R environment to be installed on your machine, and an IDE (Integrated development environment) to run the language (can also be run using CMD on Windows or Terminal on Linux).

 R Studio is a powerful IDE, specifically used for the R language.

Download R from here  http://lib.stat.cmu.edu/R/CRAN/
Download R studio from here https://posit.co/download/rstudio-desktop/

Comments

Popular posts from this blog

Programming in R - Dr Binu V P

R Data Types

R- Linear Regression