R

RThe R Project
(https://www.r-project.org/about.html)

What is R?

Introduction to R

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

The R environment

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it of an environment within which statistical techniques are implemented. R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

Getting Started

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. To download R, please choose your preferred CRAN mirror.


tutorialspointR Tutorial
(http://www.tutorialspoint.com/r/)

R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team.

R is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems like Linux, Windows and Mac.

This programming language was named R, based on the first letter of first name of the two R authors (Robert Gentleman and Ross Ihaka), and partly a play on the name of the Bell Labs Language S.


Image result for courseraR Programming, John Hopkins University
(https://www.coursera.org/learn/r-programming)

About this course: In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.

Course 2 of 10 in the Data Science Specialization.

Syllabus
WEEK 1
Week 1: Background, Getting Started, and Nuts & Bolts
This week covers the basics to get you started up with R. The Background Materials lesson contains information about course mechanics and some videos on installing R. The Week 1 videos cover the history of R and S, go over the basic data types in R, and descri… 
28 videos, 9 readings
Graded: Week 1 Quiz
WEEK 2
Week 2: Programming with R
Welcome to Week 2 of R Programming. This week, we take the gloves off, and the lectures cover key topics like control structures and functions. We also introduce the first programming assignment for the course, which is due at the end of the week. 
13 videos, 3 readings
Graded: Week 2 Quiz
Graded: Programming Assignment 1: Quiz
WEEK 3
Week 3: Loop Functions and Debugging
We have now entered the third week of R Programming, which also marks the halfway point. The lectures this week cover loop functions and the debugging tools in R. These aspects of R make R useful for both interactive work and writing longer code, and so they a… 
8 videos, 2 readings
Graded: Week 3 Quiz
Graded: Programming Assignment 2: Lexical Scoping
WEEK 4
Week 4: Simulation & Profiling
This week covers how to simulate data in R, which serves as the basis for doing simulation studies. We also cover the profiler in R which lets you collect detailed information on how your R functions are running and to identify bottlenecks that can be addresse… 
6 videos, 4 readings
Graded: Week 4 Quiz
Graded: Programming Assignment 3: Quiz

Peng, R. (2016) R Programming for Data Science, lulu.com

Download eBook PDF (PDF 10,628KB)

Data science has taken the world by storm. Every field of study and area of business has been affected as people increasingly realize the value of the incredible quantities of data being generated. But to extract value from those data, one needs to be trained in the proper data science skills. The R programming language has become the de facto programming language for data science. Its flexibility, power, sophistication, and expressiveness have made it an invaluable tool for data scientists around the world. This book is about the fundamentals of R programming. You will get started with the basics of the language, learn how to manipulate datasets, how to write functions, and how to debug and optimize code. With the fundamentals provided in this book, you will have a solid foundation on which to build your data science toolbox.

 Table of Contents

  • 1. Stay in Touch!
  • 2. Preface
  • 3. History and Overview of R
    • 3.1 What is R?
    • 3.2 What is S?
    • 3.3 The S Philosophy
    • 3.4 Back to R
    • 3.5 Basic Features of R
    • 3.6 Free Software
    • 3.7 Design of the R System
    • 3.8 Limitations of R
    • 3.9 R Resources
  • 4. Getting Started with R
    • 4.1 Installation
    • 4.2 Getting started with the R interface
  • 5. R Nuts and Bolts
    • 5.1 Entering Input
    • 5.2 Evaluation
    • 5.3 R Objects
    • 5.4 Numbers
    • 5.5 Attributes
    • 5.6 Creating Vectors
    • 5.7 Mixing Objects
    • 5.8 Explicit Coercion
    • 5.9 Matrices
    • 5.10 Lists
    • 5.11 Factors
    • 5.12 Missing Values
    • 5.13 Data Frames
    • 5.14 Names
    • 5.15 Summary
  • 6. Getting Data In and Out of R
    • 6.1 Reading and Writing Data
    • 6.2 Reading Data Files with read.table()
    • 6.3 Reading in Larger Datasets with read.table
    • 6.4 Calculating Memory Requirements for R Objects
  • 7. Using the readr Package
  • 8. Using Textual and Binary Formats for Storing Data
    • 8.1 Using dput() and dump()
    • 8.2 Binary Formats
  • 9. Interfaces to the Outside World
    • 9.1 File Connections
    • 9.2 Reading Lines of a Text File
    • 9.3 Reading From a URL Connection
  • 10. Subsetting R Objects
    • 10.1 Subsetting a Vector
    • 10.2 Subsetting a Matrix
    • 10.3 Subsetting Lists
    • 10.4 Subsetting Nested Elements of a List
    • 10.5 Extracting Multiple Elements of a List
    • 10.6 Partial Matching
    • 10.7 Removing NA Values
  • 11. Vectorized Operations
    • 11.1 Vectorized Matrix Operations
  • 12. Dates and Times
    • 12.1 Dates in R
    • 12.2 Times in R
    • 12.3 Operations on Dates and Times
    • 12.4 Summary
  • 13. Managing Data Frames with the dplyr package
    • 13.1 Data Frames
    • 13.2 The dplyr Package
    • 13.3 dplyr Grammar
    • 13.4 Installing the dplyr package
    • 13.5 select()
    • 13.6 filter()
    • 13.7 arrange()
    • 13.8 rename()
    • 13.9 mutate()
    • 13.10 group_by()
    • 13.11 %>%
    • 13.12 Summary
  • 14. Control Structures
    • 14.1 ifelse
    • 14.2 for Loops
    • 14.3 Nested for loops
    • 14.4 while Loops
    • 14.5 repeat Loops
    • 14.6 next, break
    • 14.7 Summary
  • 15. Functions
    • 15.1 Functions in R
    • 15.2 Your First Function
    • 15.3 Argument Matching
    • 15.4 Lazy Evaluation
    • 15.5 The ... Argument
    • 15.6 Arguments Coming After the ... Argument
    • 15.7 Summary
  • 16. Scoping Rules of R
    • 16.1 A Diversion on Binding Values to Symbol
    • 16.2 Scoping Rules
    • 16.3 Lexical Scoping: Why Does It Matter?
    • 16.4 Lexical vs. Dynamic Scoping
    • 16.5 Application: Optimization
    • 16.6 Plotting the Likelihood
    • 16.7 Summary
  • 17. Coding Standards for R
  • 18. Loop Functions
    • 18.1 Looping on the Command Line
    • 18.2 lapply()
    • 18.3 sapply()
    • 18.4 split()
    • 18.5 Splitting a Data Frame
    • 18.6 tapply
    • 18.7 apply()
    • 18.8 Col/Row Sums and Means
    • 18.9 Other Ways to Apply
    • 18.10 mapply()
    • 18.11 Vectorizing a Function
    • 18.12 Summary
  • 19. Regular Expressions
    • 19.1 Before You Begin
    • 19.2 Primary R Functions
    • 19.3 grep()
    • 19.4 grepl()
    • 19.5 regexpr()
    • 19.6 sub() and gsub()
    • 19.7 regexec()
    • 19.8 Summary
  • 20. Debugging
    • 20.1 Something’s Wrong!
    • 20.2 Figuring Out What’s Wrong
    • 20.3 Debugging Tools in R
    • 20.4 Using traceback()
    • 20.5 Using debug()
    • 20.6 Using recover()
    • 20.7 Summary
  • 21. Profiling R Code
    • 21.1 Using system.time()
    • 21.2 Timing Longer Expressions
    • 21.3 The R Profiler
    • 21.4 Using summaryRprof()
    • 21.5 Summary
  • 22. Simulation
    • 22.1 Generating Random Numbers
    • 22.2 Setting the random number seed
    • 22.3 Simulating a Linear Model
    • 22.4 Random Sampling
    • 22.5 Summary
  • 23. Data Analysis Case Study: Changes in Fine Particle Air Pollution in the U.S.
    • 23.1 Synopsis
    • 23.2 Loading and Processing the Raw Data
    • 23.3 Results
  • 24. Parallel Computation
    • 24.1 Hidden Parallelism
    • 24.2 Embarrassing Parallelism
    • 24.3 The Parallel Package
    • 24.4 Example: Bootstrapping a Statistic
    • 24.5 Building a Socket Cluster
    • 24.6 Summary

Try RTryR – Code School, O’Reilly
(http://tryr.codeschool.com/)

R is a tool for statistics and data modeling. The R programming language is elegant, versatile, and has a highly expressive syntax designed around working with data. R is more than that, though — it also includes extremely powerful graphics capabilities. If you want to easily manipulate your data and present it in compelling ways, R is the tool for you.

Table of Contents

  1. R Syntax: A gentle introduction to R expressions, variables, and functions
  2. Vectors: Grouping values into vectors, then doing arithmetic and graphs with them
  3. Matrices: Creating and graphing two-dimensional data sets
  4. Summary Statistics: Calculating and plotting some basic statistics: mean, median, and standard deviation
  5. Factors: Creating and plotting categorized data
  6. Data Frames: Organizing values into data frames, loading frames from files and merging them
  7. Working With Real-World Data: Testing for correlation between data sets, linear models and installing additional packages

 Introduction to R for Data Science
(
https://www.edx.org/course/introduction-r-data-science-microsoft-dat204x-2#!)

Learn the R statistical programming language, the lingua franca of data science in this hands-on course.

R is rapidly becoming the leading language in data science and statistics. Today, R is the tool of choice for data science professionals in every industry and field. Whether you are full-time number cruncher, or just the occasional data analyst, R will suit your needs.

This introduction to R programming course will help you master the basics of R. In seven sections, you will cover its basic syntax, making you ready to undertake your own first data analysis using R. Starting from variables and basic operations, you will eventually learn how to handle data structures such as vectors, matrices, data frames and lists. In the final section, you will dive deeper into the graphical capabilities of R, and create your own stunning data visualizations. No prior knowledge in programming or data science is required.

What makes this course unique is that you will continuously practice your newly acquired skills through interactive in-browser coding challenges using the DataCamp platform. Instead of passively watching videos, you will solve real data problems while receiving instant and personalized feedback that guides you to the correct solution.

What you’ll learn

  • Introductory R language fundamentals and basic syntax
  • What R is and how it’s used to perform data analysis
  • Become familiar with the major R data structures
  • Create your own visualizations using R

Course Syllabus

  • Part 1: Introduction
  • Part 2: Getting your data into R
  • Part 3: Easy ways to do basic data analysis
  • Part 4: Painless data visualization
  • Part 5: Syntax quirks you’ll want to know
  • Part 6: Useful resources
  • Complete guide as PDF download (free registration required)
  • Relatively high-profile users of R include:

    Facebook: Used by some within the company for tasks such as analyzing user behavior.

    Google: There are more than 500 R users at Google, according to David Smith at Revolution Analytics, doing tasks such as making online advertising more effective.

    National Weather Service: Flood forecasts.

    Orbitz: Statistical analysis to suggest best hotels to promote to its users.

    Trulia: Statistical modeling.

    Source: Revolution Analytics


    Programming in R
    (http://manuals.bioinformatics.ucr.edu/home/programming-in-r)

    One of the main attractions of using the R (http://cran.at.r-project.org) environment is the ease with which users can write their own programs and custom functions. The R programming syntax is extremely easy to learn, even for users with no previous programming experience. Once the basic R programming control structures are understood, users can use the R language as a powerful environment to perform complex custom analyses of almost any type of data.

    Contents

    1. 1 Introduction
    2. 2 R Basics
    3. 3 Code Editors for R
    4. 4 Integrating R with Vim and Tmux
    5. 5 Finding Help
    6. 6 Control Structures
      1. 6.1 Conditional Executions
        1. 6.1.1 Comparison Operators
        2. 6.1.2 Logical Operators
        3. 6.1.3 If Statements
        4. 6.1.4 Ifelse Statements
      2. 6.2 Loops
        1. 6.2.1 For Loop
        2. 6.2.2 While Loop
        3. 6.2.3 Apply Loop Family
          1. 6.2.3.1 For Two-Dimensional Data Sets: apply
          2. 6.2.3.2 For Ragged Arrays: tapply
          3. 6.2.3.3 For Vectors and Lists: lapply and sapply
        4. 6.2.4 Other Loops
        5. 6.2.5 Improving Speed Performance of Loops
    7. 7 Functions
    8. 8 Useful Utilities
      1. 8.1 Debugging Utilities
      2. 8.2 Regular Expressions
      3. 8.3 Interpreting Character String as Expression
      4. 8.4 Time, Date and Sleep
      5. 8.5 Calling External Software with System Command
      6. 8.6 Miscellaneous Utilities
    9. 9 Running R Programs
    10. 10 Object-Oriented Programming (OOP)
      1. 10.1 Define S4 Classes
      2. 10.2 Assign Generics and Methods
    11. 11 Building R Packages
    12. 12 Reproducible Research by Integrating R with Latex or Markdown
    13. 13 R Programming Exercises
      1. 13.1 Exercise Slides
      2. 13.2 Sample Scripts
        1. 13.2.1 Batch Operations on Many Files
        2. 13.2.2 Large-scale Array Analysis
        3. 13.2.3 Graphical Procedures: Feature Map Example
        4. 13.2.4 Sequence Analysis Utilities
        5. 13.2.5 Pattern Matching and Positional Parsing of Sequences
        6. 13.2.6 Identify Over-Represented Strings in Sequence Sets
        7. 13.2.7 Translate DNA into Protein
        8. 13.2.8 Subsetting of Structure Definition Files (SDF)
        9. 13.2.9 Managing Latex BibTeX Databases
        10. 13.2.10 Loan Payments and Amortization Tables
        11. 13.2.11 Course Assignment: GC Content, Reverse & Complement

    Image result for DataCampIntroduction to R
    (https://www.datacamp.com/courses/free-introduction-to-r)

    In this introduction to R, you will master the basics of this beautiful open source language, including factors, lists and data frames. With the knowledge gained in this course, you will be ready to undertake your first very own data analysis. With over 2 million users worldwide R is rapidly becoming the leading programming language in statistics and data science. Every year, the number of R users grows by 40% and an increasing number of organizations are using it in their day-to-day activities. Leverage the power of R by completing this free R online course today!

    1 Intro to basics

    In this chapter, you will take your first steps with R. You will learn how to use the console as a calculator and how to assign variables. You will also get to know the basic data types in R. Let’s get started!

    2 Vectors

    In this free R course, we’ll take you on a trip to Vegas, where you will learn how to analyze your gambling results using vectors in R! After completing this chapter, you will be able to create vectors in R, name them, select elements from them and compare different vectors.

    3 Matrices

    In this chapter you will learn how to work with matrices in R. By the end of the chapter, you will be able to create matrices and to understand how you can do basic computations with them. You will analyze the box office numbers of Star Wars to illustrate the use of matrices in R. May the force be with you!

    4 Factors

    Very often, data falls into a limited number of categories. For example, humans are either male or female. In R, categorical data is stored in factors. Given the importance of these factors in data analysis, you should start learning how to create, subset and compare them now!

    5 Data frames

    Most data sets you will be working with will be stored as data frames. By the end of this chapter focused on R basics, you will be able to create a data frame, select interesting parts of a data frame and order a data frame according to certain variables.

    6 Lists

    Lists, as opposed to vectors, can hold components of different types, just like your to-do list at home or at work. This intro to R chapter will teach you how to create, name and subset these lists.


    Image result for DataCampIntermediate R
    (https://www.datacamp.com/courses/intermediate-r)

    The intermediate R course is the logical next stop on your journey in the R programming language. In this R training you will learn about conditional statements, loops and functions to power your own R scripts. Next, you can make your R code more efficient and readable using the apply functions. Finally, the utilities chapter gets you up to speed with regular expressions in the R programming language, data structure manipulations and times and dates. This R tutorial will allow you to learn R and take the next step in advancing your overall knowledge and capabilities while programming in R.

    1 Conditionals and Control Flow Free

    To be TRUE or not be TRUE, that’s the question. In this chapter you’ll learn about relational operators to see how R objects compare and logical operators to combine logicals. Next, you’ll use this knowledge to build conditional statements.

    Loops

    Loops can come in handy on numerous occasions. While loops are like repeated if statements; the for loop is designed to iterate over all elements in a sequence. Learn all about them in this chapter.

    Functions

    Functions are an extremely important concept in almost every programming language; R is not different. After learning what a function is and how you can use one, you’ll take full control by writing your own functions.

    The apply family

    Whenever you’re using a for loop, you might want to revise your code and see whether you can use the lapply function instead. Learn all about this intuitive way of applying a function over a list or a vector, and its variants sapply and vapply.

    Utilities

    Mastering R programming is not only about understanding its programming concepts. Also a solid knowledge of a wide range of R functions is useful. This chapter introduces you to a bunch of useful functions for data structure manipulation, regular expressions and working with times and dates.


     Analytics Vidhya - Learn everything about AnalyticsLeaRning Path on R – Step by Step Guide to Learn Data Science on R
    (https://www.analyticsvidhya.com/learning-paths-data-science-business-analytics-business-intelligence-big-data/learning-path-r-data-science/)

    One of the common problems people face in learning R is lack of a structured path. They don’t know, from where to start, how to proceed, which track to choose? Though, there is an overload of good free resources available on the Internet, this could be overwhelming as well as confusing at the same time.

    To create this R learning path, Analytics Vidhya and DataCamp sat together and selected a comprehensive set of resources to help you learn R from scratch. This learning path is a great introduction for anyone new to data science or R, and if you are a more experienced R user you will be updated on some of the latest advancements.

    This will help you learn R quickly and efficiently. Time to have fun while lea-R-ning!