R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among statisticians, engineers and scientists without computer programming skills who find it easy to use. Its popularity is due to the increasing use of data mining for various goals such as set ad prices, find new drugs more quickly or fine-tune financial models. R has a wide variety of packages for data mining.
I. Introduction and preliminaries
1. Overview
- Making R more friendly, R and available GUIs
- Rstudio
- Related software and documentation
- R and statistics
- Using R interactively
- An introductory session
- Getting help with functions and features
- R commands, case sensitivity, etc.
- Recall and correction of previous commands
- Executing commands from or diverting output to a file
- Data permanency and removing objects
- Good programming practice: Self-contained scripts, good readability e.g. structured scripts, documentation, markdown
- installing packages; CRAN and Bioconductor
2. Reading data
- Txt files (read.delim)
- CSV files
3. Simple manipulations; numbers and vectors + arrays
- Vectors and assignment
- Vector arithmetic
- Generating regular sequences
- Logical vectors
- Missing values
- Character vectors
- Index vectors; selecting and modifying subsets of a data set
- Arrays
- Array indexing. Subsections of an array
- Index matrices
- The array() function + simple operations on arrays e.g. multiplication, transposition
- Other types of objects
4. Lists and data frames
- Lists
- Constructing and modifying lists
- Concatenating lists
- Data frames
- Making data frames
- Working with data frames
- Attaching arbitrary lists
- Managing the search path
5. Data manipulation
- Selecting, subsetting observations and variables
- Filtering, grouping
- Recoding, transformations
- Aggregation, combining data sets
- Forming partitioned matrices, cbind() and rbind()
- The concatenation function, (), with arrays
- Character manipulation, stringr package
- short intro into grep and regexpr
6. More on Reading data
- XLS, XLSX files
- readr and readxl packages
- SPSS, SAS, Stata,… and other formats data
- Exporting data to txt, csv and other formats
6. Grouping, loops and conditional execution
- Grouped expressions
- Control statements
- Conditional execution: if statements
- Repetitive execution: for loops, repeat and while
- intro into apply, lapply, sapply, tapply
7. Functions
- Creating functions
- Optional arguments and default values
- Variable number of arguments
- Scope and its consequences
8. Simple graphics in R
- Creating a Graph
- Density Plots
- Dot Plots
- Bar Plots
- Line Charts
- Pie Charts
- Boxplots
- Scatter Plots
- Combining Plots
II. Statistical analysis in R
1. Probability distributions
- R as a set of statistical tables
- Examining the distribution of a set of data
2. Testing of Hypotheses
- Tests about a Population Mean
- Likelihood Ratio Test
- One- and two-sample tests
- Chi-Square Goodness-of-Fit Test
- Kolmogorov-Smirnov One-Sample Statistic
- Wilcoxon Signed-Rank Test
- Two-Sample Test
- Wilcoxon Rank Sum Test
- Mann-Whitney Test
- Kolmogorov-Smirnov Test
3. Multiple Testing of Hypotheses
- Type I Error and FDR
- ROC curves and AUC
- Multiple Testing Procedures (BH, Bonferroni etc.)
4. Linear regression models
- Generic functions for extracting model information
- Updating fitted models
- Generalized linear models
- Families
- The glm() function
- Classification
- Logistic Regression
- Linear Discriminant Analysis
- Unsupervised learning
- Principal Components Analysis
- Clustering Methods(k-means, hierarchical clustering, k-medoids)
5. Survival analysis (survival package)
- Survival objects in r
- Kaplan-Meier estimate, log-rank test, parametric regression
- Confidence bands
- Censored (interval censored) data analysis
- Cox PH models, constant covariates
- Cox PH models, time-dependent covariates
- Simulation: Model comparison (Comparing regression models)
6. Analysis of Variance
- One-Way ANOVA
- Two-Way Classification of ANOVA
- MANOVA
III. Worked problems in bioinformatics
- Short introduction to limma package
- Microarray data analysis workflow
- Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
- Data processing (QC, normalisation, differential expression)
- Volcano plot
- Custering examples + heatmaps
Brasil - Introductory R for Biologists
Canada - Introductory R for Biologists
中国 - Introductory R for Biologists
香港 - Introductory R for Biologists
澳門 - Introductory R for Biologists
台灣 - Introductory R for Biologists
USA - Introductory R for Biologists
Österreich - Introductory R for Biologists
Schweiz - Introductory R for Biologists
Deutschland - Introductory R for Biologists
Czech Republic - Introductory R for Biologists
Denmark - Introductory R for Biologists
Estonia - Introductory R for Biologists
Finland - Introductory R for Biologists
Greece - Introductory R for Biologists
Magyarország - Introductory R for Biologists
Ireland - Introductory R for Biologists
Luxembourg - Introductory R for Biologists
Latvia - Introductory R for Biologists
España - Introducción R para Biólogos
Italia - Introductory R for Biologists
Lithuania - Introductory R for Biologists
Nederland - Introductory R for Biologists
Norway - Introductory R for Biologists
Portugal - Introductory R for Biologists
România - Introductory R for Biologists
Sverige - Introductory R for Biologists
Türkiye - Introductory R for Biologists
Malta - Introductory R for Biologists
Belgique - Introductory R for Biologists
France - Introductory R for Biologists
日本 - Introductory R for Biologists
Australia - Introductory R for Biologists
Malaysia - Introductory R for Biologists
New Zealand - Introductory R for Biologists
Philippines - Introductory R for Biologists
Singapore - Introductory R for Biologists
Thailand - Introductory R for Biologists
Vietnam - Introductory R for Biologists
India - Introductory R for Biologists
Argentina - Introducción R para Biólogos
Chile - Introducción R para Biólogos
Costa Rica - Introducción R para Biólogos
Ecuador - Introducción R para Biólogos
Guatemala - Introducción R para Biólogos
Colombia - Introducción R para Biólogos
México - Introducción R para Biólogos
Panama - Introducción R para Biólogos
Peru - Introducción R para Biólogos
Uruguay - Introducción R para Biólogos
Venezuela - Introducción R para Biólogos
Polska - Introductory R for Biologists
United Kingdom - Introductory R for Biologists
South Korea - Introductory R for Biologists
Pakistan - Introductory R for Biologists
Sri Lanka - Introductory R for Biologists
Bulgaria - Introductory R for Biologists
Bolivia - Introducción R para Biólogos
Indonesia - Introductory R for Biologists
Kazakhstan - Introductory R for Biologists
Moldova - Introductory R for Biologists
Slovakia - Introductory R for Biologists
Slovenia - Introductory R for Biologists
Croatia - Introductory R for Biologists
Serbia - Introductory R for Biologists
Bhutan - Introductory R for Biologists