Course Code: bsprsa
Duration: 21 hours
Course Outline:

Day One

Introduction to R & RStudio

  • A first R program
  • Rstudio
  • Other editors
  • Getting Help in R

Importing/Exporting Data

  • Flat files – txt, csv
  • Spreadsheet files – xls, xlsx
  • SPSS, SAS and other formats data
  • Accessing data from SQL data sources
  • SQL database connectivity and operations

Organising Data

  • Data types and classes
  • Data storage in R – Rdata format
  • Objects structure
  • Numbers and vectors
  • Matrix and table
  • Factors
  • Lists
  • Data Frames
  • Date and time

Tabular Representation

  • Overview of packages for data tables – dplyr, tidyr, data.table
  • Indexes and subscripts
  • Selecting, subsetting observations and variables
  • Filtering, grouping
  • Recoding transformations
  • Reshaping data
  • Merging data
  • Character manipulation, stringr package
  • Regular expressions

Day Two

R and Statistics

  • Probability and Normal Distribution
  • Random numbers
  • Descriptive Statistics
  • Standardization and Normalization
  • t-distribution
  • Chi-Square Distribution
  • Confidence Intervals
  • Hypothesis Testing, parametric vs. non parametric tests
  • t-tests
  • F-tests
  • ANOVA

Linear Regression

  • Correlation coefficient and interpretation
  • Wilkinson-Rogers formula notation
  • Simple and multiple linear regression
  • Estimation methods – Least squares
  • Model validation – tests for violation of assumptions
  • Logistic regression

Graphical Procedures

  • Plots for 1, 2 and more variables
  • QQ-Plots
  • Exporting plots to png, pdf and jpeg files
  • ggplot2

Project Organisation

  • Data & other Artefacts
  • Folder-Structure
  • Versioning

Day Three

  • ANOVA revisisted
  • Normality Tests
  • Mixed-Effect Models & Nested Analysis
  • Variance Component Analysis
    • R-Package VCA
    • Visualization of Variability
    • Outlier Detection
      • R-Package STB
    • VCA-Models
      • ANOVA, MINQUE
      • REML, ML
    • VCA Inference
    • Confidence intervals for Variance Components