Luxembourg Institute of Socio-Economic Research (LISER)
14-15 June 2018

Workshop on "Introduction to R and Statistical Learning"

R is one of the most used language for statistics and data science. It is freely available and has many utilities for basic and advanced statistical analysis.  The Short Course “Introduction to R and Statistical Learning” aims at introducing the software R to PhD students and researchers with no previous experience, who want to have an idea on how to use the R and Statistical Learning to conduct statistical analysis.

The first part of the course will provide a general introduction to basic operations and functions for data structures, data manipulation and plotting, together with some basics on linear and generalized linear models. The second part of the course will briefly introduce predictive models for statistical learning, focusing on the lasso estimator, classification and regression trees and random forests.

A two-day course of demonstration and practical will be provided by Anna Gottard, Professor of Statistics at the “Department of Statistics, Computer Science, Applications Giuseppe Parenti” of the University of Florence. The format of the course will comprise both lectures and practical sessions.

Everyone wishing to bring their own computer should ensure R and R-Studio is installed before attending.

Target audience

  • no experience with R is required
  • Knowledge of basic statistics and regression models would be helpful, particularly for the second part of the course

Draft Programme

Day 1

9:30 - 13:00 14:30 - 18:30
  • Introduction to R and R Studio
  • Objects and packages
  • Basic syntax (operations, vectors, matrices, lists, ecc.)
  • Data: importing, exporting and generating data
  • Data manipulation: summaries, tables, some plots
  • Functions, cycles, if
  • Linear model with lm()
  • Generalized Linear models with glm()

Day 2

9:30 - 13:00 14:30 - 18:30
  • Introduction to Statistical Learning
  • Cross validation
  • Ridge and Lasso estimator. Package glmnet
  • The CART algorithm for regression and classification trees. Package rpart
  • Random forest. Package randomForest

The lectures will be held by Anna Gottard, Associate Professor in Statistics in the Department of Statistics, Computer Science, Applications (DiSIA) at the University of Florence. She graduated in Statistics at the University of Rome "La Sapienza" and received her Ph.D. in Applied Statistics from the University of Florence. Her research interests include graphical models and statistical learning, high-dimensional statistics and computational statistics. Specific current areas of her research include variable selection in regression trees and random forest and non-Gaussian/high dimensional graphical models’. Her academic work has been published in highly regarded field journals such as Journal of the Royal Statistical Society A, Computational Statistics and Data Analysis, Scandinavian Journal of Statistics.

Contact person: