Analyzing Large Numbers of Variables—Microarrays, EEG's, MRI's

Dates: Begins June 19th 2015. 

Aim of the Course: This four-week course will provide a practical guide to multivariate analysis, with the emphasis on the use of the bootstrap, decision trees, and permutation tests in conjunction with parametric techniques to analyze large arrays of data similar to those collected during DNA, RNA, and protein sequencing, EEG, MRI, fMRI, PET, telemetry and other forms of image analysis.

Who Should Take This Course: Geneticists, protein chemists, neurophysiologists, epidemiologists, geographers, astronomers, agronomists, other image analysts, and statisticians.

Instructor: Dr. Phillip Good, former Calloway Professor of Computer Science at the University of Georgia (Fort Valley) and graduate of the program in mathematical statistics at UC Berkeley, is the author of Analyzing the Large Number of Variables in Biomedical and Satellite Imagery (Wiley, 2011). Common Errors in Statistics (and How to Avoid Them) (Wiley, 4th ed., 2012 with James Hardin), Permutation, Parametric, and Bootstrap Tests of Hypotheses (Springer, 3rd ed, 2005), Manager's Guide to Design and Conduct of Clinical Trials (Wiley, 2nd ed., 2006), and Applying Statistics in the Courtroom (CRC, 2001). He has given tutorials at the Joint Statistical Meetings (U.S.) and Deming Conference, lectured in Belgium, Bulgaria, France, Holland, Ireland, Slovenia, and Spain, and was twice a traveling lecturer for the American Statistical Association. This is his 8th year of providing on-line interactive courses.

Prerequisite: You should have familiarity with basic parametric statistical concepts. Resampling methodology will be introduced as needed. The relevant biomedical background is also provided as needed.

Organization of the Course: The course takes place over the Internet. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given an alias and access to a private bulletin board that serves as a forum for discussion of ideas, problem solving, and interaction with the instructor.

The course is scheduled to take place over four weeks, and should require no more than ten hours per week. At the beginning of each week, participants receive the relevant material, in addition to answers to exercises from the previous session. During the week, participants are expected to go over the course materials and work through exercises. Discussion among participants is encouraged. The instructor will provide answers and comments.

Textbook: Analyzing the Large Number of Variables in Biomedical and Satellite Imagery.  Wiley, 2011.  Be sure to order before the course begins.  This text will prove invaluable after the course ends.

Course Program: The course is structured as follows

SESSION 1: Basic Concepts

  • Analyzing Very Large Arrays of Data--Problems and Solutions
  • Advantages of and Necessity for Multivariate Analysis.
  • Hotelling's T2
  • Using Permutation Tests to Establish Statistical Significance
  • Combining Independent Tests
  • Software

SESSION 2: Testing Multiple Hypotheses

  • Reducing the Number of Variables
  • Controlling the Over-All Error Rate--Examples
  • Controlling the False Discovery Rate--Time-Course Data
  • Gene Set Enrichment Analysis
  • Software

SESSION 3: Applying the Bootstrap and Permutation Tests

  • Pre-Post Comparisons
  • The Bootstrap
  • Determining Sample Size
  • Validating a Cluster Analysis
  • Bootstrap or Permutation Test?
SESSION 4: Classifying Subjects on the Basis of Biomedical Data
  • Classification Methods
  • Decision Trees
  • Misclassification Costs
  • Ensemble Methods
  • Validation
  • Software for Use with Large Numbers of Variables

Cost: The full cost of this four-week interactive on-line course is only $345. $35 discount for students, faculty and research workers at academic institutions, send an email from your institutional email account to

Cost: The full cost of this four-week interactive on-line interactive on-line course is only $345. $35 discount for students, faculty and research workers at academic institutions, send an email from your institutional email account to