Short Course Listings
Short Course

Course Information
Course Title: Getting the Most out of your Data: Introduction to Multivariate Data Analysis
Categories: 1 - Life Sciences
2 - Chemometrics
3 - Data Analysis
4 - Polymerase Chain Reaction
5 - Proteomics
6 - Spectroscopy
Instructor(s): Mikael Kubista / Jose Andrade, Jose Andrade Course Number: 115
Affiliation: University of A Coruna
Course Date: 03/08/2016 - Tuesday Course Length: 1 1/2 Day Course
Start Time: 08:30 AM End Time: 05:00 PM
Course Date 2: 03/09/2016 - Wednesday    
Start Time: 08:30 AM End Time: 12:30 PM
Fee: $800 ($1125 after 2/12/16) Textbook Fee:

Course Description
Most tasks scientists must address are multivariate in nature; e.g. quality control in chemical industries, PAT in pharma industries, environmental monitoring, spectral analyses, genomic/proteomic studies, data mining, etc. Therefore, a sound understanding of the principles underlying some fundamental, widely applied chemometric tools is required. Often, such knowledge is not covered by undergraduate training and self-learning is hard. This course presents intuitive explanations to comprehend the fundamentals of the most common multivariate tools to analyse multivariate data (mathematical theory is explained, but kept to a minumum. Focus is on understanding the principles and interpreting results). Participants will be trained on several practical real-life examples and will be provided a free license time limited license on dedicated software for multivariate analysis.

Target Audience
Analytical chemists in general, quality control technicians, scientists in life sciences, pharma and biotech industry. The course also fits researchers and engineers working with current industrial PAT (process analytical technologies) trends related to multivariate data analysis and knowledge extraction / data mining.

Course Outline
1. Why do we need multivariate chemometrics? Limitations of classical tests.
a. Introducing the issues, FDA-PAT guideline, EU (CEN) guidelines/technical specifications
b. Examples illustrating issues when analysing data with classical univariate tests
2. How to organize data. The importance of data pretreatment for multivariate analysis
a. Organizing the data; visual presentation, quality control
b. Different data pretreatments; normalization, scaling (columns vs rows; autoscale vs. mean center)
c. Examples
d. Classification of chemometric techniques
3. Finding groups of samples: hierarchical clustering and heatmaps
a. Basics
b. Examples
4. Data mining and pattern recognition
a. Basics, mainly based on illustrations
b. Examples
c. Dynamic PCA
5. Introduction of three supervised classification techniques
a. The ideas and theories behind the methods.
b. Illustration of the underlying principles
6. A step forward: supervised classification
a. Support vector machines
i. Introduction to the basic idea
ii. Examples
b. Discriminant PLS
i. Introduction to the basic idea
ii. Examples

Examples are analysed usually together in-group after each method or principle introduced. All participants receive free of charge time-limited license to a dedicated software for multivariate analysis. Students will also be encouraged to bring their own data sets and will be supported analysing them.

Course Instructor's Biography
Prof. Mikael Kubista was among the pioneers developing and applying chemometric methods. He introduced Procrustes rotation for calibration of samples and demonstrated that spectra of unknowns could be determined despite extensive spectral overlap by multivariate spectroscopic analysis. He co-founded MultiD Analyses (, which today is a leading company developing software for multivariate and multidimensional analysis with products such as DATAN and GenEx. During the last decade Dr Kubista introduced multivariate methods for expression profiling and he founded the TATAA Biocenters as a leading provider of real-time PCR expression services. The TATAA Biocenters are leading organizers of hands-on training in biostatistics educating annually over 300 scientists across the world. Kubista is member of the CEN and ISO technical committees drafting the forthcoming technical specifications and guidelines for Molecular Diagnostics Investigations – Specifications for the Preexaminations Processes for DNA, RNA, and protein analyses in blood, fresh frozen tissues and preserved tissues. Dr. Jose Andrade holds a Chair at the University of A Coruņa (Spain) since 2011. He has worked in a refinery as a quality control manager and he currently works on multivariate data analysis in the environmental and petrochemical fields. His interests include FTIR and Atomic Spectrometry (ETAAS), and he recently edited a book (RSC, United Kingdom) introducing the principles of multivariate calibration. He was also involved in the environmental studies dealing with oil pollution in the sea and airborne pollution control.