Northeastern University

College of Science
College of Computer and Information Science
360 Huntington Ave.
Boston, Massachusetts 02115

May Institute

Computation and statistics for mass spectrometry and proteomics.

May 1-12, 2017, Northeastern University, Boston MA

Organizers: Meena Choi and Olga

The May Institute is an annual course that focuses on computational and statistical aspects of quantitative mass spectrometry-based proteomics. The course combines keynote presentations, theoretical introductory lectures, practical training, and informal personal discussions.  Take me to 2017 May Institute website.

The 2016 May Institue featured instructors who are leading experts in this field, who contributed numerous experimental and computational methods and software. The target audience was both beginners and experienced scientists, who would like to strengthen their computational and statistical expertise, as well as computational scientists and statisticians interested in quantitative proteomics. The participants had many opportunities to ask questions, as well as present their research.


Start at 9am on Monday and finish at 4:00pm on Friday.

Monday 05/02/16: Introduction to targeted quantitative proteomics, Skyline and R

  • 8:50AM    Opening Remarks
  • 9:00AM    Keynote: Biological relevance of targeted proteomics experiments – Ruedi Aebersold
  • 10:30AM   Refreshments
  • 11:00AM   Lecture: Introduction to Selected Reaction Monitoring (SRM) – Ariel Bensimon
  • 12:00PM   Lecture: Introduction to Skyline  – Brendan MacLean
  • 12:30PM   Lunch Break
  • 1:30PM     Lecture: From discovery to targeted proteomics – Ariel Bensimon and Brendan MacLean
  • 2:30PM     Hands-on: Setting up targeted assays in Skyline – Brendan MacLean
  • 4:00PM     Refreshments
  • 4:30PM     Hands-on: Introduction to data manipulation and visualization in R – Meena Choi and Erik Verschueren
  • 5:30PM     Science slam
  • 6:30PM     Adjourn

Tuesday 05/03/16: Design and analysis of targeted proteomic experiments

  • 8:00AM    Skyjam/Rjam
  • 9:00AM    Hands-on: Scheduling targeted acquisition – Ariel Bensimon
  • 9:30AM     Hands-on: iRT retention time prediction – Brendan MacLean
  • 10:30AM   Refreshments
  • 11:00AM    Lecture: Basics of statistical inference and experimental design – Olga Vitek
  • 12:30PM    Lunch Break
  • 1:30PM      Lecture: Quantitative targeted proteomics – Sue Abbatiello
  • 2:30PM     Hands-on: Effective data process and analysis with Skyline – Brendan MacLean
  • 4:00PM     Refreshments
  • 4:30PM     Hands-on: Group comparisons in Skyline – Brendan MacLean
  • 5:30PM     Hands-on: Basics of statistical inference and experimental design in R – Meena Choi and Erik Verschueren
  • 6:30PM     Adjourn

Wednesday 05/04/16: Statistical methods, alternative workflows, and systems biology

  • 8:00AM    Skyjam/Rjam
  • 9:00AM    Lecture: Statistical methods for detecting differentially abundant proteins – Olga Vitek
  • 10:30AM   Refreshments
  • 11:00AM   Hands-on: MS1 filtering – Brendan MacLean
  • 12:30PM   Lunch Break
  • 1:30PM      Lecture: Systems biology – Christine Vogel
  • 3:00PM     Hands-on: Systems biology: a case study – Christine Vogel
  • 4:00PM     Refreshments
  • 4:30PM     Hands-on: Panorama and statistical process control  – Brendan MacLean
  • 5:30PM     Hands-on: Statistical analysis of proteomic experiments with R and MSstats – Meena Choi and Erik Verschueren
  • 6:30PM     Adjourn

Thursday 05/05/16: Targeted proteomics on a large scale

  • 8:00AM    Skyjam/Rjam
  • 9:00AM    Lecture: Parallel Reaction Monitoring (PRM) assays – Jacob Jaffe
  • 10:30AM  Refreshments
  • 11:00AM   Lecture: Design and analysis of data-independent acquisition (DIA)  experiments with Skyline – Jarrett Egertson
  • 12:00PM   Lunch Break
  • 1:00PM     Hands on: Design and analysis of data-independent acquisition (DIA) experiments with Skyline – Jarrett Egertson
  • 2:30PM     Lecture: Untargeted interpretation of DIA experiments with DIA-Umpire and Skyline – Alexey Nesvizhskii
  • 3:30PM     Refreshments
  • 4:00PM     Hands-on: Untargeted interpretation of DIA experiments with DIA-Umpire and Skyline – Alexey Nesvizhskii
  • 5:30PM     Lecture: Statistical analysis of proteomic experiments at the Broad – D.R.Mani
  • 6:30PM     Adjourn
  • 7:30PM     Course dinner, Gaslight Brasserie

Friday 05/06/16: Proteomics in action

  • 8:00AM   Skyjam/Rjam
  • 9:00AM   Keynote: Quantitative proteomics for clinical applications: from discovery to targeted analysis – Steven Carr
  • 10:30AM  Refreshments
  • 11:00 AM  Keynote: Proteogenomics – Alexey Nesvizhskii
  • 12:30PM   Lunch Break
  • 1:30PM     Lecture: Multivariate analysis for discovery of biomarkers of disease – Olga Vitek
  • 2:00PM    Hands-on: Multivariate analysis in R – Meena Choi
  • 2:30PM    Jeopardy quiz – Brendan MacLean and Olga Vitek
  • 3:30PM    Final discussion, course certificate and feedback – Brendan MacLean and  Olga Vitek
  • 4:00PM    Adjourn


Susan Abbatiello

Susan received a Ph.D. in analytical chemistry at the University of Florida, and completed a postdoc at the University of Pittsburgh’s Hillman Cancer Center. Susan is now a product specialist at ThermoFisher, and a research scientist in the Proteomics Platform at the Broad Institute of MIT and Harvard. Susan’s research focuses on the development of tests to measure potential protein biomarkers in the blood for diseases such as cancer. Susan has co-chaired a committee that is part of the National Cancer Institute’s Clinical Proteomic Technology Assessment for Cancer (CPTAC).


Ruedi Aebersold

Ruedi is professor at ETH Zurich. His research has focused on the development of new technologies for quantitative proteomics and on applying them to challenging questions of contemporary life science research. In this area, the group has a worldwide standing and has pioneered a number of concepts and technologies that have transformed proteomics. These include the introduction of relative and absolute proteome quantification, the development of open source computational tools for the objective, statistically supported analysis of large proteomic datasets, the development of a method for the determination of the spatial organization of protein complexes and the development of targeted proteomic techniques such as Selected Reaction Monitoring and SWATH-MS.


Ariel Bensimon

Ariel is a PhD candidate in the lab of Ruedi Aebersold at ETH Zurich, Switzerland. He holds a M.Sc. in Biology from the Adi Lautman Interdisciplinary Program for Outstanding Students at Tel Aviv University, Israel. Ariel employs targeted proteomics to identify novel components of the DNA damage response, to quantify the detected changes and to use this information to formulate and refine models of the DNA damage signaling process.



Steven Carr

Steven Carr is director of the Proteomics Platform at the Broad Institute of MIT and Harvard. He is internationally recognized as a leader in the development of novel proteomics methods and in their application in biology and medicine. Prior to joining Broad, he held positions of Computational and Structural Sciences at GlaxoSmithKline Pharmaceuticals and Senior Director of Protein Science and Technology at Millennium Pharmaceuticals in Cambridge, Massachusetts. Steven is the recipient of the 2011 Discovery Award in Proteomics from the Human Proteome Organization, the 2011 Thought Leader Award in Proteomics from the Agilent Foundation, and the 2014 Wallace H. Coulter Lectureship Award.

Meena Choi

Meena is a post-doctoral associate in the lab of Olga Vitek at Northeastern University (starting mid-Spring 2016). She holds a B.S. in Biology from the Korea Institute for Science and Technology, and a PhD in Statistics from Purdue University. Meena’s work focuses on statistical methods for quantitative proteomics. She is the lead developer and maintainer of MSstats.



Jarrett Egertson

Jarrett is a postdoctoral researcher at the University of Washington Department of Genome Sciences. He works in the MacCoss Lab and primarily focuses on developing new data acquisition methods and software in support of these methods. Jarrett earned his undergraduate degree (B.S. in Molecular, Cell, and Developmental Biology) from UCLA in 2008. While earning his undergraduate degree, Jarrett researched at the Spielberg Family Center for Applied Proteomics at the Cedars-Sinai Medical Center.

Jacob Jaffe

Jake is the Assistant Director of the Proteomics Platform at the Broad Institute. He obtained his B.A. in Biochemistry from the University of Pennsylvania and his Ph.D. from Harvard University where he studied with George Church and Howard Berg. Dr. Jaffe has pioneered diverse problems in modern proteomics including large-scale mapping of proteomic data onto genomes, thus allowing their de novo annotation from proteomic evidence, pattern recognition for quantitative proteomics, determination and quantification of epigenetic marks on histone proteins, and high-throughput targeted phosphoproteomics.


Brendan MacLean

Brendan worked at Microsoft for 8 years in the 1990s where he was a lead developer and development manager for the Visual C++/Developer Studio Project. Since leaving Microsoft, Brendan has been the Vice President of Engineering for Westside Corporation, Director of Engineering for BEA Systems, Inc., Sr. Software Engineer at the Fred Hutchinson Cancer Research Center, and a founding partner of LabKey Software. Since August, 2008 he has worked as a Sr. Software Engineer within the MacCoss lab and been responsible for all aspects of design, development and support in creating the Skyline Targeted Proteomics Environment and its growing worldwide user community.

D. R Mani

D. R. Mani is a Principle Computational Biologist at the Broad Institute of MIT and Harvard. Mani leads the high throughput and large scale analysis of proteomic data derived from a variety of sources, including liquid chromatography-based (LC) and multiple reaction monitoring (MRM) mass spectrometry (MS), with the goal of better understanding and exploring the molecular basis of cancer diagnosis, detection and biomarker discovery.


Alexey Nesvizhskii

Alexey holds a PhD in physics from the University of Washington, and completed a post-doc at the Institute for Systems Biology in Seattle. He is now Associate Professor at the University of Michigan in Ann Arbor. Alexey and his colleagues contributed tools such as the Trans-Proteomic Pipeline, PeptideAtlas, SAINT, Qspec, and the CRAPome, are used by hundreds of laboratories worldwide. Alexey serves on the Scientific Advisory Board for Swiss Institute of Bioinformatics and on the Board of Directors for the US Human Proteome Organization.


Erik Verschueren

Erik currently works in the Discovery Proteomics group at Genentech as a Bioinformatics Scientist. He previously worked at the CRG in Barcelona, Spain and the Krogan lab at University of California San Francisco. Erik’s research focuses on protein interaction networks and post-translational regulation patterns. He also develops computational methods in R for the analysis of high-throughput Affinity Purification Mass Spectrometry datasets, quantitation of differential post-translational modifications and integration of multiple ‘omics datasets into network models.


Olga Vitek

Olga is a Sy and Laurie Sternberg Associate Professor in the College of Science and the College of Computer and Information Science at Northeastern University. Olga holds a PhD in Statistics from Purdue University. Her group develops statistical methods and algorithms for quantitative proteomics. The methods optimize the experimental design, and ensure accurate and objective interpretation of the resulting large and complex datasets. The tools developed by her group include MSstats, an open-source software for statistical analysis of quantitative shotgun, targeted and data-independent proteomic experiments.


Christine Vogel

Christine is an Assistant Professor of Biology at New York University. She is a systems biologist who uses statistical and computational tools, large-scale quantitative mass spectrometry, and molecular biology techniques to study the dynamics of the cellular proteome. Her research revolves around proteins, their properties, evolution, and expression patterns.




Instructors and administrative support


Nat Brace

Nat worked full-time for Microsoft from 1991 through 2000 where he led a team of system engineers who were helping organizations plan for and integrate Microsoft’s advanced server solutions. He continued at Microsoft as a consultant from 2000 to 2011, with several internal teams as a technical project owner, customer outreach lead and marketing manager. More recently, Nat served as the lead project manager at a social innovation start-up, delivering a collaboration platform for use by educators, researchers and sponsors participating in a global educational reform initiative. As project manager for the Skyline team, Nat is responsible for outreach programs, like webinars, courses and user meetings, foreign language translation, and instrument vendor interactions.


Nicholas Shulman

Nick worked from 1995-2000 at Microsoft on the Microsoft Access team, leaving to join Westside Corporation with Brendan to create browser-based database design tools. After Westside was acquired by BEA Systems, Nick created a new graphical JSP designer for Weblogic Workshop, an award winning Integrated Development Environment for enterprise Java applications. At LabKey Corporation, Nick created the flow cytometry module and the graphical query designer. Since March, 2009 he has worked in the Maccoss lab on Skyline and Topograph, a quantitative analysis tool for protein turnover experiments.


Tsung-Heng Tsai

Tsung-Heng is a postdoctoral research associate in the lab of Olga Vitek at Northeastern University. He is a co-developer of MSstats, an open-source software package for statistical analysis of quantitative proteomic experiments. Tsung-Heng holds a PhD in Electrical Engineering from Virginia Tech. His current research focuses on developing statistical and computational methods for mass spectrometry-based proteomics.



Shan Li

Administrative staff


  • Skyline is a Windows-based software for building SRM methods and analysing the resulting MS data. It employs cutting-edge technologies for creating and iteratively refining targeted methods for large-scale proteomics studies. Mac users need a virtual machine to run Skyline (e.g. VMware Fusion – 30 days free trial version).
  • MSstats is an R-based statistical tool which detects differentially abundant proteins, summarises and visualises protein-level inferences, and can be used to design future SRM experiments. The statistical framework behind MSstats is based on linear mixed-effect models, where model-based inferences attain high sensitivity and specificity in protein significance results.
  • R is a freely available language and environment for statistical computing and graphics, which provides a wide variety of statistical and graphical techniques.


Prior exposure to R is not required. However, we highly recommend that you familiarize yourself with R prior to the course. The best place to start is here or hereMore advanced examples are available here or here. A useful full-scale course is here.

Location: 315 and 320, Shillman Hall
Northeastern University, 360 Huntington Ave, 02115 Boston MA.


Accommodations: Participants should directly contact the hotels for accommodation arrangements.

Hotel and Guesthouse near NEU:

We also suggest the following low-budget hotels:

Suggestions for finding accommodation:

  • There are two subway lines that go straight to NEU: Orange Line  Ruggles Station and Green Line E Northeastern University Station.  It is a good idea to find accommodation close to these subway lines.
  • Participants could also find accommodation near subway Red Line, Blue Line or other Green lines, as long as it is easy to transfer to Orange Line or Green Line E.


Download Annotated NEU Campus Map (PDF)