Bioinformatics

Proteomics identification pipeline

After a LC-MS analysis, a huge quantity of data must be stored and analysed via different bioinformatics tools. The general pipeline of the facility consists of:

Search engines

The different search engine goal is to match all the spectra obtained by LC-MS/MS against a proteomic sequence database, converted into a theoretical ion mass list. The major problem is that one of these search engines can’t cover the totality of the spectra. Combining several engines allows to obtain a better coverage of the expected information. That’s why we decided to use, in a systematic way, Mascot, Sequest and X! tandem in parallel.

  Mascot 2.3   Matrix Science
  Sequest   Thermo Finnigan
  X! tandem   The Global Proteome Machine Organization
  Omssa *   NCBI
 Andromeda  Maxquant

* An analysis with Omssa can be done upon request 

Proteome Discoverer 2.2

Proteome discoverer is a suite of different tools for proteomic data analysis, following differerent worflows, created and maintained by the user. It allows to parse raw data and to launch different search engines, such as Mascot or Sequest, for each analysis.

Scaffold 4

The Proteome Discoverer output ( merging Mascot and Sequest results ) is then loaded into Scaffold. This software validates proteomics data by running an independent implementation of PeptideProphet™ (peptide assignments to MS/MS spectra by database search engines ) and ProteinProphet® (protein identifications made on the basis of peptides assigned to MS/MS spectra). These two open souce programs are Bayesian statistical algorithms from the Institute for Systems Biology. This ensures a very low percentage of false positive identification in datasets.

Reports can be viewed by the users with the free viewer, in a user-friendly way with a high-level overview of proteomics results. A brief explanation is given for each new user.

Moreover, Scaffold allows to provide all relevant information for the submission of papers containing proteomics data.

Maxquant 1.6.2.10

MaxQuant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets.

Proteomics quantitative pipeline

Stable Isotope Labelling with Amino acids in Cell culture (SILAC)

The SILAC approach is based on the in vivo incorporation of a label into proteins for MS-based quantitative proteomics, relying on metabolic incorporation of a given ‘light’ or ‘heavy’ form of the amino acid into the proteins. To analyse the data from the SILAC approach, the Maxquant software is used, with the search engine Andromeda and Perseus software for the statistical analyses. Finally, to answer to needs of the PCF, home-made programs are used to optimize the reading of the results.

We will support you for each step of the optimization of this technique, depending on your needs.

Dimethyl Labelling

Label-free quantification

Tandem Mass Tag (TMT)

TMT belongs to a family of reagents referred to as isobaric mass tags.

Proteomics databases

The proteomics core facility place at your disposal a catalog of usual databases. New databases are added to the collection, depending on the needs of the different projects.

 General

Databases

 Uniprot – Swissprot  Manually curated annotations
 Uniprot – Swissprot and Trembl  Manual and automatic annotations
 IPI (different organisms)  IPI is not maintained anymore but can still be used
 Dedicated

Databases

 M. smegmatis MC2-155  SmegmaList
 M. tuberculosis H37Rv  Tuberculist
 P. falciparum  PlasmoDB
 S. cerevisiae  SGD
 C. elegans  Wormpep
 T. aestivum  Uniprot

The updates of these different databases are performed on a 6-month basis or upon request of users.

Home-made programs

Home-made programs in R, Python, Ruby or Perl are developed and used to meet the particular needs of the PCF or the users.