Copyright Notice

pepXML2Excel is Copyright (C) 2007 Magnus Palmblad

pepXML2Excel is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PApepXML2ExcelICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

About pepXML2Excel

pepXML2Excel is an AWK program that transposes the peptide list in the TPP PepXML Viewer to one protein per line, calculates averages, standard deviations, t-test and Bonferroni correction of the peptide ratios for all proteins with at least three distinct peptides with ionscore > identityscore. In its default form, Mascot and A. thaliana are assumed, but anyone with a basic knowledge of AWK should be able to modify the program to suit any search engine and species (the program will work for any species even in its default version). This program was used to generate the data in Bindschedler et al., Some. Journal., XX(XX), XXXX-XXXX (2007).

Download pepXML2Excel by clicking here. (Right-click and choose "Save Link As...".)

Using pepXML2Excel

Usage: ./pepXML2Excel.awk <interact spreadsheet>

where <interact spreadsheet> is the spreadsheet exported from the TPP PepXML Viewer.

  1. In the TPP PepXML Viewer, display columns index, probability, spectrum, ionscore, identityscore, homologyscore, ions, peptide, protein and XPRESS in this order. This is the default, but it is important that this is correct.
  2. Apply Filtering Options:require ionscore > identityscore in PepXML Viewer.
  3. Sort pepXML according to protein (Summary:Sorting:protein) in PepXML Viewer.
  4. Save as spreadsheet ("Other Actions:Export Spreadsheet") in PepXML Viewer.
  5. Run AWK script in Cygwin shell with ./pepXML2Excel.awk interact.xls > interact.txt (if the spreadsheet was saved as "interact.xls" from PepXML Viewer).
  6. Open interact.txt in Excel as a tab delimited text file (Excel defaults).
  7. (Optional) Apply conditional formatting to column B with Condition 1 as "Formula Is", "=AND(G2>0.95,D2<$D$2)" and format bold red, and Condition 2 as "Formula Is", "=AND(G2>0.95,D2>$D$2)" and format bold green. This will highlight those protein abundances that are significantly higher (according to the Bonferroni corrected t-test) than the control/average in green and those that are significantly lower in red (this is easily changed of course). Move the control protein to the second row (first in the list) if needed.

To include AGI id's ("At numbers"), or any other alternative protein accession numbers, a mapping between the accession numbers in the searching database and AGI id's (or equivalent) must me present in the same directory as the interact files, for instance UniProt2At_map.txt. For simplicity, the name of this file is incorporated in the AWK program itself (change this to use another mapping). If this file is omitted, these fields will be empty in the resulting Excel spreadsheet.