Have a question? Find a bug? Submit a support ticket.

A Brief Description of Proteome Cluster

Proteome Cluster is developed and maintained by Bioproximity, LLC. Proteome Cluster is a web-based, high-performance computing solution for searching, storing, analyzing and viewing tandem mass spectrometry data. Searches are conducted on message passing interface (MPI) Beowulf compute clusters hosted on Amazon Web Services. Multiple clusters can be launched consecutively, allowing many searches to be conducted at the same time. Each cluster node is equipped with 2 eight-core CPUs. Each cluster can be configured with many nodes, allowing even very large MuDPIT-style experiments to be analyzed in minutes. Upon completion of each a search the output data is saved and the cluster is shut down.

Proteome Cluster uses numerous open-source algorithms including ProteoWizard (1), OpenMS (2), OMSSA (3), X!Tandem (4), X!Hunter (5), DirecTag (6) and QuaMeter (7). Use of open-source, freely-available solutions allows end-users to easily reproduce, modify or extend the data analyses conducted on Proteome Cluster. All data generated is available for download, often in multiple formats. This includes critical parameter settings which are necessary to reproduce and to understand the analyses.

While other open-source search algorithms exist, we use the OMSSA and X!-based algorithms because they return expectation value-based scores for every peptide assignment. This is in contrast to virtually all other algorithms. Expectation value-based scores allow the user to evaluate the likelihood that a match is random assignment. An excellent discussion of the differences between expectation-based scoring and other scoring types has been published (8). An added benefit of expectation value-based scoring is that protein expectation values may be derived from the peptide scores. This removes the need for the so-called "two-peptide" rule (9).

How to Use Proteome Cluster

Proteome Cluster consists of five main sections navigable by the menu in the upper-right corner of the page.

The Upload page allows you to upload raw mass spectrometry data in mz5, mzML or mzXML formats or processed peak list data in the MGF format. Learn more.

The Runs page lists uploaded data files and is the page from which searches may be launched.

The Searches pages consist of the Search Queue page, the Search Parameters page and the Search Workflows page. The Search Queue page lists ongoing and completed searches and links to the results. The Search Parameters page lists saved tandem MS search parameter settings and allows for the creation of new search parameter settings or editing existing settings. The Search Workflows page lists sets of search parameters saved as workflows. When launching a search cluster, a workflow may be defined instead of a single search parameter. The cluster will then search the data by all of the search parameters defined in the workflow.

The Projects page lists all projects. Projects centralize a defined set of runs, searches, search parameters and output data and allow experimental methods to be described and attachments to be added. Projects may be shared with other users.

The Support page brings you to this Wiki.


  1. Marc Sturm, Andreas Bertsch, Clemens Gröpl, Andreas Hildebrandt, Rene Hussong, Eva Lange, Nico Pfeifer, Ole Schulz-Trieglaff, Alexandra Zerck, Knut Reinert, and Oliver Kohlbacher, 2008. “OpenMS – an Open-Source Software Framework for Mass Spectrometry” BMC Bioinformatics 9: 163. doi:10.1186/1471-2105-9-163.
    URL http://dx.doi.org/10.1186/1471-2105-9-163
  2. Tautenhahn, R., Böttcher, C., Neumann, S., Nov. 2008. Highly sensitive feature detection for high resolution LC/MS. BMC bioinformatics 9 (1), 504+. URL http://dx.doi.org/10.1186/1471-2105-9-504
  3. Geer, L. Y., Markey, S. P., Kowalak, J. A., Wagner, L., Xu, M., Maynard, D. M., Yang, X., Shi, W., Bryant, S. H., Jul. 2004. Open mass spectrometry search algorithm. J. Proteome Res. 3 (5), 958-964. URL http://dx.doi.org/10.1021/pr0499491
  4. Craig, R., Beavis, R. C., Jun. 2004. TANDEM: matching proteins with tandem mass spectra. Bioinformatics (Oxford, England) 20 (9), 1466-1467. URL http://dx.doi.org/10.1093/bioinformatics/bth092
  5. Craig, R., Cortens, J. C., Fenyo, D., Beavis, R. C., Aug. 2006. Using annotated peptide mass spectrum libraries for protein identification. Journal of proteome research 5 (8), 1843-1849. URL http://dx.doi.org/10.1021/pr0602085
  6. Tabb, D. L., Ma, Z.-Q., Martin, D. B., Ham, A.-J. L., Chambers, M. C., Jul. 2008. DirecTag: Accurate sequence tags from peptide MS/MS through statistical scoring. J. Proteome Res. 7 (9), 3838-3846. URL http://dx.doi.org/10.1021/pr800154p
  7. Ma, Z.-Q. Q., Polzin, K. O., Dasari, S., Chambers, M. C., Schilling, B., Gibson, B. W., Tran, B. Q., Vega-Montoto, L., Liebler, D. C., Tabb, D. L., Jul. 2012. QuaMeter: Multivendor performance metrics for LC-MS/MS proteomics instrumentation. Analytical chemistry 84 (14), 5845-5850. URL http://dx.doi.org/10.1021/ac300629p
  8. Gupta, N., Bandeira, N., Keich, U., Pevzner, P. A., Jul. 2011. Target-Decoy approach and false discovery rate: When things may go wrong. Journal of The American Society for Mass Spectrometry 22 (7), 1111-1120. URL http://dx.doi.org/10.1007/s13361-011-0139-3
  9. Gupta, N., Pevzner, P. A., Jul. 2009. False discovery rates of protein identifications: A strike against the Two-Peptide rule. J. Proteome Res. 8 (9), 4173-4181. URL http://dx.doi.org/10.1021/pr9004794