Difference between revisions of "Proteomics data formats"
From BioAssist
m |
m (added format diagram) |
||
Line 5: | Line 5: | ||
# Thermo Xcalibur or Waters MassLynx .RAW files<br>The file formats are different, but distinguishable by: Xcalibur=file, MassLynx=directory. Original data file as exported by Xcalibur or MassLynx respectively during data acquisition. These files can programmatically be accessed on the Microsoft Windows platform through a OLE DLL. Normally this will only work with a C++ implementation, however the PeakML library (due to be released open source; currently available on request r.a.scheltema@rug.nl) provides a 1-to-1 mapping for accessing the data for Java implementations. It is advisable to use these original formats, as there is a large amount of information contained in these files, which is not mapped to an open file format like mzML. | # Thermo Xcalibur or Waters MassLynx .RAW files<br>The file formats are different, but distinguishable by: Xcalibur=file, MassLynx=directory. Original data file as exported by Xcalibur or MassLynx respectively during data acquisition. These files can programmatically be accessed on the Microsoft Windows platform through a OLE DLL. Normally this will only work with a C++ implementation, however the PeakML library (due to be released open source; currently available on request r.a.scheltema@rug.nl) provides a 1-to-1 mapping for accessing the data for Java implementations. It is advisable to use these original formats, as there is a large amount of information contained in these files, which is not mapped to an open file format like mzML. | ||
# Agilent .wiff files | # Agilent .wiff files | ||
+ | |||
+ | [[Image:PeakList_format_and_software.png|thumb|500px]] | ||
=== Peak list level === | === Peak list level === |
Revision as of 18:26, 8 March 2010
Data formats
Raw data level
- Thermo Xcalibur or Waters MassLynx .RAW files
The file formats are different, but distinguishable by: Xcalibur=file, MassLynx=directory. Original data file as exported by Xcalibur or MassLynx respectively during data acquisition. These files can programmatically be accessed on the Microsoft Windows platform through a OLE DLL. Normally this will only work with a C++ implementation, however the PeakML library (due to be released open source; currently available on request r.a.scheltema@rug.nl) provides a 1-to-1 mapping for accessing the data for Java implementations. It is advisable to use these original formats, as there is a large amount of information contained in these files, which is not mapped to an open file format like mzML. - Agilent .wiff files
Peak list level
- .mgf
- .dta
- .pkl
- mzML (mzML[1]) (also see mzsquash (compression tool for mzML files) or fast infoset)
- mzXML[2] (mzXML[1])
Peptide identification level
- pepXML[2]
- protXML[2]
- mzIdentML[1]
- PeakML
Was developed by Richard Scheltema as response to the needed to have the ability to store intermediate data (extracted mass traces, matched sets of these, parameters, etc.), in order to create a modular pipeline setup. - NetCDF (obsolete)
It was developed to be general purpose and as such is a very poor fit for mass spec data. This means it will miss much useful information on your mass spec run. Do not use it. - analysisXML
- prideXML?
- Mascot .dat
- Mascot HTML
- Mascot CSV
- Mascot pepXML
- TandemXML
- OMSSA .omx
- OMSSA CSV
- InsPecT CSV
Cite error: <ref>
tags exist, but no <references/>
tag was found