SIRIUS is a Java software for analyzing metabolites from tandem mass spectrometry data. It combines the analysis of isotope patterns in MS spectra with the analysis of fragmentation patterns in MS/MS spectra, and uses CSI:FingerID as a web service to search in molecular structure databases. Further it integrates CANOPUS for de novo compound class prediction. For getting started quickly see the quick-start guide or our book chapter on De Novo Molecular Formula Annotation and Structure Elucidation Using SIRIUS 4 (Preprint).

SIRIUS requires high mass accuracy data. The mass deviation of your MS and MS/MS spectra should be within 20 ppm. Mass Spectrometry instruments such as TOF, Orbitrap and FT-ICR usually provide high mass accuracy data, as well as coupled instruments like Q-TOF, IT-TOF or IT-Orbitrap. Spectra measured with a quadrupole or linear trap do not provide the high mass accuracy that is required for our method. See Mass deviations on what “mass accuracy” means in detail for SIRIUS.

SIRIUS expects MS and MS/MS spectra as input. It is possible to omit the MS data, but it will make the analysis more time consuming and might give you worse results. In this case, you should consider limiting the candidate molecular formulas to those found in PubChem.

SIRIUS expects processed peak lists (centroided spectra). It does not contain routines for peak picking from profiled spectra. This is a deliberate design decision: We want you to use the best peak picking software out there — or alternatively, your favorite software. There are several tools specialized for this task, such as OpenMS, MZmine or XCMS. See our video tutorials on how to preprocess tour data for SIRIUS with OpenMS or MZmine. However, since version 4.4.0 SIRIUS contains a zero parameter preprocessing tool to directly import LCMS-Runs from .mzml (or mzxml) format to help you getting started quickly. See how to use MSconvert/ProteoWizard to convert your vendor formats to mzml for SIRIUS in this video tutorial.

SIRIUS will identify the molecular formula of the measured precursor ion, and will also annotate the spectrum by providing a molecular formula for each fragment peak. Peaks that receive no annotation are assumed to be noise peaks. Furthermore, a fragmentation tree is predicted; this tree contains the predicted fragmentation reaction leading to the fragment peaks.

ZODIAC improves the ranking of the formula candidates provided by SIRIUS. It re-ranks the candidates by considering joint fragments and losses between fragmentation trees of different compounds in a data set.

CSI:FingerID identifies the structure of a compound by searching in a molecular structure database. Here and in the following, “structure” refers to the identity and connectivity (with bond multiplicities) of the atoms, but no stereochemistry information. Elucidation of stereochemistry is currently beyond the power of automated search engines.

COSMIC confidence score assigns a confidence to CSI:FingerID structure identifications. The idea is similar to False Discovery Rates: It allows to run CSI:FingerID in high-throughput on thousands of compounds and select the most confident identifications. The workflow of generating a structure database, searching with CSI:FingerID and ranking hits by confidence score is termed the COSMIC workflow. Make your data interpretation workflow easier by first identifying the most confident compounds in your sample, then use them to generate knowledge or hypotheses.

CANOPUS predicts compound classes from the molecular fingerprint predicted by CSI:FingerID without any database search involved. Hence, it provides structural information for compounds for which neither spectral nor structural reference data are available.

The SIRIUS software can also be used within an analysis pipeline. For example, you can identify the molecular formula of the ion and fragment peaks, and use this information as input for other tools such as FingerID or MAGMa to identify the structure of the measured compound. For this purpose, you can either use the command line interface or the SIRIUS libraries directly. See boecker-lab/sirius-libs for the sources. The pre-built jars are available via our maven repository. See “Developer information” for details.

Since version 3.1, our software ships with a Graphical User Interface (GUI). The GUI version also includes the commandline tool. A slim version without GUI is available as separate download. Since version 4.4.0 the GUI and CLI share the same persistence layer, so all results and intermediate steps can be exported/imported between GUI and CLI

Literature

The scientific development behind SIRIUS, ZODIAC, CSI:FingerID and CANOPUS required numerous man-years of PhD students, postdocs and principal investigators; an educated guess would be roughly 35 man-years. This estimate does not include building the shiny Graphical User Interface that was introduced in version 3.1. But it is not the user interface or software development that does the work here; it is our scientific research that made SIRIUS, ZODIAC, CSI:FingerID and CANOPUS possible. It is understood that the work of 15 years cannot be described in a single paper.

Please cite all papers that you feel relevant for your work. Please do not cite this manual or the SIRIUS or CSI:FingerID website, but rather our scientific papers.