Using a custom controlled vocabulary for rMZQC

Using a custom controlled vocabulary for rMZQC

This vignette serves as a guide for R users to use a custom CV before creating an mzQC document.

Warning: you should settle on a CV before instantiating any mzQC objects, since this ensures that all CV terms are consistent (and checked for existance) and that the CV meta information within the mzQC document is accurate.

Target Audience: R users

Create a minial mzQC document with a custom CV

Let’s first consider what happens by default:

library(rmzqc)

print(getCVInfo())
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCcontrolledVocabulary"
## Field "name":
## [1] "Proteomics Standards Initiative Mass Spectrometry Ontology"
## Field "uri":
## [1] "https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.192/psi-ms.obo"
## Field "version":
## [1] "4.1.192"
## test if the default CV is usable
toQCMetric(id = "MS:4000059", value = 13405) ## number of MS1 scans
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCqualityMetric"
## Field "accession":
## [1] "MS:4000059"
## Field "name":
## [1] "number of MS1 spectra"
## Field "description":
## [1] "\"The number of MS1 events in the run.\" [PSI:MS]"
## Field "value":
## [1] 13405
## Field "unit":
## list()

However, if you happen to run this code without an internet connection, it will fall back to the PSI-MS CV which is shipped with the rmzqc package (which may not contain the latest CV terms)

## With internet:
myCV = getCVSingleton()
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
myCV$setData(getCVDictionary("latest")) ## this is done internally by default when you load the package
## Downloading obo from 'https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.192/psi-ms.obo' ...
cat("Number of entries in latest CV: ", nrow(getCVSingleton()$getCV()), "\n")
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Number of entries in latest CV:  6806
print(getCVInfo())
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCcontrolledVocabulary"
## Field "name":
## [1] "Proteomics Standards Initiative Mass Spectrometry Ontology"
## Field "uri":
## [1] "https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.192/psi-ms.obo"
## Field "version":
## [1] "4.1.192"
## simulate missing internet connection by invoking the function manually
myCV$setData(getCVDictionary("local"))
cat("Number of entries in local CV: ", nrow(getCVSingleton()$getCV()), "\n")
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Number of entries in local CV:  6700
print(getCVInfo())
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCcontrolledVocabulary"
## Field "name":
## [1] "Proteomics Standards Initiative Mass Spectrometry Ontology"
## Field "uri":
## [1] "https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.129/psi-ms.obo"
## Field "version":
## [1] "4.1.129"

Now, the package’s PSI-MS CV might still not suit you, and you want to use the latest unpublished CV, which you downloaded somewhere, or which you handcrafted for testing. Then simply use a custom .obo file:

myOBO = system.file("./cv/psi-ms.obo", package="rmzqc") ## we will use a local file, but you can point to anything you have (even URI's)
myCV$setData(getCVDictionary("custom", myOBO))
cat("Number of entries in custom CV: ", nrow(getCVSingleton()$getCV()), "\n")
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Number of entries in custom CV:  6700
print(getCVInfo())
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCcontrolledVocabulary"
## Field "name":
## [1] "Proteomics Standards Initiative Mass Spectrometry Ontology"
## Field "uri":
## [1] "/tmp/RtmpQvktLf/Rinstbfa4ad905dd/rmzqc/./cv/psi-ms.obo"
## Field "version":
## [1] "4.1.129"
## you may want to change the CV-entries, or URI or version manually, before creating an mzQC file:
newCV = list(CV = myCV$getData()$CV, 
             URI = "https://myURI.com",
             version = "9.9.2")
myCV$setData(newCV)
print(getCVInfo())
## Superclass Singleton has cloneable=FALSE, but subclass CV_ has cloneable=TRUE. A subclass cannot be cloneable when its superclass is not cloneable, so cloning will be disabled for CV_.
## Reference class object of class "MzQCcontrolledVocabulary"
## Field "name":
## [1] "Proteomics Standards Initiative Mass Spectrometry Ontology"
## Field "uri":
## [1] "https://myURI.com"
## Field "version":
## [1] "9.9.2"