DbNPQueries

From BioAssist
Jump to: navigation, search

This webpage is no longer maintained and only here for archive purposes. Please refer to https://trac.nbic.nl/gscf and http://dbnp.org for up to date information on this project.


Queries that relate only to metadata

Who else is doing similar studies? (possible collaborations: Who has an interesting cohort/intervention study? or who can corroborate my data?) This question can be reformulated technically as:

 •	Find all studies which have event X (e.g. treatment 'paracetamol')
 -	Covered only by the full text search

Which other measurements were preformed on a certain sample/ experiment? This question can be reformulated technically as:

 •	List all assays which are performed on study X
 +	Covered in study overview

(see also http://www.ebi.ac.uk/bioinvindex/browse_studies.seam?searchPattern=&cid=1755)

Were all data obtained on the same date?

 +	Covered in study overview - timeline

Was the same strain used?

 +	Covered in study overview - subjects

How was the animal killed?

 +	Covered in study overview - samples (if present in metadata template)

How was the sample isolated/quenched?

 +	Covered in study overview - protocols

How do different samples relate to each other (time, treatment)?

 +	Covered in study overview - timeline

What was the status of the animal before the intervention?

Which machine/platform/assay was used?

What do we know about the background of the animal?

What did the intervention contain?

These questions can be resolved by:

 •	Add a second sheet to the output that includes this type of information of the selected samples


Queries that relate to both metadata and clean data

1. Give me all the data of a specific compound (metabolite/transcript) of (a sub-selection of) all studies. The selection may be based on treatment, species or organ. In this way also inter-species comparisons should be made possible.

'All the data' is interpreted as all metadata.

This question can be reformulated technically as either:

 •	Find all studies which
have a clinical data assay which contains a feature that refers to a certain compound
[and have a fulltext match with the specified treatment in Events]
[and have species matching the given species]
[and have a Sample source matching the given organ]
and return the study list
 -	Should be covered by the first step of the advanced query

or:

 •	Find all studies which
have a metabolomics assay which contains a feature that refers to a certain metabolite
[and have a fulltext match with the specified treatment in Events]
[and have species matching the given species]
[and have a Sample source matching the given organ]
and return the study list
 -	Should be covered by the first step of the advanced query

2. Same as 1. but for a short list of compounds

See 1, 'contains a feature' becomes 'contains features'

3. Give me all the data of a specific compound (metabolite/transcript) which has a certain threshold value of (a sub-selection of) all studies. The selection may be based on treatment, species or organ. In this way also inter-species comparisons should be made possible.

'All the data' is interpreted as all metadata.

This question can be reformulated technically as either:

 •	Find all studies which
have a clinical data assay which contains a feature that refers to a certain compound and has the value of that feature fitting to the given threshold criterium
[and have a fulltext match with the specified treatment in Events]
[and have species matching the given species]
[and have a Sample source matching the given organ]
and return the study list
 -	Should be covered by the first step of the advanced query

or:

 •	Find all studies which
have a metabolomics assay which contains a feature that refers to a certain metabolite and has the value (concentration) of that feature fitting to the given threshold criterium
[and have a fulltext match with the specified treatment in Events]
[and have species matching the given species]
[and have a Sample source matching the given organ]
and return the study list
 -	Should be covered by the first step of the advanced query

or:

 •	Find all studies which
have a transcriptomics assay which contains a feature that refers to a certain gene and has the value (fold change) of that feature fitting to the given threshold criterium
[and have a fulltext match with the specified treatment in Events]
[and have species matching the given species]
[and have a Sample source matching the given organ]
and return the study list
 -	Should be covered by the first step of the advanced query

July 1 barrier

4. Give me all compounds significantly regulated under a certain condition: Give me the group of compounds that is always changed under a certain condition (treatment) e.g.: in the database paracetamol is used often as treatment. I want to get the list of compounds that is significant (p < x) changed in all those studies

This question can be reformulated technically as:

 •	Find all studies which
have a fulltext match with the specified condition in Events
and return the P-values of all significantly regulated transcripts from any transcriptomics assays
and return the P-values of all significantly regulated metabolites from any metabolomics assays
and return the P-values of all significantly regulated compounds from any clinical data assays
 -	Coverage?

--> This implies having a sort of P-values / results cache at the data side --> for clinical data module we should try to implement that, at least enable uploading pre-calculated P-values

5. Give me the compounds that are unchanged in all (or a selection of) studies

6. Are the transcripts/proteins that relate to an affected metabolite (or any other –omics combination) changed as well? (make groups of regulated transcripts/proteins/metabolites on function or connect them on bases of gene expression)

6. Inter-omics comparisons: e.g. give me all affected clinical markers and metabolites.

7. Can you find back changes in one organ in another organ in the same or any other experiment with the same background?

9. How much overlap is there between data of similar experiments (e.g. interlab comparisons)?

11. What type of normalization was used?

13. GEO or pathvisio link: which pathways are changed/unchanged

14. Would we like to store analysis files (T-profiler)?

15. Based on treatment (e.g. similar compound class) and reliability select (data of the same cell type). QC information should be available to get an idea about reliability (QC correlation)

16. Give all experiment were the levels of a compounds are higher than x (defined value)

17. Which probe sets have the smallest variation over all experiments

18. Give all datasets for which a certain compound has a defined level (higher or lower than x) and a certain event occurred