Experiment Package
In the following section, the structure of a zipped Experiment Package will be explained. Please note that each and avery file listed below can be created and modified from your Home Page direclty, with no need to manually adjust them from outside PathLay. Each Experiment Package is defined from up to 7 different text files with different extensions:
File Extension |
Content |
|---|---|
.conf |
Holds the main configuration of the Experiment Package |
.mrna |
Holds the Transcriptomic dataset |
.prot |
Holds the Proteomic dataset |
.mirna |
Holds the miRNomic dataset |
.meth |
Holds the Methylomic dataset |
.chroma |
Holds the Chromatin Status dataset |
.ont |
Holds the Gene Ontologies entries of interest of the Experiment Package |
.meta |
Holds the Metabolomic dataset |
Every experiment created in your Home Page has a name like “exp1,2,3…” automatically assigned. If you want to manually configure an experiment from scratch you have to name your files with this convention in mind (i.e. exp1.conf, exp1.mrna, exp1.meta etc…).
Configuration File
The configuration file for the Experiment Package is structured as a series of tag and value pairs, separated by a “=” character, with each line of the file holding one pair.
Tag |
Description |
Content |
|---|---|---|
expname |
Holds the name of the Experiment Package |
String of text |
comments |
Holds brief comments for the Experiment Package |
String of text |
organism |
Holds the organism ID related to the experiment |
hsa; mmu |
geneIdType |
Holds the ID type for the Transcriptomic dataset |
entrez; ensembl; symbol; |
gene_id_column |
Holds the number representing the column with the IDs in the Transcriptomic dataset |
Integer Number |
gene_dev_column |
Holds the number representing the column with the differential expression values in the Transcriptomic dataset |
Integer Number |
gene_pvalue_column |
Holds the number representing the column with the p-values in the Transcriptomic dataset |
Integer Number |
urnaIdType |
Holds the ID type for the miRNomic dataset |
mirbase |
urna_id_column |
Holds the number representing the column with the IDs in the miRNomic dataset |
Integer Number |
urna_dev_column |
Holds the number representing the column with the differential expression values in the miRNomic dataset |
Integer Number |
urna_pvalue_column |
Holds the number representing the column with the p-values in the miRNomic dataset |
Integer Number |
metaIdType |
Holds the ID type for the Metabolomic dataset |
keggcompound; name; |
meta_id_column |
Holds the number representing the column with the IDs in the Metabolomic dataset |
Integer Number |
meta_dev_column |
Holds the number representing the column with the expression values in the Metabolomic dataset |
Integer Number |
meta_pvalue_column |
Holds the number representing the column with the p-values in the Metabolomic dataset |
Integer Number |
methIdType |
Holds the ID type for the Methylomic dataset |
entrez; ensembl; symbol; |
meth_id_column |
Holds the number representing the column with the IDs in the Methylomic dataset |
Integer Number |
meth_dev_column |
Holds the number representing the column with the differential expression values in the Methylomic dataset |
Integer Number |
meth_pvalue_column |
Holds the number representing the column with the p-values in the Methylomic dataset |
Integer Number |
protIdType |
Holds the ID type for the Proteomic dataset |
entry; entrez; symbol; |
prot_id_column |
Holds the number representing the column with the IDs in the Proteomic dataset |
Integer Number |
prot_dev_column |
Holds the number representing the column with the differential expression values in the Proteomic dataset |
Integer Number |
prot_pvalue_column |
Holds the number representing the column with the p-values in the Proteomic dataset |
Integer Number |
chromaIdType |
Holds the ID type for the Chromatin Status dataset |
entrez; ensembl; symbol; |
chroma_id_column |
Holds the number representing the column with the IDs in the Chromatin Status dataset |
Integer Number |
chroma_dev_column |
Holds the number representing the column with the differential expression values in the Chromatin Status dataset |
Integer Number |
chroma_pvalue_column |
Holds the number representing the column with the p-values in the Chromatin Status dataset |
Integer Number |
An example of a fully configured .conf file is displayed below:
expname=LTED vs MCF7+
comments=ANOVA adj.p + Tukey
organism=hsa
geneIdType=entrez
gene_id_column=8
gene_dev_column=6
gene_pvalue_column=5
protIdType=entry
prot_id_column=1
prot_dev_column=4
prot_pvalue_column=3
urnaIdType=mirbase
urna_id_column=1
urna_dev_column=6
urna_pvalue_column=5
methIdType=entrez
meth_id_column=8
meth_dev_column=6
meth_pvalue_column=5
metaIdType=keggcompound
meta_id_column=8
meta_dev_column=6
meta_pvalue_column=5
chromaIdType=entrez
chroma_id_column=8
chroma_dev_column=6
chroma_pvalue_column=5
The experiment Package is named “LTED vs MCF7+” and it’s related to an experiment involving Homo sapiens cells, it has all the -omics datasets currently supported by PathLay associated to it. We can read its configuration as it follows:
The Transcriptomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The miRNomic datataset has its mirbase IDs stored in the 1st column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Metabolomic datataset has its Kegg Compound IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Methylomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Chromatin Status dataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Proteomic datataset has its Uniprot Entry IDs stored in the 1st column, Effect Size values stored in the 4th column and p-values stored in the 3rd column.
Note
Experiment Packages can be created and configured in your Home Page with a more intuitive approach. Once the “Save” button is clicked the .conf file will be automatically generated and saved in your home folder.
Datasets Files
Each dataset file (i.e. a file with .mrna, .prot, .mirna, .meta, .meth and .chroma extension) is a tab separated file that can have any number of columns but at least one, two or three, depending on the Analysis configuration you will choose for PathLay (see more in Configuration Page section), the only mandatory field is the one containing the IDs.
Note
Datasets can be copied and pasted in their related text areas in your Home Page. Once the “Save” button is clicked the dataset file will be created in your home folder and named after the experiment.
Supported IDs
PathLay supports a variety of ID types as input which are converted to the ID default type during analysis. All the ID types are summarized at Table and some examples are provided at Table.
Dataset Type |
Supported Input IDs |
ID Type for Analysis |
|---|---|---|
Gene |
Entrez Gene ID; Ensembl; Gene Symbol; |
Entrez Gene ID |
Protein |
Entrez Gene ID; UniProtKB Entry; Protein Symbol; |
Entrez Gene ID |
miRNA |
miRBase ID |
miRBase ID |
Metabolite |
KEGG Compounds; Compound Name; |
KEGG Compounds |
Methylation |
Entrez Gene ID; Ensembl; Gene Symbol; |
Entrez Gene ID |
Chromatin Status |
Entrez Gene ID; Ensembl; Gene Symbol; |
Entrez Gene ID |
ID Type |
Example |
|---|---|
Entrez Gene ID |
2308 |
Gene Symbol |
FOXO1 |
Ensembl |
ENSG00000150907 |
UniProtKB Entry |
Q12778 |
Protein Symbol |
FOXO1 |
miRBase ID |
hsa-miR-183-5p |
KEGG Compounds |
C03758 |
Compound Name |
Dopamine |