Experiment Package

In the following section, the structure of a zipped Experiment Package will be explained. Please note that each and avery file listed below can be created and modified from your Home Page direclty, with no need to manually adjust them from outside PathLay. Each Experiment Package is defined from up to 7 different text files with different extensions:

Experiment Package Content
File Extension	Content
.conf	Holds the main configuration of the Experiment Package
.mrna	Holds the Transcriptomic dataset
.prot	Holds the Proteomic dataset
.mirna	Holds the miRNomic dataset
.meth	Holds the Methylomic dataset
.chroma	Holds the Chromatin Status dataset
.ont	Holds the Gene Ontologies entries of interest of the Experiment Package
.meta	Holds the Metabolomic dataset

Every experiment created in your Home Page has a name like “exp1,2,3…” automatically assigned. If you want to manually configure an experiment from scratch you have to name your files with this convention in mind (i.e. exp1.conf, exp1.mrna, exp1.meta etc…).

Configuration File

The configuration file for the Experiment Package is structured as a series of tag and value pairs, separated by a “=” character, with each line of the file holding one pair.

Configuration File Tags
Tag	Description	Content
expname	Holds the name of the Experiment Package	String of text
comments	Holds brief comments for the Experiment Package	String of text
organism	Holds the organism ID related to the experiment	hsa; mmu
geneIdType	Holds the ID type for the Transcriptomic dataset	entrez; ensembl; symbol;
gene_id_column	Holds the number representing the column with the IDs in the Transcriptomic dataset	Integer Number
gene_dev_column	Holds the number representing the column with the differential expression values in the Transcriptomic dataset	Integer Number
gene_pvalue_column	Holds the number representing the column with the p-values in the Transcriptomic dataset	Integer Number
urnaIdType	Holds the ID type for the miRNomic dataset	mirbase
urna_id_column	Holds the number representing the column with the IDs in the miRNomic dataset	Integer Number
urna_dev_column	Holds the number representing the column with the differential expression values in the miRNomic dataset	Integer Number
urna_pvalue_column	Holds the number representing the column with the p-values in the miRNomic dataset	Integer Number
metaIdType	Holds the ID type for the Metabolomic dataset	keggcompound; name;
meta_id_column	Holds the number representing the column with the IDs in the Metabolomic dataset	Integer Number
meta_dev_column	Holds the number representing the column with the expression values in the Metabolomic dataset	Integer Number
meta_pvalue_column	Holds the number representing the column with the p-values in the Metabolomic dataset	Integer Number
methIdType	Holds the ID type for the Methylomic dataset	entrez; ensembl; symbol;
meth_id_column	Holds the number representing the column with the IDs in the Methylomic dataset	Integer Number
meth_dev_column	Holds the number representing the column with the differential expression values in the Methylomic dataset	Integer Number
meth_pvalue_column	Holds the number representing the column with the p-values in the Methylomic dataset	Integer Number
protIdType	Holds the ID type for the Proteomic dataset	entry; entrez; symbol;
prot_id_column	Holds the number representing the column with the IDs in the Proteomic dataset	Integer Number
prot_dev_column	Holds the number representing the column with the differential expression values in the Proteomic dataset	Integer Number
prot_pvalue_column	Holds the number representing the column with the p-values in the Proteomic dataset	Integer Number
chromaIdType	Holds the ID type for the Chromatin Status dataset	entrez; ensembl; symbol;
chroma_id_column	Holds the number representing the column with the IDs in the Chromatin Status dataset	Integer Number
chroma_dev_column	Holds the number representing the column with the differential expression values in the Chromatin Status dataset	Integer Number
chroma_pvalue_column	Holds the number representing the column with the p-values in the Chromatin Status dataset	Integer Number

An example of a fully configured .conf file is displayed below:

Structure of a .conf file of an Experiment Package with Transcriptomic, Proteomic, miRNomic, Methylomic and Metabolomic datasets associated

expname=LTED vs MCF7+
comments=ANOVA adj.p + Tukey
organism=hsa
geneIdType=entrez
gene_id_column=8
gene_dev_column=6
gene_pvalue_column=5
protIdType=entry
prot_id_column=1
prot_dev_column=4
prot_pvalue_column=3
urnaIdType=mirbase
urna_id_column=1
urna_dev_column=6
urna_pvalue_column=5
methIdType=entrez
meth_id_column=8
meth_dev_column=6
meth_pvalue_column=5
metaIdType=keggcompound
meta_id_column=8
meta_dev_column=6
meta_pvalue_column=5
chromaIdType=entrez
chroma_id_column=8
chroma_dev_column=6
chroma_pvalue_column=5

The experiment Package is named “LTED vs MCF7+” and it’s related to an experiment involving Homo sapiens cells, it has all the -omics datasets currently supported by PathLay associated to it. We can read its configuration as it follows:

The Transcriptomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The miRNomic datataset has its mirbase IDs stored in the 1st column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Metabolomic datataset has its Kegg Compound IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Methylomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Chromatin Status dataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.
The Proteomic datataset has its Uniprot Entry IDs stored in the 1st column, Effect Size values stored in the 4th column and p-values stored in the 3rd column.

Note

Experiment Packages can be created and configured in your Home Page with a more intuitive approach. Once the “Save” button is clicked the .conf file will be automatically generated and saved in your home folder.

Datasets Files

Each dataset file (i.e. a file with .mrna, .prot, .mirna, .meta, .meth and .chroma extension) is a tab separated file that can have any number of columns but at least one, two or three, depending on the Analysis configuration you will choose for PathLay (see more in Configuration Page section), the only mandatory field is the one containing the IDs.

Note

Datasets can be copied and pasted in their related text areas in your Home Page. Once the “Save” button is clicked the dataset file will be created in your home folder and named after the experiment.

Supported IDs

PathLay supports a variety of ID types as input which are converted to the ID default type during analysis. All the ID types are summarized at Table and some examples are provided at Table.

Compatible IDs to use in your Datasets
Dataset Type	Supported Input IDs	ID Type for Analysis
Gene	Entrez Gene ID; Ensembl; Gene Symbol;	Entrez Gene ID
Protein	Entrez Gene ID; UniProtKB Entry; Protein Symbol;	Entrez Gene ID
miRNA	miRBase ID	miRBase ID
Metabolite	KEGG Compounds; Compound Name;	KEGG Compounds
Methylation	Entrez Gene ID; Ensembl; Gene Symbol;	Entrez Gene ID
Chromatin Status	Entrez Gene ID; Ensembl; Gene Symbol;	Entrez Gene ID

Examples of ID Types
ID Type	Example
Entrez Gene ID	2308
Gene Symbol	FOXO1
Ensembl	ENSG00000150907
UniProtKB Entry	Q12778
Protein Symbol	FOXO1
miRBase ID	hsa-miR-183-5p
KEGG Compounds	C03758
Compound Name	Dopamine