Experiment Package

In the following section, the structure of a zipped Experiment Package will be explained. Please note that each and avery file listed below can be created and modified from your Home Page direclty, with no need to manually adjust them from outside PathLay. Each Experiment Package is defined from up to 7 different text files with different extensions:

Experiment Package Content

File Extension

Content

.conf

Holds the main configuration of the Experiment Package

.mrna

Holds the Transcriptomic dataset

.prot

Holds the Proteomic dataset

.mirna

Holds the miRNomic dataset

.meth

Holds the Methylomic dataset

.chroma

Holds the Chromatin Status dataset

.ont

Holds the Gene Ontologies entries of interest of the Experiment Package

.meta

Holds the Metabolomic dataset

Every experiment created in your Home Page has a name like “exp1,2,3…” automatically assigned. If you want to manually configure an experiment from scratch you have to name your files with this convention in mind (i.e. exp1.conf, exp1.mrna, exp1.meta etc…).

Configuration File

The configuration file for the Experiment Package is structured as a series of tag and value pairs, separated by a “=” character, with each line of the file holding one pair.

Configuration File Tags

Tag

Description

Content

expname

Holds the name of the Experiment Package

String of text

comments

Holds brief comments for the Experiment Package

String of text

organism

Holds the organism ID related to the experiment

hsa; mmu

geneIdType

Holds the ID type for the Transcriptomic dataset

entrez; ensembl; symbol;

gene_id_column

Holds the number representing the column with the IDs in the Transcriptomic dataset

Integer Number

gene_dev_column

Holds the number representing the column with the differential expression values in the Transcriptomic dataset

Integer Number

gene_pvalue_column

Holds the number representing the column with the p-values in the Transcriptomic dataset

Integer Number

urnaIdType

Holds the ID type for the miRNomic dataset

mirbase

urna_id_column

Holds the number representing the column with the IDs in the miRNomic dataset

Integer Number

urna_dev_column

Holds the number representing the column with the differential expression values in the miRNomic dataset

Integer Number

urna_pvalue_column

Holds the number representing the column with the p-values in the miRNomic dataset

Integer Number

metaIdType

Holds the ID type for the Metabolomic dataset

keggcompound; name;

meta_id_column

Holds the number representing the column with the IDs in the Metabolomic dataset

Integer Number

meta_dev_column

Holds the number representing the column with the expression values in the Metabolomic dataset

Integer Number

meta_pvalue_column

Holds the number representing the column with the p-values in the Metabolomic dataset

Integer Number

methIdType

Holds the ID type for the Methylomic dataset

entrez; ensembl; symbol;

meth_id_column

Holds the number representing the column with the IDs in the Methylomic dataset

Integer Number

meth_dev_column

Holds the number representing the column with the differential expression values in the Methylomic dataset

Integer Number

meth_pvalue_column

Holds the number representing the column with the p-values in the Methylomic dataset

Integer Number

protIdType

Holds the ID type for the Proteomic dataset

entry; entrez; symbol;

prot_id_column

Holds the number representing the column with the IDs in the Proteomic dataset

Integer Number

prot_dev_column

Holds the number representing the column with the differential expression values in the Proteomic dataset

Integer Number

prot_pvalue_column

Holds the number representing the column with the p-values in the Proteomic dataset

Integer Number

chromaIdType

Holds the ID type for the Chromatin Status dataset

entrez; ensembl; symbol;

chroma_id_column

Holds the number representing the column with the IDs in the Chromatin Status dataset

Integer Number

chroma_dev_column

Holds the number representing the column with the differential expression values in the Chromatin Status dataset

Integer Number

chroma_pvalue_column

Holds the number representing the column with the p-values in the Chromatin Status dataset

Integer Number

An example of a fully configured .conf file is displayed below:

Structure of a .conf file of an Experiment Package with Transcriptomic, Proteomic, miRNomic, Methylomic and Metabolomic datasets associated
expname=LTED vs MCF7+
comments=ANOVA adj.p + Tukey
organism=hsa
geneIdType=entrez
gene_id_column=8
gene_dev_column=6
gene_pvalue_column=5
protIdType=entry
prot_id_column=1
prot_dev_column=4
prot_pvalue_column=3
urnaIdType=mirbase
urna_id_column=1
urna_dev_column=6
urna_pvalue_column=5
methIdType=entrez
meth_id_column=8
meth_dev_column=6
meth_pvalue_column=5
metaIdType=keggcompound
meta_id_column=8
meta_dev_column=6
meta_pvalue_column=5
chromaIdType=entrez
chroma_id_column=8
chroma_dev_column=6
chroma_pvalue_column=5

The experiment Package is named “LTED vs MCF7+” and it’s related to an experiment involving Homo sapiens cells, it has all the -omics datasets currently supported by PathLay associated to it. We can read its configuration as it follows:

  • The Transcriptomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.

  • The miRNomic datataset has its mirbase IDs stored in the 1st column, Effect Size values stored in the 6th column and p-values stored in the 5th column.

  • The Metabolomic datataset has its Kegg Compound IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.

  • The Methylomic datataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.

  • The Chromatin Status dataset has its Entrez IDs stored in the 8th column, Effect Size values stored in the 6th column and p-values stored in the 5th column.

  • The Proteomic datataset has its Uniprot Entry IDs stored in the 1st column, Effect Size values stored in the 4th column and p-values stored in the 3rd column.

Note

Experiment Packages can be created and configured in your Home Page with a more intuitive approach. Once the “Save” button is clicked the .conf file will be automatically generated and saved in your home folder.

Datasets Files

Each dataset file (i.e. a file with .mrna, .prot, .mirna, .meta, .meth and .chroma extension) is a tab separated file that can have any number of columns but at least one, two or three, depending on the Analysis configuration you will choose for PathLay (see more in Configuration Page section), the only mandatory field is the one containing the IDs.

Note

Datasets can be copied and pasted in their related text areas in your Home Page. Once the “Save” button is clicked the dataset file will be created in your home folder and named after the experiment.

Supported IDs

PathLay supports a variety of ID types as input which are converted to the ID default type during analysis. All the ID types are summarized at Table and some examples are provided at Table.

Compatible IDs to use in your Datasets

Dataset Type

Supported Input IDs

ID Type for Analysis

Gene

Entrez Gene ID; Ensembl; Gene Symbol;

Entrez Gene ID

Protein

Entrez Gene ID; UniProtKB Entry; Protein Symbol;

Entrez Gene ID

miRNA

miRBase ID

miRBase ID

Metabolite

KEGG Compounds; Compound Name;

KEGG Compounds

Methylation

Entrez Gene ID; Ensembl; Gene Symbol;

Entrez Gene ID

Chromatin Status

Entrez Gene ID; Ensembl; Gene Symbol;

Entrez Gene ID

Examples of ID Types

ID Type

Example

Entrez Gene ID

2308

Gene Symbol

FOXO1

Ensembl

ENSG00000150907

UniProtKB Entry

Q12778

Protein Symbol

FOXO1

miRBase ID

hsa-miR-183-5p

KEGG Compounds

C03758

Compound Name

Dopamine