Pipeline configuration
The pipeline is controlled by the central configuration file config_pxn.sh. This config file contains all the information about where the pipeline inputs are, the resources that will be allocated and other PDxN specific parameters. Every individual script in the pipeline depends on the contents of this config file to run proerly. The first thing you need to do is modify the variables in the config file.
DO NOT CHANGE THE VARIABLE NAMES, just modify the file paths or other values (right hand side of the = sign).
Modifications
The following table outlines the customizable parameters of the config file:
| Variable | Description | Example | Available Options |
|---|---|---|---|
GSNAMEBASE | Base name of the gene set | genedex | genedex, MSigDBv7, MSigDBv6 |
DSNAME | Name of the background reference dataset | gtextoil_iBrain | gtextoil, gtextoil_iBrain, HGU133plus2 |
OUTDIR_NAME | Name of directory to store outputs | test_run | Custom string |
COLLECTION | Optional, name of collection used to augment the custom gene set (augment mode) | "test" | Custom string |
NWRK_TYPE | Optional, type of network to calculate (augment) | "unipartite" | unipartite, bipartite, uni_bipartite |
CORES | Number of cores (for part 1) | 10 | Integer |
CORES_P2 | Number of cores (for part 2) | 25 | Integer |
MAX_GENES | Largest pathway size allowed | 500 | Integer |
MIN_GENES | Smallest pathway size allowed | 20 | Integer |
MAX_JACQ | Max Jacquard index between two pathways | 85 | Integer |
PCOR_OPTION | Choice of partial correlation | 0 | 0 (yes) or 1 (no) |
MIN_SAMPLES | Minimum number of samples per tissue | 10 | Integer |