Configuration

The first step in this process is to defined some constants for future use:

But first let me handle the loading of various libraries required by R. This is handled by my_functions.R in the R/ directory.

source("R/my_functions.R")

## Warning: replacing previous import by 'intervals::coerce' when loading 'genomeIntervals'
## Warning: replacing previous import by 'intervals::initialize' when loading 'genomeIntervals'

A philosophical question

Is it better to keep experimental design information as a text file or load it on demand by creating the relevant data structure?

In the following lines, I create it here, then write it to a file. It is just as valid to go the other way, of course.

HPGL_IDs = c("HPGL0199","HPGL0200","HPGL0205","HPGL0206","HPGL0211","HPGL0212")
conditions =  c("wt_rfe","wt_dfe","wt_rfe","wt_dfe","wt_rfe","wt_dfe")
batches =  c(   "A",     "A",     "B",     "B",     "C",    "C")

wt_data_design = data.frame(
    HPGL_ID = HPGL_IDs,
    condition = conditions,
    batch = batches)
conditions = as.factor(wt_data_design$condition)
batches = as.factor(wt_data_design$batch)
write.csv(wt_data_design, file="txt/wt_data_design.csv")
## The annotation we use is actually for mexicana because that is
## the best we have at the moment
gff_file = "reference/Lmexicana_TriTrypDB-4.2.gff.gz"
pvalue_cutoff = 0.05

## The following will likely not be used for the immediate future,
## but illustrates the opposite method of dealing with experimental design
all_data_design = read.csv(file="txt/all_data_design.csv")

wt_table = xtable(wt_data_design)
print(wt_table, type="html")

## <!-- html table generated in R 3.0.2 by xtable 1.7-3 package -->
## <!-- Mon May  5 10:19:20 2014 -->
## <TABLE border=1>
## <TR> <TH>  </TH> <TH> HPGL_ID </TH> <TH> condition </TH> <TH> batch </TH>  </TR>
##   <TR> <TD align="right"> 1 </TD> <TD> HPGL0199 </TD> <TD> wt_rfe </TD> <TD> A </TD> </TR>
##   <TR> <TD align="right"> 2 </TD> <TD> HPGL0200 </TD> <TD> wt_dfe </TD> <TD> A </TD> </TR>
##   <TR> <TD align="right"> 3 </TD> <TD> HPGL0205 </TD> <TD> wt_rfe </TD> <TD> B </TD> </TR>
##   <TR> <TD align="right"> 4 </TD> <TD> HPGL0206 </TD> <TD> wt_dfe </TD> <TD> B </TD> </TR>
##   <TR> <TD align="right"> 5 </TD> <TD> HPGL0211 </TD> <TD> wt_rfe </TD> <TD> C </TD> </TR>
##   <TR> <TD align="right"> 6 </TD> <TD> HPGL0212 </TD> <TD> wt_dfe </TD> <TD> C </TD> </TR>
##    </TABLE>

## In contrast, the full design is:
all_table = xtable(all_data_design)


chosen_colors = hash(keys = c("replete","deplete"),
    values = c("black","gray"))

print(all_table, type="html")

	HPGL_ID	condition	batch
1	HPGL0199	wt_rfe	A
2	HPGL0200	wt_dfe	A
3	HPGL0201	LFR_rfe	A
4	HPGL0202	LFR_dfe	A
5	HPGL0203	LIT_rfe	A
6	HPGL0204	LIT_dfe	A
7	HPGL0205	wt_rfe	B
8	HPGL0206	wt_dfe	B
9	HPGL0207	LFR_rfe	B
10	HPGL0208	LFR_dfe	B
11	HPGL0209	LIT_rfe	B
12	HPGL0210	LIT_dfe	B
13	HPGL0211	wt_rfe	C
14	HPGL0212	wt_dfe	C
15	HPGL0213	LFR_rfe	C
16	HPGL0214	LFR_dfe	C
17	HPGL0215	LIT_rfe	C
18	HPGL0216	LIT_dfe	C

Configuration

A philosophical question

Save data