The first step in this process is to defined some constants for future use:
But first let me handle the loading of various libraries required by R. This is handled by my_functions.R in the R/ directory.
source("R/my_functions.R")
## Warning: replacing previous import by 'intervals::coerce' when loading 'genomeIntervals'
## Warning: replacing previous import by 'intervals::initialize' when loading 'genomeIntervals'
Is it better to keep experimental design information as a text file or load it on demand by creating the relevant data structure?
In the following lines, I create it here, then write it to a file. It is just as valid to go the other way, of course.
HPGL_IDs = c("HPGL0199","HPGL0200","HPGL0205","HPGL0206","HPGL0211","HPGL0212")
conditions = c("wt_rfe","wt_dfe","wt_rfe","wt_dfe","wt_rfe","wt_dfe")
batches = c( "A", "A", "B", "B", "C", "C")
wt_data_design = data.frame(
HPGL_ID = HPGL_IDs,
condition = conditions,
batch = batches)
conditions = as.factor(wt_data_design$condition)
batches = as.factor(wt_data_design$batch)
write.csv(wt_data_design, file="txt/wt_data_design.csv")
## The annotation we use is actually for mexicana because that is
## the best we have at the moment
gff_file = "reference/Lmexicana_TriTrypDB-4.2.gff.gz"
pvalue_cutoff = 0.05
## The following will likely not be used for the immediate future,
## but illustrates the opposite method of dealing with experimental design
all_data_design = read.csv(file="txt/all_data_design.csv")
wt_table = xtable(wt_data_design)
print(wt_table, type="html")
## <!-- html table generated in R 3.0.2 by xtable 1.7-3 package -->
## <!-- Mon May 5 10:19:20 2014 -->
## <TABLE border=1>
## <TR> <TH> </TH> <TH> HPGL_ID </TH> <TH> condition </TH> <TH> batch </TH> </TR>
## <TR> <TD align="right"> 1 </TD> <TD> HPGL0199 </TD> <TD> wt_rfe </TD> <TD> A </TD> </TR>
## <TR> <TD align="right"> 2 </TD> <TD> HPGL0200 </TD> <TD> wt_dfe </TD> <TD> A </TD> </TR>
## <TR> <TD align="right"> 3 </TD> <TD> HPGL0205 </TD> <TD> wt_rfe </TD> <TD> B </TD> </TR>
## <TR> <TD align="right"> 4 </TD> <TD> HPGL0206 </TD> <TD> wt_dfe </TD> <TD> B </TD> </TR>
## <TR> <TD align="right"> 5 </TD> <TD> HPGL0211 </TD> <TD> wt_rfe </TD> <TD> C </TD> </TR>
## <TR> <TD align="right"> 6 </TD> <TD> HPGL0212 </TD> <TD> wt_dfe </TD> <TD> C </TD> </TR>
## </TABLE>
## In contrast, the full design is:
all_table = xtable(all_data_design)
chosen_colors = hash(keys = c("replete","deplete"),
values = c("black","gray"))
print(all_table, type="html")
HPGL_ID | condition | batch | |
---|---|---|---|
1 | HPGL0199 | wt_rfe | A |
2 | HPGL0200 | wt_dfe | A |
3 | HPGL0201 | LFR_rfe | A |
4 | HPGL0202 | LFR_dfe | A |
5 | HPGL0203 | LIT_rfe | A |
6 | HPGL0204 | LIT_dfe | A |
7 | HPGL0205 | wt_rfe | B |
8 | HPGL0206 | wt_dfe | B |
9 | HPGL0207 | LFR_rfe | B |
10 | HPGL0208 | LFR_dfe | B |
11 | HPGL0209 | LIT_rfe | B |
12 | HPGL0210 | LIT_dfe | B |
13 | HPGL0211 | wt_rfe | C |
14 | HPGL0212 | wt_dfe | C |
15 | HPGL0213 | LFR_rfe | C |
16 | HPGL0214 | LFR_dfe | C |
17 | HPGL0215 | LIT_rfe | C |
18 | HPGL0216 | LIT_dfe | C |
save(list = ls(all=TRUE), file="RData")