I'm right now using R to do two-sample t test.
I see lots of script examples on internet like below:
dataset.1= c(498, 460, 468, 458, 530, 482, 528, 598, 456)
dataset.2= c(596, 422, 524, 454, 538, 552, 478, 564, 556)
t.test(dataset.1, dataset.2, paired=T,conf.level=0.9)
OK this works well to me. But my problem is: I have a huge data input like below:
GENE CANCER1 CANCER2 CANCER3 NORMAL1 NORMAL2 NORMAL3
gene1 123 232 322 898 988 899
gene7000 233 434 434 897 676 654
Then how can I upload this data (path+xxx.txt) to the script?
Also more importantly, how can I specifically point out certain columns in my script?
say now I hope to compare
data2=c(897,676,654) for gene7000?
It should be pretty simple. You can pass any arguments you want to your
R script on the command lines. You can pass file names, the name of a vector or the number of a column, etc. To get the arguments from within
R do something like this:
arguments <- commandArgs(trailingOnly=TRUE)
?commandArgs for more info.
The R import/export manual that comes with the R installation, or is available here, has a lot of information on different ways to get your data into R, which is best depends on what your data looks like and how large it is. It may be as simple as using the
read.table function, or for large dataset using a database may be better.
If you use
read.table or similar then your data will be in a data frame and you can run the t test using code similar to this (assuming your data frame is named mydata):
help('[[') for more details on extracting portions of a data object.