当前位置: 动力学知识库 > 问答 > 编程问答 >

statistics - upload of large data when using R to do t test

问题描述:

I'm right now using R to do two-sample t test.

I see lots of script examples on internet like below:

#!/usr/bin/env Rscript

dataset.1= c(498, 460, 468, 458, 530, 482, 528, 598, 456)

dataset.2= c(596, 422, 524, 454, 538, 552, 478, 564, 556)

t.test(dataset.1, dataset.2, paired=T,conf.level=0.9)

OK this works well to me. But my problem is: I have a huge data input like below:

GENE CANCER1 CANCER2 CANCER3 NORMAL1 NORMAL2 NORMAL3

gene1 123 232 322 898 988 899

.....

.....

gene7000 233 434 434 897 676 654

Then how can I upload this data (path+xxx.txt) to the script?

Also more importantly, how can I specifically point out certain columns in my script?

say now I hope to compare data1=c(233,434,434,) and data2=c(897,676,654) for gene7000?

Thanks

网友答案:

It should be pretty simple. You can pass any arguments you want to your R script on the command lines. You can pass file names, the name of a vector or the number of a column, etc. To get the arguments from within R do something like this:

arguments <- commandArgs(trailingOnly=TRUE)

Look at ?commandArgs for more info.

网友答案:

The R import/export manual that comes with the R installation, or is available here, has a lot of information on different ways to get your data into R, which is best depends on what your data looks like and how large it is. It may be as simple as using the read.table function, or for large dataset using a database may be better.

If you use read.table or similar then your data will be in a data frame and you can run the t test using code similar to this (assuming your data frame is named mydata):

t.test(mydata$CANCER1, mydata$NORMAL1)

Run help('[[') for more details on extracting portions of a data object.

分享给朋友:
您可能感兴趣的文章:
随机阅读: