当前位置: 动力学知识库 > 问答 > 编程问答 >

r - Match two matrices by columns

问题描述:

I have two matrices that looks like this:

A

ColumnA ColumnB ColumnC ColumnD

A D N F

DF N A S

P F K l

qw AS O W

n H Q

D E

B

 ColumnA ColumnB ColumnC ColumnD

A DH K FS

np N A S

AS Q O lm

P n N WE

AS PV Q

NQ E

I would like a third matrix C containing the common elements column by column between the two matrices.

I tried to do this work by using R but it seems impossible since the two matrices are too large: ~5000 rows and 1500 columns. The two matrices have the same number of columns ad the same column names.

Can anyone help me please?

Best

Desired output:

C

ColumnA ColumnB ColumnC ColumnD

A N N S

P AS A Q

n K E

O

网友答案:

You could try

library(stringi)
#Here `A` and `B` are "data.frames"
m1 <- stri_list2matrix( Map(`intersect`, A, B), fill='')
C <- setNames(as.data.frame(m1, stringsAsFactors=FALSE), colnames(A))
C
#   ColumnA ColumnB ColumnC ColumnD
# 1       A       N       N       S
# 2       P      AS       A       Q
# 3               n       K       E
# 4                       O        

Or

lst <- lapply(rbind(A,B), function(x) x[duplicated(x)& x!=''] )
m2 <- sapply(lst, `length<-`, max(sapply(lst, length)))
m2[is.na(m2)] <- ''
as.data.frame(m2, stringsAsFactors=FALSE)
#  ColumnA ColumnB ColumnC ColumnD
#1       A       N       K       S
#2       P       n       A       Q
#3              AS       O       E
#4                       N        
网友答案:

Do you know how to use sqlite?

In sqlite you could try something like

SELECT DISTINCT newtable
FROM A 
WHERE newtable Not IN (SELECT DISTINCT newtable FROM B)

it shoulnt be too much hassle to create a .db file

Note: if you're running linux you have sqlite or sqlite3 installed already

分享给朋友:
您可能感兴趣的文章:
随机阅读: