当前位置: 动力学知识库 > 问答 > 编程问答 >

python - Deleting mulitple columns in Pandas

问题描述:

I have some data and when I import it I get the following unneeded columns I'm looking for an easy way to delete all of these

 'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',

'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',

'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',

'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',

'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',

'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',

'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',

'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',

'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',

'Unnamed: 60'

They are indexed by 0-indexing so I tried something like

 df.drop(df.columns[[22, 23, 24, 25,

26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

I've seen some examples which are similar Dropping multiple columns but this doesn't answer my question.

网友答案:

I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:

df = df[cols_of_interest]

Where cols_of_interest is a list of the columns you care about.

Or you can slice the columns and pass this to drop:

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

The call to head just selects 0 rows as we're only interested in the column names rather than data

update

Another method would be simpler would be to use the boolean mask from str.contains and invert it to mask the columns:

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df

Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []

In [4]:
~df.columns.str.contains('Unnamed:')

Out[4]:
array([ True, False, False,  True], dtype=bool)

In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]

Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []
网友答案:

The by far the simplest approach is:

yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)
网友答案:

This is probably a good way to do what you want. It will delete all columns that contain 'Unnamed' in their header.

for col in df.columns:
    if 'Unnamed' in col:
        del df[col]
网友答案:

The below worked for me:

for col in df:
    if 'Unnamed' in col:
        #del df[col]
        print col
        try:
            df.drop(col, axis=1, inplace=True)
        except Exception:
            pass
网友答案:

You can do this in one line and one go:

df.drop([col for col in df.columns if "Unnamed" in col], axis=1, inplace=True)

This involves less moving around/copying of the object than the solutions above.

分享给朋友:
您可能感兴趣的文章:
随机阅读: