当前位置: 动力学知识库 > 问答 > 编程问答 >

numpy - Extracting data from multiple files with python

问题描述:

I'm trying to extract data from a directory with 12 .txt files. Each file contains 3 columns of data (X,Y,Z) that i want to extract. I want to collect all the data in one df(InforDF), but so far i only succeeded in creating a df with all of the X,Y and Z data in the same column. This is my code:

import pandas as pd

import numpy as np

import os

import fnmatch

path = os.getcwd()

file_list = os.listdir(path)

InfoDF = pd.DataFrame()

for file in file_list:

try:

if fnmatch.fnmatch(file, '*.txt'):

filedata = open(file, 'r')

df = pd.read_table(filedata, delim_whitespace=True, names={'X','Y','Z'})

except Exception as e:

print(e)

What am i doing wrong?

网友答案:
df = pd.read_table(filedata, delim_whitespace=True, names={'X','Y','Z'})

this line replace df at each iteration of the loop, that's why you only have the last one at the end of your program.

what you can do is to save all your dataframe in a list and concatenate them at the end

df_list = []
for file in file_list:
    try:
        if fnmatch.fnmatch(file, '*.txt'): 
            filedata = open(file, 'r')
            df_list.append(pd.read_table(filedata, delim_whitespace=True, names={'X','Y','Z'}))
df = pd.concat(df_list)

alternatively, you can write it:

df_list = pd.concat([pd.read_table(open(file, 'r'), delim_whitespace=True, names={'X','Y','Z'})  for file in file_list if fnmatch.fnmatch(file, '*.txt')])
网友答案:

I think you need glob for select all files, create list of DataFrames dfs in list comprehension and then use concat:

files = glob.glob('*.txt')
dfs = [pd.read_csv(fp, delim_whitespace=True, names=['X','Y','Z']) for fp in files]

df = pd.concat(dfs, ignore_index=True)
网友答案:
  • As camilleri mentions above, you are overwriting df in your loop
  • Also there is no point catching a general exception

Solution: Create an empty dataframe InfoDF before the loop and then use append or concat to populate it with smaller dfs

import pandas as pd
import numpy as np
import os
import fnmatch

path = os.getcwd()

file_list = os.listdir(path)

InfoDF = pd.DataFrame(columns={'X','Y','Z'}) # create empty dataframe
for file in file_list:
    if fnmatch.fnmatch(file, '*.txt'): 
        filedata = open(file, 'r')
        df = pd.read_table(filedata, delim_whitespace=True, names={'X','Y','Z'})
        InfoDF.append(df, ignore_index=True)
print InfoDF
分享给朋友:
您可能感兴趣的文章:
随机阅读: