当前位置: 动力学知识库 > 问答 > 编程问答 >

python - pandas read_fwf special character not loading correctly

问题描述:

so i have the following data in test.txt:

étoufee

placing

and the following code:

import pandas as pd

import numpy as np

widths = [4,3]

names = ["part1", "part2"]

df = pd.read_fwf('test.txt',widths = widths, names = names, encoding = 'utf8')

print df

and the output is:

 part1 part2

0 éto ufe

1 plac ing

notice the first line. the special character is causing read_fwf to read the length correctly, and we're losing data. I've tried setting encoding = utf-8 but that didn't work. Any other options?


for those who might look at this in the future, here's the updated code

# encoding=utf8

import pandas as pd

import numpy as np

from io import StringIO

import sys, locale

import codecs

with codecs.open('test.txt','r',encoding='utf8') as f:

text = f.read()

widths = [4,3]

names = ["part1", "part2"]

df = pd.read_fwf(StringIO(text),widths = widths, names = names, encoding = 'utf8')

print(df)

网友答案:

NOT AN ANSWER
just possibly helpful

txt = """étoufee
placing"""

import pandas as pd
import numpy as np
from io import StringIO

widths = [4,3]
names = ["part1", "part2"]

df = pd.read_fwf(StringIO(txt),widths = widths, names = names, encoding = 'utf8')
print(df)

  part1 part2
0  étou   fee
1  plac   ing

import sys, locale
print(sys.version)
print(pd.__version__)
print(sys.getfilesystemencoding())
print(sys.getdefaultencoding())
print(locale.getlocale())

3.5.2 |Anaconda custom (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
0.19.0
utf-8
utf-8
('en_US', 'UTF-8')
分享给朋友:
您可能感兴趣的文章:
随机阅读: