当前位置: 动力学知识库 > 问答 > 编程问答 >

Python: search for strings listed in one file from another text file?

问题描述:

I want to find strings listed in list.txt (one string per line) in another text file in case I found it print 'string,one_sentence' in case didn't find 'string,another_sentence'. I'm using following code, but it is finding only last string in the strings list from file list.txt. Cannot understand what could be the reason?

data = open('c:/tmp/textfile.TXT').read()

for x in open('c:/tmp/list.txt').readlines():

if x in data:

print(x,',one_sentence')

else:

print(x,',another_sentence')

网友答案:

When you read a file with readlines(), the resulting list elements do have a trailing newline characters. Likely, these are the reason why you have less matches than you expected.

Instead of writing

for x in list:

write

for x in (s.strip() for s in list):

This removes leading and trailing whitespace from the strings in list. Hence, it removes trailing newline characters from the strings.

In order to consolidate your program, you could do something like this:

with open('c:/tmp/textfile.TXT') as f:
    haystack = f.read()

if not haystack:
    sys.exit("Could not read haystack data :-(")

with open('c:/tmp/list.txt') as f:
    for needle in (line.strip() for line in f):
        if needle in haystack:
            print(needle, ',one_sentence')
        else:
            print(needle, ',another_sentence')

I did not want to make too drastic changes. The most important difference is that I am using the context manager here via the with statement. It ensures proper file handling (mainly closing) for you. Also, the 'needle' lines are stripped on the fly using a generator expression. The above approach reads and processes the needle file line by line instead of loading the whole file into memory at once. Of course, this only makes a difference for large files.

网友答案:

readlines() keeps a newline character at the end of each string read from your list file. Call strip() on those strings to remove those (and every other whitespace) characters.

分享给朋友:
您可能感兴趣的文章:
随机阅读: