当前位置: 动力学知识库 > 问答 > 编程问答 >

Python BeautifulSoup is unnecessarily slow

问题描述:

While this code works pretty fast:

for olay in soup("li", {"class":"textb"}):

tanim = olay("strong")

try:

print tanim[0]

except IndexError:

pass

Getting string property like this makes this code considerably slower:

for olay in soup("li", {"class":"textb"}):

tanim = olay("strong")

try:

print tanim[0].string

except IndexError:

pass

My question is, am I doing something that I shouldn't getting string property like that? Should I have used something else to get plain text version of an object?

Update:

This is also working pretty fast, so slowness is unique to string property I guess?

for olay in soup("li", {"class":"textb"}):

tanim = olay("strong")

try:

print tanim[0].text

except IndexError:

pass

网友答案:

If you just want to print the string representation of tanim[0]. You should just do: print str(tanim[0]). Also, do a dir(tanim[0]) to see if it has a property called string at all.

for olay in soup("li", {"class":"textb"}):
    tanim = olay("strong")
    try:
        print str(tanim[0])
    except IndexError:
        pass

For everyone to provide a better answer, you could also post the target HTML or the URI and mention which bit you are trying to extract out of it.

分享给朋友:
您可能感兴趣的文章:
随机阅读: