当前位置: 动力学知识库 > 问答 > 编程问答 >

python - How do I match a tag containing only the stated class, not any others, using BeautifulSoup?

问题描述:

Is there a way to use BeautifulSoup to match a tag with only the indicated class attribute, not the indicated class attribute and others? For example, in this simple HTML:

<html>

<head>

<title>

Title here

</title>

</head>

<body>

<div class="one two">

some content here

</div>

<div class="two">

more content here

</div>

</body>

</html>

is it possible to match only the div with class="two", but not match the div with class="one two"? Unless I'm missing something, that section of the documentation doesn't give me any ideas. This is the code I'm using currently:

from bs4 import BeautifulSoup

html = '''

<html>

<head>

<title>

Title here

</title>

</head>

<body>

<div class="one two">

should not be matched

</div>

<div class="two">

this should be matched

</div>

</body>

</html>

'''

soup = BeautifulSoup(html)

div_two = soup.find("div", "two")

print(div_two.contents[0].strip())

I'm trying to get this to print this should be matched instead of should not be matched.

EDIT: In this simple example, I know that the only options for classes are "one two" or "two", but in production code, I'll only know that what I want to match will have class "two"; other tags could have a large number of other classes in addition to "two", which may not be known.

On a related note, it's also helpful to read the documentation for version 4, not version 3 as I previously linked.

网友答案:

Try:

divs = soup.findAll('div', class="two")

for div in divs:
    if div['class'] == ['two']:
        pass # handle class="two"
    else:
        pass # handle other cases, including but not limited to "one two"
网友答案:

Hope, below code helps you. Though I didn't try this one.

soup.find("div", { "class" : "two" })
分享给朋友:
您可能感兴趣的文章:
随机阅读: