Python Language => HTML 구문 분석

BeautifulSoup에서 요소 뒤에 텍스트 찾기

다음 HTML을 가지고 있다고 가정 해보십시오.

<div>
    <label>Name:</label>
    John Smith
</div>

그리고 label 요소 다음에 "John Smith"라는 텍스트를 찾아야합니다.

이 경우 텍스트로 label 요소를 찾은 다음 .next_sibling 속성 을 사용할 .next_sibling 있습니다 .

from bs4 import BeautifulSoup

data = """
<div>
    <label>Name:</label>
    John Smith
</div>
"""

soup = BeautifulSoup(data, "html.parser")

label = soup.find("label", text="Name:")
print(label.next_sibling.strip())

John Smith 인쇄합니다.

BeautifulSoup에서 CSS 선택기 사용하기

BeautifulSoup는 CSS 선택기에 대한 지원 이 제한적 이지만 가장 일반적으로 사용되는 CSS를 지원 합니다. select() 메서드를 사용 select() 여러 요소를 찾고 select_one() 을 사용하여 단일 요소를 찾습니다.

기본 예 :

from bs4 import BeautifulSoup

data = """
<ul>
    <li class="item">item1</li>
    <li class="item">item2</li>
    <li class="item">item3</li>
</ul>
"""

soup = BeautifulSoup(data, "html.parser")

for item in soup.select("li.item"):
    print(item.get_text())

인쇄물:

item1
item2
item3

PyQuery

pyquery는 python을위한 jquery와 같은 라이브러리입니다. CSS 선택기를 아주 잘 지원합니다.

from pyquery import PyQuery

html = """
<h1>Sales</h1>
<table id="table">
<tr>
    <td>Lorem</td>
    <td>46</td>
</tr>
<tr>
    <td>Ipsum</td>
    <td>12</td>
</tr>
<tr>
    <td>Dolor</td>
    <td>27</td>
</tr>
<tr>
    <td>Sit</td>
    <td>90</td>
</tr>
</table>
"""

doc = PyQuery(html)

title = doc('h1').text()

print title

table_data = []

rows = doc('#table > tr')
for row in rows:
    name = PyQuery(row).find('td').eq(0).text()
    value = PyQuery(row).find('td').eq(1).text()

    print "%s\t  %s" % (name, value)

Modified text is an extract of the original Stack Overflow Documentation

아래 라이선스 CC BY-SA 3.0

와 제휴하지 않음 Stack Overflow

Python Language
HTML 구문 분석

수색…

BeautifulSoup에서 요소 뒤에 텍스트 찾기

BeautifulSoup에서 CSS 선택기 사용하기

PyQuery