python 爬虫利器优美的BeautifulSoup

news/2024/7/19 11:20:45 标签: 爬虫, python

    近期在研究py的网络编程,编写爬虫也是顺利成章的,开始在纠结与用正则表达式来匹配,到后来发现了Beautifulsoup,用他可以非常完美的帮我完成了这些任务:

   

    Beautiful Soup 是用Python写的一个HTML/XML的解析器,它可以很好的处理不规范标记并生成剖析树(parse tree)。 它提供简单又常用的导航(navigating),搜索以及修改剖析树的操作。它可以大大节省你的编程时间。

   简单使用说明:

python" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background-image:none;border:0px;line-height:1.1em;vertical-align:baseline;">
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>>  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">from  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">bs4  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">import  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">BeautifulSoup
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> html_doc  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">=  python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">"""
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <html><head><title>The Dormouse's story</title></head>
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">...  
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <p class="title"><b>The Dormouse's story</b></p>
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">...  
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <p class="story">Once upon a time there were three little sisters; and their names were
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... and they lived at the bottom of a well.</p>
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">...  
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... <p class="story">...</p>
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">... """
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">=  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">BeautifulSoup(html_doc)
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.head()
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[<title>The Dormouse's story< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">title>]
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.title
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;"><title>The Dormouse's story< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">title>
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.title.string
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">u python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"The Dormouse's story"
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.body.b
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;"><b>The Dormouse's story< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">b>
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.body.a
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;"><a  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">class python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"sister"  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">href python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"http://example.com/elsie"  python functions" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(255,20,147);">id python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"link1" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>Elsie< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">a>
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.get_text()
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">u python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"... The Dormouse's story\n...  \n... The Dormouse's story\n...  \n... Once upon a time there were three little sisters; and their names were\n... Elsie,\n... Lacie and\n... Tillie;\n... and they lived at the bottom of a well.\n...  \n... ...\n... "
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>> soup.find_all( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'a' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">)
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[<a  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">class python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"sister"  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">href python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"http://example.com/elsie"  python functions" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(255,20,147);">id python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"link1" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>Elsie< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">a>, <a  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">class python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"sister"  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">href python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"http://example.com/lacie"  python functions" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(255,20,147);">id python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"link2" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>Lacie< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">a>, <a  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">class python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"sister"  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">href python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"http://example.com/tillie"  python functions" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(255,20,147);">id python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"link3" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>Tillie< python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">a>]
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">>>>  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">for  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">key  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">in  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">soup.find_all( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'a' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">):
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">...      python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">print  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">key.get( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'class' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">),key.get( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'href' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">)
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">... 
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[ python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'sister' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">] http: python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">example.com python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">elsie
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[ python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'sister' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">] http: python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">example.com python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">lacie
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[ python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">'sister' python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">] http: python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">example.com python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">/ python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">tillie

###通过里面的方法,可以很快调出里面的元素和结果:

简单说明:

soup.body:表示显示body标签下面的内容,也可以用.来叠加标签:

soup.title.string:表示现在titile的文本内容

 soup.get_text():表示显示所有文本内容:

soup.find_all():方式可以随意组合,也可以通过任意标签,包括class,id 等方式:

举例说明:以我常常看的直播表新闻为例;

1、首先看看我们要获得的内容:

wKiom1YXFvDTi66tAAT1iE-LNKQ789.jpg

我要获取的是上面那一栏热点新闻:如世预赛国足不敌卡塔而

2、源代码查看:

1
2
< div  class = "fb_bbs" >< a  href = "http://news.zhibo8.cc/zuqiu/"  style = "padding: 0 5px 0 0;"  target = "_blank"  title = "足球新闻" >< img  src = "/css/images/football.png" /></ a >< span >< a  href = "http://news.zhibo8.cc/zuqiu/"  target = "_blank" >< font  color = "red" > 世预赛:国足0-1不敌卡塔
尔</ font ></ a >|< a  href = "http://news.zhibo8.cc/zuqiu/2015-10-09/5616a910d74ac.htm"  target = "_blank" >国足“刷卡”耻辱:11年不胜</ a >|< a  hf = "http://news.zhibo8.cc/zuqiu/2015-10-09/5616b22cbd134.htm"  target = "_blank" >切尔西签下阿梅利亚</ a >|< a  href = "http://news.zhibo8.cc/zuqiu/2015-10-09/5616daa45ee48.htm"  target = "_blank" >惊人!莱万5场14球</ a >|< a  href = "http://tu.zhibo8.cc/zuqiu/"  target = "_blank" >图-FIFA16中国球员</ a ></ span ></ div >

###从源码看到,这个是一个div 标签包裹的一个class=“fb_bbs”的版块,当然我们要确保这个是唯一的。

3、用BeautifulSoup来分析出结果代码如下:

python" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background-image:none;border:0px;line-height:1.1em;vertical-align:baseline;">
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);">#coding=utf-8
python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">import  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">urllib,urllib2
python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">from  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">bs4  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">import  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">BeautifulSoup
python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">try python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">:
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">     python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">html  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">=  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">urllib2.urlopen( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"http://www.zhibo8.cc" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">)
python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">except  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">urllib2.HTTPError as err:
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">     python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">print  python functions" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(255,20,147);">str python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">(err)
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">soup  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">=  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">BeautifulSoup(html)
python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">for  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">i  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">in  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">soup.find_all( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"div" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">,attrs python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">= python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">{ python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"class" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">: python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"fb_bbs" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">}):
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">     python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">result  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">=  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">i.get_text().split( python string" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#0000FF;">"|" python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">)
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">     python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">for  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">term  python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">in  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">result:
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">         python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">print  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">term
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">  
python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">4 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">、执行效果:
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">  
python spaces" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;">  python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">[root@master network] python comments" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,130,0);"># python url.py 
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">世预赛:国足 python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">0 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">1 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">不敌卡塔尔
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">国足“刷卡”耻辱: python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">11 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">年不胜
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">切尔西签下阿梅利亚
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">惊人!莱万 python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">5 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">场 python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">14 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">球
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">图 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">FIFA16中国球员
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">利物浦官方宣布克洛普上任
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">档案:克洛普的安菲尔德之旅
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">欧预赛 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">德国爆冷 python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">0 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">1 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">爱尔兰
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">葡萄牙 python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">1 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python value" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:rgb(0,153,0);">0 python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">胜丹麦
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">图 python keyword" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;font-weight:bold;color:rgb(0,102,153);">- python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">穆帅难罢手
 
python plain" style="font-family:Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace;font-size:1em;background:none;border:0px;line-height:1.1em;vertical-align:baseline;color:#000000;">到此任务差不多完成,代码量比re模块少了很多,而且简洁唯美,用py做爬虫确实是个利器;









本文转自 小罗ge11 51CTO博客,原文链接:http://blog.51cto.com/xiaoluoge/1701086,如需转载请自行联系原作者

http://www.niftyadmin.cn/n/1042513.html

相关文章

html中文字开头空行,html 中的空格和空行

html 中的空格和空行要用特殊的格式显示&#xff0c;否则空格和空行不会显示出来。一、在web开发经常会遇到如: 这样的字符。它其实是Html将一些特殊字符(Html语法字符)的一种表达方式。下面是几个常用字符&#xff1a;空格& &< <> >"…

java c python 性能测试_C/Java/Python/Objective-C在OS X上的性能实验

前几天看到一篇介绍python的文章&#xff1a;如何让python变得更快——http://www.codeproject.com/Articles/522217/Howplustoplusmakepluspythonplusfaster&#xff0c;这篇文章勾起了我的好奇心&#xff0c;同样的算法多种编程语言在Mac的OS X上跑会是个什么情况呢&#xff…

js获取网页屏幕宽高

document.body.clientWidth > BODY对象宽度document.body.clientHeight > BODY对象高度document.documentElement.clientWidth > 可见区域宽度document.documentElement.clientHeight > 可见区域高度 网页可见区域宽&#xff1a; document.body.clientWidth网页可…

java保证同一个线程单独运行_JAVA 多线程和并发基础面试问答

多线程和并发问题是 Java 技术面试中面试官比较喜欢问的问题之一。在这里&#xff0c;从面试的角度列出了大部分重要的问题&#xff0c;但是你仍然应该牢固的掌握Java多线程基础知识来对应日后碰到的问题。(校对注&#xff1a;非常赞同这个观点)Java 多线程面试问题1.进程和线程…

从0开始学Unity第一天---Unity的下载与安装

这篇文章主要介绍Unity引擎的下载与安装。Unity有两种下载方式&#xff1a; 1. 下载器下载 打开Unity官网 &#xff08;Ps:有时会被墙&#xff09;然后你可以找到首页大大的按钮 > 下载Unity 这里下载的程序只是一个下载器&#xff0c;所以你还要使用下载器下载需要的模块。…

java后台解决ajax跨域_ajax前台后台跨域请求处理方式

最近一直在搞公众号前台开发&#xff0c;遇到了ajax跨域请求的问题&#xff0c;像地区的省-市-县三级联动、汽车品牌-车系-车款的三级联动查询等都需要调用外部接口(其他工程项目的接口)完成。下面就分享一下个人解决跨域请求的方案&#xff0c;当然是在后台程序猿大哥的帮助下…

详解 WebAPI 签名机制

首先&#xff0c;写这篇文章的原因是因为最近某一个项目中的接口被人为调用了&#xff0c;导致了数据库数据被串改。虽然是内部人无意点的&#xff0c;但还是引起了我的担忧&#xff0c;所有整理了下关于WebAPI的相关签名机制。 一、我们在开发接口时&#xff0c;有时候嫌麻烦就…

php输出音频文件_PHP语言生成指定频率和声音的音频文件(代码教程)

本文主要向大家介绍了PHP语言生成指定频率和声音的音频文件(代码教程)&#xff0c;通过具体的内容向大家展示&#xff0c;希望对大家学习php语言有所帮助。找了好久没有找到,php生成音频的代码,自己写了份&#xff0c;代码如下:> 24 & 255;$wav[6] $headerLen >>…