Python爬虫爬取百度百科词条源码

大小: 20KB

文件类型: .zip

金币: 1

下载: 0 次

发布日期: 2021-01-07
语言: Python
标签: python 爬虫代码

高速下载

资源简介

使用Python编写的爬取百度百科词条信息的Demo源代码，具体看博客：http://blog.csdn.net/tianmaxingkong_/article/details/52959784

资源截图

小图大图

代码片段和文件信息

# coding=utf-8
import urllib2

class HtmlDownloader（object）:

    def download（self url）:
        if url is None:
            return

        response = urllib2.urlopen（url）

        if response.getcode（） != 200:
            return None

        return response.read（）.decode（‘utf-8‘）

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2016-10-28 10:27  test_spider\
     文件        2297  2016-10-28 10:20  test_spider\spider_main.py
     文件         822  2016-10-28 10:27  test_spider\html_outputer.py
     文件        1831  2016-10-28 10:14  test_spider\html_parser.pyc
     文件         783  2016-10-28 10:14  test_spider\html_downloader.pyc
     文件        1618  2016-10-28 10:14  test_spider\url_manager.pyc
     文件       29684  2016-10-28 10:27  test_spider\output.html
     文件        1598  2016-10-28 10:27  test_spider\html_outputer.pyc
     文件         285  2016-10-28 09:42  test_spider\html_downloader.py
     文件         656  2016-10-28 09:39  test_spider\url_manager.py
     文件        1685  2016-10-28 10:06  test_spider\html_parser.py
     文件           0  2016-10-28 08:19  test_spider\__init__.py

上一篇：tcpudp;端口扫描器
下一篇：携程机票python爬取脚本优化版本

共有条评论

Python爬虫爬取百度百科词条源码

资源简介

资源截图

代码片段和文件信息

评论

相关资源