Python爬虫，爬取136书屋的小说beautifulsoup4.py

大小: 2KB

文件类型: .py

金币: 1

下载: 0 次

发布日期: 2021-01-04
语言: Python
标签: bs

高速下载

资源简介

Python爬虫，爬取136书屋的小说beautifulsoup4.py 使用beautifulsoup4包进行html和xml的解析，使用urllib打开和操作网址使用前请先安装beautifulsoup4和urllib包，本示例使用的是Python2.7

资源截图

小图大图

代码片段和文件信息

#coding=utf-8

from urllib import URLopener
from bs4 import BeautifulSoup as BS
import os
import sys

if __name__ == ‘__main__‘:
    Bfolder = r“D:\LILUO\6.MyTools\12.beautifulsoup4\books“
    
    url = “http://www.136book.com/“
    html = URLopener（）.open（url）
    soup = BS（html.read（） “html.parser“）
    
    a = soup.find_all（name=‘a‘）
    BookDict = {}
    for each in a:
        if “http://www.136book.com/“ in each.get（‘href‘）:
            if each.get（‘title‘）:
                BookDict[each.get（‘href‘）] = each.get（‘title‘）
    html.close（）

    for burl in BookDict:
        #burl = “http://www.136book.com/zetianji/“
        bhtml = URLopener（）.open（burl）
        bsoup = BS（bhtml.read（） “html.parser“）
        ba = bsoup.find_all（name=‘a‘）
        path = Bfol

上一篇：openmv识别特定颜色且打印坐标到串口
下一篇：Python 凸包算法

共有条评论

Python爬虫，爬取136书屋的小说beautifulsoup4.py

资源简介

资源截图

代码片段和文件信息

评论

相关资源