煎蛋网图片爬虫

大小: 1.82KB

文件类型: .py

金币: 1

下载: 0 次

发布日期: 2021-03-01
语言: Python
标签: 爬虫图片

高速下载

资源简介

煎蛋网图片爬虫

资源截图

小图大图

代码片段和文件信息

import urllib.request
import os
import base64

def url_open（url）:
    headers = {‘User-Agent‘:‘Mozilla/5.0 （Windows NT 6.1; WOW64; rv:23.0） Gecko/20100101 Firefox/23.0‘}  
    req = urllib.request.Request（urlheaders=headers）
    response = urllib.request.urlopen（req）
    html = response.read（）
    return html

    
def get_page（url）:
    html = url_open（url）.decode（‘utf-8‘）
    a = html.find（‘current-comment-page‘）+23#‘‘‘偏移23个字符到页数位置‘‘‘
    b = html.find（‘]‘a）
    return html[a:b]


def find_imgs（url）:
    html = url_open（url）.decode（‘utf-8‘）
    img_addrs =[]
    a = html.find（‘img src=‘）
    
    while a != -1:
        b = html.find（‘.jpg‘aa+255）#从a开始寻找jpg，截止到a+255，既最大字符数
        if b != -1:
            img_addrs.append（‘http:‘+ html[a+9:b+4]）#+9是‘img src=‘到图片链接前的字符数，+4是添加到.jpg的末尾
        else:

上一篇：python对于文件和异常的处理
下一篇：豆瓣电影排名250爬取，并存excel

共有条评论

煎蛋网图片爬虫

资源简介

资源截图

代码片段和文件信息

评论

相关资源