GrabClass.py爬取武汉理工大学课表

大小: 2KB

文件类型: .py

金币: 1

下载: 0 次

发布日期: 2021-05-13
语言: Python
标签: 爬虫

高速下载

资源简介

对上个爬虫代码的补充，主要用于武汉理工大学课表的爬取

资源截图

小图大图

代码片段和文件信息

#!/usr/bin/env python
# -*- coding:utf-8 -*-
#author:universtar
#time：18/4/12


from urllib import request
from urllib import parse
from bs4 import BeautifulSoup
import time
import re
#响应头信息
headers = {
    ‘User-Agent‘:‘Mozilla/5.0 （Windows NT 6.1; WOW64） AppleWebKit/537.36 （KHTML like Gecko） Chrome/57.0.2987.98 Safari/537.36‘
}
#目标url
url = ‘http://sso.jwc.whut.edu.cn/Certification//login.do‘

#获取原网页返回的html
def get_html（urluserNamepassword）:
    #添加进入教务处的信息
    data = {
        ‘systemId‘:‘‘
        ‘xmlmsg‘:‘‘
        ‘userName‘:userName
        ‘password‘:password
        ‘type‘:‘xs‘
    }
    #将信息格式编码为html格式
    data = parse.urlencode（data）.encode（‘utf-8‘）
    #提交请求
    req = request.Request（url=urlheaders=headersdata=data）
    response = request.urlopen（req）
    #获取网页html代码
    html =  response.read（）
    return html

#
def get_info（htmlresponse）:
    #获得soup对象
    soup = BeautifulSoup（htmlresponse ‘html.parser‘ from_encoding=‘utf-8‘）
    #从soup对象中截取到所要的信息
    infos = soup.find_all（‘div‘style=“margin-top: 2px; font-size: 10px“）

上一篇：正则表达式到dfagraphviz输出图像
下一篇：小甲鱼Python零基础免费全套视频教学-百度网盘地址以及密码.txt

共有条评论

GrabClass.py爬取武汉理工大学课表

资源简介

资源截图

代码片段和文件信息

评论

相关资源