资源简介
实现对正方教务系统成绩的爬取。
实现对正方教务系统成绩的爬取。
代码片段和文件信息
# -*- coding:gb2312 -*-
import urllib urllib2 cookielib
import re os string
from bs4 import BeautifulSoup
# from PIL import Image
import sys
reload(sys)
sys.setdefaultencoding(‘gb2312‘)
baseUrl = ‘http://222.24.19.201/‘
codeUrl = ‘CheckCode.aspx‘
loginUrl = ‘default2.aspx‘
scoreUrl = ‘xscjcx.aspx‘
def downImg(url name):
‘‘‘
下载验证码
:param url:验证码获取接口
:param name: 验证码存储文件名
:return:
‘‘‘
try:
req = urllib2.Request(url)
req = urllib2.urlopen(req)
content = req.read()
file = open(os.getcwd() + ‘/‘ + name ‘w+b‘)
file.write(content)
file.close()
except Exception e:
print ‘Error :‘ e
def setCookie():
‘‘‘
创建cookie
:return:cookie句柄
‘‘‘
cookie = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))
urllib2.install_opener(opener)
opener.open(baseUrl)
return cookie
def login(username password cookie):
‘‘‘
登录教务系统
:param username:用户名
:param password:密码
:param cookie:setcookie的cookie句柄
:return:用户名以及session_id
‘‘‘
request = urllib2.Request(baseUrl)
text = urllib2.urlopen(request).read()
downImg(baseUrl + codeUrl ‘code.png‘)
# image = Image.open(‘code.png‘)
# print image_to_string(image)
code = raw_input(‘请输入验证码:‘)
soup = BeautifulSoup(text ‘html.parser‘)
_VIEWSTATE = soup.find_all(‘input‘)[0].get(‘value‘)
headers = {
‘User-Agent‘ : ‘Mozilla/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko/20100101 Firefox/14.0.1‘
‘Referer‘ : baseUrl
}
postData = {
‘__VIEWSTATE‘ : _VIEWSTATE
‘txtUserName‘ : username
‘TextBox2‘ : password
‘txtSecretCode‘ : code
‘RadioButtonList1‘ : ‘学生‘
‘Button1‘ : ‘‘
‘lbLanguage‘ : ‘‘
‘hidPdrs‘ : ‘‘
‘hidsc‘ : ‘‘
}
postData = urllib.urlencode(postData)
request = urllib2.Request(baseUrl + loginUrl postData headers)
response = urllib2.urlopen(request)
text = response.read()
soup = BeautifulSoup(text ‘html.parser‘)
if re.search(‘验证码不正确‘ text):
print ‘验证码错误‘
exit(1)
elif re.search(‘‘ text):
result = {}
name = soup.find(id = ‘xhxm‘).string
name = name.decode(‘gb2312‘).encode(‘gb2312‘)
name = string.replace(name ‘同学‘ ‘‘)
result[‘name‘] = name
session_id = cookie._cookies[‘222.24.19.201‘][‘/‘][‘ASP.NET_SessionId‘].value
result[‘session_id‘] = session_id
return result
else:
print ‘登录失败‘
exit(1)
def getScore(username name session_id ddlXN ddlXQ):
- 上一篇:PyQt5初级教程
- 下一篇:python实现三次自然样条插值
相关资源
- python实现SGBM图像匹配算法
- python实现灰度直方图均衡化
- scrapy_qunar_one
- Python学习全系列教程永久可用
- python简明教程.chm
- 抽奖大转盘python的图形化界面
- 双边滤波器实验报告及代码python
- python +MYSQL+HTML实现21蛋糕网上商城
- Python-直播答题助手自动检测出题搜索
- OpenCV入门教程+OpenCV官方教程中文版
- Python 串口工具源码+.exe文件
- Python开发的全栈股票系统.zip
- Python操作Excel表格并将其中部分数据写
- python书籍 PDF
- 利用python绘制散点图
- python+labview+No1.vi
- 老男孩python项目实战
- python源码制作whl文件.rar
- python3.5可用的scipy
- PYTHON3 经典50案例.pptx
- 计算机科学导论-python.pdf
- python模拟鼠标点击屏幕
- windows鼠标自动点击py脚本
- 鱼c小甲鱼零基础学python全套课后题和
- Python 练习题100道
- Practical Programming 2nd Edition
- wxPython Application Development Cookbook
- python 3.6
- Python 3.5.2 中文文档 互联网唯一CHM版本
- python3.5.2.chm官方文档
评论
共有 条评论