资源简介
这个版本是最优化版本,可以不断的输入想要爬取的时间,地点,不断的爬取,使用了GUI,使得可以根据自己选择某一天来查询机票价格,或者未来90天的价格,可以根据自己输入的始发地,目的地,日期,然后自动爬取该天全部机票的票价,航班,发机时间,还有可以自动保存到excel里面,同时是用来学习爬取网络一个好的例子,最主要的是能自动保存到excel里面,文件名字都是根据自己来输入,很智能化,python爬虫最终版本我相信不会让大家失望,真的真的很互动也很智能,如果好用,麻烦给个好评,至于最后爬取内容是在文档那个文件夹里面
代码片段和文件信息
#!/usr/bin/env python
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
from datetime import datetime
from dateutil.parser import parse
from bs4 import BeautifulSoup
import re
import win32com.client as win32
import win32gui
import requests
from tkinter import *
import sys
headers = {
‘User-Agent‘:‘Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE‘
‘Accept‘:‘text/htmlapplication/xhtml+xmlapplication/xml;q=0.9image/webpimage/apng*/*;q=0.8‘
‘Connection‘:‘keep-alive‘
‘Accept-Encoding‘:‘gzip deflate br‘
}
city = {
‘上海‘:‘sha‘‘北京‘:‘bjs‘‘广州‘:‘can‘‘深圳‘:‘szx‘
‘海口‘:‘hak‘‘三亚‘:‘syx‘‘杭州‘:‘hgh‘‘武汉‘:‘wuh‘
‘成都‘:‘ctu‘‘西安‘:‘sia‘‘重庆‘:‘ckg‘‘青岛‘:‘tao‘
‘长沙‘:‘csx‘‘南京‘:‘nkg‘‘厦门‘:‘xmn‘‘昆明‘:‘kmg‘
‘大连‘:‘dlc‘‘天津‘:‘tsn‘‘郑州‘:‘cgo‘‘济南‘:‘tna‘
‘福州‘:‘foc‘
}
all_price = []
result = []
go_On = True
def oneDay_ticket():
try:
global soup
adrress_start = input(‘Please input a StartAdrress:‘)
adrres_end = input(‘Please input a EndAdrress:‘)
GoDate = input(‘Please input a Date:‘)
name = input(‘Please input filename:‘)
dest_filename = ‘%s.xlsx‘% name
app = ‘Excel‘
xl = win32.gencache.EnsureDispatch(‘%s.Application‘ % app)
wb = xl.Workbooks.Add()
sh = wb.ActiveSheet
sh.Cells.NumberFormatLocal = “@“
xl.Visible = True
sh.Cells(11).Value = adrress_start
sh.Cells(12).Value = adrres_end
sh.Cells(14).Value = ‘价格(元)‘
GoDate = str(GoDate)
if GoDate!=‘‘:
GoDate = parse(GoDate)
GoDate = GoDate.strftime(‘%Y-%m-%d‘)
sh.Cells(13).Value = GoDate
else:
GoDate = datetime.today()
GoDate = GoDate.strftime(‘%Y-%m-%d‘)
sh.Cells(13).Value = GoDate
# ~ GoDate.asctime()
id_list = re.compile(r‘[a-zA-Z0-9\._+-]*‘)
adrress_start = city.get(adrress_start)
adrres_end = city.get(adrres_end)
link = ‘{}-{}?date={}‘
links = link.format(adrress_startadrres_endGoDate)
url = ‘https://flights.ctrip.com/itinerary/oneway/‘+links
expression = re.compile(r‘^\/itinerary\/oneway\/[a-z]+[\-]+[a-z]+\?date\=[0-9]+[-]+[0-9]+[-]+[0-9]+‘)
driver = webdriver.PhantomJS(executable_path=‘D:/Python/Python_work/phantomjs-2.1.1-windows/bin/phantomjs‘)
driver.get(url)
time.sleep(3)
sourcePage = driver.page_source
soup = BeautifulSoup(sourcePage “lxml“)
num = 0
inb_find()
price()
# ~ result_ls()
for (namestart_tend_ta_price) in zip(airport_namestart_timeend_timeall_price):
sh.Cells(num + 21).Value = name
sh.Cells(num + 22).Value = start_t
sh.Cells(num + 23).Value = end_t
sh.Cells(num + 24).Value = a_price
num +=1
sh.Cells.Replace(“¥“““)
sh.Cells.Columns.AutoFit
sh.SaveAs(dest_filename)
finally:
wb.Close(False)
xl.Application.Quit()
#
相关资源
- python实现SGBM图像匹配算法
- python实现灰度直方图均衡化
- scrapy_qunar_one
- Python学习全系列教程永久可用
- python简明教程.chm
- 抽奖大转盘python的图形化界面
- 双边滤波器实验报告及代码python
- python +MYSQL+HTML实现21蛋糕网上商城
- Python-直播答题助手自动检测出题搜索
- OpenCV入门教程+OpenCV官方教程中文版
- Python 串口工具源码+.exe文件
- Python开发的全栈股票系统.zip
- Python操作Excel表格并将其中部分数据写
- python书籍 PDF
- 利用python绘制散点图
- python+labview+No1.vi
- 老男孩python项目实战
- python源码制作whl文件.rar
- python3.5可用的scipy
- PYTHON3 经典50案例.pptx
- 计算机科学导论-python.pdf
- python模拟鼠标点击屏幕
- windows鼠标自动点击py脚本
- 鱼c小甲鱼零基础学python全套课后题和
- Python 练习题100道
- Practical Programming 2nd Edition
- wxPython Application Development Cookbook
- python 3.6
- Python 3.5.2 中文文档 互联网唯一CHM版本
- python3.5.2.chm官方文档
评论
共有 条评论