资源简介
爬取某一天携程机票,可以根据自己输入的始发地,目的地,日期,然后自动爬取该天全部机票的票价,航班,发机时间,还有可以自动保存到excel里面,同时是用来学习爬取网络一个好的例子,最主要的是能自动保存到excel里面
代码片段和文件信息
#!/usr/bin/env python
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
from datetime import datetime
from dateutil.parser import parse
from bs4 import BeautifulSoup
import re
from lxml import etree
import win32com.client as win32
headers = {
‘User-Agent‘:‘Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE‘
‘Accept‘:‘text/htmlapplication/xhtml+xmlapplication/xml;q=0.9image/webpimage/apng*/*;q=0.8‘
‘Connection‘:‘keep-alive‘
‘Accept-Encoding‘:‘gzip deflate br‘
}
city = {
‘上海‘:‘sha‘‘北京‘:‘bjs‘‘广州‘:‘can‘‘深圳‘:‘szx‘
‘海口‘:‘hak‘‘三亚‘:‘syx‘‘杭州‘:‘hgh‘‘武汉‘:‘wuh‘
‘成都‘:‘ctu‘‘西安‘:‘sia‘‘重庆‘:‘ckg‘‘青岛‘:‘tao‘
‘长沙‘:‘csx‘‘南京‘:‘nkg‘‘厦门‘:‘xmn‘‘昆明‘:‘kmg‘
‘大连‘:‘dlc‘‘天津‘:‘tsn‘‘郑州‘:‘cgo‘‘济南‘:‘tna‘
‘福州‘:‘foc‘
}
all_price = []
result = []
adrress_start = input(‘Please input a StartAdrress:‘)
adrres_end = input(‘Please input a EndAdrress:‘)
GoDate = input(‘Please input a Date:‘)
dest_filename = ‘机票2.xlsx‘
app = ‘Excel‘
xl = win32.gencache.EnsureDispatch(‘%s.Application‘ % app)
wb = xl.Workbooks.Add()
sh = wb.ActiveSheet
sh.Cells.NumberFormatLocal = “@“
xl.Visible = True
sh.Cells(11).Value = adrress_start
sh.Cells(12).Value = adrres_end
GoDate = str(GoDate)
if GoDate!=‘‘:
GoDate = parse(GoDate)
GoDate = GoDate.strftime(‘%Y-%m-%d‘)
sh.Cells(13).Value = GoDate
else:
GoDate = datetime.today()
GoDate = GoDate.strftime(‘%Y-%m-%d‘)
sh.Cells(13).Value = GoDate
# ~ GoDate.asctime()
id_list = re.compile(r‘[a-zA-Z0-9\._+-]*‘)
adrress_start = city.get(adrress_start)
adrres_end = city.get(adrres_end)
link = ‘{}-{}?date={}‘
links = link.format(adrress_startadrres_endGoDate)
url = ‘https://flights.ctrip.com/itinerary/oneway/‘+links
expression = re.compile(r‘^\/itinerary\/oneway\/[a-z]+[\-]+[a-z]+\?date\=[0-9]+[-]+[0-9]+[-]+[0-9]+‘)
driver = webdriver.PhantomJS(executable_path=‘F:/书籍/Python/Python_work/phantomjs-2.1.1-windows/bin/phantomjs‘)
driver.get(url)
def main():
num = 0
inb_find()
price()
result_ls()
for (namestart_tend_ta_price) in zip(airport_namestart_timeend_timeall_price):
sh.Cells(num + 21).Value = name
sh.Cells(num +
- 上一篇:Python自动化运维视频开发
- 下一篇:携程机票python爬取脚本最终优化版本
相关资源
- python实现SGBM图像匹配算法
- python实现灰度直方图均衡化
- scrapy_qunar_one
- Python学习全系列教程永久可用
- python简明教程.chm
- 抽奖大转盘python的图形化界面
- 双边滤波器实验报告及代码python
- python +MYSQL+HTML实现21蛋糕网上商城
- Python-直播答题助手自动检测出题搜索
- OpenCV入门教程+OpenCV官方教程中文版
- Python 串口工具源码+.exe文件
- Python开发的全栈股票系统.zip
- Python操作Excel表格并将其中部分数据写
- python书籍 PDF
- 利用python绘制散点图
- python+labview+No1.vi
- 老男孩python项目实战
- python源码制作whl文件.rar
- python3.5可用的scipy
- PYTHON3 经典50案例.pptx
- 计算机科学导论-python.pdf
- python模拟鼠标点击屏幕
- windows鼠标自动点击py脚本
- 鱼c小甲鱼零基础学python全套课后题和
- Python 练习题100道
- Practical Programming 2nd Edition
- wxPython Application Development Cookbook
- python 3.6
- Python 3.5.2 中文文档 互联网唯一CHM版本
- python3.5.2.chm官方文档
评论
共有 条评论