资源简介
python爬虫爬取企业详细信息,并保存到mysql数据库,包含代理IP的使用。
代码片段和文件信息
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# 通过公司全名到企某查上查询详细信息
from bs4 import BeautifulSoup
import urllib.request
# import urllib2.request
import re
import pymysql
import random
import time
import requests
# 记录公司信息的字典,类似C语言的结构体
# 字典中的字段包括:companydomainlegal_personaddressemailphone
gCompanyInfo = dict()
proxy_list = [
{‘http‘: ‘http://163.125.71.159:9999‘}
{‘http‘: ‘http://180.118.242.17:61234‘}
{‘http‘: ‘http://218.1.236.197:50835‘}
{‘http‘: ‘http://219.135.168.15:53281‘}
{‘http‘: ‘http://175.148.75.226:1133‘}
{‘http‘: ‘http://221.238.151.98:45455‘}
{‘http‘: ‘http://110.72.18.208:8123‘}
{‘http‘: ‘http://122.246.53.146:8010‘}
{‘http‘: ‘http://180.118.240.49:808‘}
{‘http‘: ‘http://163.125.19.53:8888‘}
{‘http‘: ‘http://180.118.240.7:808‘}
{‘http‘: ‘http://112.95.206.225:9999‘}
{‘http‘: ‘http://171.13.36.155:808‘}
{‘http‘: ‘http://180.122.148.207:40021‘}
{‘http‘: ‘http://222.85.39.150:808‘}
{‘http‘: ‘http://182.88.186.245:8123‘}
{‘http‘: ‘http://113.128.27.236:39094‘}
{‘http‘: ‘http://121.31.102.218:8123‘}
{‘http‘: ‘http://115.46.69.90:8123‘}
{‘http‘: ‘http://60.175.213.163:38513‘}
{‘http‘: ‘http://182.202.221.202:61234‘}
{‘http‘: ‘http://121.205.254.46:29568‘}
{‘http‘: ‘http://110.73.42.237:8123‘}
{‘http‘: ‘http://115.46.78.105:8123‘}
{‘http‘: ‘http://171.13.36.167:808‘}
{‘http‘: ‘http://49.81.17.102:8888‘}
{‘http‘: ‘http://110.73.8.236:8123‘}
{‘http‘: ‘http://171.13.36.135:48755‘}
{‘http‘: ‘http://222.85.39.72:23115‘}
{‘http‘: ‘http://121.31.176.190:8123‘}
{‘http‘: ‘http://180.118.77.9:9999‘}
{‘http‘: ‘http://59.62.165.31:9999‘}
{‘http‘: ‘http://59.62.166.99:53281‘}
{‘http‘: ‘http://113.128.9.58:9999‘}
{‘http‘: ‘http://116.209.56.179:9999‘}
{‘http‘: ‘http://121.61.1.48:9999‘}
{‘http‘: ‘http://110.52.235.127:9999‘}
{‘http‘: ‘http://59.62.166.135:9999‘}
{‘http‘: ‘http://115.151.0.253:9999‘}
{‘http‘: ‘http://171.41.81.185:9999‘}
{‘http‘: ‘http://121.61.3.209:9999‘}
{‘http‘: ‘http://110.52.235.66:9999‘}
{‘http‘: ‘http://112.85.168.189:9999‘}
{‘http‘: ‘http://115.151.3.121:808‘}
{‘http‘: ‘http://110.52.235.212:9999‘}
{‘http‘: ‘http://121.61.1.234:9999‘}
{‘http‘: ‘http://222.44.30.20:8080‘}
{‘http‘: ‘http://58.251.49.4:43007‘}
{‘http‘: ‘http://113.13.160.100:9999‘}
{‘http‘: ‘http://121.61.3.13:9999‘}
{‘http‘: ‘http://119.101.114.73:9999‘}
{‘http‘: ‘http://119.101.115.7:9999‘}
{‘http‘: ‘http://119.101.113.107:9999‘}
{‘http‘: ‘http://59.37.33.62:54474‘}
{‘http‘: ‘http://119.101.115.137:9999‘}
{‘http‘: ‘http://119.101.118.3:9999‘}
{‘http‘: ‘http://121.61.0.192:9999‘}
{‘http‘: ‘http://119.101.112.39:9999‘}
{‘http‘: ‘http://119.101.112.126:9999‘}
{‘http‘: ‘http://119.101.116.202:9999‘
相关资源
- fcntl模块 win
- Kruskal算法python实现
- 蚁群算法的python代码
- 58同城爬虫程序
- 最小二乘法python代码,不用库函数
- sm3 python encode
- openopc for python 3.x
- 张正友相机标定Python代码
- 链家房价爬虫
- Python调用QQ微信截图
- python图像裁剪
- 海明校验 python源代码 海明码
- mod_wsgi.so
- Python 强大的图论和网络研究工具 ne
- python从入门到精通视频60集全
- wheel 安装包
- Python程序设计与算法基础教程源代码
- python数据处理csv->图表
- 微博评论Python代码实现
- Python最小距离法
- 五子棋AI python实现
- requests 中文文档
- 手机Python图形界面教程
- Birch python实现
- 用Python脚本对栅格图层进行批量resa
- Python学习路线Python课程大纲Python视频
- 自适应共振理论ART2(Adaptive Resonance
- 基于opencv绘制图片的三维空间显示图
- 抓取CSDN博客文章的简单爬虫python源码
- 囚徒困境的演化博弈实现Python
评论
共有 条评论