• 大小: 45.42MB
    文件类型: .zip
    金币: 1
    下载: 0 次
    发布日期: 2023-07-30
  • 语言: Python
  • 标签:

资源简介

智联_51job招聘需求挖掘采集和分析,数据采集时间截止到2018年12月28日,数据条数为15万条,平台为智联和51_job,算是给要找工作的自己一个方向,具体的流程可以参考右边的PPT

资源截图

代码片段和文件信息

# coding=utf-8
import requests
import list_headers
import json
import time
data_demo= open(‘data_json/data_all_json.txt‘‘w‘encoding=‘utf-8‘)
# # 全国 page = 090180270
area = [‘489‘]
# 全部的学历类别,互联网全类别
for are in area:
    # 10800条数据
    for page in range(1200):
        data = {
            # ‘start‘: ‘90‘
            ‘start‘: 90*page
            ‘pageSize‘: ‘90‘
            ‘cityId‘: are
            ‘industry‘: ‘10100‘
            ‘workExperience‘: ‘-1‘
            ‘education‘: ‘-1‘
            ‘companyType‘: ‘-1‘
            ‘employmentType‘: ‘-1‘
            ‘jobWelfareTag‘: ‘-1‘
            ‘kt‘: ‘3‘
        }
        # 采集json
        url = ‘https://fe-api.zhaopin.com/c/i/sou‘
        time.sleep(1)
        html = requests.get(urlparams=dataheaders=list_headers.headers())
        print(page)
        data_demo.write(json.dumps(html.json()))
        data_demo.write(‘\n‘)
data_demo.close()


 属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2019-01-11 07:05  zhilian-51job-analysis-master\
     文件        1203  2019-01-11 07:05  zhilian-51job-analysis-master\.gitignore
     目录           0  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\
     文件         993  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\get_all_json.py
     文件        1085  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\get_page_list.py
     文件         779  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\get_page_responsibility.py
     文件         761  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\get_require_list.py
     文件        3933  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\get_tag.py
     文件     2853245  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\job_list.txt
     文件     2875795  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\job_list_hlw.txt
     文件        3362  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\list_headers.py
     文件        2588  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\single_page_header.py
     文件        3446  2019-01-11 07:05  zhilian-51job-analysis-master\51job_采集\数据清洗.py
     文件       11357  2019-01-11 07:05  zhilian-51job-analysis-master\LICENSE
     文件        2715  2019-01-11 07:05  zhilian-51job-analysis-master\README.md
     文件       11397  2019-01-11 07:05  zhilian-51job-analysis-master\Text-Rank+TF-IDF.ipynb
     目录           0  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\
     文件       53287  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\LDA4.png
     文件       30720  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\cl.png
     文件       93016  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\dc.png
     文件       38432  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\edu.png
     文件       44332  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\exp.png
     文件       80013  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\money.png
     文件      119790  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\salary.png
     文件       66210  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\type.png
     文件       79740  2019-01-11 07:05  zhilian-51job-analysis-master\analysis_result\主题挖掘.png
     目录           0  2019-01-11 07:05  zhilian-51job-analysis-master\excel_data\
     文件    23320315  2019-01-11 07:05  zhilian-51job-analysis-master\excel_data\51job.xlsx
     文件    12066032  2019-01-11 07:05  zhilian-51job-analysis-master\excel_data\zhilian.xlsx
     文件     2388783  2019-01-11 07:05  zhilian-51job-analysis-master\index.html
     目录           0  2019-01-11 07:05  zhilian-51job-analysis-master\pic\
............此处省略87个文件信息

评论

共有 条评论