-
大小: 1.91MB文件类型: .zip金币: 1下载: 0 次发布日期: 2023-11-18
- 语言: Python
- 标签:
资源简介
Goose3 - 一个用Python编写的文章提取器
代码片段和文件信息
# -*- coding: utf-8 -*-
“““\
This is a python port of “Goose“ orignialy licensed to Gravity.com
under one or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.
Python port was written by Xavier Grangier for Recrutae
Gravity.com licenses this file
to you under the Apache License Version 2.0 (the “License“);
you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing software
distributed under the License is distributed on an “AS IS“ BASIS
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
“““
import os
from imp import load_source
from setuptools import (setup find_packages)
def read_file(filepath):
‘‘‘ read the file ‘‘‘
with open(filepath ‘r‘) as filepointer:
res = filepointer.read()
return res
version = load_source(“version“ os.path.join(“goose3“ “version.py“))
CLASSIFIERS = [
‘Development Status :: 4 - Beta‘
‘Environment :: Other Environment‘
‘Intended Audience :: Developers‘
‘License :: OSI Approved :: Apache Software License‘
‘Operating System :: MacOS :: MacOS X‘
‘Operating System :: POSIX‘
‘Operating System :: Microsoft :: Windows‘
‘Programming Language :: Python‘
‘Programming Language :: Python :: 3‘
‘Programming Language :: Python :: 3.4‘
‘Programming Language :: Python :: 3.5‘
‘Programming Language :: Python :: 3.6‘
‘Topic :: Internet‘
‘Topic :: Utilities‘
‘Topic :: Software Development :: Libraries :: Python Modules‘]
description = “Html Content / Article Extractor web scrapping for Python3“
dependencies = read_file(‘./requirements/python‘).splitlines()
test_dependencies = read_file(‘./requirements/python-dev‘).splitlines()
# read long description
try:
long_description = read_file(‘README.rst‘)
except Exception:
long_description = description
setup(
name=‘goose3‘
version=version.__version__
description=description
long_description=long_description
keywords=‘scrapping extractor web scrapping‘
classifiers=CLASSIFIERS
maintainer=‘Mahmoud Lababidi‘
maintainer_email=‘lababidi+py@gmail.com‘
url=‘https://github.com/goose3/goose3‘
license=‘Apache‘
packages=find_packages(exclude=[‘tests‘])
package_data={‘goose3‘: [‘resources/images/*.txt‘ ‘resources/text/*.txt‘
‘requirements/python‘]}
include_package_data=True
zip_safe=False
install_requires=dependencies
test_requires=test_dependencies
test_suite=“tests“
)
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\
文件 1182 2018-10-20 16:27 goose3-goose3-8de054f\.gitignore
文件 12390 2018-10-20 16:27 goose3-goose3-8de054f\.pylintrc
文件 287 2018-10-20 16:27 goose3-goose3-8de054f\.travis.yml
文件 4957 2018-10-20 16:27 goose3-goose3-8de054f\CHANGELOG.md
文件 5573 2018-10-20 16:27 goose3-goose3-8de054f\CONTRIBUTING.md
文件 10850 2018-10-20 16:27 goose3-goose3-8de054f\LICENSE.txt
文件 9166 2018-10-20 16:27 goose3-goose3-8de054f\README.rst
文件 153 2018-10-20 16:27 goose3-goose3-8de054f\THANKS
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\
文件 608 2018-10-20 16:27 goose3-goose3-8de054f\docs\Makefile
文件 808 2018-10-20 16:27 goose3-goose3-8de054f\docs\make.bat
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\
文件 520 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\__init__.py
文件 3873 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\breadcrumbs.html
文件 1977 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\footer.html
文件 7535 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\layout.html
文件 1530 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\search.html
文件 365 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\searchbox.html
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\css\
文件 3358 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\css\badge_only.css
文件 116310 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\css\theme.css
目录 0 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\
文件 134808 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\FontAwesome.otf
文件 165742 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\fontawesome-webfont.eot
文件 444379 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\fontawesome-webfont.svg
文件 165548 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\fontawesome-webfont.ttf
文件 98024 2018-10-20 16:27 goose3-goose3-8de054f\docs\source\_themes\custom_theme\static\fonts\fontawesome-webfont.woff
............此处省略282个文件信息
评论
共有 条评论