-
大小: 5KB文件类型: .zip金币: 1下载: 0 次发布日期: 2021-01-09
- 语言: Python
- 标签:
资源简介
2019年百度的实体链指比赛(ccks2019),一个baseline
代码片段和文件信息
#! -*- coding: utf-8 -*-
# 2019年百度的实体链指比赛( ccks2019,https://biendata.com/competition/ccks_2019_el/ ),一个baseline
import json
from tqdm import tqdm
import os
import numpy as np
from random import choice
from itertools import groupby
mode = 0
min_count = 2
char_size = 128
id2kb = {}
with open(‘../ccks2019_el/kb_data‘) as f:
for l in tqdm(f):
_ = json.loads(l)
subject_id = _[‘subject_id‘]
subject_alias = list(set([_[‘subject‘]] + _.get(‘alias‘ [])))
subject_alias = [alias.lower() for alias in subject_alias]
subject_desc = ‘\n‘.join(u‘%s:%s‘ % (i[‘predicate‘] i[‘object‘]) for i in _[‘data‘])
subject_desc = subject_desc.lower()
if subject_desc:
id2kb[subject_id] = {‘subject_alias‘: subject_alias ‘subject_desc‘: subje
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2019-05-21 02:01 el-2019-ba
文件 2516 2019-05-21 02:01 el-2019-ba
文件 9894 2019-05-21 02:01 el-2019-ba
评论
共有 条评论