-
大小: 6.02MB文件类型: .zip金币: 2下载: 0 次发布日期: 2023-08-12
- 语言: Python
- 标签: chatbot tensorflow python
资源简介
python3, tensorflow >= 1.3
简单的英文聊天机器人基于深度学习seq2seq, 可以直接跑,结果不是很准确

代码片段和文件信息
#! /usr/bin/python
# -*- coding: utf8 -*-
“““Sequence to Sequence Learning for Twitter/Cornell Chatbot.
References
----------
http://suriyadeepan.github.io/2016-12-31-practical-seq2seq/
“““
import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import *
import tensorflow as tf
import numpy as np
import time
###============= prepare data
from data.twitter import data
metadata idx_q idx_a = data.load_data(PATH=‘data/twitter/‘) # Twitter
# from data.cornell_corpus import data
# metadata idx_q idx_a = data.load_data(PATH=‘data/cornell_corpus/‘) # Cornell Moive
(trainX trainY) (testX testY) (validX validY) = data.split_dataset(idx_q idx_a)
trainX = trainX.tolist()
trainY = trainY.tolist()
testX = testX.tolist()
testY = testY.tolist()
validX = validX.tolist()
validY = validY.tolist()
trainX = tl.prepro.remove_pad_sequences(trainX)
trainY = tl.prepro.remove_pad_sequences(trainY)
testX = tl.prepro.remove_pad_sequences(testX)
testY = tl.prepro.remove_pad_sequences(testY)
validX = tl.prepro.remove_pad_sequences(validX)
validY = tl.prepro.remove_pad_sequences(validY)
###============= parameters
xseq_len = len(trainX)#.shape[-1]
yseq_len = len(trainY)#.shape[-1]
assert xseq_len == yseq_len
batch_size = 32
n_step = int(xseq_len/batch_size)
xvocab_size = len(metadata[‘idx2w‘]) # 8002 (0~8001)
emb_dim = 1024
w2idx = metadata[‘w2idx‘] # dict word 2 index
idx2w = metadata[‘idx2w‘] # list index 2 word
unk_id = w2idx[‘unk‘] # 1
pad_id = w2idx[‘_‘] # 0
start_id = xvocab_size # 8002
end_id = xvocab_size+1 # 8003
w2idx.update({‘start_id‘: start_id})
w2idx.update({‘end_id‘: end_id})
idx2w = idx2w + [‘start_id‘ ‘end_id‘]
xvocab_size = yvocab_size = xvocab_size + 2
“““ A data for Seq2Seq should look like this:
input_seqs : [‘how‘ ‘are‘ ‘you‘ ‘]
decode_seqs : [‘‘ ‘I‘ ‘am‘ ‘fine‘ ‘]
target_seqs : [‘I‘ ‘am‘ ‘fine‘ ‘‘ ‘]
target_mask : [1 1 1 1 0]
“““
print(“encode_seqs“ [idx2w[id] for id in trainX[10]])
target_seqs = tl.prepro.sequences_add_end_id([trainY[10]] end_id=end_id)[0]
# target_seqs = tl.prepro.remove_pad_sequences([target_seqs] pad_id=pad_id)[0]
print(“target_seqs“ [idx2w[id] for id in target_seqs])
decode_seqs = tl.prepro.sequences_add_start_id([trainY[10]] start_id=start_id remove_last=False)[0]
# decode_seqs = tl.prepro.remove_pad_sequences([decode_seqs] pad_id=pad_id)[0]
print(“decode_seqs“ [idx2w[id] for id in decode_seqs])
target_mask = tl.prepro.sequences_get_mask([target_seqs])[0]
print(“target_mask“ target_mask)
print(len(target_seqs) len(decode_seqs) len(target_mask))
###============= model
def model(encode_seqs decode_seqs is_train=True reuse=False):
with tf.variable_scope(“model“ reuse=reuse):
# for chatbot you can use the same embedding layer
# for translation you may want to use 2 seperated embedding layers
with tf.variable_scope(“embedding“) as vs:
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2017-11-05 23:01 seq2seq-chatbot-master\
文件 156 2017-11-05 23:01 seq2seq-chatbot-master\.gitignore
文件 1247 2017-11-05 23:01 seq2seq-chatbot-master\README.md
目录 0 2017-11-05 23:01 seq2seq-chatbot-master\data\
文件 91 2017-11-05 23:01 seq2seq-chatbot-master\data\__init__.py
目录 0 2017-11-05 23:01 seq2seq-chatbot-master\data\cornell_corpus\
文件 11453 2017-11-05 23:01 seq2seq-chatbot-master\data\cornell_corpus\data.py
目录 0 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\
文件 7459 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\data.py
文件 10433840 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\idx_a.npy
文件 10433840 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\idx_q.npy
文件 2877112 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\me
文件 119 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\pull
文件 101 2017-11-05 23:01 seq2seq-chatbot-master\data\twitter\pull_raw_data
文件 9656 2017-11-05 23:01 seq2seq-chatbot-master\main_simple_seq2seq.py
相关资源
- 二级考试python试题12套(包括选择题和
- pywin32_python3.6_64位
- python+ selenium教程
- PycURL(Windows7/Win32)Python2.7安装包 P
- 英文原版-Scientific Computing with Python
- 7.图像风格迁移 基于深度学习 pyt
- 基于Python的学生管理系统
- A Byte of Python(简明Python教程)(第
- Python实例174946
- Python 人脸识别
- Python 人事管理系统
- 基于python-flask的个人博客系统
- 计算机视觉应用开发流程
- python 调用sftp断点续传文件
- python socket游戏
- 基于Python爬虫爬取天气预报信息
- python函数编程和讲解
- Python开发的个人博客
- 基于python的三层神经网络模型搭建
- python实现自动操作windows应用
- python人脸识别(opencv)
- python 绘图(方形、线条、圆形)
- python疫情卡UN管控
- python 连连看小游戏源码
- 基于PyQt5的视频播放器设计
- 一个简单的python爬虫
- csv文件行列转换python实现代码
- Python操作Mysql教程手册
- DeepLabV3-Tensorflow-master
- Python Machine Learning Case Studies
评论
共有 条评论