资源简介
用LSTM实现机器翻译,有教程,有任务,非常适合学习。
代码片段和文件信息
import os
import pickle
import copy
import numpy as np
CODES = {‘‘: 0 ‘‘: 1 ‘‘: 2 ‘‘: 3 }
def load_data(path):
“““
Load Dataset from File
“““
input_file = os.path.join(path)
with open(input_file ‘r‘ encoding=‘utf-8‘) as f:
return f.read()
def preprocess_and_save_data(source_path target_path text_to_ids):
“““
Preprocess Text Data. Save to to file.
“““
# Preprocess
source_text = load_data(source_path)
target_text = load_data(target_path)
source_text = source_text.lower()
target_text = target_text.lower()
source_vocab_to_int source_int_to_vocab = create_lookup_tables(source_text)
target_vocab_to_int target_int_to_vocab = create_lookup_tables(target_text)
source_text target_text = text_to_ids(source_text target_text source_vocab_to_int target_vocab_to_int)
# Save Data
with open(‘preprocess.p‘ ‘wb‘) as out_file:
pickle.dump((
(source_text target_text)
(source_vocab_to_int target_vocab_to_int)
(source_int_to_vocab target_int_to_vocab)) out_file)
def load_preprocess():
“““
Load the Preprocessed Training data and return them in batches of or less
“““
with open(‘preprocess.p‘ mode=‘rb‘) as in_file:
return pickle.load(in_file)
def create_lookup_tables(text):
“““
Create lookup tables for vocabulary
“““
vocab = set(text.split())
vocab_to_int = copy.copy(CODES)
for v_i v in enumerate(vocab len(CODES)):
vocab_to_int[v] = v_i
int_to_vocab = {v_i: v for v v_i in vocab_to_int.items()}
return vocab_to_int int_to_vocab
def save_params(params):
“““
Save parameters to file
“““
with open(‘params.p‘ ‘wb‘) as out_file:
pickle.dump(params out_file)
def load_params():
“““
Load parameters from file
“““
with open(‘params.p‘ mode=‘rb‘) as in_file:
return pickle.load(in_file)
def batch_data(source target batch_size):
“““
Batch source and target together
“““
for batch_i in range(0 len(source)//batch_size):
start_i = batch_i * batch_size
source_batch = source[start_i:start_i + batch_size]
target_batch = target[start_i:start_i + batch_size]
yield np.array(pad_sentence_batch(source_batch)) np.array(pad_sentence_batch(target_batch))
def pad_sentence_batch(sentence_batch):
“““
Pad sentence with id
“““
max_sentence = max([len(sentence) for sentence in sentence_batch])
return [sentence + [CODES[‘‘]] * (max_sentence - len(sentence))
for sentence in sentence_batch]
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
文件 279 2018-05-19 14:21 dlnd_language_translation\.git\config
文件 73 2018-05-19 14:20 dlnd_language_translation\.git\desc
文件 23 2018-05-19 14:21 dlnd_language_translation\.git\HEAD
文件 478 2018-05-19 14:20 dlnd_language_translation\.git\hooks\applypatch-msg.sample
文件 896 2018-05-19 14:20 dlnd_language_translation\.git\hooks\commit-msg.sample
文件 189 2018-05-19 14:20 dlnd_language_translation\.git\hooks\post-update.sample
文件 424 2018-05-19 14:20 dlnd_language_translation\.git\hooks\pre-applypatch.sample
文件 1642 2018-05-19 14:20 dlnd_language_translation\.git\hooks\pre-commit.sample
文件 1348 2018-05-19 14:20 dlnd_language_translation\.git\hooks\pre-push.sample
文件 4898 2018-05-19 14:20 dlnd_language_translation\.git\hooks\pre-reba
文件 544 2018-05-19 14:20 dlnd_language_translation\.git\hooks\pre-receive.sample
文件 1239 2018-05-19 14:20 dlnd_language_translation\.git\hooks\prepare-commit-msg.sample
文件 3610 2018-05-19 14:20 dlnd_language_translation\.git\hooks\update.sample
文件 963 2018-05-19 14:21 dlnd_language_translation\.git\index
文件 240 2018-05-19 14:20 dlnd_language_translation\.git\info\exclude
文件 194 2018-05-19 14:21 dlnd_language_translation\.git\logs\HEAD
文件 194 2018-05-19 14:21 dlnd_language_translation\.git\logs\refs\heads\master
文件 194 2018-05-19 14:21 dlnd_language_translation\.git\logs\refs\remotes\origin\HEAD
文件 118 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 648 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 277 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 183 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 185 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 1935397 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 802 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 986 2018-05-19 14:21 dlnd_language_translation\.git\ob
文件 246 2018-05-19 14:21 dlnd_language_translation\.git\ob
文件 19417 2018-05-19 14:21 dlnd_language_translation\.git\ob
文件 54 2018-05-19 14:20 dlnd_language_translation\.git\ob
文件 105 2018-05-19 14:21 dlnd_language_translation\.git\ob
............此处省略73个文件信息
相关资源
- Pascal VOC 2007数据集用于物体检测
- 深度学习入门的几篇经典论文原版英
- halcon 19 深度学习 和平版
- Machine Learning - A Probabilistic Perspective
- cifar10图片版
- 《深度学习Deep Learning 》中文版 高清
- RNN 文本分类
- EEG MI Data.zip
- Deep Learning-Ian Goodfellow (2017)带书签
- 吴恩达深度学习第一课jupyter版作业
- cs231n春季作业 2017版
- 车牌_汉字_字母_数字训练集
- CycleGAN--应用于图像风格迁移
- 多篇深度学习,机器学习论文翻译,
- 机器学习-深度学习-NLP-算法工程师面
- deeplearning深度学习中文版无水印
- keras自带数据集的。。。
- 深度学习框架-PyTorch: 入门与实践(陈
- 雷达辐射源分选识别资料基于深度学
- 深度学习中word2vector测试语料text8
- 零基础入门深度学习-系列博客高清合
- Ian Goodfellow深度学习中文版+英文版
- Tensorflow - 实战Google深度学习框架 全本
- 深度学习方法及应用PDF高清晰完整版
- 《TensorFlow实战Google深度学习框架(第
- 深度学习与社会计算-刘知远
- 吴恩达深度学习专项课程编程作业集
- 自然场景OCR(YOLOv3+CRNN)
- 《DeepLearning》深度学习圣经-IanGoodfe
- MNIST CNN 手写体识别完整数据集加代码
评论
共有 条评论