摘要: 如何用TFLearn,Keras等高層框架來學習自動生成莎翁的戲劇或者尼采的哲學文章git
上一節咱們學習了Tensorflow的高層API封裝,能夠經過簡單的幾步就生成一個DNN分類器來解決MNIST手寫識別問題。github
儘管Tensorflow也在不斷推動Estimator API。可是,這並非工具的所有。在Tensorflow官方的API方外,咱們還有強大的工具,好比TFLearn和Keras。後端
這節咱們就作一個武器庫的展現,看看專門爲Tensorflow作的高層框架TFLearn和跨Tensorflow和CNTK幾種後端的Keras爲咱們作了哪些強大的功能封裝。網絡
以前咱們簡單介紹了強大的用於處理序列數據的RNN。RNN比起其它網絡的重要優勢是能夠學習了序列數據以後進行自生成。
好比,學習《唐詩三百首》能夠寫詩,學習了Linux Kernel源代碼就能寫C代碼(雖然基本上編譯不過)。app
咱們首先來一個自動寫莎士比亞戲劇的例子吧。
在看代碼以前我先嘮叨幾句。深度學習對於數據量的要求仍是比較高的,像訓練自動生成的這種,通常得幾百萬到幾千萬量級的訓練數據下才能有好的效果。好比只用幾篇小說來訓練確定生成不知所云的小說。就算是人類也作不到只學幾首詩就會寫詩麼。
另一點就是,訓練數據量上來了,對於時間和算力的要求也是指數級提升的。
好比咱們用莎翁的戲劇來訓練,雖然數據量也不是特別的大,也就16萬多行,可是在CPU上訓練的話也不是一兩個小時能搞定的,大約是天爲單位。
後面咱們舉圖像或視頻的例子,在CPU上訓,論月也是並不意外的。框架
那麼,這個須要訓一天左右的例子,代碼會多複雜呢?答案是核心代碼不過10幾行,總共加上數據處理和測試代碼也不過50行左右。dom
from __future__ import absolute_import, division, print_function import os import pickle from six.moves import urllib import tflearn from tflearn.data_utils import * path = "shakespeare_input.txt" char_idx_file = 'char_idx.pickle' if not os.path.isfile(path): urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/shakespeare_input.txt", path) maxlen = 25 char_idx = None if os.path.isfile(char_idx_file): print('Loading previous char_idx') char_idx = pickle.load(open(char_idx_file, 'rb')) X, Y, char_idx = \ textfile_to_semi_redundant_sequences(path, seq_maxlen=maxlen, redun_step=3, pre_defined_char_idx=char_idx) pickle.dump(char_idx, open(char_idx_file,'wb')) g = tflearn.input_data([None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_shakespeare') for i in range(50): seed = random_sequence_from_textfile(path, maxlen) m.fit(X, Y, validation_set=0.1, batch_size=128, n_epoch=1, run_id='shakespeare') print("-- TESTING...") print("-- Test with temperature of 1.0 --") print(m.generate(600, temperature=1.0, seq_seed=seed)) print("-- Test with temperature of 0.5 --") print(m.generate(600, temperature=0.5, seq_seed=seed))
上面的例子須要使用TFLearn框架,能夠經過工具
pip install tflearn
來安裝。學習
TFLearn是專門爲Tensorflow開發的高層次API框架。
用TFLearn API的主要好處是可讀性更好,好比剛纔的核心代碼:測試
g = tflearn.input_data([None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_shakespeare')
從輸入數據,三層LSTM,三層Dropout,最後是一個softmax的全鏈接層。
咱們再來看一個預測泰坦尼克號倖存機率的網絡的結構:
# Build neural network net = tflearn.input_data(shape=[None, 6]) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 2, activation='softmax') net = tflearn.regression(net) # Define model model = tflearn.DNN(net) # Start training (apply gradient descent algorithm) model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)
你們的莎士比亞模型應該正在訓練過程當中吧,我們閒着也是閒着,不如從一個更簡單的例子來看看這個生成過程。
咱們仍是取TFLearn的官方例子,經過讀取美國主要城市名字列表來生成一些新的城市名字。
咱們以Z開頭的城市爲例:
Zachary Zafra Zag Zahl Zaleski Zalma Zama Zanesfield Zanesville Zap Zapata Zarah Zavalla Zearing Zebina Zebulon Zeeland Zeigler Zela Zelienople Zell Zellwood Zemple Zena Zenda Zenith Zephyr Zephyr Cove Zephyrhills Zia Pueblo Zillah Zilwaukee Zim Zimmerman Zinc Zion Zionsville Zita Zoar Zolfo Springs Zona Zumbro Falls Zumbrota Zuni Zurich Zwingle Zwolle
一共20580個城市。這個訓練就快多了,在純CPU上訓練,大約5到6分鐘能夠訓練一輪。
代碼以下,跟上面寫莎翁的戲劇的一模一樣:
from __future__ import absolute_import, division, print_function import os from six import moves import ssl import tflearn from tflearn.data_utils import * path = "US_Cities.txt" if not os.path.isfile(path): context = ssl._create_unverified_context() moves.urllib.request.urlretrieve("https://raw.githubusercontent.com/tflearn/tflearn.github.io/master/resources/US_Cities.txt", path, context=context) maxlen = 20 string_utf8 = open(path, "r").read().decode('utf-8') X, Y, char_idx = \ string_to_semi_redundant_sequences(string_utf8, seq_maxlen=maxlen, redun_step=3) g = tflearn.input_data(shape=[None, maxlen, len(char_idx)]) g = tflearn.lstm(g, 512, return_seq=True) g = tflearn.dropout(g, 0.5) g = tflearn.lstm(g, 512) g = tflearn.dropout(g, 0.5) g = tflearn.fully_connected(g, len(char_idx), activation='softmax') g = tflearn.regression(g, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) m = tflearn.SequenceGenerator(g, dictionary=char_idx, seq_maxlen=maxlen, clip_gradients=5.0, checkpoint_path='model_us_cities') for i in range(40): seed = random_sequence_from_string(string_utf8, maxlen) m.fit(X, Y, validation_set=0.1, batch_size=128, n_epoch=1, run_id='us_cities') print("-- TESTING...") print("-- Test with temperature of 1.2 --") print(m.generate(30, temperature=1.2, seq_seed=seed).encode('utf-8')) print("-- Test with temperature of 1.0 --") print(m.generate(30, temperature=1.0, seq_seed=seed).encode('utf-8')) print("-- Test with temperature of 0.5 --") print(m.generate(30, temperature=0.5, seq_seed=seed).encode('utf-8'))
咱們看下第一輪訓練完生成的城市名:
t and Shoot Cuthbertd Lettfrecv El Ceoneel Sutd Sa
第二輪:
stle Finchford Finch Dasthond madloogd Wlaycoyarfw
第三輪:
averal Cape Carteret Acbiropa Heowar Sor Dittoy Do
第十輪:
hoenchen Schofield Stcojos Schabell StcaKnerum Cri
Keras是能夠跨Tensorflow,微軟的CNTK等多種後端的API。
能夠經過
pip install keras
來安裝keras。咱們安裝了Tensorflow以後,Keras會選用Tensorflow來作它的後端。
咱們也看下Keras上文本生成的例子。官方例子是用來生成尼采的句子。
核心語句就6句話:
model = Sequential() model.add(LSTM(128, input_shape=(maxlen, len(chars)))) model.add(Dense(len(chars))) model.add(Activation('softmax')) optimizer = RMSprop(lr=0.01) model.compile(loss='categorical_crossentropy', optimizer=optimizer)
下面是完整的代碼,你們跑來玩玩吧。若是對尼采不感興趣,也能夠換成別的文章。不過請注意,正如註釋中所說的,文本隨便換,可是要保持在10萬字符以上。最好是100萬字符以上。
'''Example script to generate text from Nietzsche's writings. At least 20 epochs are required before the generated text starts sounding coherent. It is recommended to run this script on GPU, as recurrent networks are quite computationally intensive. If you try this script on new data, make sure your corpus has at least ~100k characters. ~1M is better. ''' from __future__ import print_function from keras.callbacks import LambdaCallback from keras.models import Sequential from keras.layers import Dense, Activation from keras.layers import LSTM from keras.optimizers import RMSprop from keras.utils.data_utils import get_file import numpy as np import random import sys import io path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt') with io.open(path, encoding='utf-8') as f: text = f.read().lower() print('corpus length:', len(text)) chars = sorted(list(set(text))) print('total chars:', len(chars)) char_indices = dict((c, i) for i, c in enumerate(chars)) indices_char = dict((i, c) for i, c in enumerate(chars)) # cut the text in semi-redundant sequences of maxlen characters maxlen = 40 step = 3 sentences = [] next_chars = [] for i in range(0, len(text) - maxlen, step): sentences.append(text[i: i + maxlen]) next_chars.append(text[i + maxlen]) print('nb sequences:', len(sentences)) print('Vectorization...') x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool) y = np.zeros((len(sentences), len(chars)), dtype=np.bool) for i, sentence in enumerate(sentences): for t, char in enumerate(sentence): x[i, t, char_indices[char]] = 1 y[i, char_indices[next_chars[i]]] = 1 # build the model: a single LSTM print('Build model...') model = Sequential() model.add(LSTM(128, input_shape=(maxlen, len(chars)))) model.add(Dense(len(chars))) model.add(Activation('softmax')) optimizer = RMSprop(lr=0.01) model.compile(loss='categorical_crossentropy', optimizer=optimizer) def sample(preds, temperature=1.0): # helper function to sample an index from a probability array preds = np.asarray(preds).astype('float64') preds = np.log(preds) / temperature exp_preds = np.exp(preds) preds = exp_preds / np.sum(exp_preds) probas = np.random.multinomial(1, preds, 1) return np.argmax(probas) def on_epoch_end(epoch, logs): # Function invoked at end of each epoch. Prints generated text. print() print('----- Generating text after Epoch: %d' % epoch) start_index = random.randint(0, len(text) - maxlen - 1) for diversity in [0.2, 0.5, 1.0, 1.2]: print('----- diversity:', diversity) generated = '' sentence = text[start_index: start_index + maxlen] generated += sentence print('----- Generating with seed: "' + sentence + '"') sys.stdout.write(generated) for i in range(400): x_pred = np.zeros((1, maxlen, len(chars))) for t, char in enumerate(sentence): x_pred[0, t, char_indices[char]] = 1. preds = model.predict(x_pred, verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char sentence = sentence[1:] + next_char sys.stdout.write(next_char) sys.stdout.flush() print() print_callback = LambdaCallback(on_epoch_end=on_epoch_end) model.fit(x, y, batch_size=128, epochs=60, callbacks=[print_callback])
本文做者:lusing
本文爲雲棲社區原創內容,未經容許不得轉載。