tensorflow筆記3：CRF函數：tf.contrib.crf.crf_log_likelihood()

在分析訓練代碼的時候，遇到了，tf.contrib.crf.crf_log_likelihood，這個函數，因而想簡單理解下：python

函數的目的：使用crf 來計算損失，裏面用到的優化方法是：最大似然估計算法

使用方法：session

tf.contrib.crf.crf_log_likelihood(inputs, tag_indices, sequence_lengths, transition_params=None) See the guide: CRF (contrib) Computes the log-likelihood of tag sequences in a CRF. Args: inputs: A [batch_size, max_seq_len, num_tags] tensor of unary potentials to use as input to the CRF layer. tag_indices: A [batch_size, max_seq_len] matrix of tag indices for which we compute the log-likelihood. sequence_lengths: A [batch_size] vector of true sequence lengths. transition_params: A [num_tags, num_tags] transition matrix, if available. Returns: log_likelihood: A scalar containing the log-likelihood of the given sequence of tag indices. transition_params: A [num_tags, num_tags] transition matrix. This is either provided by the caller or created in this function.

函數講解：dom

一、tf.contrib.crf.crf_log_likelihoodide

crf_log_likelihood(inputs,tag_indices,sequence_lengths,transition_params=None)函數

在一個條件隨機場裏面計算標籤序列的log-likelihood測試

參數:優化

inputs: 一個形狀爲[batch_size, max_seq_len, num_tags] 的tensor,通常使用BILSTM處理以後輸出轉換爲他要求的形狀做爲CRF層的輸入. tag_indices: 一個形狀爲[batch_size, max_seq_len] 的矩陣,其實就是真實標籤. sequence_lengths: 一個形狀爲 [batch_size] 的向量,表示每一個序列的長度. transition_params: 形狀爲[num_tags, num_tags] 的轉移矩陣

返回：ui

log_likelihood: 標量,log-likelihood transition_params: 形狀爲[num_tags, num_tags] 的轉移矩陣

二、tf.contrib.crf.viterbi_decodethis

viterbi_decode(score,transition_params)
通俗一點,做用就是返回最好的標籤序列.這個函數只可以在測試時使用,在tensorflow外部解碼

參數:

score: 一個形狀爲[seq_len, num_tags] matrix of unary potentials. transition_params: 形狀爲[num_tags, num_tags] 的轉移矩陣

viterbi: 一個形狀爲[seq_len] 顯示了最高分的標籤索引的列表. viterbi_score: A float containing the score for the Viterbi sequence.

三、tf.contrib.crf.crf_decode

crf_decode(potentials,transition_params,sequence_length)
在tensorflow內解碼

參數:

potentials: 一個形狀爲[batch_size, max_seq_len, num_tags] 的tensor, transition_params: 一個形狀爲[num_tags, num_tags] 的轉移矩陣 sequence_length: 一個形狀爲[batch_size] 的 ,表示batch中每一個序列的長度

decode_tags:一個形狀爲[batch_size, max_seq_len] 的tensor,類型是tf.int32.表示最好的序列標記. best_score: 有個形狀爲[batch_size] 的tensor, 包含每一個序列解碼標籤的分數.

轉載來自知乎：

若是你須要預測的是個序列，那麼能夠選擇用crf_log_likelihood做爲損失函數

crf_log_likelihood( inputs, tag_indices, sequence_lengths, transition_params=None )

輸入：

inputs：unary potentials，也就是每一個標籤的預測機率值，這個值根據實際狀況選擇計算方法，CNN,RNN...均可以

tag_indices，這個就是真實的標籤序列了

sequence_lengths，這是一個樣本真實的序列長度，由於爲了對齊長度會作些padding，可是能夠把真實的長度放到這個參數裏

transition_params，轉移機率，能夠沒有，沒有的話這個函數也會算出來

輸出：

log_likelihood,

transition_params,轉移機率，若是輸入沒輸，它就本身算個給返回

做者：知乎用戶
連接：https://www.zhihu.com/question/57666556/answer/326803900
來源：知乎
著做權歸做者全部。商業轉載請聯繫做者得到受權，非商業轉載請註明出處。

官方的示例代碼：如何使用crf來計算：

# !/home/wcg/tools/local/anaconda3/bin/python  # coding=utf8
import numpy as np import tensorflow as tf #data settings
num_examples = 10 num_words = 20 num_features = 100 num_tags = 5 

# 5 tags #x shape = [10,20,100] #random features.
x = np.random.rand(num_examples,num_words,num_features).astype(np.float32) #y shape = [10,20]

#Random tag indices representing the gold sequence.
y = np.random.randint(num_tags,size = [num_examples,num_words]).astype(np.int32) # 序列的長度 #sequence_lengths = [19,19,19,19,19,19,19,19,19,19]
sequence_lengths = np.full(num_examples,num_words - 1,dtype=np.int32) #Train and evaluate the model.
with tf.Graph().as_default(): with tf.Session() as session: # Add the data to the TensorFlow gtaph.
         x_t = tf.constant(x) #觀測序列
         y_t = tf.constant(y) # 標記序列
         sequence_lengths_t = tf.constant(sequence_lengths) # Compute unary scores from a linear layer.
         # weights shape = [100,5]
         weights = tf.get_variable("weights", [num_features, num_tags]) # matricized_x_t shape = [200,100]
         matricized_x_t = tf.reshape(x_t, [-1, num_features]) # compute [200,100] [100,5] get [200,5]
         # 計算結果
         matricized_unary_scores = tf.matmul(matricized_x_t, weights) # unary_scores shape = [10,20,5] [10,20,5] 
         unary_scores = tf.reshape(matricized_unary_scores, [num_examples, num_words, num_tags]) # compute the log-likelihood of the gold sequences and keep the transition
         # params for inference at test time.
         # shape shape [10,20,5] [10,20] [10]
         log_likelihood,transition_params = tf.contrib.crf.crf_log_likelihood(unary_scores,y_t,sequence_lengths_t) viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(unary_scores, transition_params, sequence_lengths_t) # add a training op to tune the parameters.
         loss = tf.reduce_mean(-log_likelihood) # 定義梯度降低算法的優化器
         #learning_rate 0.01
         train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss) #train for a fixed number of iterations.
 session.run(tf.global_variables_initializer()) ''' #eg: In [61]: m_20 Out[61]: array([[ 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]]) In [62]: n_20 Out[62]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) In [59]: n_20<m_20 Out[59]: array([[ True, True, True, True, True, True, True, True, True, True]], dtype=bool) '''
         #這裏用mask過濾掉不符合的結果
         mask = (np.expand_dims(np.arange(num_words), axis=0) < np.expand_dims(sequence_lengths, axis=1)) ###mask = array([[ True, True, True, True, True, True, True, True, True, True]], dtype=bool)
         #序列的長度
         total_labels = np.sum(sequence_lengths) print ("mask:",mask) print ("total_labels:",total_labels) for i in range(1000): #tf_unary_scores,tf_transition_params,_ = session.run([unary_scores,transition_params,train_op])
             tf_viterbi_sequence,_=session.run([viterbi_sequence,train_op]) if i%100 == 0: ''' false*false = false false*true= false ture*true = true '''
                #序列中預測對的個數
                correct_labels = np.sum((y==tf_viterbi_sequence) * mask) accuracy = 100.0*correct_labels/float(total_labels) print ("Accuracy: %.2f%%" %accuracy)