網絡上關於ReLU、LReLU等很是多的理論東西,但是大部分都是理論的,彙集怎麼應用比較少。python
在 Convolutional Neural Network (CNN) https://tensorflow.google.cn/tutorials/images/cnn?hl=en 的學習課程中,激活函數是 relu。git
在學習過程當中,看有的博文中說當激活函數 ReLU 效果很差時,建議使用LReLU試試,但是網上並無特別詳細的使用方法,只好去官網上找。github
首先使用常規的relu —— 直接使用。express
直接使用官網例子《Create the convolutional base》apache
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation='relu'))
很明顯,這裏的 relu 是能夠直接使用的。api
找到官網關於 activation='relu' 的相關內容網絡
在 activation 的官網中app
(Module: tf.compat.v2.keras.activations) https://tensorflow.google.cn/api_docs/python/tf/compat/v2/keras/activations?hl=enless
咱們能夠看到以下內容dom
Built-in activation functions.
Functions
elu(...)
: Exponential linear unit.
exponential(...)
: Exponential activation function.
hard_sigmoid(...)
: Hard sigmoid activation function.
linear(...)
: Linear activation function.
relu(...)
: Rectified Linear Unit.
selu(...)
: Scaled Exponential Linear Unit (SELU).
sigmoid(...)
: Sigmoid.
softmax(...)
: The softmax activation function transforms the outputs so that all values are in
softplus(...)
: Softplus activation function.
softsign(...)
: Softsign activation function.
tanh(...)
: Hyperbolic Tangent (tanh) activation function.
很明顯,內建函數中包括有relu;而沒有LReLU。因此直接使用 activation='lrelu' 會報錯!
報錯內容: ValueError: Unknown activation function:lrelu 。
備註: activation='relu' 等價於 activation=tf.keras.activations.relu() 。
官網 tf.keras.activations.relu https://tensorflow.google.cn/api_docs/python/tf/keras/activations/relu
tf.keras.activations.relu( x, alpha=0.0, max_value=None, threshold=0 )
參數:
x
:張量或變量。
alpha
:負截面的標量斜率(默認值= 0.
)。
max_value
:浮動。飽和度閾值。
threshold
:浮動。閾值激活的閾值。
使用默認值,該函數返回 element-wise max(x, 0)
.
不然,它遵循以下規則:
f(x) = max_value
for x >= max_value
,
f(x) = x
for threshold <= x < max_value
,
f(x) = alpha * (x - threshold)
.
其實我在想,要是負截面的標量斜率 alpha ≠ 0 ,是否是就相似於RLeLU 函數了?
接下來咱們對比 LeaKyReLU 函數。
tf.keras.layers.LeakyReLU 的官方網址:https://tensorflow.google.cn/api_docs/python/tf/keras/layers/LeakyReLU?hl=en
class tf.keras.layers.LeakyReLU
首先須要明確的是 LeaKyReLU 是一個類,而不是函數!
該類繼承自 layer(當我意識到它是類時,覺得繼承自layers,尾後附源碼)
alpha
:浮點> =0。負斜率係數。__init__方法
__init__( alpha=0.3, **kwargs )
深度卷積生成對抗網絡 Deep Convolutional Generative Adversarial Network
該文中是這樣應用的:
def make_generator_model(): model = tf.keras.Sequential() model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,))) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Reshape((7, 7, 256))) assert model.output_shape == (None, 7, 7, 256) # 注意:batch size 沒有限制
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) assert model.output_shape == (None, 7, 7, 128) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False)) assert model.output_shape == (None, 14, 14, 64) model.add(layers.BatchNormalization()) model.add(layers.LeakyReLU()) model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) assert model.output_shape == (None, 28, 28, 1) return model
能夠理解爲直接實例化應用。
在官方給出的另外一文中的應用方法。
Pix2Pix https://tensorflow.google.cn/tutorials/generative/pix2pix
應用方法爲
def downsample(filters, size, apply_batchnorm=True): initializer = tf.random_normal_initializer(0., 0.02) result = tf.keras.Sequential() result.add( tf.keras.layers.Conv2D(filters, size, strides=2, padding='same', kernel_initializer=initializer, use_bias=False)) if apply_batchnorm: result.add(tf.keras.layers.BatchNormalization()) result.add(tf.keras.layers.LeakyReLU()) return result
紅色標識的地方可知,也是直接實例化應用的。
固然了也能夠直接賦值並實例化:
依據官方的方法進行修改
首先引用文件
from tensorflow.keras import layers, models from tensorflow.keras.layers import LeakyReLU
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), input_shape=(28, 28, 3))) model.add(LeakyReLU(alpha=0.01)) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3))) model.add(LeakyReLU(alpha=0.01)) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3))) model.add(LeakyReLU(alpha=0.01)) model.add(layers.MaxPooling2D((2, 2))) model.summary()
嘗試是能夠運行的:
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 896 _________________________________________________________________ leaky_re_lu (LeakyReLU) (None, 26, 26, 32) 0 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 11, 11, 64) 18496 _________________________________________________________________ leaky_re_lu_1 (LeakyReLU) (None, 11, 11, 64) 0 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 3, 3, 64) 36928 _________________________________________________________________ leaky_re_lu_2 (LeakyReLU) (None, 3, 3, 64) 0 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 1, 1, 64) 0 ================================================================= Total params: 56,320 Trainable params: 56,320 Non-trainable params: 0 _________________________________________________________________
咱們嘗試修改代碼行數,觀察是否可行:
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation=LeakyReLU(alpha=0.01), input_shape=(32, 32, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation=LeakyReLU(alpha=0.01))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation=LeakyReLU(alpha=0.01))) model.add(layers.MaxPooling2D((2, 2))) model.summary()
運行
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 30, 30, 32) 896 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 15, 15, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 13, 13, 64) 18496 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 4, 4, 64) 36928 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 2, 2, 64) 0 ================================================================= Total params: 56,320 Trainable params: 56,320 Non-trainable params: 0 _________________________________________________________________
很明顯,也是能夠運行的。只是它們的 summary 有些區別。
雖然在 3.3.2章節 中驗證成功,可是在實際運行中卻出現了意外——報錯!
AttributeError: 'LeakyReLU' object has no attribute '__name__'
採用 3.3.1 章節 的方案卻沒有報錯。和原來未修改前同樣,故仍是使用 3.3.1章節 內容。
爲了顯示底層代碼的重要性,咱們將其做爲單獨的章節列出
tensorflow/tensorflow/python/keras/layers/advanced_activations.py
在該內容中咱們看到以下代碼內容,從代碼中可知,class LeakyReLU 確實繼承自 class Layer
# Copyright 2015 The TensorFlow Authors. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================== """Layers that act as activation functions. """ from __future__ import absolute_import from __future__ import division from __future__ import print_function from tensorflow.python.keras import backend as K from tensorflow.python.keras import constraints from tensorflow.python.keras import initializers from tensorflow.python.keras import regularizers from tensorflow.python.keras.engine.base_layer import Layer from tensorflow.python.keras.engine.input_spec import InputSpec from tensorflow.python.keras.utils import tf_utils from tensorflow.python.ops import math_ops from tensorflow.python.util.tf_export import keras_export @keras_export('keras.layers.LeakyReLU') class LeakyReLU(Layer): """Leaky version of a Rectified Linear Unit. It allows a small gradient when the unit is not active: `f(x) = alpha * x for x < 0`, `f(x) = x for x >= 0`. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: alpha: Float >= 0. Negative slope coefficient. """ def __init__(self, alpha=0.3, **kwargs): super(LeakyReLU, self).__init__(**kwargs) self.supports_masking = True self.alpha = K.cast_to_floatx(alpha) def call(self, inputs): return K.relu(inputs, alpha=self.alpha) def get_config(self): config = {'alpha': float(self.alpha)} base_config = super(LeakyReLU, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape @keras_export('keras.layers.PReLU') class PReLU(Layer): """Parametric Rectified Linear Unit. It follows: `f(x) = alpha * x for x < 0`, `f(x) = x for x >= 0`, where `alpha` is a learned array with the same shape as x. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: alpha_initializer: Initializer function for the weights. alpha_regularizer: Regularizer for the weights. alpha_constraint: Constraint for the weights. shared_axes: The axes along which to share learnable parameters for the activation function. For example, if the incoming feature maps are from a 2D convolution with output shape `(batch, height, width, channels)`, and you wish to share parameters across space so that each filter only has one set of parameters, set `shared_axes=[1, 2]`. """ def __init__(self, alpha_initializer='zeros', alpha_regularizer=None, alpha_constraint=None, shared_axes=None, **kwargs): super(PReLU, self).__init__(**kwargs) self.supports_masking = True self.alpha_initializer = initializers.get(alpha_initializer) self.alpha_regularizer = regularizers.get(alpha_regularizer) self.alpha_constraint = constraints.get(alpha_constraint) if shared_axes is None: self.shared_axes = None elif not isinstance(shared_axes, (list, tuple)): self.shared_axes = [shared_axes] else: self.shared_axes = list(shared_axes) @tf_utils.shape_type_conversion def build(self, input_shape): param_shape = list(input_shape[1:]) if self.shared_axes is not None: for i in self.shared_axes: param_shape[i - 1] = 1 self.alpha = self.add_weight( shape=param_shape, name='alpha', initializer=self.alpha_initializer, regularizer=self.alpha_regularizer, constraint=self.alpha_constraint) # Set input spec axes = {} if self.shared_axes: for i in range(1, len(input_shape)): if i not in self.shared_axes: axes[i] = input_shape[i] self.input_spec = InputSpec(ndim=len(input_shape), axes=axes) self.built = True def call(self, inputs): pos = K.relu(inputs) neg = -self.alpha * K.relu(-inputs) return pos + neg def get_config(self): config = { 'alpha_initializer': initializers.serialize(self.alpha_initializer), 'alpha_regularizer': regularizers.serialize(self.alpha_regularizer), 'alpha_constraint': constraints.serialize(self.alpha_constraint), 'shared_axes': self.shared_axes } base_config = super(PReLU, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape @keras_export('keras.layers.ELU') class ELU(Layer): """Exponential Linear Unit. It follows: `f(x) = alpha * (exp(x) - 1.) for x < 0`, `f(x) = x for x >= 0`. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: alpha: Scale for the negative factor. """ def __init__(self, alpha=1.0, **kwargs): super(ELU, self).__init__(**kwargs) self.supports_masking = True self.alpha = K.cast_to_floatx(alpha) def call(self, inputs): return K.elu(inputs, self.alpha) def get_config(self): config = {'alpha': float(self.alpha)} base_config = super(ELU, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape @keras_export('keras.layers.ThresholdedReLU') class ThresholdedReLU(Layer): """Thresholded Rectified Linear Unit. It follows: `f(x) = x for x > theta`, `f(x) = 0 otherwise`. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: theta: Float >= 0. Threshold location of activation. """ def __init__(self, theta=1.0, **kwargs): super(ThresholdedReLU, self).__init__(**kwargs) self.supports_masking = True self.theta = K.cast_to_floatx(theta) def call(self, inputs): return inputs * math_ops.cast( math_ops.greater(inputs, self.theta), K.floatx()) def get_config(self): config = {'theta': float(self.theta)} base_config = super(ThresholdedReLU, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape @keras_export('keras.layers.Softmax') class Softmax(Layer): """Softmax activation function. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: axis: Integer, axis along which the softmax normalization is applied. """ def __init__(self, axis=-1, **kwargs): super(Softmax, self).__init__(**kwargs) self.supports_masking = True self.axis = axis def call(self, inputs): return K.softmax(inputs, axis=self.axis) def get_config(self): config = {'axis': self.axis} base_config = super(Softmax, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape @keras_export('keras.layers.ReLU') class ReLU(Layer): """Rectified Linear Unit activation function. With default values, it returns element-wise `max(x, 0)`. Otherwise, it follows: `f(x) = max_value` for `x >= max_value`, `f(x) = x` for `threshold <= x < max_value`, `f(x) = negative_slope * (x - threshold)` otherwise. Input shape: Arbitrary. Use the keyword argument `input_shape` (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model. Output shape: Same shape as the input. Arguments: max_value: Float >= 0. Maximum activation value. negative_slope: Float >= 0. Negative slope coefficient. threshold: Float. Threshold value for thresholded activation. """ def __init__(self, max_value=None, negative_slope=0, threshold=0, **kwargs): super(ReLU, self).__init__(**kwargs) if max_value is not None and max_value < 0.: raise ValueError('max_value of Relu layer ' 'cannot be negative value: ' + str(max_value)) if negative_slope < 0.: raise ValueError('negative_slope of Relu layer ' 'cannot be negative value: ' + str(negative_slope)) self.support_masking = True if max_value is not None: max_value = K.cast_to_floatx(max_value) self.max_value = max_value self.negative_slope = K.cast_to_floatx(negative_slope) self.threshold = K.cast_to_floatx(threshold) def call(self, inputs): # alpha is used for leaky relu slope in activations instead of # negative_slope. return K.relu(inputs, alpha=self.negative_slope, max_value=self.max_value, threshold=self.threshold) def get_config(self): config = { 'max_value': self.max_value, 'negative_slope': self.negative_slope, 'threshold': self.threshold } base_config = super(ReLU, self).get_config() return dict(list(base_config.items()) + list(config.items())) @tf_utils.shape_type_conversion def compute_output_shape(self, input_shape): return input_shape