- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

激活函数汇总，包含公式、求导过程以及numpy实现，妥妥的万字干货

AI浩发表于 2022/01/15 05:26:33 2022/01/15

【摘要】 @[toc] 1、激活函数的实现 1.1 sigmoid 1.1.1 函数函数：f(x)=11+e−xf(x)=\frac{1}{1+e^{-x}}f(x)=1+e−x1 1.1.2 导数求导过程：根据：(uv)′=u′v−uv′v2\left ( \frac{u}{v} \right ){}'=\frac{{u}'v-u{v}'}{v^{2}}(vu)′=v2u′v−uv′f(x)′...

@[toc]

1、激活函数的实现

1.1 sigmoid

1.1.1 函数

函数： $f(x)=\frac{1}{1+e^{-x}}$

1.1.2 导数

求导过程：

根据： $\left ( \frac{u}{v} \right ){}'=\frac{{u}'v-u{v}'}{v^{2}}$

$\begin{aligned} f(x)^{\prime} &=\left(\frac{1}{1+e^{-x}}\right)^{\prime} \\ &=\frac{1^{\prime} \times\left(1+e^{-x}\right)-1 \times\left(1+e^{-x}\right)^{\prime}}{\left(1+e^{-x}\right)^{2}} \\ &=\frac{e^{-x}}{\left(1+e^{-x}\right)^{2}} \\ &=\frac{1+e^{-x}-1}{\left(1+e^{-x}\right)^{2}} \\ &=\left(\frac{1}{1+e^{-x}}\right)\left(1-\frac{1}{1+e^{-x}}\right) \\ &=\quad f(x)(1-f(x)) \end{aligned}$

1.1.3 代码实现

import numpy as np

class Sigmoid():
    def __call__(self, x):
        return 1 / (1 + np.exp(-x))

    def gradient(self, x):
        return self.__call__(x) * (1 - self.__call__(x))

1.2 softmax

1.2.1 函数

softmax用于多分类过程中，它将多个神经元的输出，映射到（0,1）区间内，可以看成概率来理解，从而来进行多分类！

假设我们有一个数组，V，Vi表示V中的第i个元素，那么这个元素的softmax值就是:

$S_{i}=\frac{e^{i}}{\sum _{j}e^{j}}$

更形象的如下图表示：

$y1=\frac{e^{z_{1}}}{e^{z_{1}}+e^{z_{2}}+e^{z_{3}}}\\ y2=\frac{e^{z_{2}}}{e^{z_{1}}+e^{z_{2}}+e^{z_{3}}}\\ y3=\frac{e^{z_{3}}}{e^{z_{1}}+e^{z_{2}}+e^{z_{3}}}\\ \tag{1}$

要使用梯度下降，就需要一个损失函数，一般使用交叉熵作为损失函数，交叉熵函数形式如下：

$Loss = -\sum_{i}^{}{y_{i}lna_{i} } \tag{2}$

1.2.2 导数

求导分为两种情况。

第一种j=i：

$S_{i}=\frac{e^{i}}{\sum _{j}e^{j}}=\frac{e^{i}}{\sum _{i}e^{i}}$

推导过程如下：

$\begin{aligned} f^{\prime}&=\left(\frac{e^{i}}{\sum_{i} e^{i}}\right)^{\prime} & \\ &=\frac{\left(e^{i}\right) \times \sum_{i} e^{i}-e^{i} \times e^{i}}{\left(\sum_{i} e^{i}\right)^{2}} \\ &=\frac{e^{i}}{\sum_{i} e^{i}}-\frac{e^{i}}{\sum_{i} e^{i}} \times \frac{e^{i}}{\sum_{i} e^{i}} \\ &= \frac{e^{i}}{\sum_{i} e^{i}}\left(1-\frac{e^{i}}{\sum_{i} e^{i}}\right) \\ &= f(1-f) \end{aligned}$

1.2.3 代码实现

import numpy as np
class Softmax():
    def __call__(self, x):
        e_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
        return e_x / np.sum(e_x, axis=-1, keepdims=True)

    def gradient(self, x):
        p = self.__call__(x)
        return p * (1 - p)

1.3 tanh

1.3.1 函数

$tanh(x)=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}$

1.3.2 导数

求导过程：

$\begin{aligned} \tanh (x)^{\prime} &=\left(\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}\right)^{\prime} \\ &=\frac{\left(e^{x}-e^{-x}\right)^{\prime}\left(e^{x}+e^{-x}\right)-\left(e^{x}-e^{-x}\right)\left(e^{x}+e^{-x}\right)^{\prime}}{\left(e^{x}+e^{-x}\right)^{2}} \\ &=\frac{\left(e^{x}+e^{-x}\right)^{2}-\left(e^{x} \cdot e^{-x}\right)^{2}}{\left(e^{x}+e^{-x}\right)^{2}} \\ &=1-\left(\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}\right)^{2} \\ &=1-\tanh (x)^{2} \end{aligned}$

1.3.3 代码实现

import numpy as np
class TanH():
    def __call__(self, x):
        return 2 / (1 + np.exp(-2*x)) - 1

    def gradient(self, x):
        return 1 - np.power(self.__call__(x), 2)

1.4 relu

1.4.1 函数

$f(x)=\max (0, x)$

1.4.2 导数

$f^{\prime}(x)=\left\{\begin{array}{cc} 1 & \text { if } (x>0) \\ 0 & \text { if } (x<=0) \end{array}\right.$

1.4.3 代码实现

import numpy as np
class ReLU():
    def __call__(self, x):
        return np.where(x >= 0, x, 0)

    def gradient(self, x):
        return np.where(x >= 0, 1, 0)

1.5 leakyrelu

1.5.1 函数

$f(x)=\max (a x, x)$

1.5.2 导数

$f^{\prime}(x)=\left\{\begin{array}{cl} 1 & \text { if } (x>0) \\ a & \text { if }(x<=0) \end{array}\right.$

1.5.3 代码实现

import numpy as np
class LeakyReLU():
    def __init__(self, alpha=0.2):
        self.alpha = alpha

    def __call__(self, x):
        return np.where(x >= 0, x, self.alpha * x)

    def gradient(self, x):
        return np.where(x >= 0, 1, self.alpha)

1.6 ELU

1.61 函数

$f(x)=\left\{\begin{array}{cll} x, & \text { if } x \geq 0 \\ a\left(e^{x}-1\right), & \text { if } (x<0) \end{array}\right.$

1.6.2 导数

当x>=0时，导数为1。

当x<0时，导数的推导过程：

$\begin{aligned} \\ f(x)^{\prime} &=\left(a\left(e^{x}-1\right)\right)^{\prime} \\ &=a e^{x} \\ &\left.=a (e^{x}-1\right)+a \\ &=f(x)+a=ae^{x} \end{aligned}$

所以，完整的导数为：

$f^{\prime}=\left\{\begin{array}{cll} 1 & \text { if } & x \geq 0 \\ f(x)+a=ae^{x} & \text { if } & x<0 \end{array}\right.$

1.6.3 代码实现

import numpy as np
class ELU():
    def __init__(self, alpha=0.1):
        self.alpha = alpha 

    def __call__(self, x):
        return np.where(x >= 0.0, x, self.alpha * (np.exp(x) - 1))

    def gradient(self, x):
        return np.where(x >= 0.0, 1, self.__call__(x) + self.alpha)

1.7 selu

1.7.1 函数

$\operatorname{selu}(x)=\lambda \begin{cases}x & \text { if } (x>0) \\ \alpha e^{x}-\alpha & \text { if } (x \leqslant 0)\end{cases}$

1.7.2 导数

$\operatorname{selu}^{\prime}(x)=\lambda \begin{cases}1 & x>0 \\ \alpha e^{x} & \leqslant 0\end{cases}$

1.7.3 代码实现

import numpy as np
class SELU():
    # Reference : https://arxiv.org/abs/1706.02515,
    # https://github.com/bioinf-jku/SNNs/blob/master/SelfNormalizingNetworks_MLP_MNIST.ipynb
    def __init__(self):
        self.alpha = 1.6732632423543772848170429916717
        self.scale = 1.0507009873554804934193349852946 

    def __call__(self, x):
        return self.scale * np.where(x >= 0.0, x, self.alpha*(np.exp(x)-1))

    def gradient(self, x):
        return self.scale * np.where(x >= 0.0, 1, self.alpha * np.exp(x))

1.8 softplus

1.81 函数

$\operatorname{Softplus}(x)=\log \left(1+e^{x}\right)$

1.8.2 导数

log默认的底数是 $e$

$f^{\prime}(x)=\frac{e^{x}}{(1+e^{x})\ln e}=\frac{1}{1+e^{-x}}=\sigma(x)$

1.8.3 代码实现

import numpy as np
class SoftPlus():
    def __call__(self, x):
        return np.log(1 + np.exp(x))

    def gradient(self, x):
        return 1 / (1 + np.exp(-x))

1.9 Swish

1.9.1 函数

$f(x)=x \cdot \operatorname{sigmoid}(\beta x)$

1.9.2 导数

$\begin{aligned} f^{\prime}(x) &=\sigma(\beta x)+\beta x \cdot \sigma(\beta x)(1-\sigma(\beta x)) \\ &=\sigma(\beta x)+\beta x \cdot \sigma(\beta x)-\beta x \cdot \sigma(\beta x)^{2} \\ &=\beta x \cdot \sigma(x)+\sigma(\beta x)(1-\beta x \cdot \sigma(\beta x)) \\ &=\beta f(x)+\sigma(\beta x)(1-\beta f(x)) \end{aligned}$

1.9.3 代码实现

import numpy as np


class Swish(object):
    def __init__(self, b):
        self.b = b

    def __call__(self, x):
        return x * (np.exp(self.b * x) / (np.exp(self.b * x) + 1))

    def gradient(self, x):
        return self.b * x / (1 + np.exp(-self.b * x)) + (1 / (1 + np.exp(-self.b * x)))(
            1 - self.b * (x / (1 + np.exp(-self.b * x))))

1.10 Mish

1.10.1 函数

$f(x)=x * \tanh \left(\ln \left(1+e^{x}\right)\right)$

1.10.2 导数

$\begin{gathered} f^{\prime}(x)=\operatorname{sech}^{2}(\operatorname{soft} \operatorname{plus}(x)) x \operatorname{sigmoid}(x)+\frac{f(x)}{x} \\ =\Delta(x) s w i \operatorname{sh}(x)+\frac{f(x)}{x} \end{gathered}$

where softplus $(x)=\ln \left(1+e^{x}\right)$ and sigmoid $(x)=1 /\left(1+e^{-x}\right)$ .

1.10.3 代码实现

import numpy as np


def sech(x):
    """sech函数"""
    return 2 / (np.exp(x) + np.exp(-x))


def sigmoid(x):
    """sigmoid函数"""
    return 1 / (1 + np.exp(-x))


def soft_plus(x):
    """softplus函数"""
    return np.log(1 + np.exp(x))


def tan_h(x):
    """tanh函数"""
    return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))


class Mish:

    def __call__(self, x):
        return x * tan_h(soft_plus(x))

    def gradient(self, x):
        return sech(soft_plus(x)) * sech(soft_plus(x)) * x * sigmoid(x) + tan_h(soft_plus(x))

1.11 SiLU

1.11.1 函数

$f(x)=x \times sigmoid (x)$

1.11.2 导数

推导过程

$\begin{aligned} &f(x)^{\prime}=(x \cdot sigmoid(x))^{\prime}\\ &=sigmoid(x)+x \cdot(sigmoid(x)(1-sigmoid(x))\\ &=sigmoid(x)+x \cdot sigmoid(x)-x \cdot sigmoid^{2}(x)\\ &=f(x)+\operatorname{sigmoid}(x)(1-f(x)) \end{aligned}$

1.11.3 代码实现

import numpy as np


def sigmoid(x):
    """sigmoid函数"""
    return 1 / (1 + np.exp(-x))


class SILU(object):

    def __call__(self, x):
        return x * sigmoid(x)

    def gradient(self, x):
        return self.__call__(x) + sigmoid(x) * (1 - self.__call__(x))

1.12 完整代码

定义一个activation_function.py,将下面的代码复制进去，到这里激活函数就完成了。

import numpy as np


# Collection of activation functions
# Reference: https://en.wikipedia.org/wiki/Activation_function

class Sigmoid():
    def __call__(self, x):
        return 1 / (1 + np.exp(-x))

    def gradient(self, x):
        return self.__call__(x) * (1 - self.__call__(x))


class Softmax():
    def __call__(self, x):
        e_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
        return e_x / np.sum(e_x, axis=-1, keepdims=True)

    def gradient(self, x):
        p = self.__call__(x)
        return p * (1 - p)


class TanH():
    def __call__(self, x):
        return 2 / (1 + np.exp(-2 * x)) - 1

    def gradient(self, x):
        return 1 - np.power(self.__call__(x), 2)


class ReLU():
    def __call__(self, x):
        return np.where(x >= 0, x, 0)

    def gradient(self, x):
        return np.where(x >= 0, 1, 0)


class LeakyReLU():
    def __init__(self, alpha=0.2):
        self.alpha = alpha

    def __call__(self, x):
        return np.where(x >= 0, x, self.alpha * x)

    def gradient(self, x):
        return np.where(x >= 0, 1, self.alpha)


class ELU(object):
    def __init__(self, alpha=0.1):
        self.alpha = alpha

    def __call__(self, x):
        return np.where(x >= 0.0, x, self.alpha * (np.exp(x) - 1))

    def gradient(self, x):
        return np.where(x >= 0.0, 1, self.__call__(x) + self.alpha)


class SELU():
    # Reference : https://arxiv.org/abs/1706.02515,
    # https://github.com/bioinf-jku/SNNs/blob/master/SelfNormalizingNetworks_MLP_MNIST.ipynb
    def __init__(self):
        self.alpha = 1.6732632423543772848170429916717
        self.scale = 1.0507009873554804934193349852946

    def __call__(self, x):
        return self.scale * np.where(x >= 0.0, x, self.alpha * (np.exp(x) - 1))

    def gradient(self, x):
        return self.scale * np.where(x >= 0.0, 1, self.alpha * np.exp(x))


class SoftPlus(object):
    def __call__(self, x):
        return np.log(1 + np.exp(x))

    def gradient(self, x):
        return 1 / (1 + np.exp(-x))


class Swish(object):
    def __init__(self, b):
        self.b = b

    def __call__(self, x):
        return x * (np.exp(self.b * x) / (np.exp(self.b * x) + 1))

    def gradient(self, x):
        return self.b * x / (1 + np.exp(-self.b * x)) + (1 / (1 + np.exp(-self.b * x)))(
            1 - self.b * (x / (1 + np.exp(-self.b * x))))


def sech(x):
    """sech函数"""
    return 2 / (np.exp(x) + np.exp(-x))


def sigmoid(x):
    """sigmoid函数"""
    return 1 / (1 + np.exp(-x))


def soft_plus(x):
    """softplus函数"""
    return np.log(1 + np.exp(x))


def tan_h(x):
    """tanh函数"""
    return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))


class Mish:

    def __call__(self, x):
        return x * tan_h(soft_plus(x))

    def gradient(self, x):
        return sech(soft_plus(x)) * sech(soft_plus(x)) * x * sigmoid(x) + tan_h(soft_plus(x))

class SILU(object):

    def __call__(self, x):
        return x * sigmoid(x)

    def gradient(self, x):
        return self.__call__(x) + sigmoid(x) * (1 - self.__call__(x))

参考公式
$(C)^{\prime}=0$
$\left(a^{x}\right)^{\prime}=a^{x} \ln a$
$\left(x^{\mu}\right)^{\prime}=\mu x^{\mu-1}$
$\left(e^{x}\right)^{\prime}=e^{x}$
$(\sin x)^{\prime}=\cos x$
$\left(\log _{a} x\right)^{\prime}=\frac{1}{x \ln a}$
$(\cos x)^{\prime}=-\sin x$
$(\ln x)^{\prime}=\frac{1}{x}$
$(\tan x)^{\prime}=\sec ^{2} x$
$(\arcsin x)^{\prime}=\frac{1}{\sqrt{1-x^{2}}}$
$(\cot x)^{\prime}=-\csc ^{2} x$
$(\arccos x)^{\prime}=-\frac{1}{\sqrt{1-x^{2}}}$
$(\sec x)^{\prime}=\sec x \cdot \tan x$
$(\arctan x)^{\prime}=\frac{1}{1+x^{2}}$
$(\csc x)^{\prime}=-\csc x \cdot \cot x$
$(\operatorname{arccot} x)^{\prime}=-\frac{1}{1+x^{2}}$

双曲正弦: $\sinh x=\frac{e^{x}-e^{-x}}{2}$
双曲余弦: $\cosh x=\frac{e^{x}+e^{-x}}{2}$
双曲正切: $\tanh x=\frac{\sinh x}{\cosh x}=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}$
双曲余切: $\operatorname{coth} x=\frac{1}{\tanh x}=\frac{e^{x}+e^{-x}}{e^{x}-e^{-x}}$
双曲正割： $\operatorname{sech} x=\frac{1}{\cosh x}=\frac{2}{e^{x}+e^{-x}}$
双曲余割： $\operatorname{csch} x=\frac{1}{\sinh x}=\frac{2}{e^{x}-e^{-x}}$

【声明】本内容来自华为云开发者社区博主，不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源（华为云社区）、文章链接、文章作者等基本信息，否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容，欢迎发送邮件进行举报，并提供相关证据，一经查实，本社区将立刻删除涉嫌侵权内容，举报邮箱： cloudbbs@huaweicloud.com

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入