Deep Learning Chapter01:机器学习中概率论

北山啦 发表于 2022/05/09 18:06:55 2022/05/09
【摘要】 Deep Learning Chapter01:机器学习中概率论好久不见,大家好,我是北山啦。机器学习当中需要用到许多的数学知识,如今博主又要继续踏上深度学习的路程,所以现在在网上总结了相关的考研数学和机器学习中常见相关知识如下,希望对大家有所帮助。 概率论和数理统计 随机事件和概率1.事件的关系与运算(1) 子事件:A⊂BA \subset BA⊂B,若AAA发生,则BBB发生。(2) 相...

Deep Learning Chapter01:机器学习中概率论

好久不见,大家好,我是北山啦。机器学习当中需要用到许多的数学知识,如今博主又要继续踏上深度学习的路程,所以现在在网上总结了相关的考研数学和机器学习中常见相关知识如下,希望对大家有所帮助。
在这里插入图片描述

概率论和数理统计

随机事件和概率

1.事件的关系与运算

(1) 子事件: A B A \subset B ,若 A A 发生,则 B B 发生。

(2) 相等事件: A = B A = B ,即 A B A \subset B ,且 B A B \subset A

(3) 和事件: A B A\bigcup B (或 A + B A + B ), A A B B 中至少有一个发生。

(4) 差事件: A B A - B A A 发生但 B B 不发生。

(5) 积事件: A B A\bigcap B (或 A B {AB} ), A A B B 同时发生。

(6) 互斥事件(互不相容): A B A\bigcap B = \varnothing

(7) 互逆事件(对立事件):
A B = , A B = Ω , A = B ˉ , B = A ˉ A\bigcap B=\varnothing ,A\bigcup B=\Omega ,A=\bar{B},B=\bar{A}
2.运算律
(1) 交换律: A B = B A , A B = B A A\bigcup B=B\bigcup A,A\bigcap B=B\bigcap A
(2) 结合律: ( A B ) C = A ( B C ) (A\bigcup B)\bigcup C=A\bigcup (B\bigcup C)
(3) 分配律: ( A B ) C = A ( B C ) (A\bigcap B)\bigcap C=A\bigcap (B\bigcap C)
3.德$\centerdot $摩根律

A B = A ˉ B ˉ \overline{A\bigcup B}=\bar{A}\bigcap \bar{B} A B = A ˉ B ˉ \overline{A\bigcap B}=\bar{A}\bigcup \bar{B}
4.完全事件组

A 1 A 2 A n {{A}_{1}}{{A}_{2}}\cdots {{A}_{n}} 两两互斥,且和事件为必然事件,即${{A}{i}}\bigcap {{A}{j}}=\varnothing, i\ne j ,\underset{i=1}{\overset{n}{\mathop \bigcup }},=\Omega $

5.概率的基本公式
(1)条件概率:
P ( B A ) = P ( A B ) P ( A ) P(B|A)=\frac{P(AB)}{P(A)} ,表示 A A 发生的条件下, B B 发生的概率。
(2)全概率公式:
$P(A)=\sum\limits_{i=1}^{n}{P(A|{{B}{i}})P({{B}{i}}),{{B}{i}}{{B}{j}}}=\varnothing ,i\ne j,\underset{i=1}{\overset{n}{\mathop{\bigcup }}},{{B}_{i}}=\Omega $
(3) Bayes公式:

P ( B j A ) = P ( A B j ) P ( B j ) i = 1 n P ( A B i ) P ( B i ) , j = 1 , 2 , , n P({{B}_{j}}|A)=\frac{P(A|{{B}_{j}})P({{B}_{j}})}{\sum\limits_{i=1}^{n}{P(A|{{B}_{i}})P({{B}_{i}})}},j=1,2,\cdots ,n
注:上述公式中事件 B i {{B}_{i}} 的个数可为可列个。
(4)乘法公式:
P ( A 1 A 2 ) = P ( A 1 ) P ( A 2 A 1 ) = P ( A 2 ) P ( A 1 A 2 ) P({{A}_{1}}{{A}_{2}})=P({{A}_{1}})P({{A}_{2}}|{{A}_{1}})=P({{A}_{2}})P({{A}_{1}}|{{A}_{2}})
P ( A 1 A 2 A n ) = P ( A 1 ) P ( A 2 A 1 ) P ( A 3 A 1 A 2 ) P ( A n A 1 A 2 A n 1 ) P({{A}_{1}}{{A}_{2}}\cdots {{A}_{n}})=P({{A}_{1}})P({{A}_{2}}|{{A}_{1}})P({{A}_{3}}|{{A}_{1}}{{A}_{2}})\cdots P({{A}_{n}}|{{A}_{1}}{{A}_{2}}\cdots {{A}_{n-1}})

6.事件的独立性
(1) A A B B 相互独立 P ( A B ) = P ( A ) P ( B ) \Leftrightarrow P(AB)=P(A)P(B)
(2) A A B B C C 两两独立
P ( A B ) = P ( A ) P ( B ) \Leftrightarrow P(AB)=P(A)P(B) ; P ( B C ) = P ( B ) P ( C ) P(BC)=P(B)P(C) ; P ( A C ) = P ( A ) P ( C ) P(AC)=P(A)P(C) ;
(3) A A B B C C 相互独立
P ( A B ) = P ( A ) P ( B ) \Leftrightarrow P(AB)=P(A)P(B) ; P ( B C ) = P ( B ) P ( C ) P(BC)=P(B)P(C) ;
P ( A C ) = P ( A ) P ( C ) P(AC)=P(A)P(C) ; P ( A B C ) = P ( A ) P ( B ) P ( C ) P(ABC)=P(A)P(B)P(C)

7.独立重复试验

将某试验独立重复 n n 次,若每次实验中事件A发生的概率为 p p ,则 n n 次试验中 A A 发生 k k 次的概率为:
P ( X = k ) = C n k p k ( 1 p ) n k P(X=k)=C_{n}^{k}{{p}^{k}}{{(1-p)}^{n-k}}
8.重要公式与结论
( 1 ) P ( A ˉ ) = 1 P ( A ) (1)P(\bar{A})=1-P(A)
( 2 ) P ( A B ) = P ( A ) + P ( B ) P ( A B ) (2)P(A\bigcup B)=P(A)+P(B)-P(AB)
P ( A B C ) = P ( A ) + P ( B ) + P ( C ) P ( A B ) P ( B C ) P ( A C ) + P ( A B C ) P(A\bigcup B\bigcup C)=P(A)+P(B)+P(C)-P(AB)-P(BC)-P(AC)+P(ABC)
( 3 ) P ( A B ) = P ( A ) P ( A B ) (3)P(A-B)=P(A)-P(AB)
( 4 ) P ( A B ˉ ) = P ( A ) P ( A B ) , P ( A ) = P ( A B ) + P ( A B ˉ ) , (4)P(A\bar{B})=P(A)-P(AB),P(A)=P(AB)+P(A\bar{B}),
P ( A B ) = P ( A ) + P ( A ˉ B ) = P ( A B ) + P ( A B ˉ ) + P ( A ˉ B ) P(A\bigcup B)=P(A)+P(\bar{A}B)=P(AB)+P(A\bar{B})+P(\bar{A}B)
(5)条件概率 P ( B ) P(\centerdot |B) 满足概率的所有性质,
例如:. P ( A ˉ 1 B ) = 1 P ( A 1 B ) P({{\bar{A}}_{1}}|B)=1-P({{A}_{1}}|B)
P ( A 1 A 2 B ) = P ( A 1 B ) + P ( A 2 B ) P ( A 1 A 2 B ) P({{A}_{1}}\bigcup {{A}_{2}}|B)=P({{A}_{1}}|B)+P({{A}_{2}}|B)-P({{A}_{1}}{{A}_{2}}|B)
P ( A 1 A 2 B ) = P ( A 1 B ) P ( A 2 A 1 B ) P({{A}_{1}}{{A}_{2}}|B)=P({{A}_{1}}|B)P({{A}_{2}}|{{A}_{1}}B)
(6)若 A 1 , A 2 , , A n {{A}_{1}},{{A}_{2}},\cdots ,{{A}_{n}} 相互独立,则 P ( i = 1 n A i ) = i = 1 n P ( A i ) , P(\bigcap\limits_{i=1}^{n}{{{A}_{i}}})=\prod\limits_{i=1}^{n}{P({{A}_{i}})},
P ( i = 1 n A i ) = i = 1 n ( 1 P ( A i ) ) P(\bigcup\limits_{i=1}^{n}{{{A}_{i}}})=\prod\limits_{i=1}^{n}{(1-P({{A}_{i}}))}
(7)互斥、互逆与独立性之间的关系:
A A B B 互逆 \Rightarrow A A B B 互斥,但反之不成立, A A B B 互斥(或互逆)且均非零概率事件$\Rightarrow $$A B 不独立 . ( 8 ) 不独立. (8)若 {{A}{1}},{{A}{2}},\cdots ,{{A}{m}},{{B}{1}},{{B}{2}},\cdots ,{{B}{n}} 相互独立,则 相互独立,则 f({{A}{1}},{{A}{2}},\cdots ,{{A}{m}}) g({{B}{1}},{{B}{2}},\cdots ,{{B}{n}}) 也相互独立,其中 也相互独立,其中 f(\centerdot ),g(\centerdot )$分别表示对相应事件做任意事件运算后所得的事件,另外,概率为1(或0)的事件与任何事件相互独立.

随机变量及其概率分布

1.随机变量及概率分布

取值带有随机性的变量,严格地说是定义在样本空间上,取值于实数的函数称为随机变量,概率分布通常指分布函数或分布律

2.分布函数的概念与性质

定义: F ( x ) = P ( X x ) , < x < + F(x) = P(X \leq x), - \infty < x < + \infty

性质:(1) 0 F ( x ) 1 0 \leq F(x) \leq 1

(2) F ( x ) F(x) 单调不减

(3) 右连续 F ( x + 0 ) = F ( x ) F(x + 0) = F(x)

(4) F ( ) = 0 , F ( + ) = 1 F( - \infty) = 0,F( + \infty) = 1

3.离散型随机变量的概率分布

P ( X = x i ) = p i , i = 1 , 2 , , n , p i 0 , i = 1 p i = 1 P(X = x_{i}) = p_{i},i = 1,2,\cdots,n,\cdots\quad\quad p_{i} \geq 0,\sum_{i =1}^{\infty}p_{i} = 1

4.连续型随机变量的概率密度

概率密度 f ( x ) f(x) ;非负可积,且:

(1) f ( x ) 0 , f(x) \geq 0,

(2) + f ( x ) d x = 1 \int_{- \infty}^{+\infty}{f(x){dx} = 1}

(3) x x f ( x ) f(x) 的连续点,则:

f ( x ) = F ( x ) f(x) = F'(x) 分布函数 F ( x ) = x f ( t ) d t F(x) = \int_{- \infty}^{x}{f(t){dt}}

5.常见分布

(1) 0-1分布: P ( X = k ) = p k ( 1 p ) 1 k , k = 0 , 1 P(X = k) = p^{k}{(1 - p)}^{1 - k},k = 0,1

(2) 二项分布: B ( n , p ) B(n,p) P ( X = k ) = C n k p k ( 1 p ) n k , k = 0 , 1 , , n P(X = k) = C_{n}^{k}p^{k}{(1 - p)}^{n - k},k =0,1,\cdots,n

(3) Poisson分布: p ( λ ) p(\lambda) P ( X = k ) = λ k k ! e λ , λ > 0 , k = 0 , 1 , 2 P(X = k) = \frac{\lambda^{k}}{k!}e^{-\lambda},\lambda > 0,k = 0,1,2\cdots

(4) 均匀分布 U ( a , b ) U(a,b) :$f(x) = { \begin{matrix} & \frac{1}{b - a},a < x< b \ & 0, \ \end{matrix} $

(5) 正态分布: N ( μ , σ 2 ) : N(\mu,\sigma^{2}): φ ( x ) = 1 2 π σ e ( x μ ) 2 2 σ 2 , σ > 0 , < x < + \varphi(x) =\frac{1}{\sqrt{2\pi}\sigma}e^{- \frac{{(x - \mu)}^{2}}{2\sigma^{2}}},\sigma > 0,\infty < x < + \infty

(6)指数分布:$E(\lambda):f(x) ={ \begin{matrix} & \lambda e^{-{λx}},x > 0,\lambda > 0 \ & 0, \ \end{matrix} $

(7)几何分布: G ( p ) : P ( X = k ) = ( 1 p ) k 1 p , 0 < p < 1 , k = 1 , 2 , . G(p):P(X = k) = {(1 - p)}^{k - 1}p,0 < p < 1,k = 1,2,\cdots.

(8)超几何分布: H ( N , M , n ) : P ( X = k ) = C M k C N M n k C N n , k = 0 , 1 , , m i n ( n , M ) H(N,M,n):P(X = k) = \frac{C_{M}^{k}C_{N - M}^{n -k}}{C_{N}^{n}},k =0,1,\cdots,min(n,M)

6.随机变量函数的概率分布

(1)离散型: P ( X = x 1 ) = p i , Y = g ( X ) P(X = x_{1}) = p_{i},Y = g(X)

则: P ( Y = y j ) = g ( x i ) = y i P ( X = x i ) P(Y = y_{j}) = \sum_{g(x_{i}) = y_{i}}^{}{P(X = x_{i})}

(2)连续型: X   ~ f X ( x ) , Y = g ( x ) X\tilde{\ }f_{X}(x),Y = g(x)

则: F y ( y ) = P ( Y y ) = P ( g ( X ) y ) = g ( x ) y f x ( x ) d x F_{y}(y) = P(Y \leq y) = P(g(X) \leq y) = \int_{g(x) \leq y}^{}{f_{x}(x)dx} f Y ( y ) = F Y ( y ) f_{Y}(y) = F'_{Y}(y)

7.重要公式与结论

(1) X N ( 0 , 1 ) φ ( 0 ) = 1 2 π , Φ ( 0 ) = 1 2 , X\sim N(0,1) \Rightarrow \varphi(0) = \frac{1}{\sqrt{2\pi}},\Phi(0) =\frac{1}{2}, Φ ( a ) = P ( X a ) = 1 Φ ( a ) \Phi( - a) = P(X \leq - a) = 1 - \Phi(a)

(2) X N ( μ , σ 2 ) X μ σ N ( 0 , 1 ) , P ( X a ) = Φ ( a μ σ ) X\sim N\left( \mu,\sigma^{2} \right) \Rightarrow \frac{X -\mu}{\sigma}\sim N\left( 0,1 \right),P(X \leq a) = \Phi(\frac{a -\mu}{\sigma})

(3) X E ( λ ) P ( X > s + t X > s ) = P ( X > t ) X\sim E(\lambda) \Rightarrow P(X > s + t|X > s) = P(X > t)

(4) X G ( p ) P ( X = m + k X > m ) = P ( X = k ) X\sim G(p) \Rightarrow P(X = m + k|X > m) = P(X = k)

(5) 离散型随机变量的分布函数为阶梯间断函数;连续型随机变量的分布函数为连续函数,但不一定为处处可导函数。

(6) 存在既非离散也非连续型随机变量。

多维随机变量及其分布

1.二维随机变量及其联合分布

由两个随机变量构成的随机向量 ( X , Y ) (X,Y) , 联合分布为 F ( x , y ) = P ( X x , Y y ) F(x,y) = P(X \leq x,Y \leq y)

2.二维离散型随机变量的分布

(1) 联合概率分布律 P { X = x i , Y = y j } = p i j ; i , j = 1 , 2 , P\{ X = x_{i},Y = y_{j}\} = p_{{ij}};i,j =1,2,\cdots

(2) 边缘分布律 p i = j = 1 p i j , i = 1 , 2 , p_{i \cdot} = \sum_{j = 1}^{\infty}p_{{ij}},i =1,2,\cdots p j = i p i j , j = 1 , 2 , p_{\cdot j} = \sum_{i}^{\infty}p_{{ij}},j = 1,2,\cdots

(3) 条件分布律 P { X = x i Y = y j } = p i j p j P\{ X = x_{i}|Y = y_{j}\} = \frac{p_{{ij}}}{p_{\cdot j}}
P { Y = y j X = x i } = p i j p i P\{ Y = y_{j}|X = x_{i}\} = \frac{p_{{ij}}}{p_{i \cdot}}

3. 二维连续性随机变量的密度

(1) 联合概率密度 f ( x , y ) : f(x,y):

  1. f ( x , y ) 0 f(x,y) \geq 0

  2. + + f ( x , y ) d x d y = 1 \int_{- \infty}^{+ \infty}{\int_{- \infty}^{+ \infty}{f(x,y)dxdy}} = 1

(2) 分布函数: F ( x , y ) = x y f ( u , v ) d u d v F(x,y) = \int_{- \infty}^{x}{\int_{- \infty}^{y}{f(u,v)dudv}}

(3) 边缘概率密度: f X ( x ) = + f ( x , y ) d y f_{X}\left( x \right) = \int_{- \infty}^{+ \infty}{f\left( x,y \right){dy}} f Y ( y ) = + f ( x , y ) d x f_{Y}(y) = \int_{- \infty}^{+ \infty}{f(x,y)dx}

(4) 条件概率密度: f X Y ( x | y ) = f ( x , y ) f Y ( y ) f_{X|Y}\left( x \middle| y \right) = \frac{f\left( x,y \right)}{f_{Y}\left( y \right)} f Y X ( y x ) = f ( x , y ) f X ( x ) f_{Y|X}(y|x) = \frac{f(x,y)}{f_{X}(x)}

4.常见二维随机变量的联合分布

(1) 二维均匀分布: ( x , y ) U ( D ) (x,y) \sim U(D) , f ( x , y ) = { 1 S ( D ) , ( x , y ) D 0 , 其他 f(x,y) = \begin{cases} \frac{1}{S(D)},(x,y) \in D \\ 0,其他 \end{cases}

(2) 二维正态分布: ( X , Y ) N ( μ 1 , μ 2 , σ 1 2 , σ 2 2 , ρ ) (X,Y)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho) , ( X , Y ) N ( μ 1 , μ 2 , σ 1 2 , σ 2 2 , ρ ) (X,Y)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho)

f ( x , y ) = 1 2 π σ 1 σ 2 1 ρ 2 . exp { 1 2 ( 1 ρ 2 ) [ ( x μ 1 ) 2 σ 1 2 2 ρ ( x μ 1 ) ( y μ 2 ) σ 1 σ 2 + ( y μ 2 ) 2 σ 2 2 ] } f(x,y) = \frac{1}{2\pi\sigma_{1}\sigma_{2}\sqrt{1 - \rho^{2}}}.\exp\left\{ \frac{- 1}{2(1 - \rho^{2})}\lbrack\frac{{(x - \mu_{1})}^{2}}{\sigma_{1}^{2}} - 2\rho\frac{(x - \mu_{1})(y - \mu_{2})}{\sigma_{1}\sigma_{2}} + \frac{{(y - \mu_{2})}^{2}}{\sigma_{2}^{2}}\rbrack \right\}

5.随机变量的独立性和相关性

X X Y Y 的相互独立: F ( x , y ) = F X ( x ) F Y ( y ) \Leftrightarrow F\left( x,y \right) = F_{X}\left( x \right)F_{Y}\left( y \right) :

p i j = p i p j \Leftrightarrow p_{{ij}} = p_{i \cdot} \cdot p_{\cdot j} (离散型)
f ( x , y ) = f X ( x ) f Y ( y ) \Leftrightarrow f\left( x,y \right) = f_{X}\left( x \right)f_{Y}\left( y \right) (连续型)

X X Y Y 的相关性:

相关系数 ρ X Y = 0 \rho_{{XY}} = 0 时,称 X X Y Y 不相关,
否则称 X X Y Y 相关

6.两个随机变量简单函数的概率分布

离散型: P ( X = x i , Y = y i ) = p i j , Z = g ( X , Y ) P\left( X = x_{i},Y = y_{i} \right) = p_{{ij}},Z = g\left( X,Y \right) 则:

P ( Z = z k ) = P { g ( X , Y ) = z k } = g ( x i , y i ) = z k P ( X = x i , Y = y j ) P(Z = z_{k}) = P\left\{ g\left( X,Y \right) = z_{k} \right\} = \sum_{g\left( x_{i},y_{i} \right) = z_{k}}^{}{P\left( X = x_{i},Y = y_{j} \right)}

连续型: ( X , Y ) f ( x , y ) , Z = g ( X , Y ) \left( X,Y \right) \sim f\left( x,y \right),Z = g\left( X,Y \right)
则:

F z ( z ) = P { g ( X , Y ) z } = g ( x , y ) z f ( x , y ) d x d y F_{z}\left( z \right) = P\left\{ g\left( X,Y \right) \leq z \right\} = \iint_{g(x,y) \leq z}^{}{f(x,y)dxdy} f z ( z ) = F z ( z ) f_{z}(z) = F'_{z}(z)

7.重要公式与结论

(1) 边缘密度公式: f X ( x ) = + f ( x , y ) d y , f_{X}(x) = \int_{- \infty}^{+ \infty}{f(x,y)dy,}
f Y ( y ) = + f ( x , y ) d x f_{Y}(y) = \int_{- \infty}^{+ \infty}{f(x,y)dx}

(2) P { ( X , Y ) D } = D f ( x , y ) d x d y P\left\{ \left( X,Y \right) \in D \right\} = \iint_{D}^{}{f\left( x,y \right){dxdy}}

(3) 若 ( X , Y ) (X,Y) 服从二维正态分布 N ( μ 1 , μ 2 , σ 1 2 , σ 2 2 , ρ ) N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho)
则有:

  1. X N ( μ 1 , σ 1 2 ) , Y N ( μ 2 , σ 2 2 ) . X\sim N\left( \mu_{1},\sigma_{1}^{2} \right),Y\sim N(\mu_{2},\sigma_{2}^{2}).

  2. X X Y Y 相互独立 ρ = 0 \Leftrightarrow \rho = 0 ,即 X X Y Y 不相关。

  3. C 1 X + C 2 Y N ( C 1 μ 1 + C 2 μ 2 , C 1 2 σ 1 2 + C 2 2 σ 2 2 + 2 C 1 C 2 σ 1 σ 2 ρ ) C_{1}X + C_{2}Y\sim N(C_{1}\mu_{1} + C_{2}\mu_{2},C_{1}^{2}\sigma_{1}^{2} + C_{2}^{2}\sigma_{2}^{2} + 2C_{1}C_{2}\sigma_{1}\sigma_{2}\rho)

  4.   X {\ X} 关于 Y = y Y=y 的条件分布为: N ( μ 1 + ρ σ 1 σ 2 ( y μ 2 ) , σ 1 2 ( 1 ρ 2 ) ) N(\mu_{1} + \rho\frac{\sigma_{1}}{\sigma_{2}}(y - \mu_{2}),\sigma_{1}^{2}(1 - \rho^{2}))

  5. Y Y 关于 X = x X = x 的条件分布为: N ( μ 2 + ρ σ 2 σ 1 ( x μ 1 ) , σ 2 2 ( 1 ρ 2 ) ) N(\mu_{2} + \rho\frac{\sigma_{2}}{\sigma_{1}}(x - \mu_{1}),\sigma_{2}^{2}(1 - \rho^{2}))

(4) 若 X X Y Y 独立,且分别服从 N ( μ 1 , σ 1 2 ) , N ( μ 1 , σ 2 2 ) , N(\mu_{1},\sigma_{1}^{2}),N(\mu_{1},\sigma_{2}^{2}),
则: ( X , Y ) N ( μ 1 , μ 2 , σ 1 2 , σ 2 2 , 0 ) , \left( X,Y \right)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},0),

C 1 X + C 2 Y   ~ N ( C 1 μ 1 + C 2 μ 2 , C 1 2 σ 1 2 C 2 2 σ 2 2 ) . C_{1}X + C_{2}Y\tilde{\ }N(C_{1}\mu_{1} + C_{2}\mu_{2},C_{1}^{2}\sigma_{1}^{2} C_{2}^{2}\sigma_{2}^{2}).

(5) 若 X X Y Y 相互独立, f ( x ) f\left( x \right) g ( x ) g\left( x \right) 为连续函数, 则 f ( X ) f\left( X \right) g ( Y ) g(Y) 也相互独立。

随机变量的数字特征

1.数学期望

离散型: P { X = x i } = p i , E ( X ) = i x i p i P\left\{ X = x_{i} \right\} = p_{i},E(X) = \sum_{i}^{}{x_{i}p_{i}}

连续型: X f ( x ) , E ( X ) = + x f ( x ) d x X\sim f(x),E(X) = \int_{- \infty}^{+ \infty}{xf(x)dx}

性质:

(1) E ( C ) = C , E [ E ( X ) ] = E ( X ) E(C) = C,E\lbrack E(X)\rbrack = E(X)

(2) E ( C 1 X + C 2 Y ) = C 1 E ( X ) + C 2 E ( Y ) E(C_{1}X + C_{2}Y) = C_{1}E(X) + C_{2}E(Y)

(3) 若 X X Y Y 独立,则 E ( X Y ) = E ( X ) E ( Y ) E(XY) = E(X)E(Y)

(4) [ E ( X Y ) ] 2 E ( X 2 ) E ( Y 2 ) \left\lbrack E(XY) \right\rbrack^{2} \leq E(X^{2})E(Y^{2})

2.方差 D ( X ) = E [ X E ( X ) ] 2 = E ( X 2 ) [ E ( X ) ] 2 D(X) = E\left\lbrack X - E(X) \right\rbrack^{2} = E(X^{2}) - \left\lbrack E(X) \right\rbrack^{2}

3.标准差 D ( X ) \sqrt{D(X)}

4.离散型: D ( X ) = i [ x i E ( X ) ] 2 p i D(X) = \sum_{i}^{}{\left\lbrack x_{i} - E(X) \right\rbrack^{2}p_{i}}

5.连续型: D ( X ) = + [ x E ( X ) ] 2 f ( x ) d x D(X) = {\int_{- \infty}^{+ \infty}\left\lbrack x - E(X) \right\rbrack}^{2}f(x)dx

性质:

(1)   D ( C ) = 0 , D [ E ( X ) ] = 0 , D [ D ( X ) ] = 0 \ D(C) = 0,D\lbrack E(X)\rbrack = 0,D\lbrack D(X)\rbrack = 0

(2) X X Y Y 相互独立,则 D ( X ± Y ) = D ( X ) + D ( Y ) D(X \pm Y) = D(X) + D(Y)

(3)   D ( C 1 X + C 2 ) = C 1 2 D ( X ) \ D\left( C_{1}X + C_{2} \right) = C_{1}^{2}D\left( X \right)

(4) 一般有 D ( X ± Y ) = D ( X ) + D ( Y ) ± 2 C o v ( X , Y ) = D ( X ) + D ( Y ) ± 2 ρ D ( X ) D ( Y ) D(X \pm Y) = D(X) + D(Y) \pm 2Cov(X,Y) = D(X) + D(Y) \pm 2\rho\sqrt{D(X)}\sqrt{D(Y)}

(5)   D ( X ) < E ( X C ) 2 , C E ( X ) \ D\left( X \right) < E\left( X - C \right)^{2},C \neq E\left( X \right)

(6)   D ( X ) = 0 P { X = C } = 1 \ D(X) = 0 \Leftrightarrow P\left\{ X = C \right\} = 1

6.随机变量函数的数学期望

(1) 对于函数 Y = g ( x ) Y = g(x)

X X 为离散型: P { X = x i } = p i , E ( Y ) = i g ( x i ) p i P\{ X = x_{i}\} = p_{i},E(Y) = \sum_{i}^{}{g(x_{i})p_{i}}

X X 为连续型: X f ( x ) , E ( Y ) = + g ( x ) f ( x ) d x X\sim f(x),E(Y) = \int_{- \infty}^{+ \infty}{g(x)f(x)dx}

(2) Z = g ( X , Y ) Z = g(X,Y) ; ( X , Y ) P { X = x i , Y = y j } = p i j \left( X,Y \right)\sim P\{ X = x_{i},Y = y_{j}\} = p_{{ij}} ; E ( Z ) = i j g ( x i , y j ) p i j E(Z) = \sum_{i}^{}{\sum_{j}^{}{g(x_{i},y_{j})p_{{ij}}}} ( X , Y ) f ( x , y ) \left( X,Y \right)\sim f(x,y) ; E ( Z ) = + + g ( x , y ) f ( x , y ) d x d y E(Z) = \int_{- \infty}^{+ \infty}{\int_{- \infty}^{+ \infty}{g(x,y)f(x,y)dxdy}}

7.协方差

C o v ( X , Y ) = E [ ( X E ( X ) ( Y E ( Y ) ) ] Cov(X,Y) = E\left\lbrack (X - E(X)(Y - E(Y)) \right\rbrack

8.相关系数

ρ X Y = C o v ( X , Y ) D ( X ) D ( Y ) \rho_{{XY}} = \frac{Cov(X,Y)}{\sqrt{D(X)}\sqrt{D(Y)}} , k k 阶原点矩 E ( X k ) E(X^{k}) ;
k k 阶中心矩 E { [ X E ( X ) ] k } E\left\{ {\lbrack X - E(X)\rbrack}^{k} \right\}

性质:

(1)   C o v ( X , Y ) = C o v ( Y , X ) \ Cov(X,Y) = Cov(Y,X)

(2)   C o v ( a X , b Y ) = a b C o v ( Y , X ) \ Cov(aX,bY) = abCov(Y,X)

(3)   C o v ( X 1 + X 2 , Y ) = C o v ( X 1 , Y ) + C o v ( X 2 , Y ) \ Cov(X_{1} + X_{2},Y) = Cov(X_{1},Y) + Cov(X_{2},Y)

(4)   ρ ( X , Y ) 1 \ \left| \rho\left( X,Y \right) \right| \leq 1

(5)   ρ ( X , Y ) = 1 P ( Y = a X + b ) = 1 \ \rho\left( X,Y \right) = 1 \Leftrightarrow P\left( Y = aX + b \right) = 1 ,其中 a > 0 a > 0

ρ ( X , Y ) = 1 P ( Y = a X + b ) = 1 \rho\left( X,Y \right) = - 1 \Leftrightarrow P\left( Y = aX + b \right) = 1
,其中 a < 0 a < 0

9.重要公式与结论

(1)   D ( X ) = E ( X 2 ) E 2 ( X ) \ D(X) = E(X^{2}) - E^{2}(X)

(2)   C o v ( X , Y ) = E ( X Y ) E ( X ) E ( Y ) \ Cov(X,Y) = E(XY) - E(X)E(Y)

(3) ρ ( X , Y ) 1 , \left| \rho\left( X,Y \right) \right| \leq 1, ρ ( X , Y ) = 1 P ( Y = a X + b ) = 1 \rho\left( X,Y \right) = 1 \Leftrightarrow P\left( Y = aX + b \right) = 1 ,其中 a > 0 a > 0

ρ ( X , Y ) = 1 P ( Y = a X + b ) = 1 \rho\left( X,Y \right) = - 1 \Leftrightarrow P\left( Y = aX + b \right) = 1 ,其中 a < 0 a < 0

(4) 下面5个条件互为充要条件:

ρ ( X , Y ) = 0 \rho(X,Y) = 0 C o v ( X , Y ) = 0 \Leftrightarrow Cov(X,Y) = 0 E ( X , Y ) = E ( X ) E ( Y ) \Leftrightarrow E(X,Y) = E(X)E(Y) D ( X + Y ) = D ( X ) + D ( Y ) \Leftrightarrow D(X + Y) = D(X) + D(Y) D ( X Y ) = D ( X ) + D ( Y ) \Leftrightarrow D(X - Y) = D(X) + D(Y)

注: X X Y Y 独立为上述5个条件中任何一个成立的充分条件,但非必要条件。

数理统计的基本概念

1.基本概念

总体:研究对象的全体,它是一个随机变量,用 X X 表示。

个体:组成总体的每个基本元素。

简单随机样本:来自总体 X X n n 个相互独立且与总体同分布的随机变量 X 1 , X 2 , X n X_{1},X_{2}\cdots,X_{n} ,称为容量为 n n 的简单随机样本,简称样本。

统计量:设 X 1 , X 2 , X n , X_{1},X_{2}\cdots,X_{n}, 是来自总体 X X 的一个样本, g ( X 1 , X 2 , X n ) g(X_{1},X_{2}\cdots,X_{n}) )是样本的连续函数,且 g ( ) g() 中不含任何未知参数,则称 g ( X 1 , X 2 , X n ) g(X_{1},X_{2}\cdots,X_{n}) 为统计量。

样本均值: X = 1 n i = 1 n X i \overline{X} = \frac{1}{n}\sum_{i = 1}^{n}X_{i}

样本方差: S 2 = 1 n 1 i = 1 n ( X i X ) 2 S^{2} = \frac{1}{n - 1}\sum_{i = 1}^{n}{(X_{i} - \overline{X})}^{2}

样本矩:样本 k k 阶原点矩: A k = 1 n i = 1 n X i k , k = 1 , 2 , A_{k} = \frac{1}{n}\sum_{i = 1}^{n}X_{i}^{k},k = 1,2,\cdots

样本 k k 阶中心矩: B k = 1 n i = 1 n ( X i X ) k , k = 1 , 2 , B_{k} = \frac{1}{n}\sum_{i = 1}^{n}{(X_{i} - \overline{X})}^{k},k = 1,2,\cdots

2.分布

χ 2 \chi^{2} 分布: χ 2 = X 1 2 + X 2 2 + + X n 2 χ 2 ( n ) \chi^{2} = X_{1}^{2} + X_{2}^{2} + \cdots + X_{n}^{2}\sim\chi^{2}(n) ,其中 X 1 , X 2 , X n , X_{1},X_{2}\cdots,X_{n}, 相互独立,且同服从 N ( 0 , 1 ) N(0,1)

t t 分布: T = X Y / n t ( n ) T = \frac{X}{\sqrt{Y/n}}\sim t(n) ,其中 X N ( 0 , 1 ) , Y χ 2 ( n ) , X\sim N\left( 0,1 \right),Y\sim\chi^{2}(n), X X Y Y 相互独立。

F F 分布: F = X / n 1 Y / n 2 F ( n 1 , n 2 ) F = \frac{X/n_{1}}{Y/n_{2}}\sim F(n_{1},n_{2}) ,其中 X χ 2 ( n 1 ) , Y χ 2 ( n 2 ) , X\sim\chi^{2}\left( n_{1} \right),Y\sim\chi^{2}(n_{2}), X X Y Y 相互独立。

分位数:若 P ( X x α ) = α , P(X \leq x_{\alpha}) = \alpha, 则称 x α x_{\alpha} X X α \alpha 分位数

3.正态总体的常用样本分布

(1) 设 X 1 , X 2 , X n X_{1},X_{2}\cdots,X_{n} 为来自正态总体 N ( μ , σ 2 ) N(\mu,\sigma^{2}) 的样本,

X = 1 n i = 1 n X i , S 2 = 1 n 1 i = 1 n ( X i X ) 2 , \overline{X} = \frac{1}{n}\sum_{i = 1}^{n}X_{i},S^{2} = \frac{1}{n - 1}\sum_{i = 1}^{n}{{(X_{i} - \overline{X})}^{2},} 则:

  1. X N ( μ , σ 2 n )    \overline{X}\sim N\left( \mu,\frac{\sigma^{2}}{n} \right){\ \ } 或者 X μ σ n N ( 0 , 1 ) \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}}\sim N(0,1)

  2. ( n 1 ) S 2 σ 2 = 1 σ 2 i = 1 n ( X i X ) 2 χ 2 ( n 1 ) \frac{(n - 1)S^{2}}{\sigma^{2}} = \frac{1}{\sigma^{2}}\sum_{i = 1}^{n}{{(X_{i} - \overline{X})}^{2}\sim\chi^{2}(n - 1)}

  3. 1 σ 2 i = 1 n ( X i μ ) 2 χ 2 ( n ) \frac{1}{\sigma^{2}}\sum_{i = 1}^{n}{{(X_{i} - \mu)}^{2}\sim\chi^{2}(n)}

4)    X μ S / n t ( n 1 ) {\ \ }\frac{\overline{X} - \mu}{S/\sqrt{n}}\sim t(n - 1)

4.重要公式与结论

(1) 对于 χ 2 χ 2 ( n ) \chi^{2}\sim\chi^{2}(n) ,有 E ( χ 2 ( n ) ) = n , D ( χ 2 ( n ) ) = 2 n ; E(\chi^{2}(n)) = n,D(\chi^{2}(n)) = 2n;

(2) 对于 T t ( n ) T\sim t(n) ,有 E ( T ) = 0 , D ( T ) = n n 2 ( n > 2 ) E(T) = 0,D(T) = \frac{n}{n - 2}(n > 2)

(3) 对于 F   ~ F ( m , n ) F\tilde{\ }F(m,n) ,有 1 F F ( n , m ) , F a / 2 ( m , n ) = 1 F 1 a / 2 ( n , m ) ; \frac{1}{F}\sim F(n,m),F_{a/2}(m,n) = \frac{1}{F_{1 - a/2}(n,m)};

(4) 对于任意总体 X X ,有 E ( X ) = E ( X ) , E ( S 2 ) = D ( X ) , D ( X ) = D ( X ) n E(\overline{X}) = E(X),E(S^{2}) = D(X),D(\overline{X}) = \frac{D(X)}{n}

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区),文章链接,文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件至:cloudbbs@huaweicloud.com进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容。
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。