# Serence

## 机器学习 - 逻辑回归

logistic 回归又称 logistic 回归分析，是一种广义的线性回归分析模型，常用于数据挖掘，疾病自动诊断，经济预测等领域。

# 正文

## 预测函数——二元分类

$g\left( z \right) =\frac{1}{1+e_{}^{-z}}$

$h_{\theta}^{}\left( x \right) =\theta _{}^{T}x$

$h_{\theta}^{}\left( x \right) =g\left( z \right) =g\left( \theta _{}^{T}x \right) =\frac{1}{1+e_{}^{-\theta _{}^{T}x}}$

$g\left( -\theta _{1}^{}x+\theta _{2}^{} \right) =\frac{1}{1+e_{}^{-\left( \theta _{1}^{}x+\theta _{2}^{} \right)}}$

h(x)表示在输入值为x，参数为θ的前提条件下y=1的概率。用概率论的公式可以写成

$h_{\theta}^{}\left( x \right) =P\left( y=1|x,\theta \right)$

## 边界

$h_{\theta}^{}\left( x \right) =g\left( z \right) =g\left( \theta _{}^{T}x \right) =\frac{1}{1+e_{}^{-\theta _{}^{T}x}}$

## 成本函数

$\cos t\left( h_{\theta}^{}\left( x \right) ,y \right) =\begin{cases} -\log \left( h_{\theta}^{}\left( x \right) \right)& if\ y=1\\ -\log \left( 1-h_{\theta}^{}\left( x \right) \right)& if\ y=0\\ \end{cases}$

$\cos t\left( h_{\theta}^{}\left( x \right) ,y \right) =-y\log \left( h_{\theta}^{}\left( x \right) \right) -\left( 1-y \right) \log \left( h_{\theta}^{}\left( x \right) \right)$

$J\left( \theta \right) =-\frac{1}{m}\left( \sum_{i=1}^m{y_{i}^{}\log \left( h_{\theta}^{}\left( x_{i}^{} \right) \right) +\left( 1-y_{i}^{} \right) \log \left( 1-h_{\theta}^{}\left( x_{i}^{} \right) \right)} \right)$

## 梯度下降算法

$g\left( -\theta _{1}^{}x+\theta _{2}^{} \right) =\frac{1}{1+e_{}^{-\left( \theta _{1}^{}x+\theta _{2}^{} \right)}}$

$\theta _1=\theta _1-\alpha \frac{\partial}{\partial \theta _{1}^{}}J\left( \theta \right)$

$\theta _0=\theta _0-\alpha \frac{\partial}{\partial \theta _{0}^{}}J\left( \theta \right)$

$\theta _1=\theta _1-\alpha \frac{1}{m}\sum_{i=1}^m{\left( h_{\theta}\left( x^{\left( i \right)} \right) -y^{\left( i \right)} \right)}x_{}^{\left( i \right)}$

$\theta _0=\theta _0-\alpha \frac{1}{m}\sum_{i=1}^m{\left( h_{\theta}\left( x^{\left( i \right)} \right) -y^{\left( i \right)} \right)}$

\begin{aligned} \sigma \left( x \right) ’&=\left( \frac{1}{1+e^{-x}} \right) ’=\frac{-\left( 1+e^{-x} \right) ’}{\left( 1+e^{-x} \right) ^2}=\frac{-1’-\left( e^{-x} \right) ’}{\left( 1+e^{-x} \right) ^2}\\ &=\frac{0-\left( -x \right) ’\left( e^{-x} \right)}{\left( 1+e^{-x} \right) ^2}=\frac{-\left( -1 \right) \left( e^{-x} \right)}{\left( 1+e^{-x} \right) ^2}=\frac{e^{-x}}{\left( 1+e^{-x} \right) ^2}\\ &=\left( \frac{1}{1+e^{-x}} \right) \left( \frac{e^{-x}}{1+e^{-x}} \right) =\sigma \left( x \right) \left( \frac{+1-1+e^{-x}}{1+e^{-x}} \right)\\ &=\sigma \left( x \right) \left( \frac{1+e^{-x}}{1+e^{-x}}-\frac{1}{1+e^{-x}} \right)\\ &=\sigma \left( x \right) \left( 1-\sigma \left( x \right) \right)\\ \end{aligned}

## 正则化

### 线性回归正则化

$J\left( \theta \right) =\frac{1}{2m}\left[ \sum_{i=1}^m{\left( h_{\theta}^{}\left( x_{i}^{} \right) -y_{i}^{} \right) _{}^{2}} \right] +\lambda \sum_{j=1}^n{\theta _{j}^{2}}$

$\theta _{j}^{}=\theta _{j}^{}-a\frac{1}{m}\sum_{i=1}^m{\left[ \left( \left( h\left( x_{}^{\left( i \right)} \right) -y_{}^{\left( i \right)} \right) x_{j}^{\left( i \right)} \right) +\frac{\lambda}{m}\theta _{j}^{} \right]}$

### 逻辑回归正则化

$J\left( \theta \right) =-\frac{1}{m}\left[ \sum_{i=1}^m{y_{}^{\left( i \right)}\log \left( h_{\theta}^{}\left( x_{}^{\left( i \right)} \right) \right) +\left( 1-y_{}^{\left( i \right)} \right) \log \left( 1-h_{\theta}^{}\left( x_{}^{\left( i \right)} \right) \right)} \right] +\frac{\lambda}{2m}\sum_{j=1}^n{\theta _{j}^{2}}$

$\theta _{j}^{}=\theta _{j}^{}\left( 1-\alpha \frac{\lambda}{m} \right) -\alpha \frac{1}{m}\sum_{i=1}^m{\left( \left( h\left( x_{}^{\left( i \right)} \right) -y_{}^{\left( i \right)} \right) x_{j}^{\left( i \right)} \right)}$

## 范数

L1范数（L1正则项 ：向量里元素的绝对值之和
||θ||1 = ||θ1||+||θ2||
L2范数（L2正则项 ：向量里元素的平方和的开方根
||θ||2 = √θ12 + θ22
L1 范数作为正则项，会让模型参数稀疏化，即让模型参数向量里为0的元素尽量多。而L2范数作为正则项，会让模型参数尽量小，但不会为0，即尽量让每个特征对预测值都有贡献。

References:

1. https://baike.baidu.com/item/logistic回归/2981575
2. http://blog.kamidox.com/logistic-regression.html
3. 《scikit-learn 机器学习：常用算法原理及编程实战》- 黄永昌