广西科学  2018, Vol. 25 Issue (4): 428-432   PDF    
带非欧氏范数的双稳定束方法
欧小梅, 唐春明     
广西大学数学与信息科学学院, 广西南宁 530004
摘要: 针对一类非光滑凸优化问题, 提出一个带非欧氏范数的双稳定束方法.通过利用邻近函数代替传统的欧氏距离, 形成更具广泛性的双稳定子问题, 进而在计算上可充分利用可行集的几何结构, 加快收敛速度、减少计算量.分析论证了算法的全局收敛性, 当下降步有限时, 最后一个稳定中心即为问题的最优解; 当下降步无限时, 稳定中心点列任意的聚点均为问题的最优解.该方法将传统邻近束方法和水平束方法的稳定性有机融合, 从而具备更优越的理论性质和更稳定的数值效果.
关键词: 双稳定束方法     邻近函数     非光滑优化     全局收敛    
A Doubly Stabilized Bundle Method with Non-Euclidean Norm
OU Xiaomei , TANG Chunming     
College of Mathematics and Information Science, Guangxi University, Nanning, Guangxi, 530004, China
Abstract: In this paper, a doubly stabilized bundle method with non-Euclidean norm was proposed to solve a class of non-smooth convex optimization problems.By using the proximity function to replace the traditional Euclidean distance, a more general doubly stabilized sub-problem was formed, and the geometric structure of the feasible set could be fully utilized in the calculation to speed up the convergence rate and reduce the computational cost.The global convergence of the algorithm was analyzed and demonstrated.When the number of descent steps was finite, the last stable center was the optimal solution of the problem.When the number of descent steps was infinite, each accumulation point of the sequence of stable centers was the optimal solution of the problem.This method well combined the stability of traditional proximal bundle method and level bundle method, so that it had more superior theoretical properties and more stable numerical effects.
Key words: oubly stabilized bundle method     proximal function     non-smooth optimization     global convergence    
0 引言

考虑求解如下非光滑凸优化问题

$ {f^ * } = \inf \left\{ {f\left( x \right):x \in X} \right\}, $ (0.1)

其中, f:RnR为非光滑凸函数, X$ \subseteq $Rn为非空闭凸集.

非光滑优化是最优化研究[1-3]的重要分支, 其在数据挖掘[4]、图像恢复[5]及机器学习[6]等领域有着广泛应用.束方法是求解非光滑优化最有效的方法之一[7-9].传统束方法包括邻近束方法[10]、水平束方法[11]和信赖域束方法[12]等.最近, De Oliveira和Solodov[13]结合邻近束方法与水平束方法的稳定性, 提出求解问题(0.1)的一个双稳定束方法, 其通过求解如下子问题产生新迭代点xk+1

$ \begin{array}{l} \min \\ \left\{ {\mathop {{f_k}}\limits^ \vee \left( x \right) + \frac{1}{{2{\tau _k}}}{{\left\| {x - {{\hat x}_k}} \right\|}^2}:\mathop {{f_k}}\limits^ \vee \left( x \right) \le {l_k},x \in X} \right\}, \end{array} $ (0.2)

其中, $\mathop {{f_k}}\limits^ \vee \left( x \right)$是在第k次迭代时产生的f(x)的割平面模型, 且满足$\mathop {{f_k}}\limits^ \vee \left( x \right) \le f\left( x \right)$, ${\hat x_k}$为当前稳定中心, τk > 0为邻近参数, lkR为水平参数.

注意到子问题(0.2)的邻近项采用的是经典的欧氏距离, 为充分利用可行集的几何结构, 加快收敛速度, 提升数值效果, 许多方法引入了非欧氏距离.沈洁等[14]在文献[13]中引入矩阵范数, 应用对偶思想进行求解, 分析算法的全局收敛性.Kiwiel等在邻近点算法中引入Bregman距离替代邻近项, 获得算法的全局收敛性[15]; 在邻近束方法中引入Bregman距离替代邻近项, 允许求解子问题时出现近似解, 获得算法的全局收敛性[16].

本文基于邻近函数, 引入非欧氏距离, 对子问题(0.2)进行改进, 提出一个带非欧氏范数的双稳定束方法.该方法充分吸收了传统邻近束方法和水平束方法的优势, 在产生新迭代点的二次规划子问题中引入邻近函数代替传统的欧氏距离, 使得在计算上可充分利用可行集的几何结构, 获得优良的数值效果.

本文符号说明.凸函数fx处的次微∂f(x)={g:f(y)≥f(x)+〈g, y-x〉, ∀y}, ε-次微分记为εf(x)={g:f(y)≥f(x)+〈g, y-x〉-ε, ∀y}.NX(x)表示X在点x处的法锥, 即NX(x)={y:〈y, z-x〉≤0, ∀zX}.iX(x)为指示函数, 即:若xX, 则iX(x)=0;否则iX(x)=+∞.

1 算法与性质

引入邻近函数[17]

$ D\left( {x,y} \right) = h\left( x \right) - h\left( y \right) - \left\langle {\nabla h\left( y \right),x - y} \right\rangle . $

为叙述简便, 假设h:XR为一阶连续可微的强凸函数, 满足

$ \begin{array}{l} h\left( y \right) \ge h\left( x \right) + \left\langle {\nabla h\left( y \right),y - x} \right\rangle + \frac{{{\sigma _h}}}{2}\\ {\left\| {y - x} \right\|^2},\forall x,y \in X, \end{array} $ (1.1)

其中σh > 0为函数h的凸性参数.

将子问题(0.2)推广如下:

$ \min \left\{ {\mathop {{f_k}}\limits^ \vee \left( x \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_k}} \right):\mathop {{f_k}}\limits^ \vee \left( x \right) \le {l_k},x \in X} \right\}, $ (1.2)

其中$\mathop {{f_k}}\limits^ \vee \left( x \right)=\mathop {\max }\limits_{j \in {B_k}} \left\{ {{{\bar f}_j}\left( x \right):=f\left( {{x_j}} \right) + {g_j}, x - {x_j}} \right\}$, 指标集Bk$ \subseteq ${1, …, k}, xj为迭代点, gj∂f(xj).当$h\left( \cdot \right)=\frac{1}{2}\left\| \cdot \right\|$时, 子问题(1.2)变为子问题(0.2).

通过引入一个变量t, 问题(1.2)可写成:

$ \begin{array}{l} \mathop {\min }\limits_{\left( {x,t} \right) \in {R^{n + 1}}} \\ \left\{ {t + \frac{1}{{{\tau ^k}}}D\left( {x,{{\hat x}_k}} \right):\mathop {{f_k}}\limits^ \vee \left( x \right) \le t,t \le {l_k},x \in X} \right\}, \end{array} $ (1.3)

易知问题(1.2)与问题(1.3)关于x部分的解是等价的.

以下引理给出子问题(1.2)的性质.

引理1.1  若Xk={xX:$\mathop {{f_k}}\limits^ \vee \left( x \right)$lk}≠Ø, 则子问题(1.2)有唯一解xk+1.此外, 若X为多面体或者riX∩{xRn: $\mathop {{f_x}}\limits^ \vee \left( x \right)$lk}≠Ø, 则存在sk+1$\mathop {{f_k}}\limits^ \vee $(xk+1), pk+1NX(xk+1)=∂iX(xk+1)和拉格朗日乘子μk≥1, λk≥0使得

$ \nabla h\left( {{x_{k + 1}}} \right) = \nabla h\left( {{{\hat x}_k}} \right) - {\tau _k}{\mu _k}{{\hat g}_k}, $ (1.4)
$ {{\hat g}_k} = {s_{k + 1}} + \frac{1}{{{\mu _k}}}{p_{k + 1}},{\mu _k} = {\lambda _k} + 1, $ (1.5)
$ {\mu _k}\left( {\mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) - {t_{k + 1}}} \right) = 0,{\lambda _k}\left( {\mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) - {l_k}} \right) = 0 $

成立.此外, 对于任意的xX, 定义聚集线性化函数

$ \bar f_k^a\left( x \right): = \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) + \left\langle {{{\hat g}_k},x - {x_{k + 1}}} \right\rangle , $ (1.6)

$ \bar f_k^a\left( x \right) \le \mathop {{f_k}}\limits^ \vee \left( x \right) \le f\left( x \right),x \in X. $ (1.7)

证明: 由于问题(1.2)的目标函数是强凸函数且可行集非空, 故存在唯一最优解xk+1.根据问题(1.3)的最优性条件, 存在最优解(xk+1, tk+1)及拉格朗日乘子μk≥0和λk≥0使得

$ \begin{array}{l} 0 \in \frac{1}{{{\tau _k}}}\left( {\nabla h\left( {{x_{k + 1}}} \right) - \nabla h\left( {{{\hat x}_k}} \right)} \right) + {\mu _k}\partial \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) + \\ {N_X}\left( {{x_{k + 1}}} \right), \end{array} $ (1.8)
$ 0 = 1 - {\mu _k} + {\lambda _k}, $
$ {\mu _k}\left( {\mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) - {t_{k + 1}}} \right) = 0,\;\;\;\;{\lambda _k}\left( {{t_{k + 1}} - {l_k}} \right) = 0. $

μk=1+λk≥1, 可知$\mathop {{f_k}}\limits^ \vee $(xk+1)=tk+1.由公式(1.8)可知, 存在sk+1$\mathop {{f_k}}\limits^ \vee $(xk+1), pk+1NX(xk+1)使得

$ \begin{array}{l} \nabla h\left( {{x_{k + 1}}} \right) - \nabla h\left( {{{\hat x}_k}} \right) - {\tau _k}\left( {{\mu _k}{s_{k + 1}} + {p_{k + 1}}} \right) = \\ \nabla h\left( {{{\hat x}_k}} \right) - {\tau _k}{\mu _k}{{\hat g}_k}. \end{array} $

对于任意的xX, 结合公式(1.5), (1.6)及pk+1NX(xk+1), sk+1 $\mathop {{f_k}}\limits^ \vee $(xk+1), 有

$ \begin{array}{l} \bar f_k^a\left( x \right) = \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) + < {s_{k + 1}},x - {x_{k + 1}} > + \frac{1}{{{\mu _k}}} < {p_{k + 1}},\\ x - {x_{k + 1}} > \le \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) + < {s_{k + 1}},x - {x_{k + 1}} > \le \\ \mathop {{f_k}}\limits^ \vee \left( x \right) \le f\left( x \right). \end{array} $

基于文献[13]中的算法1及子问题(1.3), 下面给出带非欧氏范数的双稳定束方法.

算法1 (带非欧氏范数的双稳定束方法)

步骤0  (初始化)取参数β, ml ∈(0, 1), 终止参数TolΔ, Tole, Tolg > 0.取x1X, 令${\hat x_1}$=x1.若f*的一个下界f1low可获得, 令v1l=(1-ml)(f(${\hat x_1}$)-f1low); 否则, 令f1low=-∞, 并取v1l > 0.选取τmin > 0, τ1τmin, 令k:=1.

步骤1  (第一终止测试)令Δk=f(${\hat x_k}$)-fklow.若ΔkTolΔ, 终止算法.

步骤2  (可行性检测)计算水平参数lk=f(${\hat x_k}$)-vkl.若水平集Xk为空, 令fklow=lk, vkl=(1-ml)(f(${\hat x_k}$)-fklow), 返回步骤1.

步骤3  (寻找新迭代点)解问题(1.3)得(xk+1, tk+1)及相应于水平参数tlk的乘子λk, 且有μk=λk+1, vkτ=f(${\hat x_k}$)-tk+1, ${{\hat g}_k} = \frac{1}{{{\tau _k}{\mu _k}}}\left( {\nabla h\left( {{{\hat x}_k}} \right) - \nabla h\left( {{x_{k + 1}}} \right)} \right)$, ${{\hat e}_k}{\rm{ = }}v_k^\tau - \left\langle {{{\hat g}_k}, {{\hat x}_k} - {x_{k + 1}}} \right\rangle $.

步骤4  (第二终止测试)若${{\hat e}_k}$Tole$\left\| {{{\hat g}_k}} \right\| \le To{l_g}$, 终止算法.

步骤5  (下降测试)取fk+1low∈[fklow, f*], 若

$ f\left( {{x_{k + 1}}} \right) \le f\left( {{{\hat x}_k}} \right) - \beta v_k^\tau $ (1.9)

成立, 转至步骤5.1(下降步); 否则转至步骤5.2(无效步).

步骤5.1  (下降步)令${{\hat x}_{k + 1}}$=xk+1, τk+1=τkμkvk+1l=min{vkl, (1-ml)(f(${{\hat x}_{k + 1}}$)-fk+1low)}.

选取模型${{\overset{\vee }{\mathop{f}}\,}_{k+1}}$满足${{\overset{\vee }{\mathop{f}}\,}_{k+1}}$(·)≤f(·).

步骤5.2  (无效步)令${{\hat x}_{k + 1}} = {{\hat x}_k}$, 取τk+1∈[τmin, τk].若μk > 1, 令vk+1l=ml vkl; 否则令vk+1l=vkl.

选取模型${{\overset{\vee }{\mathop{f}}\,}_{k+1}}$满足$\max \left\{ {{{\bar f}_{k + 1}}\left( \cdot \right), \bar f_k^a\left( \cdot \right)} \right\} \le {{\overset{\vee }{\mathop{f}}\,}_{k+1}}\left( \cdot \right) \le f\left( \cdot \right)$.

步骤6  (循环)令k=:k+1, 返回步骤1.

以下引理给出算法的近似最优性条件, 其证明类似文献[13].

引理1.2  聚集线性化误差${{\hat e}_k}$, 满足

$ {{\hat e}_k} \ge 0,{{\hat e}_k} + < {{\hat g}_k},{{\hat x}_k} - {x_{k + 1}} > = v_k^\tau \ge v_k^l. $ (1.10)

μk > 1, 则vkτ=vkl.此外, 对于任意的xX, 有

$ f\left( {{{\hat x}_k}} \right) + < {{\hat g}_k},x - {{\hat x}_k} > - {{\hat e}_k} \le f\left( x \right). $ (1.11)
2 收敛性分析

根据算法1的步骤可知在迭代过程中必定会出现以下3种情况之一:水平集Xk=Ø出现无限次; 水平集Xk=Ø出现有限次时, 算法1产生有限多下降步; 水平集Xk=Ø出现有限次时, 算法1产生无限多下降步.

首先讨论水平集Xk=Ø出现无限次的情形.

定理2.1[13]  假设Xk=Ø出现无限次, 则Δk→0, {f(${{\hat x}_k}$)}→f*, 且序列{${{\hat x}_k}$}的每个聚点(若存在)都是问题(0.1)的解; 或者当{${{\hat x}_k}$}有限时, 最后一个迭代点${{\hat x}_k}$是问题(0.1)的解.

下面讨论Xk=Ø出现有限次的情形.不失一般性, 假设XkØ, ∀k≥1.

首先讨论算法1产生有限多下降步的情形.

对于xX, 定义函数

$ {{\bar \phi }_k}\left( x \right): = \bar f_k^a\left( x \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_k}} \right), $ (2.1)
$ \mathop {{\phi _k}}\limits^ \vee \left( x \right): = \mathop {{f_k}}\limits^ \vee \left( x \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_k}} \right). $ (2.2)

引理2.3  若μk=1, 则

$ \begin{array}{l} D\left( {x,{x_{k + 1}}} \right) + D\left( {{x_{k + 1}},{{\hat x}_k}} \right) - D\left( {x,{{\hat x}_k}} \right) = < \\ \nabla h\left( {{{\hat x}_k}} \right) - \nabla h\left( {{x_{k + 1}}} \right),x - {x_{k + 1}} > = {\tau _k} < {{\hat g}_k},x - \\ {x_{k + 1}} > , \end{array} $ (2.3)
$ {{\bar \phi }_k}\left( x \right) = {{\bar \phi }_k}\left( {{x_{k + 1}}} \right) + \frac{1}{{{\tau _k}}}D\left( {x,{x_{k + 1}}} \right). $ (2.4)

证明: 由公式(1.4)有

$ \begin{array}{l} D\left( {x,{x_{k + 1}}} \right) + D\left( {{x_{k + 1}},{{\hat x}_k}} \right) - D\left( {x,{{\hat x}_k}} \right) = < \\ \nabla h\left( {{{\hat x}_k}} \right) - \nabla h\left( {{x_{k + 1}}} \right),x - {x_{k + 1}} > = {\tau _k} < {{\hat g}_k},x - \\ {x_{k + 1}} > . \end{array} $

根据公式(1.6), (2.1)与(2.3)有

$ \begin{array}{l} {{\bar \phi }_k}\left( x \right) = \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) + < {{\hat g}_k},x - {x_{k + 1}} > + \frac{1}{{{\tau _k}}}D\left( {x,} \right.\\ \left. {{x_k}} \right) = \bar f_k^a\left( {{x_{k + 1}}} \right) + < {{\hat g}_k},x - {x_{k + 1}} > + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_k}} \right) = \\ {{\bar \phi }_k}\left( {{x_{k + 1}}} \right) - \frac{1}{{{\tau _k}}}D\left( {{x_{k + 1}},{{\hat x}_k}} \right) + \frac{1}{{{\tau _k}}}\left( {D\left( {x,{x_{k + 1}}} \right) + D\left( {{x_{k + 1}},} \right.} \right.\\ \left. {\left. {{{\hat x}_k}} \right) - D\left( {x,{{\hat x}_k}} \right)} \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_k}} \right) = {{\bar \phi }_k}\left( {{x_{k + 1}}} \right) + \frac{1}{{{\tau _k}}}D\left( {x,} \right.\\ \left. {{x_{k + 1}}} \right). \end{array} $

引理2.4  假设存在k1≥1, 对任意kk1, 下降测试公式(1.9)不成立, 即产生的都是无效步.若μk=1, 则

$ f\left( {{x_{k + 1}}} \right) - \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) \to 0, $ (2.5)

此外, 当{τk}→τ > 0时, 有

$ \begin{array}{l} {x_{k + 1}} \to {x_\infty }: = {\rm{Arg}}\min \left\{ {f\left( x \right) + \frac{1}{{{\tau _\infty }}}D\left( {x,{{\hat x}_{{k_1}}}} \right),} \right.\\ \left. {x \in X} \right\}. \end{array} $ (2.6)

证明: 对任意的kk1, 由公式(1.7), (2.1), (2.2)和(2.4)可得

$ \begin{array}{l} f\left( {{{\hat x}_{{k_1}}}} \right) \ge {{\overset{\vee }{\mathop{f}}\,}_{k+1}} \left( {{\hat x_{{k_1}}}} \right) = \overset{\vee }{\mathop{\phi }}\,{{}_{k+1}}\left( {{{\hat x}_{{k_1}}}} \right) \ge \overset{\vee }{\mathop{\phi }}\,{{}_{k+1}}\left( {{x_{k + 2}}} \right) = \\ {{\bar \phi }_{k + 1}}\left( {{x_{k + 2}}} \right) \ge {{\bar \phi }_k}\left( {{x_{k + 2}}} \right) \ge {{\bar \phi }_k}\left( {{x_{k + 1}}} \right) + \\ \frac{1}{{{\tau _k}}}D\left( {{x_{k + 2}},{x_{k + 1}}} \right) \ge {{\bar \phi }_k}\left( {{x_{k + 1}}} \right). \end{array} $

因为${{\hat x}_{{k_1}}}$是固定的, 由单调有界定理可知序列{ϕk(xk+1)}收敛, 再结合τminτk+1τk可得{D(xk+2, xk+1)}→0.进而由公式(1.1)有xk+2-xk+1→0.在公式(2.4)中固定x, 结合{ϕk(xk+1)}收敛, 可知{xk+1}有界.由算法1步骤5.2, 有

f(xk+2)-f(xk+1)≥${{\overset{\vee }{\mathop{f}}\,}_{k+1}}$(xk+2)-f(xk+1)≥ < gk+1, xk+2-xk+1 > , 其中gk+1∂ f(xk+1).结合{xk+1}有界和gk+1的有界性有${{\overset{\vee }{\mathop{f}}\,}_{k+1}}$(xk+2)-f(xk+1)→0, 即$\mathop {{f_x}}\limits^ \vee $(xk+1)-f(xk+1)→0.

x为{xk+1}的任一聚点, 则存在无限指标集K, 使得xk+1x, kK.由公式(1.7), (2.1)和(2.4)有

$ \begin{array}{l} f\left( x \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_{{k_1}}}} \right) \ge \bar f_k^a\left( {{x_{k + 1}}} \right) + \frac{1}{{{\tau _k}}}D\left( {x,{{\hat x}_{{k_1}}}} \right) = \\ {{\bar \phi }_k}\left( x \right) \ge {{\bar \phi }_k}\left( {{x_{k + 1}}} \right) = \bar f_k^a\left( {{x_{k + 1}}} \right) + \frac{1}{{{\tau _k}}}D\left( {{x_{k + 1}},{{\hat x}_{{k_1}}}} \right) = \\ f\left( {{x_{k + 1}}} \right) + \left( {\mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) - f\left( {{x_{k + 1}}} \right)} \right) + \frac{1}{{{\tau _k}}}\left( {h\left( {{x_{k + 1}}} \right) - } \right.\\ \left. {h\left( {{{\hat x}_{{k_1}}}} \right) - < \nabla h\left( {{{\hat x}_{{k_1}}}} \right),{x_{k + 1}} - {{\hat x}_{{k_1}}} > } \right),x \in X. \end{array} $

上式对kK, k→∞取极限有

$ \begin{array}{l} f\left( x \right) + \frac{1}{{{\tau _\infty }}}D\left( {x,{{\hat x}_{{k_1}}}} \right) \ge f\left( {\bar x} \right) + \frac{1}{{{\tau _\infty }}}D\left( {\bar x,{{\hat x}_{{k_1}}}} \right),\\ x \in X. \end{array} $

由问题(2.6)解的唯一性知x=x, 因此结合{xk+1}的有界性及的任意性可得{xk+1}→x.

定理2.2  设存在k1≥1, 当kk1时, 下降测试公式(1.9)不成立, 则集合K:={k:μk > 1}为无限集, 且${{\hat x}_{{k_1}}}$为问题(0.1)的最优解.

证明: 根据算法1知序列{vkl}是非增的, 由步骤5.1有vk+1l=mlvkl, ∀kK.

反证法, 假设K是有限集.由引理1.2, 存在k2k1使得

$ {\mu _k} = 1,{\lambda _k} = 0,v_k^l = v_{{k_2}}^l > 0,\forall k \ge {k_2}. $

结合引理2.4, 得{f(xk+1)}→f(x).若下降测试公式(1.9)不成立, 则有

$ \begin{array}{l} f\left( {{x_{k + 1}}} \right) - \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) > \mathop {{f_k}}\limits^ \vee \left( {{{\hat x}_{{k_1}}}} \right) - \beta v_k^\tau - \\ \mathop {{f_k}}\limits^ \vee \left( {{x_{k + 1}}} \right) = \left( {1 - \beta } \right)v_k^\tau ,\forall k \ge {k_1}. \end{array} $

β∈(0, 1), 结合公式(2.5), 可知vkτ→0, 这与vkτvkl=vlk2 > 0, ∀kk2矛盾.因此, K为无限集.

K为无限集, 根据算法1步骤5.1有vk+1l=mlvkl, ∀ kK, 因为ml∈(0, 1)且{vkl}单调非增, 所以vkl→0, k→∞.由引理1.2知, vkτ=vkl, kK.因此{vkτ}→0, kK.由h的强凸性, 根据文献[18]定理2.1.9有

$ \begin{array}{l} < \nabla h\left( {{{\hat x}_{{k_1}}}} \right) - \nabla h\left( {{x_{k + 1}}} \right),{{\hat x}_{{k_1}}} - {x_{k + 1}} > \ge \\ {\sigma _h}{\left\| {{{\hat x}_{{k_1}}} - {x_{k + 1}}} \right\|^2}, \end{array} $ (2.7)

结合公式(1.4), (1.10)和(2.7), 有

$ \begin{array}{l} v_k^\tau = {{\hat e}_k} + < {{\hat g}_k},{{\hat x}_{{k_1}}} - {x_{k + 1}} > = {{\hat e}_k} + \frac{1}{{{\tau _k}{\mu _k}}} < \\ \nabla h\left( {{{\hat x}_{{k_1}}}} \right) - \nabla h\left( {{x_{k + 1}}} \right),{{\hat x}_{{k_1}}} - {x_{k + 1}} > \ge {{\hat e}_k} + \frac{{{\sigma _h }}}{{{\tau _k}{\mu _k}}}\\ {\left\| {{{\hat x}_{{k_1}}} - {x_{k + 1}}} \right\|^2} \ge 0,k \in K. \end{array} $

所以${{\hat e}_k}$→0, ${{\hat x}_{{k_1}}}$-xk+1→0, kK.由公式(1.4), ∇h(x)连续可知${{{\hat g}_k}}$→0, kK, 结合公式(1.11)可知${{\hat x}_{{k_1}}}$是问题(0.1)的解.

接下来讨论算法1产生无限多下降步的情形.

定理2.3  假设算法1产生无限多下降步, 则{f(${{\hat x}_k}$)}→f*且序列{${{\hat x}_k}$}的任意一个聚点都是问题(0.1)的解.

证明: 记{${{\hat x}_{k\left( j \right)}}$}为{${{\hat x}_k}$}的一个子列, k(j)表示第j次下降步的指标, 定义i(j)=k(j+1)-1, ∀j≥1.由公式(1.4), (1.11)和引理1.2有

$ \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right) = \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}{{\hat g}_{i\left( j \right)}}, $ (2.8)
$ {{\hat g}_{i\left( j \right)}} \in {\partial _{{{\hat e}_{k\left( j \right)}}}}\left( {f + {i_X}} \right)\left( {{{\hat x}_{k\left( j \right)}}} \right), $ (2.9)
$ \begin{array}{l} {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}} \ge {\tau _{{m }{in}}},\\ f\left( {{{\hat x}_{k\left( j \right)}}} \right) - f\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right) \ge \beta v_{i\left( j \right)}^\tau . \end{array} $ (2.10)

由算法1知{f(${{\hat x}_{k\left( j \right)}}$)}是下降序列.若{f(${{\hat x}_{k\left( j \right)}}$)} →-∞, 证毕, 以下不妨假设{f(${{\hat x}_{k\left( j \right)}}$)}有界.

τi(j)μi(j)τmin > 0, ∀j≥1, 有

$ \sum\limits_{j = 1}^\infty {{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}} = + \infty . $ (2.11)

结合β∈(0, 1)和{f(${{\hat x}_{k\left( j \right)}}$)}是单调有界序列, 由公式(2.10)可得到vi(j)τ→0, j→∞.再由公式(1.10), (2.7)和(2.8)有

$ \begin{array}{l} v_{i\left( j \right)}^\tau = {{\hat e}_{i\left( j \right)}} + < {{\hat g}_{i\left( j \right)}},{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}} > = {{\hat e}_{i\left( j \right)}} + \\ \frac{1}{{{\tau _{i\left( j \right)}}{u_{i\left( j \right)}}}} < \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right),{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}} > \ge \\ {{\hat e}_{i\left( j \right)}} + \frac{{{\sigma _h}}}{{{\tau _{i\left( j \right)}}{u_{i\left( j \right)}}}}{\left\| {{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}}} \right\|^2} \ge 0, \end{array} $

所以

$ {{\hat e}_{i\left( j \right)}} \to 0,{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}} \to 0,j \to \infty . $ (2.12)

因为下降序列{f(${{\hat x}_{k\left( j \right)}}$)}有界, 不妨假设它收敛到f.要证{f(${{\hat x}_{k\left( j \right)}}$)}→f*, 因为ff*, 需证ff*.反证法, 假设f > f*, 则存在ζ > 0和zX, 使得f(z)+ζ < f(${{\hat x}_{k\left( j \right)}}$).结合公式(1.4), (2.3), (2.8)和(2.9)有

$ \begin{array}{l} D\left( {z,{{\hat x}_{k\left( {j + 1} \right)}}} \right) = D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) - D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) + < \\ \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right),z - {{\hat x}_{k\left( {j + 1} \right)}} > = D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) - \\ D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) + {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}} < {{\hat g}_{i\left( j \right)}},z - {{\hat x}_{k\left( {j + 1} \right)}} > \le \\ D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) - D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) + {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}\left( {f\left( z \right) - } \right.\\ \left. {f\left( {{{\hat x}_{k\left( j \right)}}} \right) + {{\hat e}_{i\left( j \right)}}} \right) + {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}} < {{\hat g}_{i\left( j \right)}},{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}} > = \\ D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) - D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) + {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}\left( {f\left( z \right) - } \right.\\ \left. {f\left( {{{\hat x}_{k\left( j \right)}}} \right) + {{\hat e}_{i\left( j \right)}}} \right) + < \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right),{{\hat x}_{k\left( j \right)}} - \\ {{\hat x}_{k\left( {j + 1} \right)}} > \le D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) + {\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}\left( {\frac{1}{{{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}}} < } \right.\\ \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right),{{\hat x}_{k\left( j \right)}} - {{\hat x}_{k\left( {j + 1} \right)}} > - \\ \left. {\frac{1}{{{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}}}D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) - \zeta + {{\hat e}_{i\left( j \right)}}} \right). \end{array} $

由公式(2.12), 可得

$ \begin{array}{l} \frac{1}{{{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}}} < \nabla h\left( {{{\hat x}_{k\left( j \right)}}} \right) - \nabla h\left( {{{\hat x}_{k\left( {j + 1} \right)}}} \right),{{\hat x}_{k\left( j \right)}} - \\ {{\hat x}_{k\left( {j + 1} \right)}} > - \frac{1}{{{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}}}}D\left( {{{\hat x}_{k\left( {j + 1} \right)}},{{\hat x}_{k\left( j \right)}}} \right) + {{\hat e}_{i\left( j \right)}} \to 0. \end{array} $

因此存在正整数q使得

$ \begin{array}{l} D\left( {z,{{\hat x}_{k\left( {j + 1} \right)}}} \right) \le D\left( {z,{{\hat x}_{k\left( j \right)}}} \right) - \frac{\zeta }{2}{\tau _{i\left( j \right)}}{\mu _{i\left( j \right)}},\\ \forall j \ge q. \end{array} $

对上式求和有0≤D(z, ${{\hat x}_{k\left( {j{\rm{ + }}1} \right)}}$)≤D(z, ${{\hat x}_{k\left( q \right)}}$)-$\frac{\zeta }{2}\sum\limits_{p = q}^j {{\tau _{i\left( p \right)}}{\mu _{i\left( p \right)}}} $, ∀jq.令j→+∞, 有

$ \sum\limits_{p = q}^{ + \infty } {{\tau _{i\left( p \right)}}{\mu _{i\left( p \right)}}} \le \frac{{2D\left( {z,{{\hat x}_{k\left( q \right)}}} \right)}}{\zeta } < + \infty , $

与公式(2.11)矛盾, 所以{f(${{\hat x}_{k\left( j \right)}}$)}→f*, 即{f(${{\hat x}_k}$)}→f*.

x*是{${{\hat x}_k}$}的任一聚点, 由X是非空闭凸集可知x*X, 因此x*是问题(0.1)的解.

参考文献
[1]
袁功林, 韦增欣. 一个新的BFGS信赖域算法[J]. 广西科学, 2004, 11(3): 195-196.
YUAN G L, WEI Z X. A new BFGS trust region algorithm[J]. Guangxi Sciences, 2004, 11(3): 195-196.
[2]
张秀军, 徐安农. 一种新的非线性共轭梯度法的全局收敛性[J]. 广西科学, 2005, 12(4): 282-283.
ZHANG X J, XU A N. Global convergence properties of a new class of nonlinear conjugate gradient methods[J]. Guangxi Sciences, 2005, 12(4): 282-283.
[3]
韦增欣, 谢品杰, 顾能柱. 一类拟牛顿算法的收敛性[J]. 广西科学, 2006, 13(4): 282-287.
WEI Z X, XIE P J, GU N Z. Convergence properties of a class of quasi-Newton algorithm[J]. Guangxi Sciences, 2006, 13(4): 282-287. DOI:10.3969/j.issn.1005-9164.2006.04.013
[4]
BAGIROV A M, UGON J, MIRZAYEVA H G. Nonsmooth optimization algorithm for solving clusterwise linear regression problems[J]. Journal of Optimization Theory and Applications, 2015, 164(3): 755-780. DOI:10.1007/s10957-014-0566-y
[5]
JUNG M, KANG M. Efficient nonsmooth nonconvex optimization for image restoration and segmentation[J]. Journal of Scientific Computing, 2015, 62(2): 336-370. DOI:10.1007/s10915-014-9860-y
[6]
YU J, VISHWANATHAN S V N, GVNTER S, et al. A quasi-newton approach to nonsmooth convex pptimization problems in machine learning[J]. Journal of Machine Learning Research, 2010, 11(2): 1145-1200.
[7]
MÄKELÄ M. Survey of bundle methods for nonsmooth optimization[J]. Optimization Methods and Software, 2002, 17(1): 1-29. DOI:10.1080/10556780290027828
[8]
唐春明, 简金宝. 非光滑优化的强次可行方向邻近点束求解方法[J]. 广西科学, 2014, 21(3): 283-286.
TANG C M, JIAN J B. A Proximal bundle method of strongly sub-feasible directions for nonsmooth optimization[J]. Guangxi Sciences, 2014, 21(3): 283-286.
[9]
唐春明, 律金曼. 基于非精确数据的非光滑优化强次可行方向法[J]. 广西科学, 2016, 23(5): 404-408.
TANG C M, LU J M. Strongly sub-feasible direction method with inexact data for nonsmooth optimization[J]. Guangxi Sciences, 2016, 23(5): 404-408.
[10]
KIWIEL K C. A proximal bundle method with approximate subgradient linearizations[J]. Siam Journal on Optimization, 2006, 16(4): 1007-1023. DOI:10.1137/040603929
[11]
LAN G. Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization[J]. Mathematical Programming:Ser A, 2015(149): 1-45.
[12]
KIWIEL K C. An ellipsoid trust region bundle method for nonsmooth convex minimization[J]. Siam Journal on Control and Optimization, 1989, 27(4): 737-757. DOI:10.1137/0327039
[13]
DE OLIVEIRA W, SOLODOV M. A doubly stabilized bundle method for nonsmooth convex optimization[J]. Mathematical Programming, 2016, 156(1/2): 125-159.
[14]
沈洁, 李娜, 田佳茜. 双稳定束方法以及收敛性分析[J]. 沈阳师范大学学报:自然科学版, 2015, 33(2): 177-180.
SHEN J, LI N, TIAN J Q. Doubly stabilized bundle method and its convergence[J]. Journal of Shenyang Normal University:Natural Science Edition, 2015, 33(2): 177-180.
[15]
KIWIEL K C. Proximal minimization methods with generalized Bregman functions[J]. SIAM Journal on Control and Optimization, 1997, 35(4): 1142-1168. DOI:10.1137/S0363012995281742
[16]
KIWIEL K C. A bundle Bregman proximal method for convex nondifferentiable minimization[J]. Mathematical Programming, 1999, 85(2): 241-258. DOI:10.1007/s101070050056
[17]
CENSOR Y, LENT A. An iterative row-action method for interval convex programming[J]. Journal of Optimization Theory and Applications, 1981, 34(3): 321-353. DOI:10.1007/BF00934676
[18]
NESTEROV Y. Introductory lectures on convex optimization:A basic course[M]. New York: Springer US, 2004.