4. mirror descent and relative smoothness

Convex non-smooth and the convergence of mirror descent with relative smoothness

2 min readNov 29, 2020

back to 《A Deeper Learner》main page

Relative smoothness

In the last lecture, we find smoothness required for convergence guarantee. How to generalize smoothness? Ans: Bregman divergence.

Bregman divergence

D_h(y,x)\equiv h(y) — h(x) — \langle\nabla h(x), y-x\rangle — The Bregman divergence.

Notice that Bregman divergence is not symmetric in terms of x and y, but can be considered an asymmetric “x to y” connotation.

D_h(y,x) &= \frac{1}{2}\lVert y\rVert² — \frac{1}{2}\lVert x\rVert² — \langle \nabla \frac{1}{2}\lVert x\rVert², y-x\rangl — Bregman divergence returns to the (second term) of the L-smooth definition.

We can find the second term in the L-smooth requirement as a special case of the Bregman divergence, where h is the half L-scaled-squared Euclidean norm. Thus, by replacing the second term in the L-smooth definition and gradient descent algorithm, we find the relative smoothness and a new type of “mirror” descent algorithm.

f(y)\leq f(x)+\langle\nabla f(x),y-x\rangle+LD_h(y,x) — *The necessary* and *sufficient* condition of relative L smooth condition.

Mirror descent algorithm, expressed with the Bregman divergence.

Mirror descent convergence proof

The proof follows that of gradient descent and L-smoothness. However, with Bregman divergence, simple concepts (e.g. Pythagorean theorem) now require detailed proofs.

4. mirror descent and relative smoothness

Convex non-smooth and the convergence of mirror descent with relative smoothness

Relative smoothness

Bregman divergence

Mirror descent convergence proof

Bregman Proximal Equality

Three-point Equality

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Jimmy Yeh

No responses yet