The Krein-Milman Theorem

by Nirakar Neo

1. The Krein-Milman theorem in Locally Convex Spaces

My project work this semester focuses to understand the paper the Krein-Milman Theorem in Operator Convexity by Corran Webster and Soren Winkler, which appeared in the Transactions of the AMS [Vol 351, #1, Jan 99, 307-322]. But before reading the paper, it is imperative to understand the (usual) Krein-Milman theorem which is proved in the context of locally convex spaces. My understanding of this part follows the book A Course in Functional Analysis by J B Conway. To begin with we shall collect the preliminaries that we shall need to understand the Krein-Milman theorem.

1.1. Convexity

Let {\mathbb{K}} denote the real({\mathbb{R}}) or the complex({\mathbb{C}}) number fields. Let {X} be a vector space over {\mathbb{K}}. A subset of a vector space is called convex if for any two points in the subset, the line segment joining them lies completely in the subset. We make this idea more precise.

If {a,b\in X}, the line segment from {a} to {b} is given by the set

\displaystyle [a,b]=\{(1-t)a+tb :0\leq t\leq 1\}.

By an open line segmentwe shall mean that the end points should be excluded, that is

\displaystyle (a,b)=\{(1-t)a+tb :0<t<1\}.

We shall call a line segment proper if {a\neq b}. A subset {V\subset X} is called convexif for any {a,b\in V}, the line segment {[a,b]\subset V}. It is easy to see that intersection of arbitrary number of convex subsets is again convex. We characterize a convex set as –

Proposition 1.1 : {V} is convex if and only if whenever {x_1,...,x_n\in V}, and {t_1,...,t_n\in [0,1]} with {\sum_i t_i=1}, then {\sum_i t_ix_i\in V}.

Proof: If the latter condition holds, then it is easy to see that {V} is convex. For the converse part, we use induction on the number of points, {n}. For {n=1}, this is trivially true, while the case {n=2} follows from the definition. Moreover note that if {t_1=1}, the theorem holds. So we shall assume {t_1\neq 1}. Suppose the theorem holds for {k=n>2}. Then, if there are {n+1} points, say {x_1,...,x_{n+1}}, and {t_1,...,t_{n+1}}, with {t_1\neq 1} and {\sum_{i=1}^{n+1}t_i=1} then we can write

\displaystyle t_1x_1+\sum_{i=2}^{n+1}t_ix_i=t_1x_1+(1-t_1)\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i.

Now {\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i\in V} by the induction hypothesis, and by the convexity of {V}, we get that {\sum_{i=1}^{n+1}t_ix_i\in V}. Thus the proof is complete. \Box

If {V\subset X}, the convex hull of {V}, denoted by {\text{co}\;(V)}, is the intersection of all convex sets containing {V}. Clearly this definition is meaningful since {X} itself a convex set containing {V}, and that convexity is preserved under intersection. {\text{co}\;(A)} is also convex. Using proposition 1.1, we have the following characterization of the convex hull of {V}.

The convex hull of {V} is the set of all convex combinations of elements of {V}, that is

\displaystyle co(A)=\left\lbrace t_1x_1+...+t_nx_n: x_i\in V,\; 0\leq t_i\leq 1,\; \sum_{i=1}^nt_i=1,\; n\; \text{is arbitrary}\right\rbrace

As an example, the convex hull of three points {x_1,x_2,x_3\in\mathbb{R}^2} is the closed triangular region formed by those points.

Definition 1. Let {V} be a convex subset of a vector space {X}. A point {a\in V} is called an extreme point of {V}, if there does not exist any proper open line segment containing {a} and which lies completely in {V}. We denote the set of extreme points of {V} by {\text{ext}\;(V)}.

Convex sets may or may not have extreme points. For example the set {\{(x,y)\in\mathbb{R}^2:x^2+y^2<1\}}, does not have any extreme point, whereas the extreme points of the set {\{(x,y)\in\mathbb{R}^2:x^2+y^2\leq 1\}} are all the boundary points. The extreme points of a closed square region in {\mathbb{R}^2} are its four corners.

We have following equivalent characterizations.

Proposition 1.3 : Let {V} is convex subset of a vector space {X}, and let {a\in V}. Then the following are equivalent –

  • (a) {a\in\text{ext}\;V }.
  • (b) If {x_1,x_2\in V} and {0<t<1} and {a=tx_1+(1-t)x_2}, then either {x_1\notin V} or {x_2\notin V} or {x_1=x_2=a}.
  • (c) If {x_1,...,x_n\in V} and {a\in\text{co}\;\{x_1,...,x_n\}}, then {a=x_k} for some {k}.
  • (d) {V-\{a\}} is a convex set.

Proof:The equivalence of {(a)} and {(b)} is straightforward from the definition of extreme point and convexity of {V}.

{(a)\Rightarrow (c)} : Let {a\in\text{co}\;\{x_1,...,x_n\}}. Then by proposition 1.2, we get that {a=t_1x_1+...+t_nx_n}, where {0\leq t_i\leq 1}, and {\sum t_i=1}. If {t_i=1} for some {i}, we get that {a=x_i}. So suppose that {0<t_i<1} for each {i}. Then we can write

\displaystyle a=t_1x_1+(1-t_1)\sum_{i=2}^{n}\frac{t_i}{1-t_1}x_i

This happens only if {(1-t_1)x_1=t_2x_2+...+t_nx_n} which means {a=x_1}.

{(c)\Rightarrow (a)} : Suppose {a\notin\text{ext}\;V}. Then there are points {x,y\in V} such that {a=tx+(1-t)y}, where {0<t<1}, and {x\neq y}. So {a\in\text{co}\;\{x,y\}}. Then {(c)} dictates that {a=x} or {a=y}. But then {x=y}.

{(a)\Rightarrow (d)} : We take two points {x,y\in V-\{a\}}, then {\{tx+(1-t)y:0\leq t\leq 1\}\subset V-\{a\}}, since {a\neq tx+(1-t)y} for any {t\in (0,1)}. Thus {V-\{a\}} is convex.

{(d)\Rightarrow (a)} : Suppose that {a} is not an extreme point. Then there exists {t\in (0,1)} and {x,y\in V} with {x\neq y}, and that {a=tx+(1-t)y}. But this means that {[x,y]} does not completely in {V-\{a\}}, which contradicts its convexity. \Box

We shall return to convexity once again after we have introduced the concept of a topological vector space.

1.2. Locally Convex Spaces

For doing analysis on vector spaces we need to have a topological structure on the space. Usually we deal with normed spaces, which already have a topology determined by the metric induced by the norm. We generalise to topological vector spaces which contain normed spaces as a subclass. We note, however, that the topological structure should be compatible with operations on the vector space. Thus we have the following definition.

Definition 2 : A topological vector space (TVS) is a vector space with a topology such that relative to this topology the maps defined by {(x,y)\mapsto (x+y)} and {(\alpha,x)\mapsto\alpha x} are continuous for all {x,y\in X}, and {\alpha\in\mathbb{K}}.

Note that {X\times X} inherits the product topology. Addition is continuous means that if {x_1,x_2\in X} and if {V} is a neighborhood of {x_1+x_2}, there should exist neighborhoods {V_1} of {x_1} and {V_2} of {x_2} such that {V_1+V_2\subset V}. Similarly, if {x\in X}, and {\alpha\in\mathbb{K}}, and {V} is a neighborhood of {\alpha x} then for some {r>0}, and for some neighborhood {W} of {x}, we have {\beta W\subset V} whenever {|\beta-\alpha|<r}. We also note that translation by a vector and multiplication by a scalar are homeomorphisms on {X}.

Every normed space is a TVS, since the norm satisfies {\|x+y\|\leq \|x\|+\|y\|} and {\|\alpha x\|=|\alpha\|x\|}. Now that we have a topology on the vector space, we can talk about open and closed sets. Let {X} be a TVS. We define the closed convex hull of {V}, which is defined as the intersection of all closed convex subsets of {X} containing {V}, and is denoted by {\overline{\text{co}}\;(V)}. Then {\overline{\text{co}}\;(V)} is closed and convex. One may ask – whether {\overline{\text{co}\;(V)}=\overline{\text{co}}\;(V)}? This proposition answers that in affirmative.

Proposition 1.4 Let {X} be a TVS and let {V} be a convex subset of {X}. Then

  • (a) {\overline{V}} is convex, and
  • (b) if {a\in \text{int}\;V} and {b\in\overline{V}}, then

    \displaystyle [a,b)=\{tb+(1-t)a:0\leq t<1\}\subset\text{int}\;V.

Proof:

  • (a) Let {a,b\in\overline{V}}. Choose an element {x\in V}, and let {t\in [0,1]}. Let {\{a_i\}} be a net in {V} converging to {a}. Then since addition and scalar multiplication is continuous, we get

    \displaystyle (1-t)x+ta_i\rightarrow (1-t)x+ta,

    that is {[x,a]\subset\overline{V}}. Now let {\{b_i\}} be a net in {V} converging to {b}, then {[b_i,a]\subset\overline{V}} for each {i}. Since {\overline{V}} is closed, we have

    \displaystyle (1-t)b_i+ta\rightarrow (1-t)b+ta\in\overline{V}.

    Thus {[b,a]\subset\overline{V}}. Hence {\overline{V}} is closed.

  • (b) For this part, we fix {t\in (0,1)}, and {c=tb+(1-t)a}, where {a\in \text{int}\;V} and {b\in\overline{V}}. Since translation is a homeomorphism, there is an open set {W\subset X}, such that {a+W\subset V}. Then for any {d\in V}

    \displaystyle V\supset td+(1-t)(a+W)=t(d-b)+tb+(1-t)(a+W)=[t(d-b)+(1-t)W]+c.

    It only remains to show that there is an element {d\in V}, such that {0\in [t(d-b)+(1-t)W]} for then this set becomes a neighborhood of 0, and hence its translation to {c} will mean that {c\in\text{int}\; V}. Now, {0\in [t(d-b)+(1-t)W]} means {d\in b-t^{-1}(1-t)W}. But {0\in -t^{-1}(1-t)W}, and this set is open. Since {b\in\overline{V}}, we can always find {d\in V}.

\Box

Corollary – If {V\subset X}, then {\overline{\text{co}\;(V)}=\overline{\text{co}}\;(V)}.

Proof: Since {\text{co}\;V} is convex, {\overline{\text{co}\;V}} is convex too. Since {\overline{\text{co}}\;V} is the intersection of all closed convex sets containing {V}, we have {\overline{\text{co}}\;V\subset\overline{\text{co}\;V}}. Now let {x\in\overline{\text{co}\;V}}, and let {W} be any closed convex set containing {V}. Then clearly, {W\supset\text{co}\;V}. And since {W} is closed, we also have {W\supset\overline{\text{co}\;V}}. Thus {x\in W}, and hence {x\in\overline{\text{co}}\;V}. \Box

Definition 3. Let {X} be a vector space over {\mathbb{K}}. A seminorm is a function {p:X\rightarrow [0,\infty)} satisfying –

  • (a) {p(x+y)\leq p(x)+p(y)} and
  • (b) {p(\alpha x)=|\alpha|p(x)}

for all {\alpha\in\mathbb{K}}, and {x,y\in X}.

Note that, from (b), we have {p(0x)=0p(x)=0}, and thus {p(0)=0}. It may happen that {p} is zero for elements which are not zero. A seminorm is called a norm if {p(x)=0} if and only if {x=0}. Thus, every norm is a seminorm. Let {X} be the set of all continuous functions {f:\mathbb{R}\rightarrow\mathbb{R}}. Fix {x\in\mathbb{R}}. Define {p:X\rightarrow [0,\infty)}, by {p(f)=|f(x)|}. Then

\displaystyle p(f_1+f_2)=|(f_1+f_2)(x)|\leq |f_1(x)|+|f_2(x)|=p(f_1)+p(f_2).

Also {p(\alpha f)=|\alpha f(x)|=|\alpha|p(f)}. Thus {p} is a seminorm. But, {p(f)=f(x)=0} does not mean that {f=0}.

Let {X} be a vector space and {\mathcal{P}} be a family of seminorms on {X}. Let {T} be the topology on {X} that has as subbase, the sets of the form

\displaystyle \{x:p(x-x_0)<\epsilon\},

where {p\in\mathcal{P}, x_0\in X} and {\epsilon>0}. Since the basis determined by a subbase is the collection of all finite intersection of the elements of the subbase, we see that a subset {U} is open if and only if for every {x_0\in U} there are {p_1,...,p_n\in\mathcal{P}} and {\epsilon_1,...,\epsilon_n>0} such that

\displaystyle \bigcap_{j=1}^n\left\lbrace x\in X:p_j(x-x_0)<\epsilon_j\right\rbrace\subset U.

Then

Proposition 1.6 – {X} with the above mentioned topology is a TVS.

Proof: We need to show that the maps {(x,y)\mapsto (x+y)} and {(\alpha,x)\mapsto \alpha x} are continuous for all {x,y\in X}, and {\alpha\in\mathbb{K}}.

To show the continuity of addition, let {x,y\in Z}, and let {V} be a neighborhood of {x+y}. Then we need to find neighborhoods {U_1} of {x} and {U_2} of {y} such that {U_1+U_2\subset V}. Since {V} is open there exist {p_1,...,p_n\in\mathcal{P}} and {\epsilon_1,...,\epsilon_n>0}, such that the basis element

\displaystyle B=\bigcap_{j=1}^n\left\lbrace x\in X:p_j(x-x_0)<\epsilon_j\right\rbrace\subset V.

Let {\epsilon=\min\{\epsilon_1,...,\epsilon_n\}}, and define

\displaystyle U_1=\bigcap_{j=1}^n\left\lbrace x\in X:p_j(z-x)<\frac{\epsilon}{2}\right\rbrace

and

\displaystyle U_2=\bigcap_{j=1}^n\left\lbrace x\in X:p_j(z-y)<\frac{\epsilon}{2}\right\rbrace

as basis neighborhoods of {x} and {y} respectively. Let {z_1\in U_1} and {z_2\in U_2}, then

\displaystyle p_j(z_1+z_2-(x+y))\leq p_j(z_1-x)+p_j(z_2-y)<\frac{\epsilon}{2}+\frac{\epsilon}{2}=\epsilon\leq\epsilon_j

for each {j=1,...,n}. Thus {z_1+z_2\in B\subset V}. Thus {U_1+U_2\subset V}. The continuity of multiplication is handled in a similar manner. \Box

Definition 4 – A locally convex space (LCS) is a topological vector space whose topology is defined by a family of seminorms {\mathcal{P}} such that

\displaystyle \bigcap_{p\in\mathcal{P}}\left\lbrace x:p(x)=0 \right\rbrace=\{0\}.

The condition imposed above ensures that the topology so defined is Hausdorff. To show that this is Hausdorff, let {x,y\in X}, with {x\neq y}. Then there is a {p\in\mathcal{P}} for which {p(x-y)\neq 0}. For if {p(x-y)=0} for all {p\in\mathcal{P}}, then {x-y\in \bigcap_{p\in\mathcal{P}}\left\lbrace x:p(x)=0 \right\rbrace}, which contradicts that only 0 belongs to that set. Let {p(x-y)>\epsilon>0}, and

\displaystyle \begin{array}{rcl} U &=& \{z:p(x-z)<\frac{1}{2}\epsilon\} \\ V &=& \{z:p(y-z)<\frac{1}{2}\epsilon\} \end{array}

Then {U\cap V=\emptyset}. For if {z\in U\cap V}, then

\displaystyle p(x-y)\leq p(x-z)+p(y-z)<\epsilon,

which again yields a contradiction.

There is another characterization of a LCS, which explains why the name locally convex is given. A set {V\subset X} is called balanced if {\alpha x\in V}, whenever {x\in V}, and {|\alpha|\leq 1}.

Theorem 1 Let {X} be a TVS, and let {\mathcal{U}} be the collection of all open convex balanced subsets of {X}. Then {X} is locally convex if and only if {\;\mathcal{U}} is a basis for the neighborhood system at 0.

1.3. The Krein-Milman Theorem

Roughly speaking, the Krein-Milman theorem is a generalization of the fact that in {\mathbb{R}^n}, one can recover the shape of a convex polygon once its extreme points are known.

We shall be using the following crucial separation theorem.

Theorem 2 Let {X} be a complex LCS and let {A} and {B} be disjoint closed convex subsets of {X}. If {B} is compact, then there is continuous linear functional {f:X\rightarrow\mathbb{C}}, an {\alpha\in\mathbb{R}}, and an {\epsilon>0} such that for {a\in A}, and {b\in B} we have

\displaystyle \text{Re}\;f(a)\leq\alpha<\alpha+\epsilon\leq\text{Re}\; f(b).

Theorem 3 If {K} is a non-empty compact convex subset of a locally convex space {X}, then {\text{ext}\;K\neq\emptyset} and {K=\overline{\text{co}}(\text{ext}\;K)}.

Proof: We shall prove the theorem in steps. First note that if {K=\{x_0\}}, a singleton set, then the theorem trivially holds. Therefore, we’ll assume that {K} contains at least two points.

Step 1. According to Proposition 1.3, {a\in K} is an extreme point if and only if {K-\{a\}} is convex. Every locally convex space is Hausdorff, and since {K} is a compact subset of {X}, we get that {K} is closed. Hence {K-\{a\}} is open relative to {K}. Thus, {a\in K} is an extreme point if and only if {K-\{a\}} is relatively open convex subset of {K}. In this first step of the proof, we shall establish the existence of a maximal relatively open convex proper subset of {K} (using Zorn’s lemma).

Let {\mathcal{U}} denote the collection of all relatively open convex proper subsets of {K}. This collection is not empty. For, let {x_0\in K}, then {K-\{x_0\}} is open and non-empty (since we assumed that {K} contains at least two points); let {x_1\in K-\{x_0\}}. Since {X} is locally convex, there is an open convex subset {V\subset K-\{x_0\}} containing {x_1}. Then {V\cap K} is a relatively open convex proper subset of {K}. We introduce a partial order relation on {\mathcal{U}} by inclusion, and let {\mathcal{U}_0} be a totally ordered subset of {\mathcal{U}}. Put

\displaystyle U_0=\bigcup\left\lbrace U:U\in\mathcal{U}_0\right\rbrace

Clearly, {U_0} is open and since {\mathcal{U}_0} is totally ordered, {U_0} is convex. We only have to show that {U_0} is proper. If {U_0=K}, then {\mathcal{U}_0} is an open cover for {K}, and by compactness of {K}, we have that {U=K} for some {U\in\mathcal{U}_0}, which contradicts that {U\in\mathcal{U}}. Thus this totally ordered set has a maximal element, and hence by Zorn’s lemma, {\mathcal{U}} has a maximal element {U}. This completes our first step.

Step 2. In this step we shall prove that {K-U} consists of a single point, thereby proving that ext {K} is not empty. If {x\in K}, and {0\leq t\leq 1}, let {T_{x,t}:K\rightarrow K} be defined by

\displaystyle T_{x,t}(y)=ty+(1-t)x.

Clearly, {T_{x,t}} is continuous and it preserves convex combinations, that is,

\displaystyle T_{x,t}\left(\sum_{i=1}^n\lambda_iy_i\right)=\sum_{i=1}^n\lambda_iT_{x,t}(y_i),

where {y_i\in K} and {\lambda_i\geq 0} for {i=1,...,n} and {\sum_{i=1}^ny_i=1}. If we take {x\in U}, and {0\leq t<1}, then by convexity of {U}, we have {T_{x,t}(U)\subset U}, which implies {U\subset T_{x,t}^{-1}(U)}. Since pull back of an open set by a continuous function is open, we get that {T_{x,t}^{-1}(U)} is open in {K}, and it is convex too. If {y\in\overline{U}-U,} then by proposition 1.4 we have

\displaystyle T_{x,t}(y)\in [x,y)\subset U.

Thus {\overline{U}\subset T^{-1}_{x,t}}, and hence by maximality of {U}, we get {T_{x,t}^{-1}(U)=K} which translates into {T_{x,t}(K)\subset U}.

Now, let {V} be any open convex subset of {K}. Let {y,z\in V\cup U}. If both points lie in {V}, their convex combinations also lie in {V}, and hence in {U\cup V}. Suppose {y\in U} and {z\in V}, then from what we just proved above, all the linear combinations {(1-t)y+tz} (where {0\leq t<1}) lies in {U}. Thus {[y,z]\subset U\cup V} which proves that {U\cup V} is an open convex subset of {K}. But by maximality of {U}, we must have either {U\cup V=U} or {U\cup V=K}. Using this we complete our claim that {K-U} is singleton.

Let {x,y\in K-U}, and {x\neq y}. Let {V_x} and {V_y} be disjoint open convex subsets of {K} such that {x\in V_x} and {y\in V_y}. Since {x\notin U}, we have {U\cup V_x=K} (from the previous paragraph). This implies that {y\in V_y\subset K=U\cup V_x}. But this is contradiction since {y} does not lie in {U} as well as {V_x}. Thus {K-U} is singleton, and hence ext {K} is not empty.

Step 3. In this last step we prove that {K} is the closed convex hull of ext {K}, the last claim of the theorem. We first prove that if {V} is an open convex subset of {X}, and ext {K\subset V}, then {K\subset V}. If this does not hold, there will be an open convex subset of {V\subset X} such that ext {K\subset V} but {V\cap K\neq K}. Then {V\cap K\in\mathcal{U}}, and is contained in a maximal element {U}. Since {K-U=\{a\}} for some {a\in\text{ext}\;K}, this contradicts that ext {K\subset V}.

Finally, let {E=\overline{\text{co}}(\text{ext}\;K)}. Suppose {E\neq K}, and therefore let {x\in K-E}. Then {E} and {\{x\}} are disjoint closed convex sets with {\{x\}} being compact. Then by theorem 1.8, there exists a continuous linear functional {f:X\rightarrow\mathbb{C}}, an {\alpha\in\mathbb{R}}, an {\epsilon>0} such that {E\subset\{z\in X:\text{Re}\;f(z)\leq\alpha\}}, and {x\in \{z\in X:\text{Re}\;f(z)\leq\alpha+\epsilon\}}. Let {V=\{z\in X:\text{Re}\;f(z)<\alpha+\frac{\epsilon}{2}\}}. Then {V} is open, convex and {E\subset V} which implies {\text{ext}\;K\subset V}. But as we proved in the previous paragraph, we should have {K\subset V}. But this contradicts that {\text{Re}\;f(x)\geq\alpha+\epsilon}.

Hence {K=E=\overline{\text{co}}(\text{ext}\;K)}, and this completes the proof. \Box