# Nirakar Neo's Blog

Mathematics and Trivia

## The Birthday Problem

Today I wish to discuss a simple question is probability, which goes by the name birthday problem or birthday paradox. It’s hardly a paradox, only because it does not seem apt with naive thinking. But so are the nature of things in mathematics.

Suppose you are in someone’s birthday party and the cake has been cut and the party is in full swing. The mood is there but you are alone. Thinking. And wishing perhaps that it was your birthday party, with you being the center of attention. This suddenly strikes the following question in mind.

Assuming the room is closed and there are ${n}$ people in the room and that it is a usual year with 365 days, what is the probability that there are at least two persons who share a common birthday (and hence perhaps throwing a common and grander party).

Of course, if there are 366 people in the room (hypothetically) then there is always a pair who share the same birthday. This follows from a similar situation, that if there are two boxes and one has to put three letters, then one box must contain (at least) two letters. Can you see how it works in the birthday case?

But, what if there are fewer people? In fact you can ask, what is the smallest number of people in the room for which there is 50% chance that two person share the same birthday?

The answer is 23. (Does this surprise you?)

Written by Nirakar Neo

June 23, 2012 at 8:00 am

Posted in Mathematics

## Matrix Convex Sets

In the previous post, we introduced some objects in operator algebra. Those will occur in the examples given in this post. The purpose of this post is to define matrix convex sets.

We begin by defining what we mean by a matrix convex set. There are two definitions, and we shall show that both are equivalent. Then we shall present some examples of matrix convex sets.

Definition 1. A matrix convex set in a vector space ${V}$ is a collection ${K=(K_n)}$, where for each ${n\in\mathbb{N}}$ ${K_n\subset M_n(V)}$ is a non-empty set such that

$\displaystyle \sum_{i=1}^{k}\gamma_i^*v_i\gamma_i\in K_n,$

whenever ${v_1\in K_{n_1}, ..., v_k\in K_{n_k}}$, and ${\gamma_1\in \mathbb{M}_{n_1,n},...,\gamma_k\in \mathbb{M}_{n_k,n}}$ satisfying ${\sum_{i=1}^k\gamma_i^*\gamma_i=\mathbb{I}_n}$.

The above definition seems like a natural generalisation of convexity. But the next one is usually easier to use in specific cases.

Definition 2. A matrix convex set in a vector space ${V}$ is a collection ${K=(K_n)}$, where for each ${n\in\mathbb{N}}$ ${K_n\subset M_n(V)}$ is a non-empty set such that

1. For any ${\gamma\in \mathbb{M}_{r,n}}$, with ${\gamma^*\gamma=\mathbb{I}_n}$, we have ${\gamma ^* K_r \gamma\subset K_n}$.
2. For any ${m,n\in\mathbb{N}}$, ${K_m\oplus K_n \subset K_{m+n}}$.

We now show the equivalence of both definitions.

Definition 1 ${\Rightarrow }$ Definition 2. The first part of definition 2 follows trivially (using ${k=1}$). To show the second part, let ${v\in K_m}$ and ${w\in K_n}$; then we have to show that

$\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix}\in K_{m+n}$

But

$\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix} = \gamma_1^*v\gamma_1+\gamma_2^*w\gamma_2,$

where

$\displaystyle \gamma_1=[\mathbb{I}_m \quad 0_{m,n}] \quad\quad \text{and}\quad \quad \gamma_2=[0_{n,m} \quad \mathbb{I}_n],$

and clearly, ${\gamma_1^*+\gamma_1+\gamma_2^*\gamma_2=\mathbb{I}_{m,n}}$. Thus (2) is also satisfied.

Definition 2 ${\Rightarrow }$ Definition 1. Let ${v_i\in K_{n_i}}$, ${\gamma_i\in \mathbb{M}_{n_i,n}}$ for ${i=1,..,k}$ satisfying ${\sum_{i=1}^k\gamma^*_i\gamma_i\in K_n}$. Then by extension of (2) in definition 2, we have

$\displaystyle v = \begin{pmatrix} v_1 & & & \\ & v_2 & & \\ & & \ddots & \\ & & & v_k \end{pmatrix} \in K_{n_1+n_2+...+n_k}$

Now let ${\gamma=\begin{pmatrix} \gamma_1 & \hdots & \gamma_k \end{pmatrix}^T\in \mathbb{C}_{n_1+n_2+...+n_k,n}}$. Then ${\gamma^*\gamma=\sum\gamma_i^*\gamma_i=\mathbb{I}_n}$ and by definition 1 we have ${\gamma^*v\gamma\in K_n}$ which implies ${\sum_1^k\gamma_i^*v_i\gamma_i\in K_n}$.

An important consequence is that each ${K_n}$ is a convex set of ${M_n(V)}$. Let ${K=(K_n)}$ be a matrix convex set in ${V}$. Let ${v_1,v_2\in K_n}$. We need to show that ${\lambda v_1+(1-\lambda)v_2\in K_n}$ for ${\lambda\in [0,1]}$. Set ${\gamma_1=\sqrt{\lambda}\;\mathbb{I}_n\in\mathbb{M}_n}$, and ${\gamma_2=\sqrt{1-\lambda}\;\mathbb{I}_n\in\mathbb{M}_n}$. Then

$\displaystyle \gamma_1^*\gamma_1+\gamma_2^*\gamma_2=(\sqrt{\lambda}\;\mathbb{I}_n)(\sqrt{\lambda}\;\mathbb{I}_n)+(\sqrt{1-\lambda}\;\mathbb{I}_n)(\sqrt{1-\lambda}\;\mathbb{I}_n) = (\lambda+1-\lambda)\;\mathbb{I}_n=\mathbb{I}_n.$

Then ${\gamma_1^*v_1\gamma_1+\gamma_2^*v_2\gamma_2\in K_n}$, which is nothing but ${\lambda v_1+(1-\lambda)v_2\in K_n}$.

We present some examples of matrix convex sets. We start with the simplest vector space.

Example 2. Let ${a,b\in [-\infty, \infty ]}$. On the vector space ${\mathbb{C}}$, consider the collection

$\displaystyle [a\mathbb{I},b\mathbb{I}]=([a\mathbb{I}_n,b\mathbb{I}_n]),$

where ${[a\mathbb{I}_n,b\mathbb{I}_n]=\{\alpha\in\mathbb{M}_n: a\mathbb{I}_n\leq \alpha\leq b\mathbb{I}_n\}}$. This collection defines a matrix convex set in ${\mathbb{C}}$. We shall now verify that this indeed satisfies the conditions mentioned above (using definition 2).

Let ${\alpha\in \mathbb{M}_{r,n}}$, with ${\alpha^*\alpha=\mathbb{I}_n}$. Let ${v\in K_r}$, then ${a\mathbb{I}_r\leq v\leq b\mathbb{I}_r}$. This implies

$\displaystyle a(\alpha^*\mathbb{I}_r\alpha) \leq \alpha^*v\alpha \leq b( \alpha^*\mathbb{I}_r\alpha)$

which means ${a\mathbb{I}_n\leq v \leq b\mathbb{I}_n}$. Thus ${\alpha^*v\alpha\in K_n}$. Finally let ${v\in K_m}$ and ${w\in K_n}$. Then we show that

$\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix} \in K_{m+n}$

But, for ${z_1\in \mathbb{C}^m,z_2\in \mathbb{C}^n}$, we have

$\displaystyle \left\langle \begin{pmatrix} v- a\mathbb{I}_n & 0 \\ 0 & w-a\mathbb{I}_n \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix},\begin{pmatrix} z_1 \\ z_2 \end{pmatrix}\right\rangle= \langle (v-a\mathbb{I}_m)z_1,z_1 \rangle+\langle (w-a\mathbb{I}_n)z_2,z_2 \rangle\geq 0$

And similarly we can show the other parts. This completes our verification. Conversely, we can show that any matrix convex set ${K=(K_n)}$ in ${\mathbb{C}}$, where ${K_1}$ is a closed convex subset of ${\mathbb{R}}$ is a closed matrix interval.

Theorem 1 Suppose that ${K=(K_n)}$ is a matrix convex set in ${\mathbb{C}}$ where ${K_1}$ is a bounded closed subset of ${\mathbb{R}}$. Then ${K}$ must be a closed matrix interval.

Proof: Since ${K_1}$ is a bounded closed and convex it must be a closed interval in ${\mathbb{R}}$, say ${K_1=[a,b]}$. Let ${\gamma\in K_n}$. Then we have to show that ${\alpha\mathbb{I}_n\leq \gamma\leq \beta\mathbb{I}_n}$. Let us first show that ${\gamma-\alpha\mathbb{I}_n\geq 0}$. Let ${\xi\in\mathbb{C}^n}$, then

$\displaystyle \begin{array}{rcl} && \langle (\gamma-\alpha\mathbb{I}_n)\xi,xi \rangle \geq 0 \\ &\Leftrightarrow& \langle \gamma\xi,\xi\rangle-\langle\alpha\mathbb{I}_n\xi,\xi\rangle \geq 0 \\ &\Leftrightarrow& \xi^*\gamma\xi \geq \alpha \xi^*\xi=\alpha \end{array}$

But from property 1, we see that ${\xi^*\gamma\xi\in K_1}$, and thus the last statement is true; hence ${\gamma\geq \alpha\mathbb{I}_n}$. Similarly, we can show ${\gamma\leq \beta\mathbb{I}_n}$. Thus ${\alpha\mathbb{I}_n\leq \gamma\leq \beta\mathbb{I}_n}$.

Conversely, let ${\gamma\in [a\mathbb{I}_n,\beta\mathbb{I}_n]}$, then ${\gamma}$ is self-adjoint and hence we can write

$\displaystyle \gamma=u^*D u=\begin{pmatrix} u_1^* & \hdots & u_n^* \end{pmatrix}\begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix}\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix},$

where ${\gamma_i\in [\alpha,\beta]}$. Then by property 2, we see that ${D\in K_n}$, and by property 1, ${u^*Du\in K_n}$, that is ${\gamma\in K_n}$. $\Box$

Example 3. Let ${\mathcal{M}}$ be an operator space. Then the collection of unit balls, ${B=(B_n)}$ where

$\displaystyle B_n=\{x\in M_n(\mathcal{M}): \|x\|\leq 1\}$

is a matrix convex set. It is almost trivial to verify the conditions of matrix convexity using the abstract definition of operator space (given in this post).

Example 4. Let ${\mathcal{R}}$ be an operator system. The collection of positive cones, ${P=(P_n)}$ where

$\displaystyle P_n=\{x\in M_n(\mathcal{R}):x\geq 0\}$

is a matrix convex set. For let ${\gamma\in \mathbb{M}_{r,n}}$ with ${\gamma^*\gamma=\mathbb{I}_n}$. Let ${v\in P_r}$. Then for ${\xi\in\mathbb{C}^n}$ we have

$\displaystyle \langle(\gamma^*v\gamma)\xi,\xi\rangle=\xi^*\gamma^*v\gamma\xi=(\gamma\xi)^*v(\gamma\xi)\geq 0$

since ${v\geq 0}$. Similarly, if ${v\in P_m}$ and ${w\in P_n}$, then it is fairly obvious that ${v\oplus w\in P_{m+n}}$.

Example 5. Again let ${\mathcal{R}}$ be an operator system. Consider the collection of matrix states ${CS(\mathcal{R})=(CS_n(\mathcal{R}))}$ where

$\displaystyle CS_n(\mathcal{R})=\{\varphi:\mathcal{R}\rightarrow \mathbb{M}_n \; | \; \varphi\; \text{is completely positive and} \; \varphi(1)=\mathbb{I}_n\}.$

We say ${\varphi:\mathcal{R}\rightarrow \mathbb{M}_n}$ is completely positive if canonically amplified maps ${\varphi_r:M_r(\mathcal{R})\rightarrow M_r(\mathbb{M}_n)}$ are positive for all ${r\in\mathbb{N}}$. Then ${CS(\mathcal{R})}$ is a matrix convex set in ${\mathcal{R}^*}$. Here we have identified an ${\varphi\in CS_n(\mathcal{R})}$ as an element in ${M_n(\mathcal{R^*})}$. For verification, let ${\varphi\in CS_r(\mathcal{R})}$ and ${\gamma\in\mathbb{M}_{r,n}}$. Then we have to show that ${\gamma^*\varphi\gamma}$ is completely positive and ${\gamma^*\varphi\gamma(1)=\mathbb{I}_n}$. By definition, ${\gamma^*\varphi\gamma(x)= \gamma^*\varphi(x)\gamma}$. Then

$\displaystyle (\gamma^*\varphi\gamma)(1)=\gamma^*\varphi(1)\gamma=\gamma^*\mathbb{I}_r\gamma=\mathbb{I}_n.$

To show complete positivity, let ${[x_{ij}]\geq 0}$. Then

$\displaystyle (\gamma^*\varphi\gamma)[x_{ij}]=[\gamma^*\varphi(x_{ij})\gamma]=\Gamma^*[\varphi(x_{ij})]\Gamma\geq 0,$

where

$\displaystyle \Gamma =\begin{pmatrix} \gamma & \hdots & \gamma \end{pmatrix}^T.$

Thus ${\gamma^*\varphi\gamma}$ is completely positive since ${\varphi}$ is completely positive. Likewise, the statement for ${\varphi\oplus\psi}$ is analogously proved. ${CS(\mathcal{R})}$ is considered as the matricial version of the state space.

Example 2 and example 5 are the ones which we shall focus upon the next sections.

Written by Nirakar Neo

May 24, 2012 at 3:55 pm

Posted in Mathematics

Tagged with

## Some Objects of Operator Algebra

with one comment

This post is a continuation of this previous post. The aim of this post is to introduce some common objects that are encountered in operator algebra. All of the objects will be needed to understand the paper cited in the previous blog. I am bit terse here and examples are almost absent.

All the vector spaces considered here are assumed to be complex. Let ${V}$ be a vector space. ${M_{m,n}(V)}$ is the vector space of all ${m\times n}$ matrices with entries from ${V}$, under natural addition and scalar multiplication. In particular, if ${m=n}$, we write ${M_{n,n}(V)}$ as ${M_n(V)}$. If ${V=\mathbb{C}}$, then we denote ${M_{m,n}(\mathbb{C})}$ by ${\mathbb{M}_{m,n}}$, and ${M_{n,n}(\mathbb{C})}$ by ${\mathbb{M}_n}$. ${\mathbb{I}_n}$ is the identity matrix in ${\mathbb{M}_n}$.

We shall be using some mathematical objects which we’ll describe here. The simplest to start with are the operator spaces. Let ${\mathcal{H}}$ be any Hilbert space. An operator space ${\mathcal{M}}$ is a closed linear subspace of ${\mathfrak{B}(\mathcal{H})}$ Using ${\mathcal{M}}$ we construct matricial spaces ${M_n(\mathcal{M})}$. The natural inclusion

$\displaystyle M_n(\mathcal{M})\hookrightarrow M_n(\mathfrak{B}(\mathcal{H}))\simeq \mathfrak{B}(\mathcal{H}^n) \ \ \ \ \ (1)$

provides a norm on ${M_n(\mathcal{M})}$ using the operator norm on ${\mathfrak{B(\mathcal{H}^n)}}$. This description is called a concrete definition of an operator space. Ruan characterised operator spaces in the following abstract way.

We say ${\mathcal{M}=(V,\{\|.\|_n\})}$ is an operator space if ${(M_n(V),\|.\|_n)}$ is a normed linear space, with the norms satisfying

1. ${\|\alpha v\beta\| \leq \|\alpha\|\|v\|\|\beta\|}$, where ${\alpha,\beta\in\mathbb{M}_m}$ and ${v\in M_m(V)}$, and we define ${\alpha v \beta=[\Sigma_{j,k}\alpha_{ij}v_{jk}\beta_{kl}]\in M_m(V)}$.
2. ${\|v\oplus w\|_{m+n}=\text{max}\{\|v\|_m,\|v\|_n\}}$, where ${v\in M_m(V), w\in M_n(V)}$.

Operator spaces are usually considered as noncommutative analogues of Banach spaces. A special example of operator space is an operator system. Concretely, an operator system ${\mathcal{R}}$ is a closed subspace of ${\mathfrak{B}(\mathcal{H})}$, which is also closed under adjoint operation and contains the identity operator. Again we can form the different matrix levels ${M_n(\mathcal{R})}$. The important thing here is that the inclusion (1) defines an ordering on ${M_n(\mathcal{R})}$ which is induced by the usual ordering on ${\mathfrak{B}(\mathcal{H}^n)}$.

The commutative analogue of an operator system is a function system, which is a closed subspace of ${C(X)}$, where ${X}$ is a compact, and the subspace is closed under conjugation and contains the constant function, 1 (${1(x)=1}$ for all ${x\in X}$). Consider the following important example which we shall refer later. We recall that affine functions are those which preserve convex combinations.

Example 1. Let ${X}$ be a locally convex space, and let ${K}$ be a compact convex subset of ${X}$. Define

$\displaystyle A(K)=\{f:K\rightarrow\mathbb{C} \;|\; f \; \text{is continuous and affine}\}$

Then ${A(K)}$ is a function system. Clearly, ${A(K)\subset C(K)}$. It is closed since uniform limit of a sequence of continuous functions is again continuous, and the limit function is easily seen to be affine. Moreover it is closed under conjugation, and the constant function (${1(x)=1}$ for all ${x\in K}$) is trivially contained in ${A(K)}$. The positive elements in ${A(K)}$ are precisely those functions which map each ${x\in K}$ into ${[0,\infty)}$. In particular, 1 is a positive element.

Function systems are abstractly characterised as complete order unit spaces. An order unit space is a real-partially ordered vector space – ${A}$ – having an element ${e}$, which we call the order unit, satisfying

1. (Order Unit property) For each ${a\in A}$, there is an ${r\in \mathbb{R}}$ such that ${a\leq re}$.
2. (Archimedean property) For each ${a\in A}$, if ${a\leq re}$ for all ${r\in (0,\infty)}$, then ${a\leq 0}$.

On an order unit space ${(A,e)}$, we define a norm as

$\displaystyle \|a\|=\inf\{r\in \mathbb{R}:-re\leq a\leq re\}.$

Then ${A}$ becomes a normed linear space. It is easy to see that a function system satisfies all these properties.

Parallel to this characterisation, operator systems have been characterised by Choi-Effros in terms of matrix ordered spaces. Let ${V}$ be ${^*}$-vector space. Then ${M_n(V)}$ is also ${^*}$-vector space by ${[v_{ij}]^*=[v_{ij}^*]}$. We say ${V}$ is a matrix ordered space if there are proper cones ${M_n(V)^+\subset M_n(V)_{sa}}$ for all ${n\in \mathbb{N}}$, and such that

$\displaystyle \alpha^*M_{m}(V)^+\alpha\subset M_n(V)^+$

for all ${\alpha\in \mathbb{M}_{m,n}}$. Then we have the following definition.

A matrix ordered space ${(A,M_n(A)^+)}$ is an operator system if

1. ${A^+}$ is proper, that is, if ${a,-a\in A^+}$, then ${a=0}$
2. ${M_n(A)^+}$ is Archimedean for all ${n\mathbb{N}}$
3. ${A}$ has an order unit.

Given an operator system ${\mathcal{R}}$, we define the state space of ${\mathcal{R}}$ as

$\displaystyle S(\mathcal{R})=\{\varphi:\mathcal{R}\rightarrow \mathbb{C} \;|\; \|\varphi\|=1=\varphi(1)\}.$

Let ${V,W}$ are vector spaces and ${f:V\rightarrow W}$ is a linear map, then the canonical amplification of ${f}$ is the sequence of maps ${f_n:M_n(V)\rightarrow M_n(W)}$ defined by ${[v_{ij}]\mapsto [f(v_{ij})].}$

In the next post, I’ll start by defining matrix convex sets.

Written by Nirakar Neo

March 26, 2012 at 9:46 pm

Posted in Mathematics

## Shortest Proof of Irrationality of sqrt(N), where N is not a perfect square

This proof is really the shortest proof I have ever seen for this theorem, and it’s really witty! I stumbled this proof while browsing the American Mathematical Monthly periodicals, to be specific on page 524 in the June-July 2008 issue. I reproduce the proof here.

Theorem  Let ${N}$ be a positive integer. If ${N}$ is a not perfect square, then ${\sqrt{N}}$ is irrational.

Proof: Before embarking on the proof, recall that the standard proof uses the method of contradiction. Here we shall prove the contrapositive statement to prove the theorem. That is, we shall prove that if ${\sqrt{N}}$ is rational then ${N}$ is a perfect square.

Thus assume that ${\sqrt{N}}$ is rational. This means we can write

$\displaystyle \sqrt{N}=\frac{p}{q},$

where ${p,q}$ are integers, ${q\neq 0}$, and that it is in the lowest form. This implies ${p^2=Nq^2}$, which we can rearrange so as to write

$\displaystyle \frac{p}{q}=\frac{Nq}{p}.$

Since we assumed that ${p/q}$ is in lowest form, we see that both ${Nq}$ and ${p}$ must be a integral multiple of ${p}$ and ${q}$ respectively. Taking that constant multiple to be ${c}$, we may write ${Nq=cp}$ and ${p=qc}$. The latter equation says

$\displaystyle \frac{p}{q}=c,$

but since ${p/q}$ is nothing but ${\sqrt{N}}$, this implies that ${\sqrt{N}=c}$, which simply means that ${N}$ is a perfect square. This proves the contrapositive statement, and therefore the theorem follows. $\Box$

Isn’t it beautiful!

Written by Nirakar Neo

March 21, 2012 at 11:01 pm

Posted in Mathematics

## The Krein-Milman Theorem

with one comment

1. The Krein-Milman theorem in Locally Convex Spaces

My project work this semester focuses to understand the paper the Krein-Milman Theorem in Operator Convexity by Corran Webster and Soren Winkler, which appeared in the Transactions of the AMS [Vol 351, #1, Jan 99, 307-322]. But before reading the paper, it is imperative to understand the (usual) Krein-Milman theorem which is proved in the context of locally convex spaces. My understanding of this part follows the book A Course in Functional Analysis by J B Conway. To begin with we shall collect the preliminaries that we shall need to understand the Krein-Milman theorem.

1.1. Convexity

Let ${\mathbb{K}}$ denote the real(${\mathbb{R}}$) or the complex(${\mathbb{C}}$) number fields. Let ${X}$ be a vector space over ${\mathbb{K}}$. A subset of a vector space is called convex if for any two points in the subset, the line segment joining them lies completely in the subset. We make this idea more precise.

If ${a,b\in X}$, the line segment from ${a}$ to ${b}$ is given by the set

$\displaystyle [a,b]=\{(1-t)a+tb :0\leq t\leq 1\}.$

By an open line segmentwe shall mean that the end points should be excluded, that is

$\displaystyle (a,b)=\{(1-t)a+tb :0

We shall call a line segment proper if ${a\neq b}$. A subset ${V\subset X}$ is called convexif for any ${a,b\in V}$, the line segment ${[a,b]\subset V}$. It is easy to see that intersection of arbitrary number of convex subsets is again convex. We characterize a convex set as -

Proposition 1.1 : ${V}$ is convex if and only if whenever ${x_1,...,x_n\in V}$, and ${t_1,...,t_n\in [0,1]}$ with ${\sum_i t_i=1}$, then ${\sum_i t_ix_i\in V}$.

Proof: If the latter condition holds, then it is easy to see that ${V}$ is convex. For the converse part, we use induction on the number of points, ${n}$. For ${n=1}$, this is trivially true, while the case ${n=2}$ follows from the definition. Moreover note that if ${t_1=1}$, the theorem holds. So we shall assume ${t_1\neq 1}$. Suppose the theorem holds for ${k=n>2}$. Then, if there are ${n+1}$ points, say ${x_1,...,x_{n+1}}$, and ${t_1,...,t_{n+1}}$, with ${t_1\neq 1}$ and ${\sum_{i=1}^{n+1}t_i=1}$ then we can write

$\displaystyle t_1x_1+\sum_{i=2}^{n+1}t_ix_i=t_1x_1+(1-t_1)\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i.$

Now ${\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i\in V}$ by the induction hypothesis, and by the convexity of ${V}$, we get that ${\sum_{i=1}^{n+1}t_ix_i\in V}$. Thus the proof is complete. $\Box$

If ${V\subset X}$, the convex hull of ${V}$, denoted by ${\text{co}\;(V)}$, is the intersection of all convex sets containing ${V}$. Clearly this definition is meaningful since ${X}$ itself a convex set containing ${V}$, and that convexity is preserved under intersection. ${\text{co}\;(A)}$ is also convex. Using proposition 1.1, we have the following characterization of the convex hull of ${V}$.

The convex hull of ${V}$ is the set of all convex combinations of elements of ${V}$, that is

$\displaystyle co(A)=\left\lbrace t_1x_1+...+t_nx_n: x_i\in V,\; 0\leq t_i\leq 1,\; \sum_{i=1}^nt_i=1,\; n\; \text{is arbitrary}\right\rbrace$

As an example, the convex hull of three points ${x_1,x_2,x_3\in\mathbb{R}^2}$ is the closed triangular region formed by those points.

Definition 1. Let ${V}$ be a convex subset of a vector space ${X}$. A point ${a\in V}$ is called an extreme point of ${V}$, if there does not exist any proper open line segment containing ${a}$ and which lies completely in ${V}$. We denote the set of extreme points of ${V}$ by ${\text{ext}\;(V)}$.

Convex sets may or may not have extreme points. For example the set ${\{(x,y)\in\mathbb{R}^2:x^2+y^2<1\}}$, does not have any extreme point, whereas the extreme points of the set ${\{(x,y)\in\mathbb{R}^2:x^2+y^2\leq 1\}}$ are all the boundary points. The extreme points of a closed square region in ${\mathbb{R}^2}$ are its four corners.

We have following equivalent characterizations.

Proposition 1.3 : Let ${V}$ is convex subset of a vector space ${X}$, and let ${a\in V}$. Then the following are equivalent -

• (a) ${a\in\text{ext}\;V }$.
• (b) If ${x_1,x_2\in V}$ and ${0 and ${a=tx_1+(1-t)x_2}$, then either ${x_1\notin V}$ or ${x_2\notin V}$ or ${x_1=x_2=a}$.
• (c) If ${x_1,...,x_n\in V}$ and ${a\in\text{co}\;\{x_1,...,x_n\}}$, then ${a=x_k}$ for some ${k}$.
• (d) ${V-\{a\}}$ is a convex set.

Proof:The equivalence of ${(a)}$ and ${(b)}$ is straightforward from the definition of extreme point and convexity of ${V}$.

${(a)\Rightarrow (c)}$ : Let ${a\in\text{co}\;\{x_1,...,x_n\}}$. Then by proposition 1.2, we get that ${a=t_1x_1+...+t_nx_n}$, where ${0\leq t_i\leq 1}$, and ${\sum t_i=1}$. If ${t_i=1}$ for some ${i}$, we get that ${a=x_i}$. So suppose that ${0 for each ${i}$. Then we can write

$\displaystyle a=t_1x_1+(1-t_1)\sum_{i=2}^{n}\frac{t_i}{1-t_1}x_i$

This happens only if ${(1-t_1)x_1=t_2x_2+...+t_nx_n}$ which means ${a=x_1}$.

${(c)\Rightarrow (a)}$ : Suppose ${a\notin\text{ext}\;V}$. Then there are points ${x,y\in V}$ such that ${a=tx+(1-t)y}$, where ${0, and ${x\neq y}$. So ${a\in\text{co}\;\{x,y\}}$. Then ${(c)}$ dictates that ${a=x}$ or ${a=y}$. But then ${x=y}$.

${(a)\Rightarrow (d)}$ : We take two points ${x,y\in V-\{a\}}$, then ${\{tx+(1-t)y:0\leq t\leq 1\}\subset V-\{a\}}$, since ${a\neq tx+(1-t)y}$ for any ${t\in (0,1)}$. Thus ${V-\{a\}}$ is convex.

${(d)\Rightarrow (a)}$ : Suppose that ${a}$ is not an extreme point. Then there exists ${t\in (0,1)}$ and ${x,y\in V}$ with ${x\neq y}$, and that ${a=tx+(1-t)y}$. But this means that ${[x,y]}$ does not completely in ${V-\{a\}}$, which contradicts its convexity. $\Box$

We shall return to convexity once again after we have introduced the concept of a topological vector space.

Written by Nirakar Neo

January 13, 2012 at 7:53 pm

Posted in Mathematics

## A Day @ISI-Kol

Interviews always make me nervous. And that nervousness starts precisely 30 minutes before my call. Dunno why (about the timing)! Anyway, the interview was for the NBHM MA/MSc Scholarship. Well I already get a handsome scholarship from the Govt Of India (INSPIRE) but I wanted just to get a feel, whether I am still grounded after having plenty of courses. The interview was scheduled to be at 4 pm on 18th November and the venue was ISI, Kolkata. I reached Kolkata in the morning and stayed at my friend’s room – thanks to him. I was also accompanied by my friend and classmate who was also called for the interview.

ISI Kolkata is a big place, and has numerous departments. From studies in social sciences to research advances in mathematics and theoretical statistics. No wonder why this institute has been conferred the title of institute of national importance. Coming back to point, we waited for almost the whole day, till our turn at around 5:30 pm. Till then we met students from Vivekananda University, and guys from IMA; and interacted with them about there courses etc. We took lunch at the ISI Canteen, where the food rate is incredibly subsidized – 50 “paise” for 5 chapatis and curry. We waited, waited and waited. Not to mention I had cold, and a bad headache.

Finally I was called, and already my nervousness has reached a saturated point. But it declined appreciably once I entered the lecture room. There were four professors, and one among them was Prof SC Bagchi, whom I had already acquaintance with; the rest looked familiar but I didn’t know the names. We exchanged smiles, and I eased a lot. The interview started with the obvious question – “what courses have you studied?”. Needless to say I would have easily spent 10 minutes just recounting the name of the courses. I named a few, and mentioned along that there were many in which I wasn’t comfortable with. Maintaining truthfulness is necessary to avoid questions from an area which one doesn’t know well. The first question was asked by Prof Bagchi, who asked me to prove that the closed interval [0,1] is a connected set in the real line. I said that the question boils down to proving that the only subsets of the real line which are both open and closed are the empty set and the whole real line. He asked me to prove this. Now this requires a little thinking. And I’ll leave it to you to figure it out or take the help of a book [Topology-Munkres]. I did succeed only after considerable hints from him.

The next question was an easy one – to give an example of a non-holomorphic function. I shot back saying $f(z)=\overline{z}$;  coz it didn’t satisfy the Cauchy Riemann equations. Immediately the next question was when we can say that the function is analytic – which is again simple – that in addition to satisfying the Cauchy-Riemann equations, it should also have continuous partial derivatives. Then I was asked that given a point in a domain where a function is analytic (=holomorphic), what can I say about the radius of convergence. I said that we can get the radius of convergence by the formula $R^{-1}=\limsup |a_n|^{1/n}$. They clarified that what I can say readily by just looking at the domain, and hinted Cauchy’s theorem. I told that by the theorem, it will be the radius of the largest circle that lies completely in the domain. They asked me whether I can put this in topological terms. I thought a bit, and then it dawned. “Yeah, distance from the boundary”. This was exactly they wanted me to tell.  Another guy from the committee asked me whether I was familiar with the notion of analytic continuation. Yes was the reply, and then I explained the rudimentary idea of how one can analytically continue a holomorphic function which agrees with another holomorphic function on a common domain. I was then asked to write a function which is the analytic continuation of $\sum_{n=0}^{\infty} z^n$, which of course if $\frac{1}{1-z}$ but didn’t hit me then.

Finally the linear algebra part. I was asked how can assign a meaning to $e^A$, where $A$ is a $m\times n$ matrix. I didn’t know so told them flatly. They then modified the question that suppose the matrix given is a symmetric matrix, and hinted some diagonalisation theorem. I got it immediately, so the point is that if $A$ is a symmetric matrix, it is similar to a diagonal matrix via real orthogonal matrix. This means $A=ODO^{-1}$, where $\latex D$ is diagonal. Then the computation is easy and the expression $e^A$ is also well-defined via $e^A=O e^D O^{-1}$, where $e^D$ is nothing but the diagonal matrix obtained by taking the exponential of the diagonal entries of $D$. Then they asked – what if $A$ is nilpotent. In that case the expansion contains only finite number of terms and hence $e^A$ is readily defined. So finally what can I say about a general matrix, and hinted “semi-simple”. Though I had read semi-simple operators, but I couldn’t recall then. Actually the result is that on a finite-dimensional vector space over the complex field, semisimple is equivalent to diagonalizable, and every operator on that space may be written as a sum of a semisimple operator and nilpotent operator which commute. So using this fact we may define $e^A$ for general operators.

With that I had spent almost half an hour, and according to me the interview was quite OK. After me, was my friend’s turn. Finally we had a nice chat with the professors, and got some good advice. Altogether it was a good experience.

Written by Nirakar Neo

November 19, 2011 at 9:58 am

Posted in Mathematics

Tagged with , ,

## Lecture 1. Notion of Derivative

We shall always assume that ${X,Y}$ are Banach spaces over the complex field ${\mathbb{C}}$. The real number field is denoted by ${\mathbb{R}}$. We shall now explore differential calculus in Banach spaces. It will be instructive to try to understand the theorems in well known Banach spaces like ${\mathbb{R}^{n}}$ (over ${\mathbb{R}}$) or ${\ell^{2}}$ (over ${\mathbb{C}}$) or ${C[a,b]}$ (over ${\mathbb{R}}$). In general there are two notions of derivatives – the Gateaux derivative and the Frechet derivative, where the latter is stronger in the sense that Frechet differentiability implies Gateaux differentiability.

Definition: Let ${f:X \rightarrow Y}$, where ${X,Y}$ are Banach space. For ${a,h\in X}$, define

$\displaystyle \delta f(a,h)=\lim_{t\rightarrow 0}\frac{f(a+th)-f(a)}{t},\quad t\in \mathbb{R}$

provided the limit exists. Then ${\delta f(a,h)}$ is called the directional derivative of ${f}$ at ${a}$ in the direction of ${h}$. If this limit exists for every ${h\in X}$ and if the function ${h\mapsto \delta f(a,h)}$ is linear and continuous in ${h}$, then we say that the function is Gateaux differentiable and the function defined by

$\displaystyle Df(a)(h)=\delta f(a,h)$

is called the Gateaux derivative at ${a}$. Thus the linearity and continuity condition means ${Df(a)\in L(X,Y).}$

[An example to show that the map ${h\mapsto \delta f(a,h)}$ is not always linear, and even if ${Df(a)}$ exists, it may not be continuous at ${a}$.]

Definition: Let ${f:X \rightarrow Y}$, where ${X,Y}$ are Banach space. Let ${U\subset X}$ be open subset. We say that ${f}$ is Frechet differentiable at ${a\in U}$, if there exists a bounded linear map ${A\in L(X,Y)}$ such that

$\displaystyle \lim_{h\rightarrow 0}\frac{\left\|f(a+h)-f(a)-A(h)\right\|}{\left\|h\right\|}=0$

Some remarks should be in order, which we leave as simple exercises. The Frechet derivative is unique, and Frechet differentiability implies continuity. If ${f'(a)}$ exists, then also ${Df(a)}$ exists, and we have

$\displaystyle Df(a,h)=f'(a)h.$

Usually when we say ${f}$ is differentiable, we shall mean Frechet differentiable. [An example for which the Gateaux derivative exists but Frechet derivative fails to exist.]

Our next step is to prove the mean value theorem.

Theorem: Let ${f:X\rightarrow Y}$, and suppose ${\delta f(a+t(b-a),b-a)}$ exists for all ${t\in[0,1]}$, and is a continuous map of ${t}$, then

$\displaystyle f(b)-f(a)=\int_{0}^{1}\delta f(a+t(b-a),b-a) dt.$

Proof: Let ${\phi\in Y^{*}}$. Define ${g(t)=\phi(f(a+t(b-a),b-a)):[0,1]\rightarrow\mathbb{R}}$ is differentiable, since it is the composition of two differentiable function. We have ${g'(t)=\phi(\delta f(a+t(b-a),b-a))}$. Using the fundamental theorem of calculus for real valued functions, we get

$g(1)-g(0) = \int_{0}^{1}g'(t) dt =\int_{0}^{1}\phi(\delta f(a+t(b-a),b-a)) dt$

$= \phi\left(\int_{0}^{1}\delta f(a+t(b-a),b-a) dt\right)$

Also, ${g(1)-g(0)=\phi(f(b)-f(a))}$, so altogether we get,

$\displaystyle \phi\left(f(b)-f(a)-\int_{0}^{1}\delta f(a+t(b-a),b-a) dt\right)=0.$

Since this holds for all ${\phi\in Y^{*}}$, we get the desired result.

Notice further that

$\displaystyle \|f(b)-f(a)\|\leq \int_{0}^{1}\|\delta f(a+t(b-a),b-a)\| dt\leq \sup_{t\in[0,1]}\|\delta f(a+t(b-a),b-a)\|.$

This is referred to as the mean value theorem in higher dimensions. The usual mean value theorem does not hold in higher dimensions as is seen by this example. Define ${f(t)=(\cos(t),\sin(t))}$ for ${t\in[0,2\pi]}$. Then ${f}$ is differentiable, and ${f(2\pi)-f(0)=0}$, but there is no such ${t\in(0,1)}$ for which ${f'(t)=(-\sin(t),\cos(t))}$ is zero.

Our next step is to generalize the chain rule. However we shall not prove the rule using strong hypotheses that both functions in a composition are Frechet differentiable. Instead we have -

Theorem: Let ${a\in X, g(a)=b\in Y}$. Suppose ${f}$ is Frechet differentiable at ${b}$ and ${g}$ is having directional derivative in the direction ${h}$ at ${a}$, then ${f\circ g}$ has a directional derivative at ${a}$, and

$\displaystyle \delta(f\circ g)(a,h)=f'(b)\delta g(a,h).$

Before plunging into a proof, let us have an informal discussion. On the right hand side of the above equation, we have a linear transformation ${f'(b):Y\rightarrow Z}$ acting on an element of Y – ${\delta g(a,b)}$. Let us see what we have got. ${g}$ has directional derivative at ${a}$ in the direction of ${h}$. This translates into

$\displaystyle \lim_{t\rightarrow 0}\frac{g(a+th)-g(a)}{t}=\delta g(a,h),$

that is given ${\epsilon>0}$ there exists ${\delta>0}$ such that if ${\left|t\right|<\eta}$ we have

$\displaystyle \|g(a+th)-g(a)-t\delta g(a,h)\|<\epsilon\left| t\right|.$

${f}$ is Frechet differentiable at ${b=g(a)}$, so

$\displaystyle \|f(b+k)-f(b)-Ak\|\leq\theta\|k\|,$

where the right hand side goes to zero as ${\|k\|\rightarrow 0}$. We may take

$\displaystyle k=g(a+th)-g(a)=t\delta g(a,h)+\tilde{\theta}\left|t\right|,$

which certainly goes to zero at ${t\rightarrow 0}$. If we plug this in the equation of ${f}$, we’ll get

$\displaystyle f(b+k)-f(b)=f'(b)k+\theta \|k\|=t f'(h)\delta g(a,h)+\tilde{\theta}\|t\|f'(b)+\theta \|k\|.$

To complete the proof we merely have to show that the last two terms go to zero. This we shall prove formally in the next post.

Written by Nirakar Neo

September 4, 2011 at 12:06 pm

Posted in Mathematics

Follow

### Follow “Nirakar Neo's Blog”

Get every new post delivered to your Inbox.