Nirakar Neo's Blog

Mathematics and Trivia

Protected: Post to remove content

This content is password protected. To view it please enter your password below:

A Rough Guide to Applying Abroad

The first two batches of NISER have done a commendable job in getting placed at good to jaw-dropping institutes in both India and abroad. This post, asked by some friends in the junior batches, describes some general information regarding preparation of GRE and TOEFL exams, specifically aimed at NISER students.

Disclaimer – Since the title says “rough”, I would like to assert that I may be leaving out some important information which you may have to fill on your own. Also, don’t forget to cross-check the facts given over here. I’ll polish this post with time. I’ll add a FAQ if you got some questions.

The first thing is to make sure that you want to go for a PhD in you subject area. Then you need to ask yourself whether you really want to apply to abroad. This choice may vary from person to person and also remember that applying to US universities will require ample monetary backup. If you are okay with this and want to apply to a university on the foreign soil, you need to do some research on where you want to do your PhD – at the US, or in Europe. Both are attractive options and your choice may depend on research areas and future prospects. In any case, I leave this aspect for you to decide. Also, one should remember that the preparation part is not just for the GRE/TOEFL exams but also to prepare you to handle the US environment more smoothly and to instill confidence in using English fluently.

For better preparation and to avoid last minute panic, you need to decide quite early. Invariably, you need to decide well on this matter in your 4th year, or at most by the end of 4th year. The smart way will be to decide these things by the end of 3rd year. You need not have to have a clear opinion on your choice of universities, but some idea will suffice.

Studying abroad requires that you need to appear for at least three tests – two GRE tests and a TOEFL test. And foremost, you must have a passport. If you don’t have it, close the post and open

For the uninitiated, GRE stands for Graduate Record Examination and TOEFL stands for Test of English as a Foreign Language. GRE primarily tests verbal reasoning, quantitative reasoning, and analytical writing skills, as the website puts; whereas TOEFL tests individual’s ability to use and understand English in an academic setting. Both tests are administered by ETS and are standardized. Standardized means that the test is designed in such a way that the questions, conditions for administering, scoring procedures, and interpretations are consistent and are administered and scored in a predetermined, standard manner. Both test scores are asked by the US universities for graduate admissions. Some universities may leave out the Subject GRE score.

I shall first focus on the GRE exams. There are two types of GRE exams – one is called the General GRE and the other is the Subject GRE. The general GRE assesses the above mentioned skills, whereas the subject GRE tests the knowledge and skills acquired in a particular subject and it covers standard undergraduate syllabus. The skills acquired in the first three years of NISER are more than sufficient to appear for the exam.

So, we first focus on the general GRE exam. It can be tougher than expected. It will be wiser to begin preparing well before the actual exam. Give yourself at least six months to prepare leisurely for the general GRE exam. You need to focus on two things – to understand what’s written and to write with full understanding. The rote-learning of word-meanings is secondary, though somewhat necessary. If you have confidence that you have good command over words, start with a self-test.  There is high frequency word list which you should be acquainted with. You need not remember all the near-thousand words, but reading one should strike in your thought, and at least you should be able to decipher it from the context. It’s also available on the internet, for example here. If you believe that your grasp on words is poor, use the high-frequency word list to learn these new words. Since the list usually contains a short meaning to the word, it will be better to take the extra pain to write each word in a notebook with its meaning and a sentence which illustrates its usage.

The next step is to get hold of a copy of this book published by ETS –

If you can’t buy, you should be able to get the book in the library, or get a copy from a senior. But I would recommend that one should buy. Having a personal copy helps in marking texts and adding notes in the books itself. Plus, if it’s a second hand, then many practice questions would have already been solved a priori, adding to your disadvantage.

Then simply follow the instructions given in the book. Following the book will be more than enough for the preparation, lest you really read it carefully, understand the tasks given and complete them. Remember that you have to close your mind, and just focus on what it’s being asked. That’s the key. Just follow logic and the details given in the question. Do not add your own assumptions in answering them.

** If you find the above way boring, then you can choose an alternative, better way. Read some dense texts, like articles in the Frontline magazine, or editorials of the Hindu. Don’t just read, but try to understand what that article wants to say. And write the meanings of the words that you encounter but do not understand. Finally, close the article and write about the article you just read in your own words. It needn’t be long but should be well-organized. This is arguably the best way to prepare and does give results, believe me. You can start following (**) for the first three months, and you’ll notice later that you won’t have difficulty at all in following the ETS book. Thereafter, in the next three months focus entirely on the tasks given in the book. Use the last two months to hone your writing skills.

For the quantitative section of the exam, focus especially on the probability and statistics part, the other parts are relatively easy.

For the subject GRE exam, again get a book and practice it. Plus you can take help of the internet in the absence of the book (like physics).

Once you have prepared enough for GRE exam, it will be a breeze to ace the TOEFL exam. These days most students take TOEFL iBT test. It’s an internet based test. There are four sections – reading, writing, listening and speaking. But the tasks given usually have mixed tasks. Like you’ll have to listen to a group discussion and then write something on a topic relevant to the discussion.  Here, you’ll have to practice especially these parts.  There are only a few catch points. In the speaking section, you’ll be given a specific topic to speak or place your arguments. It will be better to practice this part. Having a good score on the speaking section is important to get TA at US universities. For preparation, a book is available, but it will be better to get hold of the CD which comes together with the book and complete a sample test.

While registering for the GRE and TOEFL exams you have to give a list of four universities to which the test scores will be sent. Sending scores to universities usually costs additional 18 to 25 dollars for each university. It will be beneficial if you select four universities beforehand.

Taking the tests in the beginning of the fourth year, will spare you the pain and tension in the last year (you could focus on your thesis work better and other aspects of applying abroad), and also leaves you an opportunity to take them again if you need again to improve test results. Remember that GRE scores are valid for 5 years and TOEFL scores are valid for 2 years.

Matrix Convex Sets

In the previous post, we introduced some objects in operator algebra. Those will occur in the examples given in this post. The purpose of this post is to define matrix convex sets.

We begin by defining what we mean by a matrix convex set. There are two definitions, and we shall show that both are equivalent. Then we shall present some examples of matrix convex sets.

Definition 1. A matrix convex set in a vector space {V} is a collection {K=(K_n)}, where for each {n\in\mathbb{N}} {K_n\subset M_n(V)} is a non-empty set such that

\displaystyle \sum_{i=1}^{k}\gamma_i^*v_i\gamma_i\in K_n,

whenever {v_1\in K_{n_1}, ..., v_k\in K_{n_k}}, and {\gamma_1\in \mathbb{M}_{n_1,n},...,\gamma_k\in \mathbb{M}_{n_k,n}} satisfying {\sum_{i=1}^k\gamma_i^*\gamma_i=\mathbb{I}_n}.

The above definition seems like a natural generalisation of convexity. But the next one is usually easier to use in specific cases.

Definition 2. A matrix convex set in a vector space {V} is a collection {K=(K_n)}, where for each {n\in\mathbb{N}} {K_n\subset M_n(V)} is a non-empty set such that

For any {\gamma\in \mathbb{M}_{r,n}}, with {\gamma^*\gamma=\mathbb{I}_n}, we have {\gamma ^* K_r \gamma\subset K_n}.
For any {m,n\in\mathbb{N}}, {K_m\oplus K_n \subset K_{m+n}}.

We now show the equivalence of both definitions.

Definition 1 {\Rightarrow } Definition 2. The first part of definition 2 follows trivially (using {k=1}). To show the second part, let {v\in K_m} and {w\in K_n}; then we have to show that

\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix}\in K_{m+n}


\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix} = \gamma_1^*v\gamma_1+\gamma_2^*w\gamma_2,


\displaystyle \gamma_1=[\mathbb{I}_m \quad 0_{m,n}] \quad\quad \text{and}\quad \quad \gamma_2=[0_{n,m} \quad \mathbb{I}_n],

and clearly, {\gamma_1^*+\gamma_1+\gamma_2^*\gamma_2=\mathbb{I}_{m,n}}. Thus (2) is also satisfied.

Definition 2 {\Rightarrow } Definition 1. Let {v_i\in K_{n_i}}, {\gamma_i\in \mathbb{M}_{n_i,n}} for {i=1,..,k} satisfying {\sum_{i=1}^k\gamma^*_i\gamma_i\in K_n}. Then by extension of (2) in definition 2, we have

\displaystyle v = \begin{pmatrix} v_1 & & & \\ & v_2 & & \\ & & \ddots & \\ & & & v_k \end{pmatrix} \in K_{n_1+n_2+...+n_k}

Now let {\gamma=\begin{pmatrix} \gamma_1 & \hdots & \gamma_k \end{pmatrix}^T\in \mathbb{C}_{n_1+n_2+...+n_k,n}}. Then {\gamma^*\gamma=\sum\gamma_i^*\gamma_i=\mathbb{I}_n} and by definition 1 we have {\gamma^*v\gamma\in K_n} which implies {\sum_1^k\gamma_i^*v_i\gamma_i\in K_n}.

An important consequence is that each {K_n} is a convex set of {M_n(V)}. Let {K=(K_n)} be a matrix convex set in {V}. Let {v_1,v_2\in K_n}. We need to show that {\lambda v_1+(1-\lambda)v_2\in K_n} for {\lambda\in [0,1]}. Set {\gamma_1=\sqrt{\lambda}\;\mathbb{I}_n\in\mathbb{M}_n}, and {\gamma_2=\sqrt{1-\lambda}\;\mathbb{I}_n\in\mathbb{M}_n}. Then

\displaystyle \gamma_1^*\gamma_1+\gamma_2^*\gamma_2=(\sqrt{\lambda}\;\mathbb{I}_n)(\sqrt{\lambda}\;\mathbb{I}_n)+(\sqrt{1-\lambda}\;\mathbb{I}_n)(\sqrt{1-\lambda}\;\mathbb{I}_n) = (\lambda+1-\lambda)\;\mathbb{I}_n=\mathbb{I}_n.

Then {\gamma_1^*v_1\gamma_1+\gamma_2^*v_2\gamma_2\in K_n}, which is nothing but {\lambda v_1+(1-\lambda)v_2\in K_n}.

We present some examples of matrix convex sets. We start with the simplest vector space.

Example 2. Let {a,b\in [-\infty, \infty ]}. On the vector space {\mathbb{C}}, consider the collection

\displaystyle [a\mathbb{I},b\mathbb{I}]=([a\mathbb{I}_n,b\mathbb{I}_n]),

where {[a\mathbb{I}_n,b\mathbb{I}_n]=\{\alpha\in\mathbb{M}_n: a\mathbb{I}_n\leq \alpha\leq b\mathbb{I}_n\}}. This collection defines a matrix convex set in {\mathbb{C}}. We shall now verify that this indeed satisfies the conditions mentioned above (using definition 2).

Let {\alpha\in \mathbb{M}_{r,n}}, with {\alpha^*\alpha=\mathbb{I}_n}. Let {v\in K_r}, then {a\mathbb{I}_r\leq v\leq b\mathbb{I}_r}. This implies

\displaystyle a(\alpha^*\mathbb{I}_r\alpha) \leq \alpha^*v\alpha \leq b( \alpha^*\mathbb{I}_r\alpha)

which means {a\mathbb{I}_n\leq v \leq b\mathbb{I}_n}. Thus {\alpha^*v\alpha\in K_n}. Finally let {v\in K_m} and {w\in K_n}. Then we show that

\displaystyle \begin{pmatrix} v & 0 \\ 0 & w \end{pmatrix} \in K_{m+n}

But, for {z_1\in \mathbb{C}^m,z_2\in \mathbb{C}^n}, we have

\displaystyle \left\langle \begin{pmatrix} v- a\mathbb{I}_n & 0 \\ 0 & w-a\mathbb{I}_n \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix},\begin{pmatrix} z_1 \\ z_2 \end{pmatrix}\right\rangle= \langle (v-a\mathbb{I}_m)z_1,z_1 \rangle+\langle (w-a\mathbb{I}_n)z_2,z_2 \rangle\geq 0

And similarly we can show the other parts. This completes our verification. Conversely, we can show that any matrix convex set {K=(K_n)} in {\mathbb{C}}, where {K_1} is a closed convex subset of {\mathbb{R}} is a closed matrix interval.

Theorem 1 Suppose that {K=(K_n)} is a matrix convex set in {\mathbb{C}} where {K_1} is a bounded closed subset of {\mathbb{R}}. Then {K} must be a closed matrix interval.

Proof: Since {K_1} is a bounded closed and convex it must be a closed interval in {\mathbb{R}}, say {K_1=[a,b]}. Let {\gamma\in K_n}. Then we have to show that {\alpha\mathbb{I}_n\leq \gamma\leq \beta\mathbb{I}_n}. Let us first show that {\gamma-\alpha\mathbb{I}_n\geq 0}. Let {\xi\in\mathbb{C}^n}, then

\displaystyle \begin{array}{rcl} && \langle (\gamma-\alpha\mathbb{I}_n)\xi,xi \rangle \geq 0 \\ &\Leftrightarrow& \langle \gamma\xi,\xi\rangle-\langle\alpha\mathbb{I}_n\xi,\xi\rangle \geq 0 \\ &\Leftrightarrow& \xi^*\gamma\xi \geq \alpha \xi^*\xi=\alpha \end{array}

But from property 1, we see that {\xi^*\gamma\xi\in K_1}, and thus the last statement is true; hence {\gamma\geq \alpha\mathbb{I}_n}. Similarly, we can show {\gamma\leq \beta\mathbb{I}_n}. Thus {\alpha\mathbb{I}_n\leq \gamma\leq \beta\mathbb{I}_n}.

Conversely, let {\gamma\in [a\mathbb{I}_n,\beta\mathbb{I}_n]}, then {\gamma} is self-adjoint and hence we can write

\displaystyle \gamma=u^*D u=\begin{pmatrix} u_1^* & \hdots & u_n^* \end{pmatrix}\begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix}\begin{pmatrix} u_1 \\ \vdots \\ u_n \end{pmatrix},

where {\gamma_i\in [\alpha,\beta]}. Then by property 2, we see that {D\in K_n}, and by property 1, {u^*Du\in K_n}, that is {\gamma\in K_n}. \Box

Example 3. Let {\mathcal{M}} be an operator space. Then the collection of unit balls, {B=(B_n)} where

\displaystyle B_n=\{x\in M_n(\mathcal{M}): \|x\|\leq 1\}

is a matrix convex set. It is almost trivial to verify the conditions of matrix convexity using the abstract definition of operator space (given in this post).

Example 4. Let {\mathcal{R}} be an operator system. The collection of positive cones, {P=(P_n)} where

\displaystyle P_n=\{x\in M_n(\mathcal{R}):x\geq 0\}

is a matrix convex set. For let {\gamma\in \mathbb{M}_{r,n}} with {\gamma^*\gamma=\mathbb{I}_n}. Let {v\in P_r}. Then for {\xi\in\mathbb{C}^n} we have

\displaystyle \langle(\gamma^*v\gamma)\xi,\xi\rangle=\xi^*\gamma^*v\gamma\xi=(\gamma\xi)^*v(\gamma\xi)\geq 0

since {v\geq 0}. Similarly, if {v\in P_m} and {w\in P_n}, then it is fairly obvious that {v\oplus w\in P_{m+n}}.

Example 5. Again let {\mathcal{R}} be an operator system. Consider the collection of matrix states {CS(\mathcal{R})=(CS_n(\mathcal{R}))} where

\displaystyle CS_n(\mathcal{R})=\{\varphi:\mathcal{R}\rightarrow \mathbb{M}_n \; | \; \varphi\; \text{is completely positive and} \; \varphi(1)=\mathbb{I}_n\}.

We say {\varphi:\mathcal{R}\rightarrow \mathbb{M}_n} is completely positive if canonically amplified maps {\varphi_r:M_r(\mathcal{R})\rightarrow M_r(\mathbb{M}_n)} are positive for all {r\in\mathbb{N}}. Then {CS(\mathcal{R})} is a matrix convex set in {\mathcal{R}^*}. Here we have identified an {\varphi\in CS_n(\mathcal{R})} as an element in {M_n(\mathcal{R^*})}. For verification, let {\varphi\in CS_r(\mathcal{R})} and {\gamma\in\mathbb{M}_{r,n}}. Then we have to show that {\gamma^*\varphi\gamma} is completely positive and {\gamma^*\varphi\gamma(1)=\mathbb{I}_n}. By definition, {\gamma^*\varphi\gamma(x)= \gamma^*\varphi(x)\gamma}. Then

\displaystyle (\gamma^*\varphi\gamma)(1)=\gamma^*\varphi(1)\gamma=\gamma^*\mathbb{I}_r\gamma=\mathbb{I}_n.

To show complete positivity, let {[x_{ij}]\geq 0}. Then

\displaystyle (\gamma^*\varphi\gamma)[x_{ij}]=[\gamma^*\varphi(x_{ij})\gamma]=\Gamma^*[\varphi(x_{ij})]\Gamma\geq 0,


\displaystyle \Gamma =\begin{pmatrix} \gamma & \hdots & \gamma \end{pmatrix}^T.

Thus {\gamma^*\varphi\gamma} is completely positive since {\varphi} is completely positive. Likewise, the statement for {\varphi\oplus\psi} is analogously proved. {CS(\mathcal{R})} is considered as the matricial version of the state space.

Example 2 and example 5 are the ones which we shall focus upon the next sections.

Some Objects of Operator Algebra

This post is a continuation of this previous post. The aim of this post is to introduce some common objects that are encountered in operator algebra. All of the objects will be needed to understand the paper cited in the previous blog. I am bit terse here and examples are almost absent.

All the vector spaces considered here are assumed to be complex. Let {V} be a vector space. {M_{m,n}(V)} is the vector space of all {m\times n} matrices with entries from {V}, under natural addition and scalar multiplication. In particular, if {m=n}, we write {M_{n,n}(V)} as {M_n(V)}. If {V=\mathbb{C}}, then we denote {M_{m,n}(\mathbb{C})} by {\mathbb{M}_{m,n}}, and {M_{n,n}(\mathbb{C})} by {\mathbb{M}_n}. {\mathbb{I}_n} is the identity matrix in {\mathbb{M}_n}.

We shall be using some mathematical objects which we’ll describe here. The simplest to start with are the operator spaces. Let {\mathcal{H}} be any Hilbert space. An operator space {\mathcal{M}} is a closed linear subspace of {\mathfrak{B}(\mathcal{H})} Using {\mathcal{M}} we construct matricial spaces {M_n(\mathcal{M})}. The natural inclusion

\displaystyle M_n(\mathcal{M})\hookrightarrow M_n(\mathfrak{B}(\mathcal{H}))\simeq \mathfrak{B}(\mathcal{H}^n) \ \ \ \ \ (1)

provides a norm on {M_n(\mathcal{M})} using the operator norm on {\mathfrak{B(\mathcal{H}^n)}}. This description is called a concrete definition of an operator space. Ruan characterised operator spaces in the following abstract way.

We say {\mathcal{M}=(V,\{\|.\|_n\})} is an operator space if {(M_n(V),\|.\|_n)} is a normed linear space, with the norms satisfying

  1. {\|\alpha v\beta\| \leq \|\alpha\|\|v\|\|\beta\|}, where {\alpha,\beta\in\mathbb{M}_m} and {v\in M_m(V)}, and we define {\alpha v \beta=[\Sigma_{j,k}\alpha_{ij}v_{jk}\beta_{kl}]\in M_m(V)}.
  2. {\|v\oplus w\|_{m+n}=\text{max}\{\|v\|_m,\|v\|_n\}}, where {v\in M_m(V), w\in M_n(V)}.

Operator spaces are usually considered as noncommutative analogues of Banach spaces. A special example of operator space is an operator system. Concretely, an operator system {\mathcal{R}} is a closed subspace of {\mathfrak{B}(\mathcal{H})}, which is also closed under adjoint operation and contains the identity operator. Again we can form the different matrix levels {M_n(\mathcal{R})}. The important thing here is that the inclusion (1) defines an ordering on {M_n(\mathcal{R})} which is induced by the usual ordering on {\mathfrak{B}(\mathcal{H}^n)}.

The commutative analogue of an operator system is a function system, which is a closed subspace of {C(X)}, where {X} is a compact, and the subspace is closed under conjugation and contains the constant function, 1 ({1(x)=1} for all {x\in X}). Consider the following important example which we shall refer later. We recall that affine functions are those which preserve convex combinations.

Example 1. Let {X} be a locally convex space, and let {K} be a compact convex subset of {X}. Define

\displaystyle A(K)=\{f:K\rightarrow\mathbb{C} \;|\; f \; \text{is continuous and affine}\}

Then {A(K)} is a function system. Clearly, {A(K)\subset C(K)}. It is closed since uniform limit of a sequence of continuous functions is again continuous, and the limit function is easily seen to be affine. Moreover it is closed under conjugation, and the constant function ({1(x)=1} for all {x\in K}) is trivially contained in {A(K)}. The positive elements in {A(K)} are precisely those functions which map each {x\in K} into {[0,\infty)}. In particular, 1 is a positive element.

Function systems are abstractly characterised as complete order unit spaces. An order unit space is a real-partially ordered vector space – {A} – having an element {e}, which we call the order unit, satisfying

  1. (Order Unit property) For each {a\in A}, there is an {r\in \mathbb{R}} such that {a\leq re}.
  2. (Archimedean property) For each {a\in A}, if {a\leq re} for all {r\in (0,\infty)}, then {a\leq 0}.

On an order unit space {(A,e)}, we define a norm as

\displaystyle \|a\|=\inf\{r\in \mathbb{R}:-re\leq a\leq re\}.

Then {A} becomes a normed linear space. It is easy to see that a function system satisfies all these properties.

Parallel to this characterisation, operator systems have been characterised by Choi-Effros in terms of matrix ordered spaces. Let {V} be {^*}-vector space. Then {M_n(V)} is also {^*}-vector space by {[v_{ij}]^*=[v_{ij}^*]}. We say {V} is a matrix ordered space if there are proper cones {M_n(V)^+\subset M_n(V)_{sa}} for all {n\in \mathbb{N}}, and such that

\displaystyle \alpha^*M_{m}(V)^+\alpha\subset M_n(V)^+

for all {\alpha\in \mathbb{M}_{m,n}}. Then we have the following definition.

A matrix ordered space {(A,M_n(A)^+)} is an operator system if

  1. {A^+} is proper, that is, if {a,-a\in A^+}, then {a=0}
  2. {M_n(A)^+} is Archimedean for all {n\mathbb{N}}
  3. {A} has an order unit.

Given an operator system {\mathcal{R}}, we define the state space of {\mathcal{R}} as

\displaystyle S(\mathcal{R})=\{\varphi:\mathcal{R}\rightarrow \mathbb{C} \;|\; \|\varphi\|=1=\varphi(1)\}.

Let {V,W} are vector spaces and {f:V\rightarrow W} is a linear map, then the canonical amplification of {f} is the sequence of maps {f_n:M_n(V)\rightarrow M_n(W)} defined by {[v_{ij}]\mapsto [f(v_{ij})].}

In the next post, I’ll start by defining matrix convex sets.

Shortest Proof of Irrationality of sqrt(N), where N is not a perfect square

This proof is really the shortest proof I have ever seen for this theorem, and it’s really witty! I stumbled this proof while browsing the American Mathematical Monthly periodicals, to be specific on page 524 in the June-July 2008 issue. I reproduce the proof here.

Theorem  Let {N} be a positive integer. If {N} is a not perfect square, then {\sqrt{N}} is irrational.

Proof: Before embarking on the proof, recall that the standard proof uses the method of contradiction. Here we shall prove the contrapositive statement to prove the theorem. That is, we shall prove that if {\sqrt{N}} is rational then {N} is a perfect square.

Thus assume that {\sqrt{N}} is rational. This means we can write

\displaystyle \sqrt{N}=\frac{p}{q},

where {p,q} are integers, {q\neq 0}, and that it is in the lowest form. This implies {p^2=Nq^2}, which we can rearrange so as to write

\displaystyle \frac{p}{q}=\frac{Nq}{p}.

Since we assumed that {p/q} is in lowest form, we see that both {Nq} and {p} must be a integral multiple of {p} and {q} respectively. Taking that constant multiple to be {c}, we may write {Nq=cp} and {p=qc}. The latter equation says

\displaystyle \frac{p}{q}=c,

but since {p/q} is nothing but {\sqrt{N}}, this implies that {\sqrt{N}=c}, which simply means that {N} is a perfect square. This proves the contrapositive statement, and therefore the theorem follows. \Box

Isn’t it beautiful!

The Krein-Milman Theorem

1. The Krein-Milman theorem in Locally Convex Spaces

My project work this semester focuses to understand the paper the Krein-Milman Theorem in Operator Convexity by Corran Webster and Soren Winkler, which appeared in the Transactions of the AMS [Vol 351, #1, Jan 99, 307-322]. But before reading the paper, it is imperative to understand the (usual) Krein-Milman theorem which is proved in the context of locally convex spaces. My understanding of this part follows the book A Course in Functional Analysis by J B Conway. To begin with we shall collect the preliminaries that we shall need to understand the Krein-Milman theorem.

1.1. Convexity

Let {\mathbb{K}} denote the real({\mathbb{R}}) or the complex({\mathbb{C}}) number fields. Let {X} be a vector space over {\mathbb{K}}. A subset of a vector space is called convex if for any two points in the subset, the line segment joining them lies completely in the subset. We make this idea more precise.

If {a,b\in X}, the line segment from {a} to {b} is given by the set

\displaystyle [a,b]=\{(1-t)a+tb :0\leq t\leq 1\}.

By an open line segmentwe shall mean that the end points should be excluded, that is

\displaystyle (a,b)=\{(1-t)a+tb :0<t<1\}.

We shall call a line segment proper if {a\neq b}. A subset {V\subset X} is called convexif for any {a,b\in V}, the line segment {[a,b]\subset V}. It is easy to see that intersection of arbitrary number of convex subsets is again convex. We characterize a convex set as –

Proposition 1.1 : {V} is convex if and only if whenever {x_1,...,x_n\in V}, and {t_1,...,t_n\in [0,1]} with {\sum_i t_i=1}, then {\sum_i t_ix_i\in V}.

Proof: If the latter condition holds, then it is easy to see that {V} is convex. For the converse part, we use induction on the number of points, {n}. For {n=1}, this is trivially true, while the case {n=2} follows from the definition. Moreover note that if {t_1=1}, the theorem holds. So we shall assume {t_1\neq 1}. Suppose the theorem holds for {k=n>2}. Then, if there are {n+1} points, say {x_1,...,x_{n+1}}, and {t_1,...,t_{n+1}}, with {t_1\neq 1} and {\sum_{i=1}^{n+1}t_i=1} then we can write

\displaystyle t_1x_1+\sum_{i=2}^{n+1}t_ix_i=t_1x_1+(1-t_1)\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i.

Now {\sum_{i=2}^{n+1}\frac{t_i}{1-t_1}x_i\in V} by the induction hypothesis, and by the convexity of {V}, we get that {\sum_{i=1}^{n+1}t_ix_i\in V}. Thus the proof is complete. \Box

If {V\subset X}, the convex hull of {V}, denoted by {\text{co}\;(V)}, is the intersection of all convex sets containing {V}. Clearly this definition is meaningful since {X} itself a convex set containing {V}, and that convexity is preserved under intersection. {\text{co}\;(A)} is also convex. Using proposition 1.1, we have the following characterization of the convex hull of {V}.

The convex hull of {V} is the set of all convex combinations of elements of {V}, that is

\displaystyle co(A)=\left\lbrace t_1x_1+...+t_nx_n: x_i\in V,\; 0\leq t_i\leq 1,\; \sum_{i=1}^nt_i=1,\; n\; \text{is arbitrary}\right\rbrace

As an example, the convex hull of three points {x_1,x_2,x_3\in\mathbb{R}^2} is the closed triangular region formed by those points.

Definition 1. Let {V} be a convex subset of a vector space {X}. A point {a\in V} is called an extreme point of {V}, if there does not exist any proper open line segment containing {a} and which lies completely in {V}. We denote the set of extreme points of {V} by {\text{ext}\;(V)}.

Convex sets may or may not have extreme points. For example the set {\{(x,y)\in\mathbb{R}^2:x^2+y^2<1\}}, does not have any extreme point, whereas the extreme points of the set {\{(x,y)\in\mathbb{R}^2:x^2+y^2\leq 1\}} are all the boundary points. The extreme points of a closed square region in {\mathbb{R}^2} are its four corners.

We have following equivalent characterizations.

Proposition 1.3 : Let {V} is convex subset of a vector space {X}, and let {a\in V}. Then the following are equivalent –

  • (a) {a\in\text{ext}\;V }.
  • (b) If {x_1,x_2\in V} and {0<t<1} and {a=tx_1+(1-t)x_2}, then either {x_1\notin V} or {x_2\notin V} or {x_1=x_2=a}.
  • (c) If {x_1,...,x_n\in V} and {a\in\text{co}\;\{x_1,...,x_n\}}, then {a=x_k} for some {k}.
  • (d) {V-\{a\}} is a convex set.

Proof:The equivalence of {(a)} and {(b)} is straightforward from the definition of extreme point and convexity of {V}.

{(a)\Rightarrow (c)} : Let {a\in\text{co}\;\{x_1,...,x_n\}}. Then by proposition 1.2, we get that {a=t_1x_1+...+t_nx_n}, where {0\leq t_i\leq 1}, and {\sum t_i=1}. If {t_i=1} for some {i}, we get that {a=x_i}. So suppose that {0<t_i<1} for each {i}. Then we can write

\displaystyle a=t_1x_1+(1-t_1)\sum_{i=2}^{n}\frac{t_i}{1-t_1}x_i

This happens only if {(1-t_1)x_1=t_2x_2+...+t_nx_n} which means {a=x_1}.

{(c)\Rightarrow (a)} : Suppose {a\notin\text{ext}\;V}. Then there are points {x,y\in V} such that {a=tx+(1-t)y}, where {0<t<1}, and {x\neq y}. So {a\in\text{co}\;\{x,y\}}. Then {(c)} dictates that {a=x} or {a=y}. But then {x=y}.

{(a)\Rightarrow (d)} : We take two points {x,y\in V-\{a\}}, then {\{tx+(1-t)y:0\leq t\leq 1\}\subset V-\{a\}}, since {a\neq tx+(1-t)y} for any {t\in (0,1)}. Thus {V-\{a\}} is convex.

{(d)\Rightarrow (a)} : Suppose that {a} is not an extreme point. Then there exists {t\in (0,1)} and {x,y\in V} with {x\neq y}, and that {a=tx+(1-t)y}. But this means that {[x,y]} does not completely in {V-\{a\}}, which contradicts its convexity. \Box

We shall return to convexity once again after we have introduced the concept of a topological vector space.

Read the rest of this entry »

A Day @ISI-Kol

Interviews always make me nervous. And that nervousness starts precisely 30 minutes before my call. Dunno why (about the timing)! Anyway, the interview was for the NBHM MA/MSc Scholarship. Well I already get a handsome scholarship from the Govt Of India (INSPIRE) but I wanted just to get a feel, whether I am still grounded after having plenty of courses. The interview was scheduled to be at 4 pm on 18th November and the venue was ISI, Kolkata. I reached Kolkata in the morning and stayed at my friend’s room – thanks to him. I was also accompanied by my friend and classmate who was also called for the interview.

ISI Kolkata is a big place, and has numerous departments. From studies in social sciences to research advances in mathematics and theoretical statistics. No wonder why this institute has been conferred the title of institute of national importance. Coming back to point, we waited for almost the whole day, till our turn at around 5:30 pm. Till then we met students from Vivekananda University, and guys from IMA; and interacted with them about there courses etc. We took lunch at the ISI Canteen, where the food rate is incredibly subsidized – 50 “paise” for 5 chapatis and curry. We waited, waited and waited. Not to mention I had cold, and a bad headache.

Finally I was called, and already my nervousness has reached a saturated point. But it declined appreciably once I entered the lecture room. There were four professors, and one among them was Prof SC Bagchi, whom I had already acquaintance with; the rest looked familiar but I didn’t know the names. We exchanged smiles, and I eased a lot. The interview started with the obvious question – “what courses have you studied?”. Needless to say I would have easily spent 10 minutes just recounting the name of the courses. I named a few, and mentioned along that there were many in which I wasn’t comfortable with. Maintaining truthfulness is necessary to avoid questions from an area which one doesn’t know well. The first question was asked by Prof Bagchi, who asked me to prove that the closed interval [0,1] is a connected set in the real line. I said that the question boils down to proving that the only subsets of the real line which are both open and closed are the empty set and the whole real line. He asked me to prove this. Now this requires a little thinking. And I’ll leave it to you to figure it out or take the help of a book [Topology-Munkres]. I did succeed only after considerable hints from him.

The next question was an easy one – to give an example of a non-holomorphic function. I shot back saying f(z)=\overline{z};  coz it didn’t satisfy the Cauchy Riemann equations. Immediately the next question was when we can say that the function is analytic – which is again simple – that in addition to satisfying the Cauchy-Riemann equations, it should also have continuous partial derivatives. Then I was asked that given a point in a domain where a function is analytic (=holomorphic), what can I say about the radius of convergence. I said that we can get the radius of convergence by the formula R^{-1}=\limsup |a_n|^{1/n}. They clarified that what I can say readily by just looking at the domain, and hinted Cauchy’s theorem. I told that by the theorem, it will be the radius of the largest circle that lies completely in the domain. They asked me whether I can put this in topological terms. I thought a bit, and then it dawned. “Yeah, distance from the boundary”. This was exactly they wanted me to tell.  Another guy from the committee asked me whether I was familiar with the notion of analytic continuation. Yes was the reply, and then I explained the rudimentary idea of how one can analytically continue a holomorphic function which agrees with another holomorphic function on a common domain. I was then asked to write a function which is the analytic continuation of \sum_{n=0}^{\infty} z^n, which of course if \frac{1}{1-z} but didn’t hit me then.

Finally the linear algebra part. I was asked how can assign a meaning to e^A, where A is a m\times n matrix. I didn’t know so told them flatly. They then modified the question that suppose the matrix given is a symmetric matrix, and hinted some diagonalisation theorem. I got it immediately, so the point is that if A is a symmetric matrix, it is similar to a diagonal matrix via real orthogonal matrix. This means A=ODO^{-1}, where $\latex D$ is diagonal. Then the computation is easy and the expression e^A is also well-defined via e^A=O e^D O^{-1}, where e^D is nothing but the diagonal matrix obtained by taking the exponential of the diagonal entries of D. Then they asked – what if A is nilpotent. In that case the expansion contains only finite number of terms and hence e^A is readily defined. So finally what can I say about a general matrix, and hinted “semi-simple”. Though I had read semi-simple operators, but I couldn’t recall then. Actually the result is that on a finite-dimensional vector space over the complex field, semisimple is equivalent to diagonalizable, and every operator on that space may be written as a sum of a semisimple operator and nilpotent operator which commute. So using this fact we may define e^A for general operators.

With that I had spent almost half an hour, and according to me the interview was quite OK. After me, was my friend’s turn. Finally we had a nice chat with the professors, and got some good advice. Altogether it was a good experience.

Lecture 1. Notion of Derivative

We shall always assume that {X,Y} are Banach spaces over the complex field {\mathbb{C}}. The real number field is denoted by {\mathbb{R}}. We shall now explore differential calculus in Banach spaces. It will be instructive to try to understand the theorems in well known Banach spaces like {\mathbb{R}^{n}} (over {\mathbb{R}}) or {\ell^{2}} (over {\mathbb{C}}) or {C[a,b]} (over {\mathbb{R}}). In general there are two notions of derivatives – the Gateaux derivative and the Frechet derivative, where the latter is stronger in the sense that Frechet differentiability implies Gateaux differentiability.

Definition: Let {f:X \rightarrow Y}, where {X,Y} are Banach space. For {a,h\in X}, define

\displaystyle \delta f(a,h)=\lim_{t\rightarrow 0}\frac{f(a+th)-f(a)}{t},\quad t\in \mathbb{R}

provided the limit exists. Then {\delta f(a,h)} is called the directional derivative of {f} at {a} in the direction of {h}. If this limit exists for every {h\in X} and if the function {h\mapsto \delta f(a,h)} is linear and continuous in {h}, then we say that the function is Gateaux differentiable and the function defined by

\displaystyle Df(a)(h)=\delta f(a,h)

is called the Gateaux derivative at {a}. Thus the linearity and continuity condition means {Df(a)\in L(X,Y).}

[An example to show that the map {h\mapsto \delta f(a,h)} is not always linear, and even if {Df(a)} exists, it may not be continuous at {a}.]

Definition: Let {f:X \rightarrow Y}, where {X,Y} are Banach space. Let {U\subset X} be open subset. We say that {f} is Frechet differentiable at {a\in U}, if there exists a bounded linear map {A\in L(X,Y)} such that

\displaystyle \lim_{h\rightarrow 0}\frac{\left\|f(a+h)-f(a)-A(h)\right\|}{\left\|h\right\|}=0

Some remarks should be in order, which we leave as simple exercises. The Frechet derivative is unique, and Frechet differentiability implies continuity. If {f'(a)} exists, then also {Df(a)} exists, and we have

\displaystyle Df(a,h)=f'(a)h.

Usually when we say {f} is differentiable, we shall mean Frechet differentiable. [An example for which the Gateaux derivative exists but Frechet derivative fails to exist.]

Our next step is to prove the mean value theorem.

Theorem: Let {f:X\rightarrow Y}, and suppose {\delta f(a+t(b-a),b-a)} exists for all {t\in[0,1]}, and is a continuous map of {t}, then

\displaystyle f(b)-f(a)=\int_{0}^{1}\delta f(a+t(b-a),b-a) dt.

Proof: Let {\phi\in Y^{*}}. Define {g(t)=\phi(f(a+t(b-a),b-a)):[0,1]\rightarrow\mathbb{R}} is differentiable, since it is the composition of two differentiable function. We have {g'(t)=\phi(\delta f(a+t(b-a),b-a))}. Using the fundamental theorem of calculus for real valued functions, we get

g(1)-g(0) = \int_{0}^{1}g'(t) dt =\int_{0}^{1}\phi(\delta f(a+t(b-a),b-a)) dt

= \phi\left(\int_{0}^{1}\delta f(a+t(b-a),b-a) dt\right)

Also, {g(1)-g(0)=\phi(f(b)-f(a))}, so altogether we get,

\displaystyle \phi\left(f(b)-f(a)-\int_{0}^{1}\delta f(a+t(b-a),b-a) dt\right)=0.

Since this holds for all {\phi\in Y^{*}}, we get the desired result.

Notice further that

\displaystyle \|f(b)-f(a)\|\leq \int_{0}^{1}\|\delta f(a+t(b-a),b-a)\| dt\leq \sup_{t\in[0,1]}\|\delta f(a+t(b-a),b-a)\|.

This is referred to as the mean value theorem in higher dimensions. The usual mean value theorem does not hold in higher dimensions as is seen by this example. Define {f(t)=(\cos(t),\sin(t))} for {t\in[0,2\pi]}. Then {f} is differentiable, and {f(2\pi)-f(0)=0}, but there is no such {t\in(0,1)} for which {f'(t)=(-\sin(t),\cos(t))} is zero.

Our next step is to generalize the chain rule. However we shall not prove the rule using strong hypotheses that both functions in a composition are Frechet differentiable. Instead we have –

Theorem: Let {a\in X, g(a)=b\in Y}. Suppose {f} is Frechet differentiable at {b} and {g} is having directional derivative in the direction {h} at {a}, then {f\circ g} has a directional derivative at {a}, and

\displaystyle \delta(f\circ g)(a,h)=f'(b)\delta g(a,h).

Before plunging into a proof, let us have an informal discussion. On the right hand side of the above equation, we have a linear transformation {f'(b):Y\rightarrow Z} acting on an element of Y – {\delta g(a,b)}. Let us see what we have got. {g} has directional derivative at {a} in the direction of {h}. This translates into

\displaystyle \lim_{t\rightarrow 0}\frac{g(a+th)-g(a)}{t}=\delta g(a,h),

that is given {\epsilon>0} there exists {\delta>0} such that if {\left|t\right|<\eta} we have

\displaystyle \|g(a+th)-g(a)-t\delta g(a,h)\|<\epsilon\left| t\right|.

{f} is Frechet differentiable at {b=g(a)}, so

\displaystyle \|f(b+k)-f(b)-Ak\|\leq\theta\|k\|,

where the right hand side goes to zero as {\|k\|\rightarrow 0}. We may take

\displaystyle k=g(a+th)-g(a)=t\delta g(a,h)+\tilde{\theta}\left|t\right|,

which certainly goes to zero at {t\rightarrow 0}. If we plug this in the equation of {f}, we’ll get

\displaystyle f(b+k)-f(b)=f'(b)k+\theta \|k\|=t f'(h)\delta g(a,h)+\tilde{\theta}\|t\|f'(b)+\theta \|k\|.

To complete the proof we merely have to show that the last two terms go to zero. This we shall prove formally in the next post.

Non-linear Analysis. Preliminary

Hello everyone. I’ll be posting a series of lecture notes of a course that I am credited for this semester. The course is named Non-Linear Analysis, and is being taught by Prof PC Das. The structure for this course is as follows –

Calculus in Banach spaces, inverse and multiplicit function theorems, fixed point theorems of Brouwer, Schauder and Tychonoff, fixed point theorems for non-expansive and set-valued maps, pre-degree results, compact vector fields, homotopy, homotopy extension, invariance theorems and applications.

In this preliminary post, I’ll outline some prerequisites that’ll be needed. Prerequisites include a basic course in functional analysis, where one learns the four important theorems – Hahn-Banach theorem, open mapping theorem, closed graph theorem and uniform boundedness principle; some Hilbert space theory, and some results of compact operators. Though, I’ll state the theorem, when I’ll use somewhere. One should of course be familiar with real analysis. Multi-variable calculus is not necessary but one will definitely benefit understanding the theorems in more general context if one has read. Also, it is assumed that the reader knows basic concepts (like, completeness, compactness) in point set topology.

The story starts from calculus in {\mathbb{R}}, that is differentiation and integration of functions {f:\mathbb{R}\rightarrow\mathbb{R}}, which is familiar to everyone. For scalar valued functions {f:\mathbb{R}^{n}\rightarrow\mathbb{R}}, we may define the derivative as the gradient vector; whereas for a vector valued function {f:\mathbb{R}\rightarrow\mathbb{R}^{n}}, we may define the derivative as the vector {(\frac{df_{1}}{dt},...,\frac{df_{n}}{dt})}, where {f_{1},...,f_{n}} are the components of {f}. However it is not immediately clear how to define the derivative for a function {f:\mathbb{R}^{m}\rightarrow\mathbb{R}^{n}}, where {m>1,n>1}. It turns out that the derivative of such functions is actually a linear transformation from {\mathbb{R}^{m}} to {\mathbb{R}^{n}}, and this fact generalises all the previous cases. After one digests this fact, it is then routine to prove the extensions of the results in one variable to multivariable case. Like, we then prove the chain rule (which is now composition of two linear transformations), inverse and implicit function theorems(which are a lot complicated), and Taylor series(no one bothers to actually try hands on). A good book to refer for these stuffs is [Rudin,PoMA].

It is then obvious to get a thought to extend calculus to more general topological vector spaces. The first thing we’ll handle is, how to generalize the calculus from {R^{n}} to arbitrary Banach spaces. A Banach space {X} is a vector space equipped with a norm, which makes {X} into a complete metric space. Completeness is necessary if we wish to handle limit operations on the space. The most important thing that fails to hold for Banach spaces is that now compact sets are no longer characterised by closed and bounded sets, which tends to make the proofs complicated. As we shall see that the derivative is indeed a linear transformation which now comes with a fancy name – Frechet derivative. Again, we’ll routinely complete the proofs of the basic theorems listed above.

If one has followed the proof of inverse function theorem [Rudin, PoMA, Pg-220], one notices that a fixed point theorem is used. In general, a fixed point theorem asserts the existence and uniqueness of {x_{0}\in X} such that {f(x_{0})=x_{0}}. These theorems come with different conditions which can be applied to different situations. These theorems are very important in many areas of analysis to the fact that many books have entirely been devoted to such results. We shall also investigate few such results.

I would like any comments, suggestions for improvement or any question that pops. Also please point out any typographical\factual errors.

Summer @TIFR

The monsoon is on its full flow, and it’s raining heavily outside, with the aroma of the wet soil incensing the atmosphere romantically. And I am here typing this blog post.

The VSRP program has ended and I have returned home. It happened to follow the KISS principle – Keep It Short, Silly; though I wasn’t very happy with it. The time when I had started enjoying the program, it came to an abrupt end. It was good though with dramatic twists in the starting and I enjoyed it thoroughly, learnt new things and made a lot of new friends. Let me share my experience with you, after I read a similar kind by Vipul Naik.

VSRP stands for Visiting Students Research Program, an initiative taken by Tata Institute of Fundamental Research (TIFR), Mumbai – India’s foremost research institute – to help expose students to some advanced topics. More detailed info can be found here. This year VSRP for mathematics was from 13 June to 8 July.

To those who don’t know me, I am majoring in mathematics, and I have completed three years in my 5 year integrated MSc program. The VSRP program started on 13th and after regular academic chores, we were assigned our respective guides by Prof. S. Subramanian who was the coordinator for Mathematics this year. Ankit Rai (ISI Bangalore) and I were placed under Prof. R.V. Gurjar who specialises in complex algebraic surfaces. We shared the common interest to study Riemann surfaces. He gave a short lecture on what Riemann surfaces are and advised us to follow the book of Otto Forster. We had to read the first few sections of the book. Thus my project kicked off.

TIFR has a beautiful campus. It is not that big compared to most academic institutes, but it is enough for not getting bored. It faces the sea on one side, and is an awesome place for a stroll in the evening with the waves clattering, with the wind blowing through hairs, and music flowing in the ears. The food is good, with three canteens to suit varied tastes.

The first week went good and I learnt some basic stuffs about Riemann surfaces. Let me give a layman idea. In an elementary study one notices that the elementary functions like square-root and logarithm are multi-valued, and therefore technically are not functions. But for doing analysis one needs the function to be single-valued, and hence one usually restricts the range so that the function is single-valued, and then we talk about analyticity and else. However, multi-valuedness is a basic nature of the complex functions, and therefore theory should incorporate this feature, instead of circumventing it. Riemann introduced the concept of Riemann surfaces as the proper setting to study such functions. A Riemann surface is a pair (X,\Sigma), where X is a connected two-dimensional manifold and \Sigma is a complex structure on X. Locally a Riemann surface is simply an open set in the complex plane. Simple examples include the complex plane \mathbb{C} and domains of a Riemann surface X. Two non-trivial examples are the Riemann sphere \mathbb{P}^{1} and the torus S^{1} \times S^{1}, when equipped with a suitable complex structure. Maybe later, I’ll try to write a more detailed exposition.

Weekdays were spent on reading the book, and the weekends on trips and hanging out with friends. TIFR had also organised a trip to Giant Metrewave Radio Telescope (GMRT), Pune. It was fun getting acquainted with some basic Radio Astronomy. Besides, the views of the landscape during the trip was spectacular, and given that I am interested in photography, it was exciting experience to take photographs. The album can be seen here.

In contrast, this time there was no presentation kind of thing – which was both a relief as well as a disappointment. Relief, coz, I could read more freely, and a disappointment, coz, I wanted to hone my speaking skills. However, I did submit my one page report!

Altogether, it was a welcome experience. Making new friends, and learning about their curriculum and college life. Goodbye to them. Yeah! That’s life!!