Today I wish to discuss a simple question is probability, which goes by the name birthday problem or birthday paradox. It’s hardly a paradox, only because it does not seem apt with naive thinking. But so are the nature of things in mathematics.
Suppose you are in someone’s birthday party and the cake has been cut and the party is in full swing. The mood is there but you are alone. Thinking. And wishing perhaps that it was your birthday party, with you being the center of attention. This suddenly strikes the following question in mind.
Assuming the room is closed and there are people in the room and that it is a usual year with 365 days, what is the probability that there are at least two persons who share a common birthday (and hence perhaps throwing a common and grander party).
Of course, if there are 366 people in the room (hypothetically) then there is always a pair who share the same birthday. This follows from a similar situation, that if there are two boxes and one has to put three letters, then one box must contain (at least) two letters. Can you see how it works in the birthday case?
But, what if there are fewer people? In fact you can ask, what is the smallest number of people in the room for which there is 50% chance that two person share the same birthday?
The answer is 23. (Does this surprise you?)
In the previous post, we introduced some objects in operator algebra. Those will occur in the examples given in this post. The purpose of this post is to define matrix convex sets.
We begin by defining what we mean by a matrix convex set. There are two definitions, and we shall show that both are equivalent. Then we shall present some examples of matrix convex sets.
Definition 1. A matrix convex set in a vector space is a collection , where for each is a non-empty set such that
whenever , and satisfying .
The above definition seems like a natural generalisation of convexity. But the next one is usually easier to use in specific cases.
Definition 2. A matrix convex set in a vector space is a collection , where for each is a non-empty set such that
- For any , with , we have .
- For any , .
We now show the equivalence of both definitions.
Definition 1 Definition 2. The first part of definition 2 follows trivially (using ). To show the second part, let and ; then we have to show that
and clearly, . Thus (2) is also satisfied.
Definition 2 Definition 1. Let , for satisfying . Then by extension of (2) in definition 2, we have
Now let . Then and by definition 1 we have which implies .
An important consequence is that each is a convex set of . Let be a matrix convex set in . Let . We need to show that for . Set , and . Then
Then , which is nothing but .
We present some examples of matrix convex sets. We start with the simplest vector space.
Example 2. Let . On the vector space , consider the collection
where . This collection defines a matrix convex set in . We shall now verify that this indeed satisfies the conditions mentioned above (using definition 2).
Let , with . Let , then . This implies
which means . Thus . Finally let and . Then we show that
But, for , we have
And similarly we can show the other parts. This completes our verification. Conversely, we can show that any matrix convex set in , where is a closed convex subset of is a closed matrix interval.
Theorem 1 Suppose that is a matrix convex set in where is a bounded closed subset of . Then must be a closed matrix interval.
Proof: Since is a bounded closed and convex it must be a closed interval in , say . Let . Then we have to show that . Let us first show that . Let , then
But from property 1, we see that , and thus the last statement is true; hence . Similarly, we can show . Thus .
Conversely, let , then is self-adjoint and hence we can write
where . Then by property 2, we see that , and by property 1, , that is .
Example 3. Let be an operator space. Then the collection of unit balls, where
is a matrix convex set. It is almost trivial to verify the conditions of matrix convexity using the abstract definition of operator space (given in this post).
Example 4. Let be an operator system. The collection of positive cones, where
is a matrix convex set. For let with . Let . Then for we have
since . Similarly, if and , then it is fairly obvious that .
Example 5. Again let be an operator system. Consider the collection of matrix states where
We say is completely positive if canonically amplified maps are positive for all . Then is a matrix convex set in . Here we have identified an as an element in . For verification, let and . Then we have to show that is completely positive and . By definition, . Then
To show complete positivity, let . Then
Thus is completely positive since is completely positive. Likewise, the statement for is analogously proved. is considered as the matricial version of the state space.
Example 2 and example 5 are the ones which we shall focus upon the next sections.
This post is a continuation of this previous post. The aim of this post is to introduce some common objects that are encountered in operator algebra. All of the objects will be needed to understand the paper cited in the previous blog. I am bit terse here and examples are almost absent.
All the vector spaces considered here are assumed to be complex. Let be a vector space. is the vector space of all matrices with entries from , under natural addition and scalar multiplication. In particular, if , we write as . If , then we denote by , and by . is the identity matrix in .
We shall be using some mathematical objects which we’ll describe here. The simplest to start with are the operator spaces. Let be any Hilbert space. An operator space is a closed linear subspace of Using we construct matricial spaces . The natural inclusion
provides a norm on using the operator norm on . This description is called a concrete definition of an operator space. Ruan characterised operator spaces in the following abstract way.
We say is an operator space if is a normed linear space, with the norms satisfying
- , where and , and we define .
- , where .
Operator spaces are usually considered as noncommutative analogues of Banach spaces. A special example of operator space is an operator system. Concretely, an operator system is a closed subspace of , which is also closed under adjoint operation and contains the identity operator. Again we can form the different matrix levels . The important thing here is that the inclusion (1) defines an ordering on which is induced by the usual ordering on .
The commutative analogue of an operator system is a function system, which is a closed subspace of , where is a compact, and the subspace is closed under conjugation and contains the constant function, 1 ( for all ). Consider the following important example which we shall refer later. We recall that affine functions are those which preserve convex combinations.
Example 1. Let be a locally convex space, and let be a compact convex subset of . Define
Then is a function system. Clearly, . It is closed since uniform limit of a sequence of continuous functions is again continuous, and the limit function is easily seen to be affine. Moreover it is closed under conjugation, and the constant function ( for all ) is trivially contained in . The positive elements in are precisely those functions which map each into . In particular, 1 is a positive element.
Function systems are abstractly characterised as complete order unit spaces. An order unit space is a real-partially ordered vector space – – having an element , which we call the order unit, satisfying
- (Order Unit property) For each , there is an such that .
- (Archimedean property) For each , if for all , then .
On an order unit space , we define a norm as
Then becomes a normed linear space. It is easy to see that a function system satisfies all these properties.
Parallel to this characterisation, operator systems have been characterised by Choi-Effros in terms of matrix ordered spaces. Let be -vector space. Then is also -vector space by . We say is a matrix ordered space if there are proper cones for all , and such that
for all . Then we have the following definition.
A matrix ordered space is an operator system if
- is proper, that is, if , then
- is Archimedean for all
- has an order unit.
Given an operator system , we define the state space of as
Let are vector spaces and is a linear map, then the canonical amplification of is the sequence of maps defined by
In the next post, I’ll start by defining matrix convex sets.
This proof is really the shortest proof I have ever seen for this theorem, and it’s really witty! I stumbled this proof while browsing the American Mathematical Monthly periodicals, to be specific on page 524 in the June-July 2008 issue. I reproduce the proof here.
Theorem Let be a positive integer. If is a not perfect square, then is irrational.
Proof: Before embarking on the proof, recall that the standard proof uses the method of contradiction. Here we shall prove the contrapositive statement to prove the theorem. That is, we shall prove that if is rational then is a perfect square.
Thus assume that is rational. This means we can write
where are integers, , and that it is in the lowest form. This implies , which we can rearrange so as to write
Since we assumed that is in lowest form, we see that both and must be a integral multiple of and respectively. Taking that constant multiple to be , we may write and . The latter equation says
but since is nothing but , this implies that , which simply means that is a perfect square. This proves the contrapositive statement, and therefore the theorem follows.
Isn’t it beautiful!
1. The Krein-Milman theorem in Locally Convex Spaces
My project work this semester focuses to understand the paper the Krein-Milman Theorem in Operator Convexity by Corran Webster and Soren Winkler, which appeared in the Transactions of the AMS [Vol 351, #1, Jan 99, 307-322]. But before reading the paper, it is imperative to understand the (usual) Krein-Milman theorem which is proved in the context of locally convex spaces. My understanding of this part follows the book A Course in Functional Analysis by J B Conway. To begin with we shall collect the preliminaries that we shall need to understand the Krein-Milman theorem.
Let denote the real() or the complex() number fields. Let be a vector space over . A subset of a vector space is called convex if for any two points in the subset, the line segment joining them lies completely in the subset. We make this idea more precise.
If , the line segment from to is given by the set
By an open line segmentwe shall mean that the end points should be excluded, that is
We shall call a line segment proper if . A subset is called convexif for any , the line segment . It is easy to see that intersection of arbitrary number of convex subsets is again convex. We characterize a convex set as -
Proposition 1.1 : is convex if and only if whenever , and with , then .
Proof: If the latter condition holds, then it is easy to see that is convex. For the converse part, we use induction on the number of points, . For , this is trivially true, while the case follows from the definition. Moreover note that if , the theorem holds. So we shall assume . Suppose the theorem holds for . Then, if there are points, say , and , with and then we can write
Now by the induction hypothesis, and by the convexity of , we get that . Thus the proof is complete.
If , the convex hull of , denoted by , is the intersection of all convex sets containing . Clearly this definition is meaningful since itself a convex set containing , and that convexity is preserved under intersection. is also convex. Using proposition 1.1, we have the following characterization of the convex hull of .
The convex hull of is the set of all convex combinations of elements of , that is
As an example, the convex hull of three points is the closed triangular region formed by those points.
Definition 1. Let be a convex subset of a vector space . A point is called an extreme point of , if there does not exist any proper open line segment containing and which lies completely in . We denote the set of extreme points of by .
Convex sets may or may not have extreme points. For example the set , does not have any extreme point, whereas the extreme points of the set are all the boundary points. The extreme points of a closed square region in are its four corners.
We have following equivalent characterizations.
Proposition 1.3 : Let is convex subset of a vector space , and let . Then the following are equivalent -
- (a) .
- (b) If and and , then either or or .
- (c) If and , then for some .
- (d) is a convex set.
Proof:The equivalence of and is straightforward from the definition of extreme point and convexity of .
: Let . Then by proposition 1.2, we get that , where , and . If for some , we get that . So suppose that for each . Then we can write
This happens only if which means .
: Suppose . Then there are points such that , where , and . So . Then dictates that or . But then .
: We take two points , then , since for any . Thus is convex.
: Suppose that is not an extreme point. Then there exists and with , and that . But this means that does not completely in , which contradicts its convexity.
We shall return to convexity once again after we have introduced the concept of a topological vector space.
Interviews always make me nervous. And that nervousness starts precisely 30 minutes before my call. Dunno why (about the timing)! Anyway, the interview was for the NBHM MA/MSc Scholarship. Well I already get a handsome scholarship from the Govt Of India (INSPIRE) but I wanted just to get a feel, whether I am still grounded after having plenty of courses. The interview was scheduled to be at 4 pm on 18th November and the venue was ISI, Kolkata. I reached Kolkata in the morning and stayed at my friend’s room – thanks to him. I was also accompanied by my friend and classmate who was also called for the interview.
ISI Kolkata is a big place, and has numerous departments. From studies in social sciences to research advances in mathematics and theoretical statistics. No wonder why this institute has been conferred the title of institute of national importance. Coming back to point, we waited for almost the whole day, till our turn at around 5:30 pm. Till then we met students from Vivekananda University, and guys from IMA; and interacted with them about there courses etc. We took lunch at the ISI Canteen, where the food rate is incredibly subsidized – 50 “paise” for 5 chapatis and curry. We waited, waited and waited. Not to mention I had cold, and a bad headache.
Finally I was called, and already my nervousness has reached a saturated point. But it declined appreciably once I entered the lecture room. There were four professors, and one among them was Prof SC Bagchi, whom I had already acquaintance with; the rest looked familiar but I didn’t know the names. We exchanged smiles, and I eased a lot. The interview started with the obvious question – “what courses have you studied?”. Needless to say I would have easily spent 10 minutes just recounting the name of the courses. I named a few, and mentioned along that there were many in which I wasn’t comfortable with. Maintaining truthfulness is necessary to avoid questions from an area which one doesn’t know well. The first question was asked by Prof Bagchi, who asked me to prove that the closed interval [0,1] is a connected set in the real line. I said that the question boils down to proving that the only subsets of the real line which are both open and closed are the empty set and the whole real line. He asked me to prove this. Now this requires a little thinking. And I’ll leave it to you to figure it out or take the help of a book [Topology-Munkres]. I did succeed only after considerable hints from him.
The next question was an easy one – to give an example of a non-holomorphic function. I shot back saying ; coz it didn’t satisfy the Cauchy Riemann equations. Immediately the next question was when we can say that the function is analytic – which is again simple – that in addition to satisfying the Cauchy-Riemann equations, it should also have continuous partial derivatives. Then I was asked that given a point in a domain where a function is analytic (=holomorphic), what can I say about the radius of convergence. I said that we can get the radius of convergence by the formula . They clarified that what I can say readily by just looking at the domain, and hinted Cauchy’s theorem. I told that by the theorem, it will be the radius of the largest circle that lies completely in the domain. They asked me whether I can put this in topological terms. I thought a bit, and then it dawned. “Yeah, distance from the boundary”. This was exactly they wanted me to tell. Another guy from the committee asked me whether I was familiar with the notion of analytic continuation. Yes was the reply, and then I explained the rudimentary idea of how one can analytically continue a holomorphic function which agrees with another holomorphic function on a common domain. I was then asked to write a function which is the analytic continuation of , which of course if but didn’t hit me then.
Finally the linear algebra part. I was asked how can assign a meaning to , where is a matrix. I didn’t know so told them flatly. They then modified the question that suppose the matrix given is a symmetric matrix, and hinted some diagonalisation theorem. I got it immediately, so the point is that if is a symmetric matrix, it is similar to a diagonal matrix via real orthogonal matrix. This means , where $\latex D$ is diagonal. Then the computation is easy and the expression is also well-defined via , where is nothing but the diagonal matrix obtained by taking the exponential of the diagonal entries of . Then they asked – what if is nilpotent. In that case the expansion contains only finite number of terms and hence is readily defined. So finally what can I say about a general matrix, and hinted “semi-simple”. Though I had read semi-simple operators, but I couldn’t recall then. Actually the result is that on a finite-dimensional vector space over the complex field, semisimple is equivalent to diagonalizable, and every operator on that space may be written as a sum of a semisimple operator and nilpotent operator which commute. So using this fact we may define for general operators.
With that I had spent almost half an hour, and according to me the interview was quite OK. After me, was my friend’s turn. Finally we had a nice chat with the professors, and got some good advice. Altogether it was a good experience.
We shall always assume that are Banach spaces over the complex field . The real number field is denoted by . We shall now explore differential calculus in Banach spaces. It will be instructive to try to understand the theorems in well known Banach spaces like (over ) or (over ) or (over ). In general there are two notions of derivatives – the Gateaux derivative and the Frechet derivative, where the latter is stronger in the sense that Frechet differentiability implies Gateaux differentiability.
Definition: Let , where are Banach space. For , define
provided the limit exists. Then is called the directional derivative of at in the direction of . If this limit exists for every and if the function is linear and continuous in , then we say that the function is Gateaux differentiable and the function defined by
is called the Gateaux derivative at . Thus the linearity and continuity condition means
[An example to show that the map is not always linear, and even if exists, it may not be continuous at .]
Definition: Let , where are Banach space. Let be open subset. We say that is Frechet differentiable at , if there exists a bounded linear map such that
Some remarks should be in order, which we leave as simple exercises. The Frechet derivative is unique, and Frechet differentiability implies continuity. If exists, then also exists, and we have
Usually when we say is differentiable, we shall mean Frechet differentiable. [An example for which the Gateaux derivative exists but Frechet derivative fails to exist.]
Our next step is to prove the mean value theorem.
Theorem: Let , and suppose exists for all , and is a continuous map of , then
Proof: Let . Define is differentiable, since it is the composition of two differentiable function. We have . Using the fundamental theorem of calculus for real valued functions, we get
Also, , so altogether we get,
Since this holds for all , we get the desired result.
Notice further that
This is referred to as the mean value theorem in higher dimensions. The usual mean value theorem does not hold in higher dimensions as is seen by this example. Define for . Then is differentiable, and , but there is no such for which is zero.
Our next step is to generalize the chain rule. However we shall not prove the rule using strong hypotheses that both functions in a composition are Frechet differentiable. Instead we have -
Theorem: Let . Suppose is Frechet differentiable at and is having directional derivative in the direction at , then has a directional derivative at , and
Before plunging into a proof, let us have an informal discussion. On the right hand side of the above equation, we have a linear transformation acting on an element of Y – . Let us see what we have got. has directional derivative at in the direction of . This translates into
that is given there exists such that if we have
is Frechet differentiable at , so
where the right hand side goes to zero as . We may take
which certainly goes to zero at . If we plug this in the equation of , we’ll get
To complete the proof we merely have to show that the last two terms go to zero. This we shall prove formally in the next post.