Mathematics from the eyes of a High School Student! – Bidit Acharya

The Collatz Conjecture

Before we begin the discussion, I’d like to play a little game with you

  1. Pick any natural number.
  2. If your number is even, divide it by 2. If it is odd, multiply the number by 3 and add 1 to it. (i.e. for an even number n compute \frac n2 and for an odd number n, compute 3n+1. )
  3. You now have with you a new number. Repeat step 2. Do not stop until you end up with the number 1.

For no natural number n will this process continue indefinitely. You will get to 1 after some finite number of repetitions. This is the essence of the Collatz Conjecture.

Example I choose the number 21. Since it is odd, I multiply it by 3 to get 63 and add 1 to it to get 64. Since 64 is even, I divide it by 2 to get 32 which gives me 16 which gives me 8, 4, 2 and then finally 1. As desired!

Now this was a pretty easy example but you can try it out for yourselves. Let me tell you that for some numbers, it will take a ridiculously long time until you get to 1. For example for the number 521, it takes 124 steps to get to 1!

Significance of the Conjecture

The Collatz Conjecture was first proposed by German mathematician Luthar Collatz in 1937. The conjecture has many different names but the most amusing of of them is HOTPO which stands for Half Or Triple Plus One. This conjecture may seem rather unimportant and inapplicable. Despite its frivolous nature, the conjecture is seems to be pretty important for number theorists.

According to Greg Muller‘s answer here on Math.SE, “…the Collatz conjecture seems to say that there is some sort of abstract quantity like ‘energy’ which is cannot be arbitrarily increased by adding 1. That is, no matter where you start, and no matter where this weird prime-shuffling action of adding 1 takes you, eventually the act of pulling out 2s takes enough energy out of the system that you reach 1. I think it is for reasons like this that mathematicians suspect that a solution of the Collatz conjecture will open new horizons and develop new and important techniques in number theory.”

My take on this conjecture

The Collatz conjecture is probably one of the first conjectures I encountered. Since it was pretty easy to understand I thought that maybe I could prove it without any help (at that point, I did not know what a conjecture was!) I tried for a few weeks and then finally decided to look it up on the internet for a hint. It was then that I found out that it was yet to be solved!

This is how I tried to tackle this problem back then

Any power of 2 will certainly get to 1. Infact, this is the only way to get to 1. The sequence obtained for any number (called the hailstone sequence) should always “enter” the series 1,2,4,8,16,32, 64 \ldots (powers of 2) if it is to eventually get to 1. Even more so, the hailstone sequence will only enter the series 1,2,4,8,16,32,64 \ldots from even powers of 2 only (like 16 or 64). This is pretty easy to prove

Consider a number x that is not a power of 2 but will be one on the next iteration (like $latex 5$; since 5 \times 3 +1=16=2^4. Note that x has to be odd because if it were even then x would have to be a power of 2 hence it would have already entered the series of powers of 2.

So since x is odd the next number in line would be 3x+1. This should be a power of 2. OR \exists n,x \in \mathbb N such that,

3x+1=2^n \iff x= \frac{2^n-1}{3}\;\;\;\;\;\;\;\;(1)

What is required is that n be proved even.

Lemma \text{ For some }x,n\in \mathbb N,\; 3x+1=2^n\iff n|2

Proof  The fact that x\in \mathbb N implies (2^n-1)|3. This follows from(1). We now find the solution set for the equation 2^n-1\equiv 0\; (\bmod \;3) or 2^n \equiv 1\; (\bmod \;3).

We begin by noting 2\equiv -1\; (\bmod \;3) \implies 2^n\equiv (-1)^n\; (\bmod \;3). Since (-1)^n=1\iff n|2, it follows that  2^n\equiv 1\; (\bmod \; 3)\iff n|2. This means that the quantity 2^n-1 is divisible by 3 iff n is even.  It follows from this that if a quantity 3x+1 equals some power of 2, then 2 has to be even. As desired.

Obviously, not every odd number n will result in a power of 2. For instance, 7\times 3+1=22 is not a power of 2 but 21\times 3+1=64 is a power of 2. Is there some pattern to it? As it turns out, there is; and it is pretty easy to spot it too. For this we go back to (1) :

x= \frac{2^n-1}{3}

The odd number x has to satisfy this relation. So to find new xs, ll we need to do id plug in even numbers n into the equation. We see that the series of x’s obtained follow a certain pattern.

To make things easier, define a function f:n\to\frac{2^n-1}{3}. The domain of the function is the set of even numbers.

For even number n, if we look at the difference f(n+2)-f(n), we see that it is $2^n$. So, difference between the odd number x that results in 2^n in the next iteration and the other odd number that results in 2^{n+2} differs by 2^n.

Example The number 5 results to 16 (5\times 3+1) which is 2^4 and the number 21 results to 64 which is 2^6. The difference between 5 and 21 is 16; which is 2^4.

While these things hardly have anything to do with proving the conjecture itself, these are the observations that I made on the hailstone sequence. I thought I could prove it then, but these observations is as far as I got.

I actually had many other things to post about but none of them involved my own observations. So I decided to flip the pages of my scrap book  and post it here.

Please do comment and share this post. Any constructive feedback is welcome!

Greetings Fellow Earthlings!

(I initially wrote “Hello Fellow Earthlings!”, but since the “Hello” and “Fellow” rhymed, it sounded to cheesy and unnatural)

So anyways, I recently got my own domain at biditacharya.com. I am working on building the website right now.

I had no idea that it’d take so much time to do this. I’ve been trying to set the website up for almost 2 months now.  But with all the college applications and SATs and TOEFLs going on, I can hardly make things work out for me. Its been ages since I have been able to do the things that I actually enjoy doing: like learn math or watch movies. For the past few months, its been like “register for the tests, study for the SATs, fill up the Common App, send scores to college…blah blah blah. The only interesting thing about this process is writing the essays. They’ve really helped understand myself better.

Anyways, all rantings aside, I hope I’ll be able to build a better and a more updated blog as soon as possible.

Cheers!

Bidit

This post will be the first in a series of post regarding Countability and Cardinality of Infinite Sets. I do not intend to make this post too mathematically formal.

Along the end of the 19th century, mathematicians struggled with the concept of infinity. They had trouble figuring out its exact nature. Was it just another number, but extremely large; or was it something else entirely? As we will find out later on, it is more of the latter than the former.

The traditional way that mathematics gets done is by one mathematician modifying and expanding on the work of others who came before. This model, however does not seem to apply to Georg Cantor (1845-1918), at least with regard to his work on the theory of infinite sets. One wouldn’t be exaggerating if he said that Cantor single-handedly developed the foundations of this theory. The neat thing about Cantor’s work is that it is very simple and easy to understand. On studying this theory, one begins to appreciate the fact that perspective makes all the difference when it comes to solving a problem. If one finds the right way (whatever that means) to look at the problem, he is half-way there. Cantor’s perspective was (and still is) beautiful, and very intuitive. What once eluded even the best minds is now within the reach of any lay-man. This feat is worthy of appreciation.

We begin the study of Cantor’s work, with the notion of something called Cardinality of Sets.

Cardinality

Very simply speaking, Cardinality of a set is a measure of the size of the set. For a finite set (a set having only a finite number of elements), it is pretty straight-forward. The cardinality of the set \{ 1,2,3,4 \} is 4. All we need to do is attach a natural number to each set. Although this seems to be a clumsy way of looking at it, the notion of attachment or correspondence is what makes our job a lot easier when dealing with cardinalities of infinite sets.

How do we define the Cardinalities of Infinite Sets, then? Well, as you may have noticed, this is the title of the post, and this is what we will be focusing on in this series of posts.

What Cantor had in mind was to see if he could extend the notion of order (idea of something being bigger or smaller than something) of the cardinality of finite sets to cardinality of infinite sets. This means that he wanted to see if some infinities were bigger/larger than the others. This is a valid concern too. Normally, it is very unclear whether the cardinality of \mathbb R is the same as the cardinality of the closed interval (0,1). To a normal person, it seems that the open interval  (0,1) is smaller than \mathbb R. We’ll see if this is true.

The Larger Set: Defining Order in Cardinality

The set of planets in the solar system (its cardinality being 8) is definitely larger than the set of basketball players in a starting line-up of a team (its cardinality being 5). As far as finite sets are concerned, the order of natural numbers is sufficient enough to define order in cardinality. Since 8 > 5, the set of planets in the solar system is larger than the set of basketball players in a starting line-up of a team. But this method doesn’t seem to work for infinite sets. So, we need to change few things to our approach so that it will work for infinite sets and along with that, should still hold true for finite sets (like a universal law, or something of that sort).

As I mentioned earlier, the notion of correspondence seems to be the key. When we say that the set of planets in the solar system is larger than the set of basketball players in a starting line-up of a team, what we do is, we try to put the sets into some kind of 1 to 1 correspondence with each other. From now on, it should be noted the set of planets is larger than the set of players because, if the planets were removed from the orbit and replaced by the players, then there would still be 3 empty spots in the solar system, , not just because  8 > 5. In other words, when the players were “assigned” to the planets, some of the planets had no partners. But if we consider the set of Ivy league colleges (there are eight of them) instead of the basketball players, then the  correspondence between each Ivy league college and the planets would have no unpaired member  (or, there would be no “empty spots”). This correspondence is said to be 1-1 and onto (or a bijection).

The correspondence could be something like:

\begin{matrix}\small\text{Mercury} &\small\text{Venus} &\small\text{Earth} &\small\text{Mars} &\small\text{Jupiter} &\small\text{Saturn} &\small\text{Uranus} &\small\text{Neptune}\\ \updownarrow &\updownarrow &\updownarrow &\updownarrow &\updownarrow &\updownarrow &\updownarrow &\updownarrow\\ \small\text{Harvard} &\small\text{Princeton} &\small\text{Yale} &\small\text{Cornell} &\small\text{Dartmouth} &\small\text{UPen} &\small\text{Columbia} &\small\text{Brown}\end{matrix}

As you can see, this is a perfect 1-1 (and onto) correspondence and thus we say that these two sets have the same cardinality. So, we have a formal definition with us

Definition Two sets A and B have the same cardinality if there exists f: A \to B that is 1-1 and onto. In this case, we write A\sim B

Extending the Definition to Infinite Sets

Now that we have defined order and equivalence in cardinality of sets, we will see how this method works for infinite sets too.

Consider the sets \mathbb N (the set of natural numbers) and E= \{2,4,6,\ldots\} (the set of even numbers). Which one should be the “larger” set?

On first glance (and many glances after that, if you have no idea about what we are discussing now!) it seems that cardinality of   \mathbb N is greater than that of  E and that seems logical too. After all, E \subset \mathbb N

However, if we really think about it, we will come to realize that they have the same cardinality. Strange?

For us to prove that \mathbb N\sim E , all we nee to do is define a suitable correspondence. A defined correspondence is often called a function. Do note that the function that we desire should be a bijection (1 to 1 and onto).

Before moving on any further, I’d encourage the reader to try and define, by himself/herself, such a function (mapping) f:\mathbb N\to E  that is one to one and onto.

After some thinking, the reader should have figured it out that the function f:\mathbb N\to E is defined by f(n)=2n .

\begin{matrix}\small{\mathbb{N}:} & \; &\small 1 &\small 2 &\small 3 &\small{\ldots} &\small n &\small{\ldots}\\ \; &\; &\updownarrow &\updownarrow &\updownarrow &\ldots &\updownarrow &\ldots\\ \small{E:} & &\small 2 &\small 4 &\small 6 &\small{\ldots} &\small{2n} &\small{\ldots}\end{matrix}

From the definition of equivalence, it thus follows that \mathbb N \sim E. This is something that is very odd to digest, especially to those who are to new to this field of study. But I read this on the internet that helped me when I was just learning all these.

It is certainly true that E is a proper subset of \mathbb N , and for this reason, it may seem logical to say that E is a “smaller” set than N. This is one way to look at it, but it represents a point of view that is heavily biased from an over exposure to finite sets.

The definition is quite specific, and according to it, E and \mathbb N are equivalent (they have the same cardinality.

Note: Some good exercises would be to try and show that \mathbb N \sim \mathbb Z  or to show that \mathbb N \sim O (where O be the set of odd numbers)

While this is all neat and pretty straight-forward, one often wonders what good does it do to us? What use is showing the equivalence between the cardinalities of two sets? What do we get out of knowing if two sets have the same “size” or not?

Well, this is where the concept of Countability comes in.

Countability of Sets

 Mathematics is believed to have evolved from counting. Early cattle rearers somehow learnt how to count their cattle even though they did not know any numbers. Their method of counting heavily relied on the concept of 1-1 correspondence, just like the one we discussed earlier (but a little less advanced).

Say that the cattle rearer left home with 20 cows to graze in the grasslands. He had to make sure that all the cows returned back home, or else he would be at a loss. How did he do that? Well, he used a very simple trick. He put up 20 wooden posts, one for each cow to be tied to. At the end of the day, he would tie each cow back to its post. If any of the posts were empty, this would mean that some of the cows were missing. There did not exist a bijection (few post did not map to a cow at all) and hence the two sets have different cardinality. Obviously, the rearer did not think of the situation in terms of cardinality or bijections, but essentially they are the same thing.

Cantor realized that when it came to the notions of infinity, he was in no better position than the early cattle rearers. We humans seemed to struggle with counting infinity, just as the cattle rearer struggled with counting the 1s,2s and 3s; . So Cantor thought that it was best if he adopted the principle of one-one correspondence to understand such infinite collections. And it definitely paid off.

All Cantor needed to do was to figure out the exact meaning of a countable set that is powerful enough for finite as well as infinite collections. We will attempt to do the same.

If we think about it, it makes perfect sense to say that a set (or a collection) is countable if it is possible to look at any member of the collection and say that this is the 3^{rd} element, or this is the 227^{th} element (or generally speaking, the n^{th} element), that it is possible to count the elements. Or, in other words, if it is possible to assign a number (or a natural number, to be more precise) to each and every element of the set.

So, we formally define a countable set as follows:

Definition A set \mathcal A is countable iff  \mathbb N \sim \mathcal A

So naturally, an infinite set that is not countable is called an uncountable set. We will see in later posts what it truly means for a set to be uncountable, but for now, know this that a set is uncountable if no bijective mapping is possible between the set and \mathbb N .

From our definition, it follows that the set of even numbers is countable. From the exercises suggested above, the reader should be able to deduce that  \mathbb Z as well as O (set of odd numbers) are both countable. We will deal with uncountable set in the next post as it requires a little more elaborating , but for now, I would suggest the readers to think if all infinite sets are countable or not. Given some infinite sets such as \mathbb Q or \mathbb R , it might seem as though, with some amount of ingenuity, we should be able to fit all the elements of our set into a single list (if it is possible to enlist all the elements of the set, then it should definitely be countable since the bijection would then be pretty straight-forward). After all, this list is infinitely long so there ought to be plenty of room. This is something to think about.

I would like to pause here for now. We will continue our discussion in the future posts.


PS: I would really value constructive feedbacks on this post. Please do comment in the comment section below.

One way of testing whether a set of vectors is a linearly independent set is to arrange the vectors as columns of a matrix and then work out its determinant. If the determinant of that matrix is 0 then the set is a linearly dependent set but if the determinant is something other than 0 then the set is a linearly independent one.

Obviously this only works If the matrix so formed is a square matrix, where the number of rows equals the number of columns of the matrix. This is the same a saying … if number of components of the vectors in our set equals the total number of vectors in our set. While this might at first seem as a letdown, but this isn’t totally useless, especially when it comes to determining whether the set is a basis for a particular subspace. What do I mean by that? Well, I’d recommend you continue reading!

I have been using this “technique” for quite sometime in Linear Algebra classes at school, but mostly, without really understanding what really is going on underneath. At first sight, this might seem as a “trick” of some sort but as Qiaochu Yuan says “ In mathematics nothing is a trick if seen from a sufficiently high level”, this is no trick.

Well, from here on, I’ll try to be as mathematically formal as I can.


Let set \mathcal A ,be a linearly dependent set, with cardinality n , of n-vectors. Now construct a matrix M whose columns are the vectors of \mathcal A . Or, if c_1,c_2,\ldots , c_n are the n-vector elements of \mathcal A , then construct M such that,

M=\begin{bmatrix} c_1 & c_2 &\ldots & c_n \end{bmatrix}

Now Let \nabla be the determinant of M. Then,

\nabla =| ~c_1 ~c_2 ~\cdots ~c_n ~|

Now because the set \mathcal A is a linearly dependent set, this implies that one of the vectors (without loss of generality, let’s say its the vector c_n ) can be represented as a linear combination of the other vectors in the set. Or,

c_n = a_1 c_1+a_2 c_2 \ldots + a_{n-1} c_{n-1} ~~~ \text{for some } a_i \in \mathbb R .

Now, take Matrix M and subtract a_1c_1+a_2c_2 ~\ldots +a_{n-1}c_{n-1} from the n \text{th} column (or from c_n ). This results in another matrix (say M’) whose last column is a 0 vector.

So, we now have that


M'=\begin{bmatrix} c_1 & c_2 &\ldots & c_{n-1} & 0 \end{bmatrix}

It follows from the fact that one of the columns of the matrix being 0, that \det(M') =0

Because the determinant of a matrix remains unchanged if a scalar multiple of a particular column(or row) is added(or subtracted) from a column, it thus follows

\det(M) = \det(M') ~\implies ~\nabla=0

Therefore, we have just shown that if a the columns of a matrix are linearly dependent, the determinant of that matrix should be zero. The converse is also true and it shouldn’t be that difficult to show.


Now that it has been shown (not so rigorous, but bearably solid) why it works, I’d like to focus on something I mentioned earlier on in the post, something about M being a square matrix. After reading the proof, it should be very clear as to why M should be a square matrix. Our technique uses the idea of a determinant which is only defined for a square matrix and hence , if M isn’t a square matrix, then we can not compute its determinant.

Now when you think about what it actually means for M to be a square matrix, you realize a certain fact about this technique that is, in my opinion, worthy of a mention. Beacause M is a square matrix, the number of vectors in \mathcal A and the number of component in each vector \vec {c_i} \in \mathcal A has to be equal. Else, M wouldn’t be a square matrix. This means that the linear independence of the columns of  \mathcal A implies that the vectors in \mathcal A spans the vector space \mathbb {R}^n . So, if a set is linearly independent set and it spans a particular subspace, the set is said to be a basis for that particular subspace, right?

So, what do we have now? Our little “theorem” now extends to

If a set with n vectors is a  basis for the subspace \mathbb {R}^n , then the determinant of the matrix whose columns are the basis vectors of that basis set, is a non-zero number. If the set isn’t a basis for that subspace  \mathbb {R}^n , then the determinant of the matrix is  0 .

The converse of this statement is true too and should be difficult to see why.

In the process of learning Real Analysis, I encountered a definition of an additive inverse of a cut \alpha to be

\text {add inv of } \alpha \colon= \{p:\exists r>0\, (-p-r\in \alpha)\}

The definition does make sense but I couldn’t seem to understand the motivation behind defining it this way.
Why is this definition significant? Isn’t this a bit uncalled for? Why good does this definition do to us ? Does it make some things easier (how so?)?

With many days of struggle, I finally came up with an acceptable answer and I thought I’d share it here.

To understand where the definition of -\alpha comes from, you have to go back to the definition of the sum of two cuts. I’m going to assume the following definitions of cut and of addition of cuts:

\alpha\subseteq \mathbb{Q} is a cut iff,

\qquad(1)\qquad\varnothing\ne \alpha\ne\mathbb{Q} ;
\qquad(2)\qquad\text{if }p<q\in \alpha,\text{ then }p\in \alpha ; and
\qquad(3)\qquad \alpha\text{ has no greatest element} .

If \alpha and \beta are cuts, \alpha+\beta\triangleq\{p\in\mathbb{Q}:\exists a\in \alpha\,\exists b\in \beta(p=a+b)\} .

I’ll also assume that the embedding ^* of \mathbb{Q} into the set of cuts has been defined so that q^*=\{p\in\mathbb{Q}:p<q\} , so that 0^*=\{q\in\mathbb{Q}:q<0\} is intended to be the additive identity.

Clearly we want -\alpha to be defined so that \alpha+(-\alpha)=0^* . Thus, I want -\alpha to satisfy the following condition:

\qquad\qquad\qquad\qquad for all p\in\mathbb{Q} , p<0 iff \exists a\in \alpha\,\exists b\in -\alpha(p=a+b) .

This is a bit hard to sort out, so let’s pretend for a moment that we’ve actually constructed \mathbb{R} and look at a concrete example. Suppose that \alpha is intended to be \sqrt 2 : \alpha=\{p\in\mathbb{Q}:p<\sqrt 2\} . Clearly -\alpha should be -\sqrt 2 , or \{p\in\mathbb{Q}:p<-\sqrt 2\} . But p\sqrt 2 , so -\alpha ought to be \{p\in\mathbb{Q}:-p>\sqrt 2\}=\{p\in\mathbb{Q}:-p\notin\alpha\} . This suggests that perhaps we should define -\alpha in general to be \{p\in\mathbb{Q}:-p\notin \alpha\} . This almost works, but there’s a small problem if \alpha is a rational cut q^* : in that case \{p\in\mathbb{Q}:-p\notin q^*\}=\{p\in\mathbb{Q}:-p\ge q\}=\{p\in\mathbb{Q}: p\le -q\} , which is not a cut, since it has a greatest element. We want -(q^*) to be simply (-q)^* , i.e., \{p\in\mathbb{Q}:p<-q\} . There’s a simple (if slightly clumsy-looking) way around the problem: we simply set,

-\alpha \triangleq \{ p\in \mathbb{Q}:-p \notin \alpha \wedge -p \text{ is not the least element of } \mathbb{Q} \setminus \alpha \}.

It’s not hard to check that this really does define a cut. To check that it defines the right cut, suppose first that a\in\alpha and b\in-\alpha . Then -b\notin\alpha , so a<-b , and therefore a+b0 ; then we want to find a\in\alpha and -b\in\mathbb{Q}\setminus\alpha such that -b is not the least element of \mathbb{Q}\setminus\alpha and a+r=-b . To give this some intuitive content, we’re trying to find rationals a and -b that are exactly r apart and that ‘straddle’ the cut \alpha . To do this, let q\in\alpha and s\in\mathbb{Q}\setminus\alpha be arbitrary. Since \mathbb{Q} is non-Archimedean, there is an n\in\mathbb{Z}^+ such that n>(s-q)/r and hence q+nr>s , so \{n\in\mathbb{Z}^+:q+nr\notin\alpha\}\ne\varnothing . Let m=\min\{n\in\mathbb{Z}^+:q+nr\notin\alpha\} ; then q+(m-1)r\in\alpha and q+mr\notin\alpha . Thus, if q+mr is not the least element of \mathbb{Q}\setminus\alpha , we can simply take a=q+(m-1)r and -b=q+mr to get a+r=-b with a and -b as desired.

But now if q+mr is the least element of \mathbb{Q}\setminus\alpha , we have to be a little cleverer: in this case we let a=q+\left(m-\frac12\right)r and -b=q+\left(m+\frac12\right)r . Clearly a+r=-b\in\mathbb{Q}\setminus\alpha , -b is not the least element of \mathbb{Q}\setminus\alpha , and a<q>0\;(-p-r\in\alpha)\}\subseteq-\alpha . On the other hand, if p\in-\alpha by definition (1) , then -p\notin\alpha ; let a\in\alpha be arbitrary, set r=-p-a , and note that r>0 and -p-r=a\in\alpha , showing that

 -\alpha\subseteq\{p:\exists r>0\;(-p-r\in\alpha)\} .

Thus, definition (1) can be replaced by:

-\alpha\triangleq\{p\in\mathbb{Q}:\exists r>0\;(-p-r\in\alpha)\}\;.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: