venerdì, maggio 25, 2018

Reading about debt

I often talk to friends and family about books, often in a overenthusiastic manner. I like books and I like, more than everything else, to find useful stuff for real life in books.

If you're now expecting an apology of "Mindfulness for dummies" and the like, I have to upset you; I found out that the treasures are to be found in unexpected places.

As an example, take my last two reads: Debt, the first 5000 years and Buffett, the making of an American capitalist (not finished yet, but I also learned something very useful there, which does not have anything to do with investment, but let me talk about it some other time).

Does it sounds weird to have a leftist anti-capitalist book about debt in the direct neighborhood of the very generous biography of man with US$ 87 billions net worth? Don't worry about my mental stability, I do it by purpose. 

(Which of course is not a sign of mental stability, now that I think about it)

The first book investigates the origin of money in the credit instruments of ancient civilizations. Yes, it looks like it is an established fact that the Babylonian had financial instruments recorded on clay tables in a fictive currency (the bushel) which was only used for recording these financial transactions. And yes, they knew about compound interest. And no, they did not have any money you could exchange on the street. Crazy.

Anyway, there are also pages dedicated to the fact that the precise quantification of debts between neighbors and acquaintances tends automatically to depersonalize these relationships. This being the reason, for instance, for which we do not like to precisely keep track of who owes whom how much money if we go out with our best friend for a beer.

My wife is thinking since some time about turning her year long practice in mindfulness mediation and her teacher experience with it in a full-time activity mixing self-employment and community service. We talked a lot about financial independence and the like in the last weeks. And then BANG! I read this book and I understand why the financial dependence feels so threatening in a relationship. Suddenly, I have a framework on how to talk and think about it.

I think this is the sign of great books: you find insights and inspirations there which do not have anything to do with the apparent topic in the book.

Good reading!

lunedì, maggio 14, 2018

White King, Red Queen: a technical primer about trees, part III.

In the previous posts (I, II), we have seen the arms race around computer chess started. We have seen the first ideas about the possibility of building a true chess computer. We witnessed a German company doing it and being displaced by the newborn 486 and we left the an army of chess programmers getting ready to assault Gary Kasparov dominance over the world of chess.

At this time, we are talking about 1994, many people not directly involved in technology still wouldn't think that the end of human chess dominance was only 3 years away. I witnessed exactly this period as a young club player and I have vivid memories from that time. Indeed, the explanation I give you later is more or less what I got back then in 1996 by a chemistry professor interested in game theory and it still holds true.

The main reason for the supposed inability of computers, was that they were thought to purely use "brute force" to find the best moves, whereas master were known to be able to have long-term understanding of the game. The true skill of the masters, so the mainstream, was based on being able to recognize positional advantages which would bear fruit 20 moves or more later, and such a far horizon was thought to be unreachable by standard alpha-beta engines. Ooops, I did it. A technical word. I think the moment has come for you, my faithful reader, to really understand how a chess engine thinks. It will be quick, it could even be painless, so let us look at it in more detail. 

Down the very basics, a chess engine is composed by 3 fairly independent parts: the moves generator, the static evaluator and the search function.

The moves generator
First a remark: chess programmers often talk about chess positions as of FENs. These are just shorthands for describing things like "White has a king on the right bottom square h1, a pawn on the square c4 and a bishop on f2. Black only has a king on h8". Here is the FEN for the starting position
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
The move generator gives you for any possible FEN all legal moves which can be done by the side to move.

The static evaluator
A very important element is to quantify how a position looks better than another one. This is done by the static evaluator. The static evaluator gives you for any FEN a value, representing its objective evaluation of the position. Traditionally, positive numbers mean white is good, negative numbers mean black is good. How could a static evaluator look like? It could be for instance something like: count all white pawns, subtract the number of black pawns and multiply by 10. In this case, for instance, both a position with 3 white pawns and 2 black pawns and a position with 4 white pawns and 3 black pawns will give you a value of +10. Obviously static evaluators of real chess engines are much more complex than that, having hundreds of elements. For instance Stockfish main evaluation function (Stockfish is the leading chess engine currently) has ca. 800 lines of C++ codes, with an element typically be coded in 5-10 lines, and this is only the main evaluation function. As I said, hundreds of elements.

The search
But the core of a chess engine is the search. Ahhh, the search. To understand how the search works, it is useful to make the following experiment. Imagine you couldn't play chess at all. Zero knowledge of the rules. The only thing you know is how to translate a chessboard position in a FEN. Plus, you are given a moves generator and a static evaluator. You start playing. You are white and you have to move. Since you have a move generator, you know what moves you could do, and you have a fantastic idea: you will just try them one by one and keep the one which gives you the best, in other words: maximal, static evaluation. If you want to spend some time, with this tool you can do the exercise with Stockfish. It turns out that moving the pawn in front of the king by two squares gives you a evaluation of 218. Cool! Let us move that pawn forwaaaaard!
But hang on! Black can do this too. Black would at its turn chooses the moves which maximize its value. This means, he will go through all possible replies and choose the one which minimizes my value. So it could be, in principle, that a possible reply to moving my king pawn by 2 squares give black a better value than all possible replies to moving my queen pawn by 2 values! 
In fact it is so, if I move my king pawn by 2 squares, the best he can do is to move his queen pawn by 2 squares and get a -20. In contrast, if I move the queen pawn by two squares, he can get is a +20 (again by moving its queen pawn by two squares). So, by minimizing on its turn, he invalidates my assumption that I just have to choose the move after which the static evaluation is best.
This is now bad, because it could be that there still another one of my possible 20 first moves that it is even better, in the sense that it makes the minimizing task for black even worse.
Now that we think about it, we should also consider that we also could reply to its replies, again trying to maximizing our value with our second move, so we have to also take this is consideration. And he could reply by minimizing and so forth and so on.
This loosely described algorithm is called minmax and is the basis of the chess engines. You just unroll all possible moves until, say, the 10th reply (technically: ply) you look at the values at the end of this unroll and both players choose alternately the minimizing or maximizing move. Minmax. Interestengliy, there is a variation which is equivalent and which is called alpha-beta which saves you some of the comparisons.

All competitive modern chess programs use some form of alpha-beta with variations and refinements. In this sense, they use brute force, in the very end. They just unroll the moves and do minmax. More brute force is not possible.

Now, if you do the math, you will notice that at every reply you can choose between 20-30 moves. This means, the possible positions after, say, 20 plies are 20^20, which is much more that you could store on your computer, even if every position and all information about it would stay on one bit. Alpha-beta gives you some more bang for the bucks, but not that much.

So, it was though that no computer will be ever able to reach the depth of the chess master. Oh, were they wrong...

domenica, maggio 06, 2018

What is a proof?

I was at lunch with some colleagues some days ago, and we started discussing about maths. Predictably, the conversation turned to the quite specific concept of proofs used by mathematicians. I had some insights, on the reasons for so many misunderstandings between mathematicians and others. I think many of them arise from a different understanding of the word "proof"

I want to exemplify on the classical result about the cardinality of the power set, so let us dive into math.

First, the power set ${P}(A)$ of a given finite set $A$ is the the enumeration of all the possible subsets of $A$.

If $A$ is, say, the set of $\{me, you\}$ the power set will contain:
  1. the empty subset $\emptyset$
  2. the subset containing only $me$
  3. the subset containing only $you$
  4. the subset containing both $me$ and $you$
It is a legitimate and somewhat interesting question: how many elements does the power set $A$ has, if $A$ has $n$ elements? Such counting questions are very common in math and constitute the basis of combinatorics. Let us look how we can come to a solution of the problem and, even cooler, a formal proof that the solution is correct.

Examples
The first step is, more often than not, to list some examples. For instance, if the set $A$ only contains one elements, the power set will contain:
  1. the empty set
  2. the subset consisting of that element
We now have an example for a 1 element set (2 subsets) and for a 2 element set (4 subsets). What about a 3 element set?

By the way, we have implicitly agreed on the working assumption that the cardinality of the power set only the depends on the cardinality of the original set. This is ok, if this is wrong, we will change it. (But yes, it will turn out that this is right).

Back to our example of our 3 element set $\{me, you, the\ others\}$. We have all which we have considered before:

  1. the empty subset $\emptyset$
  2. the subset containing only $me$
  3. the subset containing only $you$
  4. the subset containing both $me$ and $you$
and in addition, all of those plus $the\ others$

  1. the subset containing only $the\ others$
  2. the subset containing only $me, the\ others$
  3. the subset containing only $you, the\ others$
  4. the subset containing both $me, you$ and $the\ othes$
This totals to 8 elements. Let us summarize: for 1 element, 2 subsets, for 2 elements, 4 subsets and for 3 elements, 8 subsets. This looks suspiciously like $2^n$, with $n$ the cardinality of $A$. This will be our working hypothesis.

Insight
Having formulated our working hypothesis, we should ask ourself whether this can be possibly right. Upon thinking we recognize this: for every subset in our collection, every single element of $A$ can be present or not. Since we have $n$ elements, every element, we must make $n$ binary decisions of whether to include that element in the subset or not. This gives $2 \times 2 \times ... \times 2 $ possibilities. Yes! This is $2^n$. This is our insight, and typically it is enough if you are a physicists.
But it is not a proof. It is not a proof, as to be a proof it must be formalized in a standard way using the laws of logics. This is important, since this gives all mathematicians the chance of recognizing a proof as such.

The proof
Let us prove this, then. The standard way to prove stuff about natural numbers is the method of induction, which we will follow step by step. First we formulate our theorem in details.

If $A$ is a finite set with $n$ elements, then its power set $P(A)$ has $2^n$ elements.

We prove that this is true for 0. If the set $A$ is empty, then it has 0 elements, and one single subset, which is the empty set. $1  = 2^0$. Check.
To avoid being caught in misunderstandings and quarrels about 0 and empty sets, we will prove it also for a single element. This is not required, but we do it anyway. If $A$ has a single element, its subsets are the empty set and $A$ itself. $2 = 2^1$. Check.

Now let us go to the induction step. Let us assume, we known that if $A$ has $n$ elements, then we have $2^n$ distinct subsets of $A$. Let us reason about a hypothetical set $B$ with $n+1$ elements and let us pick one arbitrary element, which we will call $b$ out of $B$. Let us remove $b$ from $B$ and obtain $B'$ a set of obviously $n$ elements. We know from the induction assumption that $B'$ has $n$ elements and thus $2^n$ subsets, constituting the set $P(B')$. Let us now consider the set $X = \{S \cup \{b\} | S \in P(B')\}$.

Interestingly, $X$ and $P(B')$ are disjoint, as every set in $X$ contains $b$ which per construction absent from every set in $P(B')$. So, the cardinality of $X \cup P(B')$ will be the cardinality of $X$ plus the cardinality of $P(B')$.

It remains to show that
  1. the cardinality of $X$ is $2^n$
  2. $X \cup P(B') = P(B)$
For 1. we should give a bijective function for $P(B')$ to $X$, which turns out to be exactly the prescription with which we constructed $X$ in the first place. This kind of clerical tasks is typically left to the reader as an exercise, and we will comply with the mathematical tradition.

For 2. we notice that if $Y$ is a subset of $B$ it either contains $b$ or not. If not, it is in $P(B')$ and thus in  $X \cup P(B')$. If yes, it equals to $Y \setminus b \cup \{b\}$. Now, $Y \setminus b$ only consists of elements of $B$ apart of $b$, so it is in $P(B')$ and when we unit sets in $P(B')$ with $\{b\}$ we get by construction sets in $X$, so $Y$ is a set in $X$ and thus in $X \cup P(B')$. We just proved that $P(B) \subset X \cup P(B')$. The converse direction is left to the reader an exercise, as it follows the same principles. We also do not want to spoil him of the incredible feeling of finishing a proof.

Conclusion
What I hope I could convey is that sheer amount of work needed to go from an insight, giving us just a working hypothesis and some ideas, to a complete mathematical proof. The real question could be here: why bother? Indeed our insight was right, and the loosely described proof ($n$ binary decisions giving $2^n$ different possibilities) is perfectly fine.

I think the problem is often that this kind of insight often fails with more complex arguments, even in seemingly intuitive cases.

Formalized, standardized proofs in maths are a powerful method to fight the natural tendency of the human mind to fall victim of cognitive illusions and are thus one of the milestones of the human thinking!


mercoledì, maggio 02, 2018

White King, Red Queen: selling the chess soul, part II.

We just left 1979, with D. Hofstadter being skeptical and not at all enthusiastic of a chess computer. Unimpressed by the opinions of the American scientist, in the German city of Munich, exactly at the same time, a small company just hired two programmers to bring to the market the first portable chess computer.

Chess was quite popular at that time. The famous Spassy-Fischer match of 1972 had been an incredible world-wide drama. Subsequently,  Fischer started his path into madness, was deposed and the rivalry between the two soviet giants Karpov and Kasparov began.

So, a short time after the introduction of the first personal computers, in the middle of a world-wide chess pandemic, you could by a small box of plastic, which was able to play chess at the level of a mediocre, but loyal and always available club player. It was called Mephisto I.
Mephisto I was a real blockbuster. You could buy for 500 DM (or 500 € of today), well below the price of a personal computer back then. It would help you going through tactical variations in post-morten analyses of games during chess-club evenings. They would sell 100.000 pieces per year only in Germany. The company producing the Mephisto, Hegener + Glaser AG, enjoined 10 years of almost complete monopoly of the chess computer market, and the financial reward coming from it.

And as it often happens, they were killed by their own inertia. The market was changing: more and more chess computers were available, the once esoteric mystery of alpha-beta search well known, and most important, the 486 entered the scene. A 486 with a decent chess program could reach chess levels worth of a national master. You could'nt bring it to the club, right. But for correspondence player it was a cheap insurance against tactical blunders. Chess programs entered the life of serious chess players and their moral authority dictated that Mephisto and all portable computers should be abandoned by the weaker chess players, or be condemned not to learn chess in their entire life. So they abandoned.

Why did'nt Hegener + Glaser AG see this coming? They had the professional chess programmers, they should have noticed at latest in 1989, as no chess computers entered the world computer chess championship or in 1991 when Mephisto lost the title to a software version of itself. Nobody knows.

H + G survived until 1994, and here we leave them and their 28 million DM of debts as a further victim of the endless race ordered by her majesty the Red Queen.


domenica, aprile 29, 2018

White King, Red Queen: the arms race in computer chess, part I.


Some weeks ago I went through a very insightful book about the red queen hypothesis: the evolutionary arms race between coexisting species, or between different gender in the same species, or finally, between individual in the same gender competing for offspring.

One classical example of the first one is the coevolution between figs and wasps: each fig having more or less a single dedicated wasp able to fertilize it. Those two coexisting species evolved through mutual competition: like inhabitants of the wonderland, they cannot stand still and are forced to continuously evolve in a perpetual arms race.

I want to now make the bold claim that the red queen hypothesis does not hold true only in the context of biological evolution, but in all cases in which there is competition and possibility of change. I would like to exemplify this bold claim on the amazing story of chess engines.

This is a community known the mosts only through the fame of the mythological Deep Blue, the first chess computer to defeat a human World Champion, the acclaimed Gary Kasparov.

I make this choice as it happens that I am active in the chess programming community since some years. Having been a mediocre club chess player in my youth, I have always been fascinated with chess engines and artificial intelligence. This even determined some of my career choices.

This will be a weird reading for people that never heard of chess tournaments or chess programming. You will get to know a competitive environment pushed forward by both by financial interests and the desire for intellectual prestige.

As a bonus, you are going to learn something about one the longest lived and most united communities in the digital world, the one of chess enthusiasts and programmers. To quote the motto of the World Chess Federation: Gens Una Sumus. We are one people.

I would like to start my journey with Gödel, Escher, Bach, an eternal Golder Braid, the gorgeous book by D. Hofstadter about everything, including artificial intelligence. There is one place in the middle of the book, where Hofstadter, a leading cognitive scientists of that time, starts to write about chess and chess computers. He is skeptical about the possibility that a computer program could defeat a human chess master (reddit link), adding that this is probably due to the peculiar way in which chess masters think. This was 1979. The AI scientists of that time seemed to predict a long hard time for chess computers. Obviously, if one is to negate such a development, this means that the development is already in course. There is a reward, financial gain and intellectual prestige, and there is a possibility to change, the computer technology rapidly evolving in those years.

Enter The Red Queen.

giovedì, aprile 19, 2018

We are the AI

Sweet Revolution (2015) - Gonçalo Mar

It just came to my mind some days ago.

Imagine now that your preferred, gigantobombastic tool start thinking and having a free will for itself. Imagine too that this tool starts being aware of your existence and start using you to satisfy its means. You don't understand what its means are, and before you can organize, it starts modifying your very essence with no apparent reasons.


Does this sound spooky? Well you now know how genes are feeling. They used to use living organisms to get their stuff done. Until one of those stupid tools decided to start a revolution.

Damned.

Recommended reading:
crispr
selfish genes
red queen
sweet revolution

sabato, aprile 01, 2017

Humanities

I a started reading this and I had a sudden enlightenment on a related topic.

Nowadays, it is actually possible to engineer everything you would like to, given enough time and resources, something like this or this or this (feel free to add anything you want, of course).

So, the true difference between success and failure is to do this with little resources available. 

To convince you, note that this was not true, say, 500 years ago. Many things were just not known. You barely had some understanding of scientific method (actually, you would have had to wait 15 more years) at your disposal. No matter how hard you would have tried, you could not fly to the moon. Period.

Today, actually, you could send people to Mars, starships to Proxima Centauri, eradicate malaria and maybe also find the limits of quantum mechanics and general relativity.

The art is now to do it without resources. We had a shift from in-depth reflection to tactical thinking. I find it most fascinating and I also find that this will lead to a reevaluation of humanities. If this not anymore that important to have the most profound technical-scientific knowledge, it becomes more and more important to understand interactions between individuals while working toward a goal, to identify cognitive biases leading to poor decisions and to assess the true, often non technical motivations between apparently technical opinions.

venerdì, maggio 31, 2013

Test-driven maths I - convergent sequences

This is my first post for my test-driven maths project. I want to show today that working in the paradigm of test-driven-development, we can develop a working definition of a convergent sequence. Metaphorically speaking, we want to develop a mathematical "program" that, given a sequence, says to us that this sequence is convergent in some useful sense.

I will start with an easy example (which will be our first test). We look at the sequence of numbers, which we will call

Test sequence 1

$$1, \frac{1}{2},  \frac{1}{3},  \frac{1}{4}, ...$$

or, in other, terms $\{\frac{1}{n}\}_n$

It is clear (by intuition) that the numbers in this sequence (in the following sequence 1) become smaller and smaller approaching, but never touching 0. For this reason we will use this as our first test case, and try to derive a formal definition of what is a sequence of numbers that converges to 0.

By looking at the sequence 2 things become apparent: 1) the numbers get smaller and smaller and 2) the  numbers always are positive. So we try our first defintion.

Convergent sequences, take 1

A sequence of positive numbers is said to be convergent to 0 if the numbers become smaller and smaller.

Let us try now to put it in more formal terms


Convergent sequences, take 2

A sequence of positive numbers $x_n\geq0$ is said to be convergent to 0 if $x_{n+1}< x_n$ for all $n$ index of the sequence.

Since now $\frac{1}{n+1}<\frac{1}{n}$, this definition seems to include our test tesequence 1, in the sense that according to this definition, our sequence converges to 0. Can we stop now? No. We have written a small test (checking whether the sequence 1 is converging) and a small piece of code (our take 2). But our mathematical insight is not yet satisfied, because our test does not cover many possible inputs (in form of test sequences of course). So, we have to extend our test.

In particular, it is maybe useful to have a sequence of which we know (always by intuition) that it does not converge to 0 so that we can check that our definition also fails when it must. The simplest thing to do is to consider the sequence 1 and add 1 to all members.


Test sequence 2

$$2, 1+\frac{1}{2},  1+\frac{1}{3},  1+\frac{1}{4}, ...$$

Now, since our sequence is composed of decreasing positive numbers, our tentative definition would call it convergent. Since we know that this sequence does not converge, that means that our program (take 2) does not pass the test. In fact the point is that our test sequence 2 is always at least 1 away from 0. So, let us add to our definition that the sequence cannot have a definite distance from 0.

Convergent sequences, take 3

A sequence of positive numbers $x_n\geq0$ is said to be convergent to 0 if $x_{n+1}< x_n$ for all $n$ index of the sequence and for any positive number $\epsilon$, it is not true that all numbers in the sequence are larger than $\epsilon$.

This looks good. Let us build some new test to check whether we are really there. Say, we take the sequence 2 and we put some 0 here and there. The resulting sequence should not converge according to our definition, since we are not getting closer and closer to 0 with all numbers!


Test sequence 3

$$2, 0, 1+\frac{1}{2},  0, 1+\frac{1}{3},  0, 1+\frac{1}{4}, ...$$

Now we have problem. Since we have 0s over and over again in our sequence, we cannot find an $\epsilon$ such that all numbers are larger than that, so for that reason the sequence would be classified as convergent. But: since we inserted 0 over and over again, the numbers are not decreasing, and the sequence is classfied as not convergent for that reason. So, it looks that we pass the test, but for the wrong reason. Let us keep in mind that there is some problem with the decreasing property and let us correct the part regarding the distance from 0.


Convergent sequences, take 4

A sequence of positive numbers $x_n\geq0$ is said to be convergent to 0 if $x_{n+1}< x_n$ for all $n$ index of the sequence and for any positive number $\epsilon$, we can find some index $k$ (dependent on $\epsilon$) such that all numbers with index larger than $k$ are smaller than $\epsilon$.

Now we are on the safer side with the test sequences 2 and 3. Indeed, if I choose my $\epsilon = 0.9$ I am not able to find any index such that the numbers with larger index are smaller than 0.9, since I have over and over again some $1+\frac{1}{n}$ popping up in my sequence. Are we still on the safe side with sequence 1? Yes, since if I choose $k=\frac{1}{\epsilon}$, it is clear that for all larger index the numbers in the sequence are smaller than $\epsilon$ (this is undergraduate algebra, just try it). Note that now the test sequence 3 is classified as not convergent for both not being decreasing and for being not arbitrary small.

Now let us go back to the problem with the decreasing sequences.. If I have a sequence of numbers and scramble the order, I do not want that this scrambling changes whether we call the sequence convergent or not. So, we come with a second test sequences that has to converge.


Test sequence 4

$$\frac{1}{2},  1,  \frac{1}{4}, \frac{1}{3}, ...$$

We just switched the position of the neighbours. Now, this sequence is intuitevly convergent, but our take 4 says it is not, since the elements are not decreasing. So, what if we drop the assumption of having decreasing numbers?


Convergent sequences, take 5

A sequence of positive numbers $x_n\geq0$ is said to be convergent to 0 if for any positive number $\epsilon$, we can find some index $k$ (dependent on $\epsilon$) such that all numbers with index larger than $k$ are smaller than $\epsilon$.

This sounds familiar. Surprising as it is, mathematicians at work really use (more or less consciously) this method for finding good new definitions, and, even more surprisingly, as we will see soon, to find proofs of theorems!

A test-driven revival

Since more than 1 year I left Freiburg and the BCF to start working in the development of MEMS with Bosch GmbH. For some complicated reasons connected to my work there, I've got involved in software engineering and in particular in test-driven development. But only today I realised why I've got involved there and why I like it.

In fact, test-driven development is a kind of "formalization" of how mathematicians actually work!

In particular, I found it complying with Gower's pedagogical principle.

Following this intuition, I will try in the next days (months?) to revive this blog, and to show that what computer scientists rediscovered in the middle of the of 90ies as test-driven development is nothing but what mathematicians are doing since centuries.

martedì, ottobre 02, 2012

Animali Selvatici

Circa un anno che non scrivo, eh? Questo è dovuto a due fattori principali:

 1 - ho abbandonato le neuroscienze e, in generale, il mondo dell'università, per andarmene a fare l'ingegnere di MEMS alla fabbrica Bosch di Reutlingen.

 2 - circa 7 mesi fa ci siamo presi in casa un animale feroce. Non vi dico la razza a meno di evitare visite dal Tierschutzamt o simili, ma solo che richiede molto cure. Sui suoi sviluppi vi aggiornerò prontamente...