martedì, maggio 29, 2018

Never lose money (or anything else)

As I mentioned some days ago, I have just gone through a biography of Warren Buffet. One of his principles, it seems, has been to force company managers to always return sizable value to the shareholders, even when this means to reduce investments in growth. 

In a way, this reminds me of the agile obsession in software development to continuously return value to the stakeholder: small, stable, robust increments, delivering additional value to the company and the stakeholder. Abhor big-bang releases, make sure to get from a stable situation in another stable one and always make sure to give your shareholders some value, if only in the form of added knowledge.

I even suspect that the terminology used in the agile community is, at the very least, inspired from value-based investment theory. You often read about stakeholders, return on invest and similar vocabulary.

I have to admit that this point of view has a great appeal to me. I personally see applications of this principle in other situations. Some years ago, I was reading a book about relationships and communication in married couples (no, I cannot remind the title this time) and one of the behavioral psychologists authoring the book observed that much suffering in modern world arise from the underestimation of the effect of the pain arising from a divorce or otherwise separation and from the failure to objectively manage the unpleasant situation of a difficult relationship. The unexciting course of action would be to give to (and request from) the partner a small, but constant return on emotional invest. In the book, they even used the word "relationship account" and devised a structured approach to make (and request) small, continuous payments to this account. I have to admit that this structured approach has navigated my wife and myself out of dangerous waters in bad periods.

In contrast, a large investment with an uncertain return (separation and hope in a better future relationship, to stay in the current example) is often preferred to the small return of a current difficult relationship, and one fails to focus on effective measures to increase this small, but steady advantage.

As I was discussing exactly this topic with my wife some days ago (she always has to endure my rants about whatever book I read last), I just realized that this effect is wonderfully explained in in the masterpiece Thinking, fast and slow in terms of cognitive illusions. In fact, what happens here is the overconfidence in the own imagination of future situations due to the cognitive ease with which we ignore the difficulties which will arise in the future, just because we don't see them now. And, of course, we overestimate the effect of our actions since we can easily imagine them. 

A future great reward with little chance of realizing is preferred to less exciting options, in particular when the current situation is slightly adverse. So, never lose money or anything else!

venerdì, maggio 25, 2018

Reading about debt

I often talk to friends and family about books, often in a overenthusiastic manner. I like books and I like, more than everything else, to find useful stuff for real life in books.

If you're now expecting an apology of "Mindfulness for dummies" and the like, I have to upset you; I found out that the treasures are to be found in unexpected places.

As an example, take my last two reads: Debt, the first 5000 years and Buffett, the making of an American capitalist (not finished yet, but I also learned something very useful there, which does not have anything to do with investment, but let me talk about it some other time).

Does it sounds weird to have a leftist anti-capitalist book about debt in the direct neighborhood of the very generous biography of man with US$ 87 billions net worth? Don't worry about my mental stability, I do it by purpose. 

(Which of course is not a sign of mental stability, now that I think about it)

The first book investigates the origin of money in the credit instruments of ancient civilizations. Yes, it looks like it is an established fact that the Babylonian had financial instruments recorded on clay tables in a fictive currency (the bushel) which was only used for recording these financial transactions. And yes, they knew about compound interest. And no, they did not have any money you could exchange on the street. Crazy.

Anyway, there are also pages dedicated to the fact that the precise quantification of debts between neighbors and acquaintances tends automatically to depersonalize these relationships. This being the reason, for instance, for which we do not like to precisely keep track of who owes whom how much money if we go out with our best friend for a beer.

My wife is thinking since some time about turning her year long practice in mindfulness mediation and her teacher experience with it in a full-time activity mixing self-employment and community service. We talked a lot about financial independence and the like in the last weeks. And then BANG! I read this book and I understand why the financial dependence feels so threatening in a relationship. Suddenly, I have a framework on how to talk and think about it.

I think this is the sign of great books: you find insights and inspirations there which do not have anything to do with the apparent topic in the book.

Good reading!

lunedì, maggio 14, 2018

White King, Red Queen: a technical primer about trees, part III.

In the previous posts (I, II), we have seen the arms race around computer chess started. We have seen the first ideas about the possibility of building a true chess computer. We witnessed a German company doing it and being displaced by the newborn 486 and we left the an army of chess programmers getting ready to assault Gary Kasparov dominance over the world of chess.

At this time, we are talking about 1994, many people not directly involved in technology still wouldn't think that the end of human chess dominance was only 3 years away. I witnessed exactly this period as a young club player and I have vivid memories from that time. Indeed, the explanation I give you later is more or less what I got back then in 1996 by a chemistry professor interested in game theory and it still holds true.

The main reason for the supposed inability of computers, was that they were thought to purely use "brute force" to find the best moves, whereas master were known to be able to have long-term understanding of the game. The true skill of the masters, so the mainstream, was based on being able to recognize positional advantages which would bear fruit 20 moves or more later, and such a far horizon was thought to be unreachable by standard alpha-beta engines. Ooops, I did it. A technical word. I think the moment has come for you, my faithful reader, to really understand how a chess engine thinks. It will be quick, it could even be painless, so let us look at it in more detail. 

Down the very basics, a chess engine is composed by 3 fairly independent parts: the moves generator, the static evaluator and the search function.

The moves generator
First a remark: chess programmers often talk about chess positions as of FENs. These are just shorthands for describing things like "White has a king on the right bottom square h1, a pawn on the square c4 and a bishop on f2. Black only has a king on h8". Here is the FEN for the starting position
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
The move generator gives you for any possible FEN all legal moves which can be done by the side to move.

The static evaluator
A very important element is to quantify how a position looks better than another one. This is done by the static evaluator. The static evaluator gives you for any FEN a value, representing its objective evaluation of the position. Traditionally, positive numbers mean white is good, negative numbers mean black is good. How could a static evaluator look like? It could be for instance something like: count all white pawns, subtract the number of black pawns and multiply by 10. In this case, for instance, both a position with 3 white pawns and 2 black pawns and a position with 4 white pawns and 3 black pawns will give you a value of +10. Obviously static evaluators of real chess engines are much more complex than that, having hundreds of elements. For instance Stockfish main evaluation function (Stockfish is the leading chess engine currently) has ca. 800 lines of C++ codes, with an element typically be coded in 5-10 lines, and this is only the main evaluation function. As I said, hundreds of elements.

The search
But the core of a chess engine is the search. Ahhh, the search. To understand how the search works, it is useful to make the following experiment. Imagine you couldn't play chess at all. Zero knowledge of the rules. The only thing you know is how to translate a chessboard position in a FEN. Plus, you are given a moves generator and a static evaluator. You start playing. You are white and you have to move. Since you have a move generator, you know what moves you could do, and you have a fantastic idea: you will just try them one by one and keep the one which gives you the best, in other words: maximal, static evaluation. If you want to spend some time, with this tool you can do the exercise with Stockfish. It turns out that moving the pawn in front of the king by two squares gives you a evaluation of 218. Cool! Let us move that pawn forwaaaaard!
But hang on! Black can do this too. Black would at its turn chooses the moves which maximize its value. This means, he will go through all possible replies and choose the one which minimizes my value. So it could be, in principle, that a possible reply to moving my king pawn by 2 squares give black a better value than all possible replies to moving my queen pawn by 2 values! 
In fact it is so, if I move my king pawn by 2 squares, the best he can do is to move his queen pawn by 2 squares and get a -20. In contrast, if I move the queen pawn by two squares, he can get is a +20 (again by moving its queen pawn by two squares). So, by minimizing on its turn, he invalidates my assumption that I just have to choose the move after which the static evaluation is best.
This is now bad, because it could be that there still another one of my possible 20 first moves that it is even better, in the sense that it makes the minimizing task for black even worse.
Now that we think about it, we should also consider that we also could reply to its replies, again trying to maximizing our value with our second move, so we have to also take this is consideration. And he could reply by minimizing and so forth and so on.
This loosely described algorithm is called minmax and is the basis of the chess engines. You just unroll all possible moves until, say, the 10th reply (technically: ply) you look at the values at the end of this unroll and both players choose alternately the minimizing or maximizing move. Minmax. Interestengliy, there is a variation which is equivalent and which is called alpha-beta which saves you some of the comparisons.

All competitive modern chess programs use some form of alpha-beta with variations and refinements. In this sense, they use brute force, in the very end. They just unroll the moves and do minmax. More brute force is not possible.

Now, if you do the math, you will notice that at every reply you can choose between 20-30 moves. This means, the possible positions after, say, 20 plies are 20^20, which is much more that you could store on your computer, even if every position and all information about it would stay on one bit. Alpha-beta gives you some more bang for the bucks, but not that much.

So, it was though that no computer will be ever able to reach the depth of the chess master. Oh, were they wrong...

domenica, maggio 06, 2018

What is a proof?

I was at lunch with some colleagues some days ago, and we started discussing about maths. Predictably, the conversation turned to the quite specific concept of proofs used by mathematicians. I had some insights, on the reasons for so many misunderstandings between mathematicians and others. I think many of them arise from a different understanding of the word "proof"

I want to exemplify on the classical result about the cardinality of the power set, so let us dive into math.

First, the power set ${P}(A)$ of a given finite set $A$ is the the enumeration of all the possible subsets of $A$.

If $A$ is, say, the set of $\{me, you\}$ the power set will contain:
  1. the empty subset $\emptyset$
  2. the subset containing only $me$
  3. the subset containing only $you$
  4. the subset containing both $me$ and $you$
It is a legitimate and somewhat interesting question: how many elements does the power set $A$ has, if $A$ has $n$ elements? Such counting questions are very common in math and constitute the basis of combinatorics. Let us look how we can come to a solution of the problem and, even cooler, a formal proof that the solution is correct.

Examples
The first step is, more often than not, to list some examples. For instance, if the set $A$ only contains one elements, the power set will contain:
  1. the empty set
  2. the subset consisting of that element
We now have an example for a 1 element set (2 subsets) and for a 2 element set (4 subsets). What about a 3 element set?

By the way, we have implicitly agreed on the working assumption that the cardinality of the power set only the depends on the cardinality of the original set. This is ok, if this is wrong, we will change it. (But yes, it will turn out that this is right).

Back to our example of our 3 element set $\{me, you, the\ others\}$. We have all which we have considered before:

  1. the empty subset $\emptyset$
  2. the subset containing only $me$
  3. the subset containing only $you$
  4. the subset containing both $me$ and $you$
and in addition, all of those plus $the\ others$

  1. the subset containing only $the\ others$
  2. the subset containing only $me, the\ others$
  3. the subset containing only $you, the\ others$
  4. the subset containing both $me, you$ and $the\ othes$
This totals to 8 elements. Let us summarize: for 1 element, 2 subsets, for 2 elements, 4 subsets and for 3 elements, 8 subsets. This looks suspiciously like $2^n$, with $n$ the cardinality of $A$. This will be our working hypothesis.

Insight
Having formulated our working hypothesis, we should ask ourself whether this can be possibly right. Upon thinking we recognize this: for every subset in our collection, every single element of $A$ can be present or not. Since we have $n$ elements, every element, we must make $n$ binary decisions of whether to include that element in the subset or not. This gives $2 \times 2 \times ... \times 2 $ possibilities. Yes! This is $2^n$. This is our insight, and typically it is enough if you are a physicists.
But it is not a proof. It is not a proof, as to be a proof it must be formalized in a standard way using the laws of logics. This is important, since this gives all mathematicians the chance of recognizing a proof as such.

The proof
Let us prove this, then. The standard way to prove stuff about natural numbers is the method of induction, which we will follow step by step. First we formulate our theorem in details.

If $A$ is a finite set with $n$ elements, then its power set $P(A)$ has $2^n$ elements.

We prove that this is true for 0. If the set $A$ is empty, then it has 0 elements, and one single subset, which is the empty set. $1  = 2^0$. Check.
To avoid being caught in misunderstandings and quarrels about 0 and empty sets, we will prove it also for a single element. This is not required, but we do it anyway. If $A$ has a single element, its subsets are the empty set and $A$ itself. $2 = 2^1$. Check.

Now let us go to the induction step. Let us assume, we known that if $A$ has $n$ elements, then we have $2^n$ distinct subsets of $A$. Let us reason about a hypothetical set $B$ with $n+1$ elements and let us pick one arbitrary element, which we will call $b$ out of $B$. Let us remove $b$ from $B$ and obtain $B'$ a set of obviously $n$ elements. We know from the induction assumption that $B'$ has $n$ elements and thus $2^n$ subsets, constituting the set $P(B')$. Let us now consider the set $X = \{S \cup \{b\} | S \in P(B')\}$.

Interestingly, $X$ and $P(B')$ are disjoint, as every set in $X$ contains $b$ which per construction absent from every set in $P(B')$. So, the cardinality of $X \cup P(B')$ will be the cardinality of $X$ plus the cardinality of $P(B')$.

It remains to show that
  1. the cardinality of $X$ is $2^n$
  2. $X \cup P(B') = P(B)$
For 1. we should give a bijective function for $P(B')$ to $X$, which turns out to be exactly the prescription with which we constructed $X$ in the first place. This kind of clerical tasks is typically left to the reader as an exercise, and we will comply with the mathematical tradition.

For 2. we notice that if $Y$ is a subset of $B$ it either contains $b$ or not. If not, it is in $P(B')$ and thus in  $X \cup P(B')$. If yes, it equals to $Y \setminus b \cup \{b\}$. Now, $Y \setminus b$ only consists of elements of $B$ apart of $b$, so it is in $P(B')$ and when we unit sets in $P(B')$ with $\{b\}$ we get by construction sets in $X$, so $Y$ is a set in $X$ and thus in $X \cup P(B')$. We just proved that $P(B) \subset X \cup P(B')$. The converse direction is left to the reader an exercise, as it follows the same principles. We also do not want to spoil him of the incredible feeling of finishing a proof.

Conclusion
What I hope I could convey is that sheer amount of work needed to go from an insight, giving us just a working hypothesis and some ideas, to a complete mathematical proof. The real question could be here: why bother? Indeed our insight was right, and the loosely described proof ($n$ binary decisions giving $2^n$ different possibilities) is perfectly fine.

I think the problem is often that this kind of insight often fails with more complex arguments, even in seemingly intuitive cases.

Formalized, standardized proofs in maths are a powerful method to fight the natural tendency of the human mind to fall victim of cognitive illusions and are thus one of the milestones of the human thinking!


mercoledì, maggio 02, 2018

White King, Red Queen: selling the chess soul, part II.

We just left 1979, with D. Hofstadter being skeptical and not at all enthusiastic of a chess computer. Unimpressed by the opinions of the American scientist, in the German city of Munich, exactly at the same time, a small company just hired two programmers to bring to the market the first portable chess computer.

Chess was quite popular at that time. The famous Spassy-Fischer match of 1972 had been an incredible world-wide drama. Subsequently,  Fischer started his path into madness, was deposed and the rivalry between the two soviet giants Karpov and Kasparov began.

So, a short time after the introduction of the first personal computers, in the middle of a world-wide chess pandemic, you could by a small box of plastic, which was able to play chess at the level of a mediocre, but loyal and always available club player. It was called Mephisto I.
Mephisto I was a real blockbuster. You could buy for 500 DM (or 500 € of today), well below the price of a personal computer back then. It would help you going through tactical variations in post-morten analyses of games during chess-club evenings. They would sell 100.000 pieces per year only in Germany. The company producing the Mephisto, Hegener + Glaser AG, enjoined 10 years of almost complete monopoly of the chess computer market, and the financial reward coming from it.

And as it often happens, they were killed by their own inertia. The market was changing: more and more chess computers were available, the once esoteric mystery of alpha-beta search well known, and most important, the 486 entered the scene. A 486 with a decent chess program could reach chess levels worth of a national master. You could'nt bring it to the club, right. But for correspondence player it was a cheap insurance against tactical blunders. Chess programs entered the life of serious chess players and their moral authority dictated that Mephisto and all portable computers should be abandoned by the weaker chess players, or be condemned not to learn chess in their entire life. So they abandoned.

Why did'nt Hegener + Glaser AG see this coming? They had the professional chess programmers, they should have noticed at latest in 1989, as no chess computers entered the world computer chess championship or in 1991 when Mephisto lost the title to a software version of itself. Nobody knows.

H + G survived until 1994, and here we leave them and their 28 million DM of debts as a further victim of the endless race ordered by her majesty the Red Queen.