Lap(l)aciano: algoritmico

Visualizzazione post con etichetta algoritmico. Mostra tutti i post

domenica, agosto 12, 2018

A word about the AI hype in chess

Some months ago, many of you have probably heard how a neural network trained from scratch in an afternoon by Deep Mind defeated the strong chess engine in the world, Stockfish.

Deep Mind get the hype started

This legit hype was started by Deep Mind, a subsidiary of google strong in artificial intelligence, published the paper "Mastering Chess and Shogy by Self-Play with a General Reinforcement Learning Algorithm".

In this paper, they demonstrate a general learning algorithm which was able to train a a neural network which would power a Monte Carlo Treee Search based chess engine to such a level to be competitive in chess with top notch engines. They measured competitiveness in chess by letting their engine playing a 100 games match against the back then latest release of Stockfish, Stockfish 8, the world-leading chess engine, maintained by a team of volunteers (disclaimer: I am part of this team) . It won by a convincing margin of 28 wins, 72 draws and 0 losses, in their own words,

calling into question the widely held belief that alpha-beta search is inherently superior in these domains

I do now want to dispute this call by the Deep Mind team, but it could be done based on the match condition. It can be observed that, if one of the goals of this claim was to generate a hype, then it succeeded.

Open source takes over

Following this paper, a team of developer decided to start an open-source project to replicate the results. The ultimate goal was (and is) to show that this calling into question is a well-founded claim. They had of course some work to do: the paper by Deep Mind contained many details, but not all of them. Also, one of the criticisms was the Deep Mind engine was not a real UCI-compliant chess engine, and wouldn't have been able to participate in chess computer tournaments. So they had to replicate the scientific part of the paper, and also make a real chess engine out of it, being able to compete in chess tournaments. In that, they were supported by nonetheless than Gary Linscott, the gifted programmer who built Stockfish testing framework, see a previous post.
One very important remark: this neural network based engines need GPUs to work well, as the generic type of neural network used there are specifically designed to make use of the massive matrix multiply capabilities of GPUs.

LC0 go to TCEC

After some weeks in the project, the LC0 team (this is the name of the open-source project) was invited to send their program to TCEC. TCEC stands for top chess engine competition: it is a league where all chess engines can compete on standardized hardware. When invited, they start in division 4 and they can work themselves up to division 3,2,1,premier. The first two engines in division premier fight a titanic superfinal over 100 games. Last season (number 12) was won by Stockfish 9. Since the divisions are played out serially, for an engine it is possible, in principle, to start in division 4 and, in the same season, climb up all the divisions and win the Superfinal. TCEC management, however, decided not to invest in a GPU, yet as they were not sure of whether LC0 would keep existing and also for the speed in which they had to organize it. To make the story short, LC0 finished last.

LC0 hype grows

This did not discourage the team. They knew that the lack of a GPU would seriously cripple the engine and so they continued working, recruiting tester around the world for playing millions of games. In the meanwhile, private people would test the engine and publish the results online. In particular, kingcrusher, a well known and respected chess expert with 100k youtube follower, decided to back the project posting videos and analyses of this awesome new engine. The were invited back to TCEC, which was able to organize access to a server with 2 high-end GPUs. It was predicted that LC0 would rush through the divisions and challenge Stockfish itself. To add more drama, some days before the start of Season 13, it was announced that a second neural network engine would enter division 4. After some speculations it turned out that it was a more or less a derivative of LC0 and arguments followed. Nevertheless, this second engine, DeusX, was allowed to enter TCEC too. Expectations were running high for the new era of computer chess. The hype was at its maximum.

Full power neural network at TCEC

As I write now, division 4 ended some days ago. With a sort of an anticlimax, after some games, it was clear that, although LC0 and DeusX were very strong engines, they would not rush through divisions. LC0 finished first, and DeusX second, qualifying by 0.5 point margin for division 3. Division 3 is in course now: between the divisions, the admin of the site was started, and the new one had some experience with LC0 itself and was able to improve the settings of the dedicated server to give the two neural networks some more power. They also submitted updated engines. In spite of the efforts, after 20% of division 3 played out, they float in the mid-range of the league. It is somewhat expected that the division will be won by Ethereal, another very young, though classical engine, developed by a 2 persons team. The second slot for qualification is still open. Currently it seems plausible though not sure that one of both neural engines will win a ticket.

What do we learn from the hype

Maybe neural networks are the future of computer chess. Maybe not. Time will say. As usual, there is no free lunch and you shouldn't run where they say you get one. And by the way, don't do momentum trading. Avoid diets. And most important, if you start from scratch, don't expect to succeed soon.

domenica, luglio 22, 2018

White King, Red Queen: the matrix. Part VI.

They are coming.

In 2009 Hiarcs 13 on a smartphone dominated a world-class chess tournament with humans. At latest in the moment, it should have became clear that the world of computer chess should get separated by the world of human chess.

Indeed they were: as an example, in 2006 a group of compute chess enthusiasts founded the CCRL rating list, dedicated to testing of chess engines in controlled hardware conditions. They would donate their computers to let engines playing against each other just for the pleasure of compiling a list of the strongest chess computers. This list, still maintained after 12 years, contains 353 engines as of July 22th, 2018, many of them tested in multiple versions.

The arena

We have seen it often: if an appropriate, fair arena is set up, with a meaningful reward, people will gravitate towards it and, given time, the arms race will begin. In the 2000ies, many of such arenas were being made available, in the form of rating lists, computer chess tournaments and the like. The arena was set. As for the reward, chess engines were now very hot even for professional players and you could still sell them as programs for up to 100 $. If you think that you can squeeze a chess engine under 10k-20k lines of code, and you only need to comply with a specified protocol, it sounds like the perfect challenge for algorithms enthusiasts. The realm of nerds, so to speak. And the nerds came. In few years, we had many engines fighting for the title of strongest chess program: Rybka, Houdini, Komodo. And then came Stockfish.

The crowd

Until 2013, Stockfish was a good, but not top open-source chess engine. Until Gary Linscott, one of the contributors, constructed a distributed test framework for the engine. In this framework, anybody would be able to write modifications of the code and test this against the currently best version. How would this be tested? People from all over the world would donate their machine and a program would take care of let those computers playing games of the proposed modification (called patch) against the current best version (called master). After some dozen of thousands of games, it would be decided statistically whether the modification could be applied. A round of human based checks for code quality and you could get your patch included into master in less than a week! In the first 5 years of its life, Stockfish had 16 people contributing code. In the second 5 years, 94. Furthermore, since the patches would be tested with some very high predefined criteria, the risk of introducing errors just dropped to almost 0. Result: in less than 1 year Stockfish was sent to the top of all rating lists.

The arms race

One further piece to the puzzle: Stockfish is an open source engine. This means whoever can go and read the source. Furthermore, all discussion is done in public in a google forum, so everybody can know the background of decision and design. On one hand, we have the sheer amount of ideas and computing power available on the its framework. On the other side, these ideas are public, so all proprietary engines can go a dig for ideas which will work also in their context. In the subsequent years a true arms race between three competitors, Stockfish, Komodo, Houdini, followed. They would continuously exchange the crown of the strongest engine in the world, informally assigned as the winner of the last TCEC tournament. This continues until today.

Clouds on the horizon

Some of you will have known that chess engines were against on the main page some weeks ago. A team from Deep Mind, the AI subsidiary of Google, trained a neural network able to compete with Stockfish on custom hardware. The history repeats: first a year-long quest to have a alpha-beta engine beating the human chess world champion and its biological neural network. Now a year-long quest (to come) for an artificial neural network to regain the crown. We'll see how it ends! In the meanwhile, an open source team is trying to replicate the results of the Google team in an effort call Leela Zero. Again, one of founder of this project is Gary Linscott, the former Stockfish maintainer which created its testing framework.

Why the matrix?

Interestingly, now the challenge is not anymore between persons writing programs. But between engines and their development paradigm. In a way, the engines are using us, the humans, to improve themselves such that they still will be operated in the chess engines arena. As I said: the matrix.

lunedì, maggio 14, 2018

White King, Red Queen: a technical primer about trees, part III.

In the previous posts (I, II), we have seen the arms race around computer chess started. We have seen the first ideas about the possibility of building a true chess computer. We witnessed a German company doing it and being displaced by the newborn 486 and we left the an army of chess programmers getting ready to assault Gary Kasparov dominance over the world of chess.

At this time, we are talking about 1994, many people not directly involved in technology still wouldn't think that the end of human chess dominance was only 3 years away. I witnessed exactly this period as a young club player and I have vivid memories from that time. Indeed, the explanation I give you later is more or less what I got back then in 1996 by a chemistry professor interested in game theory and it still holds true.

The main reason for the supposed inability of computers, was that they were thought to purely use "brute force" to find the best moves, whereas master were known to be able to have long-term understanding of the game. The true skill of the masters, so the mainstream, was based on being able to recognize positional advantages which would bear fruit 20 moves or more later, and such a far horizon was thought to be unreachable by standard alpha-beta engines. Ooops, I did it. A technical word. I think the moment has come for you, my faithful reader, to really understand how a chess engine thinks. It will be quick, it could even be painless, so let us look at it in more detail.

Down the very basics, a chess engine is composed by 3 fairly independent parts: the moves generator, the static evaluator and the search function.

The moves generator

First a remark: chess programmers often talk about chess positions as of FENs. These are just shorthands for describing things like "White has a king on the right bottom square h1, a pawn on the square c4 and a bishop on f2. Black only has a king on h8". Here is the FEN for the starting position

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

The move generator gives you for any possible FEN all legal moves which can be done by the side to move.

The static evaluator

A very important element is to quantify how a position looks better than another one. This is done by the static evaluator. The static evaluator gives you for any FEN a value, representing its objective evaluation of the position. Traditionally, positive numbers mean white is good, negative numbers mean black is good. How could a static evaluator look like? It could be for instance something like: count all white pawns, subtract the number of black pawns and multiply by 10. In this case, for instance, both a position with 3 white pawns and 2 black pawns and a position with 4 white pawns and 3 black pawns will give you a value of +10. Obviously static evaluators of real chess engines are much more complex than that, having hundreds of elements. For instance Stockfish main evaluation function (Stockfish is the leading chess engine currently) has ca. 800 lines of C++ codes, with an element typically be coded in 5-10 lines, and this is only the main evaluation function. As I said, hundreds of elements.

The search

But the core of a chess engine is the search. Ahhh, the search. To understand how the search works, it is useful to make the following experiment. Imagine you couldn't play chess at all. Zero knowledge of the rules. The only thing you know is how to translate a chessboard position in a FEN. Plus, you are given a moves generator and a static evaluator. You start playing. You are white and you have to move. Since you have a move generator, you know what moves you could do, and you have a fantastic idea: you will just try them one by one and keep the one which gives you the best, in other words: maximal, static evaluation. If you want to spend some time, with this tool you can do the exercise with Stockfish. It turns out that moving the pawn in front of the king by two squares gives you a evaluation of 218. Cool! Let us move that pawn forwaaaaard!

But hang on! Black can do this too. Black would at its turn chooses the moves which maximize its value. This means, he will go through all possible replies and choose the one which minimizes my value. So it could be, in principle, that a possible reply to moving my king pawn by 2 squares give black a better value than all possible replies to moving my queen pawn by 2 values!

In fact it is so, if I move my king pawn by 2 squares, the best he can do is to move his queen pawn by 2 squares and get a -20. In contrast, if I move the queen pawn by two squares, he can get is a +20 (again by moving its queen pawn by two squares). So, by minimizing on its turn, he invalidates my assumption that I just have to choose the move after which the static evaluation is best.

This is now bad, because it could be that there still another one of my possible 20 first moves that it is even better, in the sense that it makes the minimizing task for black even worse.

Now that we think about it, we should also consider that we also could reply to its replies, again trying to maximizing our value with our second move, so we have to also take this is consideration. And he could reply by minimizing and so forth and so on.

This loosely described algorithm is called minmax and is the basis of the chess engines. You just unroll all possible moves until, say, the 10th reply (technically: ply) you look at the values at the end of this unroll and both players choose alternately the minimizing or maximizing move. Minmax. Interestengliy, there is a variation which is equivalent and which is called alpha-beta which saves you some of the comparisons.

All competitive modern chess programs use some form of alpha-beta with variations and refinements. In this sense, they use brute force, in the very end. They just unroll the moves and do minmax. More brute force is not possible.

Now, if you do the math, you will notice that at every reply you can choose between 20-30 moves. This means, the possible positions after, say, 20 plies are 20^20, which is much more that you could store on your computer, even if every position and all information about it would stay on one bit. Alpha-beta gives you some more bang for the bucks, but not that much.

So, it was though that no computer will be ever able to reach the depth of the chess master. Oh, were they wrong...

domenica, aprile 19, 2009

Al Mercato Nero

Ho varie cose su cui vorrei scrivere nei prossimi giorni: spero di riuscire ad evaderle tutte prima che il semestre cominci: varie conferenze a cui sono stato nell'ultimo mese, un'altra che stiamo organizzandon con un collega, l'ultimo paper che ho messo su arXiv, il corso che terrò questo semestre.

Comincio con la bizzarra installazione teatrale a cui ho partecipato ieri. La compagnia teatrale del teatro cittadino di Friburgo ha realizzato un progetto con delle scuole, in cui gli studenti dovevano realizzare delle rappresentazioni teatrali che trattasero di alcuni temi neuroscientifici: stimulazione cerebrale profonda, interfaccia cervello-computer ed altre amenità.

Ieri hanno presentato questi spettacoli, insieme a conferenze e dibattiti diretti da alcuni specialisti del settore. La sera è stato organizzato dalla compagnia un mercato nero del sapere. 48 esperti hanno offerto delle conversazioni di 1/2 ora su un tema a loro scelta al pubblico. Il pubblico poteva scegliere l'espero, e per 1 euro, parlare con lui del tema proposto.

Il mio tema era "Conway's game of life: l'ultimo gioco dove l'uomo è ancora superiore alla macchina". La mia idea era di partire dalla notizia che da pochi mesi esistono programmi giocatori di go in grado di competere con giocatori professionisti. Questo è stata una grande soprpresa per me, perchè nell'ambiente dei giocatori di strategia si andava mormorando che, a differenza degli scacchi, il go era insolubile per i computer, perchè non è attaccabile in maniera brute force, e per altri motivi più profondi. Da quello che ho capito essenzialmente è difficile scrivere un algoritmo di valutazione.

Poi volevo introdurre il game of life: spiegarne le regole, e giocare alcune situazioni semplici, per far capire che è in teoria è possibile costruire un computer composto da automi cellulari.

L'idea era di poi passare a Turing, Gödel, e argomentare che il computer non può, a causa del teorema di Gödel, superare l'uomo nel game of life.

Cosa a cui non credo fino in fondo, a dire il vero. Ma tant'è: tutte e due le conversazioni sono andate benissimo, nonostante il primo ragazzo fosse uno scolaro di 17 anni e il secondo uno studente di storia al 3° anno.

Di più: certamente un'esperienza ispiratrice e che ripeterò, qualora ce ne sia ancora l'occasione.

mercoledì, gennaio 21, 2009

MCCN X

Giovedì scorso ho spiegato qualcosa sui grafi casuali. Questi sono variabili aleatorie con valori in insiemi di grafi; di solito vengono realizzati utilizzando un qualche tipo di algoritmo casuale.

Faccio un esempio: disegnate N nodi su un foglio e scegliete un valore p fra 0 e 1, che rappresenta la connettività attesa del grafo.

Per ogni lato possibile (sono 0.5N(N+1), quindi prendetevi un po' di tempo) estraete un numero casuale uniformente distribuito tra 0 e 1. Se non avete un generatore di numeri casuali a portata di mano (basta Excel) scegliete un numero fra 1 e 6 al posto di p, e tirate un dado. Se questo numero è minore di p, disegnate il lato che state esaminando. Altimenti no. Il disegno che ottenete dopo aver tirato per tutti i possibili lati è un grafo casuale alla Erdös-Renyi.

La cosa interessante: c'è gente che afferma che le connessioni cerebrali sono, più o meno, un grafo casuale di questo tipo.

domenica, dicembre 07, 2008

Operazioni senza riporto

Questo fine settimana l'ho passato ad insegnare matematica alla cugina della mia ragazza. E mentre le spiegavo in ancora un'altra maniera la divisone polinomiale, ho capito a cosa servono i polinomi, e anche perchè i polinomi sono meglio dei numeri.

Consideriamo un numero N. Evidentemente, N si può scrivere come

$a_n10^n+ a_{n-1} 10^{n-1} + \ldots + a_0$

dove gli a_n altro non sono che le cifre del numero in base 10. Si noti ch la decomposizione è unica, dato che i coefficienti devono essere naturali fra 1 e 10. In efftti, contando in base 10,

$N=a_na_{n-1}\ldots a_0$

Ovviamente si può fare lo stesso sostituendo 10 con 2 (notazione binaria) o con qualsiasi altro naturale venga in mente.

A cosa serve? Per quale motivo lo scrivere i numeri in questa maniera è stata una delle più grandi conquiste dell'umanità? Principalmente a questo: se abbiamo fissato una base, è possibile addizionare, moltiplicare o dividere numeri fra loro semplicemente applicando degli algoritmi alla sequenza di cifre che lo rappresentano.

Il problema di tutti questi algoritmi è il riporto. Sia sommando, che moltiplicando, che dividendo numeri fra loro, è possibile che i coefficienti di una certa potenza di 10 (o di 2, o della base scelta) si sommino in maniera tale da portare ad un cambio di potenza.

Mi spiego con un esempio. Nella somma di 5 e 5, ognuno dei due sommandi e una potenza zeresima di 10, mentre il risultato è una potenza prima di 10. Le potenze della base si sono mischiate! Per tenere conto di questo fatto è necessario "fare il riporto", cioè tenere conto di come le potenze di un certo grado si sono sommate per dare origine ad una potenza di grado superiore.

Nei polinomi tutto questo non esiste! Per prima cosa notiamo che scrivere un polinomio mi da una rappresentazione parametrica dei numeri, in cui la base scelta è un parametro. Per capire cosa voglio dire, si osservi che la forma di un polinomio

$a_nx^n+ a_{n-1} x^{n-1} + \ldots + a_0$

e analoga alla rappresentazione di un numero in somme di potenze di 10, se sostitutiamo i coefficienti interi fra 0 e 9 con coefficienti reali e 10 con x.

Il grande vantaggio dei polinomi è che le potenze della variabile non possono sommarsi per ottenere un elemento di grado maggiore! Quello che voglio dire è che

$ax +bx \quad \mbox{differisce da} \quad x^2$

per ogni scelta dei valori a e b. Nel senso che il primo polinomio non è mai uguale al secondo nel senso dell'uguaglianza di funzioni, anche se possono assumere lo stesso valore per un certo valore di x.

A rigore, dunque, addizione, moltiplicazione e divisione di polinomi sono più facili di quelle fra numeri, dato che possono essere eseguite senza riporto. Un esempio? Voglio eseguire la moltiplicazione

(x^3 + 3x^2 + 5)(2x^2 -5 x)

A scuola avreste insultato l'insegnate, dato che sono un trinomio e un binomio: fanno sei termini da sommare appropriatamente! Eseguiamo invece la moltiplicazione dei coefficienti senza riporto

$\begin{array}{ccccccc}&&1& 3 & 0& 5 & \times \\&&&2&-5& 0 & =\\\hline\\&&0& 0 & 0& 0\\& -5& -15& 0& -25 & - \\ 2 & 6& 0 & 10& -&\\\hline\\ 2 & 1& -15& 10& -25& 0\end{array}$

Il risultato della moltiplicazione dei polinomi è dunque

2x^5 + x^4 -15x^3 +10x^2 -25x

Eseguire la moltiplicazione nella maniera convenzionale porta, ovviamente, allo stesso risultato.

C'è un prezzo da pagare per questa semplicità: il problema dell'unicità della rappresentazione di un numero è molto più complesso che nel caso delle basi fisse.

martedì, settembre 16, 2008

De memoris

In questo post sul blog che tengo con due miei cari amici si è sviluppata una interessante discussione sulla differenza fra la memoria procedurale e la memoria dichiarativa.

Stavo riflettendo che in matematica c'è un'interessante analogia che riguarda la definizione delle funzioni.

Cos'è una funzione? (Versione dichiarativa)

La risposta rigorosa, diciamo Bourbakistica è questa.

Definizione
Una funzione è una terna (A,B,R) dove A e B sono insiemi e R è un sottoinsieme del prodotto cartesiano A x B tale che se (a,b) e (a',b') sono in R, allora b=b'

L'idea sottesa a questa definizione è che una funzione altro non è che una lista, che accanto ad ogni valore della variabile indipendente (gli a in A) riporta uno ed un solo valore della variabile dipendente (il b in B tale che (a,b) in R).

Definire una funzione in questa maniera è rigoroso e utile. Tuttavia non corrisponde alla maniera in cui pensiamo. Quando definiamo una funzione, infatti, pensiamo a delle operazioni da eseguire, e non ad una lista di valori!

Cos'è una funzione? (Versione procedurale)

La risposta intuitiva, empirica, è quella che si dava nel 1800. Una funzione è una legge di associazione. Una funzione è definita se per ogni valore x so calcolare f(x).

Vantaggi e svantaggi

Cominciamo coi vantaggi della maniera procedurale. I vantaggi della maniera dichiarativa li sappiamo: è la maniera rigorosa con la quale si fa matematica.

Però pensiamo adesso di voler definire la funzione quadrato, cioè f(x):=x**2. Se vogliamo definirla in maniera dichiarativa, dobbiamo decidere in quali insieme tale funzione deve vivere. Dobbiamo scrivere, ad esempio, f:R-->R, f(x)=x**2 se vogliamo parlare del quadrato nei numeri reali, f:C-->C, f(x)=x**2 se vogliamo parlare del quadrato nei numeri complessi. Un po' complicato, no?

Ovviamente possiamo salvarci definendo f:A-->A, f(x)=x*x, dove A è uncampo, ma non è questo il punto. Il punto è che non possiamo dire: sia f(x)=x*x ogni qualvolta abbia senso, perchè staremmo quantificando su l'insieme totale che contiene tutte le strutture algebriche.

Fin qua è solo un problema formalistico, quasi di estesi. Un problema più grave si presenta nel caso in cui alla mia funzione definita in maniera procedurale non corrisponda nessuna funzione dichiarativa. Fissiamo una funzione continua f a valori reali. Decidiamo di definire la funzione g tramite l'operazione

g(x) = lim n(f(x+1/n)-f(x))

Ovviamente questo limite può esistere o non esistere. (O scelto di prendere il limite rispetto a n invece della definizione normale di derivata per evitare polemiche del tipo: ma guarda che non puoi definire il limite di una funzione se la funzione non è definita in maniera dichiarativa, etc...)

Il problema adesso è che non è possibile definire g in maniera dichiarativa, se non applicando una sorta di ragionamento circolare: trovo l'insieme D dove il limite esiste, e poi scrivo

g: D-->R, g(x) := lim n(f(x+1/n)-f(x))

In pratica, la definizione dichiarativa di g deve contenere la definizione procedurale di g. Poco soddisfacente.

venerdì, ottobre 19, 2007

apologia

meglio tardi che mai

mio nonno

la matematica è una cosa strana: m si studia per anni e anni un argomento, e non si capisce mai quale ne sia il senso, e lo si disprezza, e ci si dice "si, dovrei studiare anche questo", ma non se ne ha voglia e si cerca di evitare.

fino a quando, un giorno, d'improvviso, mentre si cerca di dimostrare un risultato per la densità spettrale di sequenze di potenziali d'azioni, si viene fulminati dal vero significato della vita.

in questo caso, delle variabili aleatorie.

consideriamo un insieme finito di numeri reali A=(a_1, ... , a_k). si definisca adesso una successione tramite

$x_n:= a_{n{\rm mod}k}, \qquad n \in \mathbb N$

in pratica percorriamo tutti gli a_j dal primo all'ultimo, e poi ricominciamo. è evidente che tale successione non converge: ha esattamente k punti di accumulazione. tuttavia, se consideriamo il limite secondo cesaro, di cui ho parlato anche l'ultima volta, allora si vede subito che

$\lim^C_{n\to \infty} x_n= \frac{1}{k}\sum_{j=1}^k a_j$

in realtà, si vede subito che non è necessario definire la successione in tale maniera artificiosa. come prima generalizzazione si scelga ad ogni "giro" un nuovo ordine in cui vengono assunti i valori. si vede, quindi, che è possibile definirla in una maniera arbitraria, purchè la densità relativa che i valori valori a_j sia uguale. se le densità sono diverse (si noti che non ho ancora definito cos'è questa densità e che non lo farò), allora il limite secondo cesaro altro non sarà che una media pesata, dove il peso altro non è che la densità del valore in questione.

ora, supponiamo di non volerci fissare su una specifica, per quanto arbitraria, scelta dell'ordine dei valori assunti dalla successione. vogliamo lasciare la massima libertà, e scegliere una successione in maniera in parte algoritmica cioè deterministica, e in parte casuale, cioè stocastica.

per ogni n scegliamo un numero a caso fra gli elementi di A, purchè alla fine (per n grande) siano rispettate le densità relative. cosa abbiamo fatto? non abbiamo fatto altro che definire una successione indipendente di variabili aleatorie X, ognuna di esse avente la seguente distribuzione: con probabilità p_j pari alla densità del numero in questione, viene assunto il valore a_j.

per capire la connessione tra limite secondo cesaro e il valore atteso si consideri ogni successione scelta secondo tale algoritmo come una particolare realizzazione di questa successione di variabili aleatorie. il limite secondo cesaro di tale realizzazione esiste ed è uguale al valore atteso della variabile aleatoria. cioè, indicando con X tale variabile aleatoria,

$\lim^C_{n\to \infty} x_n=E(X)$

qua si vede il vantaggio dell'approccio stocastico: non è necessario fermarsi a variabili aleatorie a valori in un insieme finito, o numerabile. si può assumere che la variabile aleatoria abbia valori reali. e mentre nel caso numerabile sarebbe possibile definire la successione in questione in maniera algoritmica, dato che è possibile assegnare ad ogni valore che può assumere X una densità finita maggiore di 0, ciò diventa impossibile nel caso continuo, rendendo necessario il ricorso al concetto di variabile aleatoria.

prima o poi devo spiegare cosa ha che fare tutto ciò con la densità spettrale di una popolazione neuronale.

mercoledì, ottobre 10, 2007

l'ultima crociata

guarda, ci sono molti tipi di dimostrazione. uno di essi è un metodo a cascata, in cui parti da un numero e riesci a proseguire per tutti i numeri naturali.

p. milella

giusto per portare acqua al mio mulino: gowers fa notare che una dimostrazione índuttiva sui numeri naturali può sempre essere trasformata in una dimostrazione per assurdo.

io vado un po' più in la: dico che la dimostrazione giusta è quella induttiva, e che quella per assurdo è solo una complicazione inutile. e questo perchè una dimostrazione induttiva ha sempre nascosto dentro di se un algoritmo, a differenza di quella per assurdo.

sabato, agosto 18, 2007

machineries

hat der alte hexenmeister
sich doch einmal wegbegeben!
und nun sollen seine geister
auch nach meinem willen leben.

der zauberlehrling

chris chatham segnala questa interessante intervista con steve grand. qualche riflessione sparsa.

strumenti

come spiega bene steve grand, viviamo in un mondo simulato, e siamo coscienti di questa simulazione - anche se non sempre del fatto che è una simulazione. una delle conseguenze di questo dato di fatto abbastanza sconcertante è la disposizione tipicamente umana a costruire strumenti. questi strumenti, infatti, permettono di controllare la realtà, permettendo al possessore dello strumento di risolvere, almeno parzialmente, uno dei grandi problemi dell'essere coscienti della propria simulazione: cioè eventuali discrepanze che si verificano fra quanto noi avevamo simulato nel futuro ed il feedback fornito dalla realtà.

platonismo

questa passione per lo strumento si trasferisce anche in altri campi. quello in cui sono più esperto e quello matematico, e di questo scrivo. teoremi, lemmi e teorie altro non sono, in questo contesto, altro che strumenti astratti che permettono di tenere sotto controllo un mondo non fisico. piccola osservazione: così come nel mondo reale ci si da delle regole per costruire degli strumenti, in maniera tale che essi siano adoperabili da tutti (chi sa il tedesco dia un'occhiata qui), anche nella matematica si è scelto un certo sistema di notazioni e di assiomi standard che permettono a tutti di comprendersi, cosicchè è possibile intraprendere esplorazioni di gruppo - "mathematical stories" è il termine che usa terence tao. questo concetto dell'esplorazione mi è caro: mi permette di dare un significato platonistico all'idea del teorema come strumento.

algoritmi

un posto di particolare rilievo in questa visione del mondo lo hanno gli algoritmi, in cui la natura strumentale del teorema è immediatamente visibile. d'altra parte si corre anche il rischio di considerare strumenti matematici genuini solo gli algoritmi, o, più in generale, espressioni finite, dimenticando l'utilità "esplorativa" di strumenti più astratti. il punto di vista speculare è la tendenza tipica dei matematici astratti ad essere spaventati da calcoli e stime, paura che può spingere persino al rifiutare come matematica corretta il risultato di tali calcoli e stime - bourbaki è un dinosauro la cui testa è troppo lontana dalla coda, si dice abbia detto cartier.

venerdì, luglio 06, 2007

impressioni

legend has it that the discovery was made at sea and that hippasus' fellow pythagoreans threw him overboard.

wikipedia su ippaso da metaponto

ieri sono stato ad un talk di barbara hammer: era un overview su tutto quello che si può dire sulle reti neurali ricorrenti.

di questo bellissimo talk mi ha in particolare colpito una cosa che lei ha detto di sfuggita e che già sapevo; cioè che le macchine di turing non possono manipolare i numeri reali e che, se potessero, allora porebbero fare veramente qualcosa di più. mi si perdoni il linguaggio poco scientifico, ma sono le 8 di mattina...

quindi siamo andati alla grigliata della facoltà di informatica, e seduti ad un tavolo, gustandomi il mio buon tannenzäpfle, ho potuto assistere ad un'accesa discussione fra lei, g.p. e c.t. sul problema dei numeri reali; che, afferma sempre g.p., tanto reali non possono essere, dato che si può sopravvivere benissimo senza di essi, approssimando tutto in maniera razionale.

a quel punto mi sono ricordato di una conversazione con r.n. in cui lui mi spiegava che, se si decide di essere rigorosamente costruttivisti, allora ogni funzione su R è automaticamente continua. purtroppo non ricordo come funzionava la dimostrazione. che ci sia una connessione frai due fatti, si vede subito: accettare di approssimare tutto con razionali, è simile al punto di vista costruttivista dell'accettare solo numeri creati da un algoritmo.

e poi mi è venuto in mente che quando si esegue l'analisi di fourier, la continuità della funzione influisce sulla velocità di convergenza della sua serie di fourier. da questo punto di vista, dato che la serie di fourier è un ente fisico, o almeno oggi mi piace pensarla così, se si accetta questa serie di impressioni e analogie come fossero un ragionamento, sembra quasi che ci sia una possibilità di verificare fisicamente l'esistenza dei numeri reali.

giovedì, giugno 07, 2007

gödel (II)

there are more things in heaven and earth, horatio,
than are dreamt of in your philosophy.

hamlet

riguardo al teorema di gödel, c'é un`altra questione che mi stupisce, probabilmente perché sono un analista e non un logico.

ogni sistema assiomatico ha evidentemente infinite proposizioni indecidibili. tuttavia, aggiungere una proposizione, o la sua negazione, al sistema assiomatico, restringe immediatamente il campo di queste proposizioni indecidibili. questa operazione é evidentemente associativa, nel senso che il sistema assiomatico A U {p,q} ha lo stesso campo di proposizioni indecidibili che A U {p} U {q}.

in pratica, nonostante si possano aggiungere assiomi uno alla volta, man mano che si trovano nuove proposizioni indecidibili, tuttavia un sistema assiomatico completo - che definisco adesso come il limite di un tale processo - é un insieme e non una successione di assiomi.

venerdì, aprile 27, 2007

grafi casuali

bizzarrie della matematica: se si cerca di spiegare ad un profano cos'è un grafo casuale, allora si ricorre ad un punto di vista algoritmico. si spiega, in pratica, come si trova una realizzazione di un grafo casuale. peró, e allora il matematico si gira verso i suoi colleghi e sorride ironicamente per sottolineare la sua superioritá rispetto al profano, ignaro della Vera Scienza, peró in veritá un grafo casuale altro non è che uno spazio di probabilitá.

giusto per farsi capire, no?

Lap(l)aciano