PDA

View Full Version : It is Probably Time for the Stochastic Calculus


Macrobius
08-07-2010, 12:07 AM
This is a three part series on the Stochastic Calculus, among other topics, and why you should care. WNs should care about the Stochastic Calculus and other subjects introduced here because they provide a high-end use of Mathematics, at which our race excels, in a number of practical situations that may provide immediate benefit if researched and exploited. We can't all be particle physicists and have university positions or government grants to fund our research. However, much of the apparatus of Physics and good part of recent Mathematics are directly applicable to practical fields such as traffic flow on the internet or the order in which computations are executed. We must learn, collectively, to separate the presence or absence of patronage from progress in research.

It is important to have timely information about the 'next big thing'. In the 80s, it would be nice to know all about chaos theory and dynamical systems, say, or to be well up on string theory. Today, there is a convergence of many fields into a coherent way of looking at dynamical systems, which builds on earlier work, but has many surprises and applications *outside* the conventional bounds of laboratory research.

This series will highlight, in particular, the spectacular connections that are emerging between the following fields: Percolation Theory, Stochastic Process Limits (Brownian motion is one example), applications of the Stochastic Calculus to problems in Finance and Economics, such as the valuation of 'Derivatives', the 'Long Tail' of the Internet, stability and performance of packet-switched networks such as AlohaNet and Ethernet (queuing theory), Heavy tailed distributions such as the Levy alpha-stable, Linguistics and Group Theory (the theory of syntactic monoids, freely generated lattices, 'generative' grammars in computer Science), the new field of 'Ludics' -- think computer games, Reliability Engineering and Extreme Value Theory, non-equilibrium Statistical Mechanics, Path Integral methods of computation, Conformal Field Theory (CFT), Stochastic Loewner Evolution (SLE), and Quantum Gravity (QG).

The mathematics discussed is the same used in particle physics, and even more in certain areas of Quantum Field Theory, General Relativity, and Cosmology. Yet, the exciting thing is these formalisms apply equally well to everyday problems in Computer Engineering, of the sort routinely assigned in IT departments -- say the application of Queuing Theory to Capacity Planning [Denning Buzen analysis is its formal name]. I cannot possibly exhaust the subjects above in a short series of course. What I am hoping to do is show the connections that make these theories, in just the last 10 years, highly convergent and exciting. Real advances and even experiments are now possible that may well elucidate fundamental problems in particle physics, Cosmology string theory, or supersymmetry, only from an unexpected direction -- the everyday Physics of the Internet.

The first part will be rather dilettantish -- no formulas, just talk, to find out what the subject of this series is about.

The second part will be more a Wikiography -- definitions of terms and links to articles. Those articles will certainly have formulas, but can be quickly skimmed to get the sense of it. Wikipedia is an annotated ontology, and learning what all the things are and how they connect to each other is the first order of business. And of course the external links provide food for thought -- and in an unstructured research program, that may be all you need to get started.

The third part, which may extend into multiple parts as needed, will drill into the details.

So much for the Introduction: Let's begin part 1.

1. We must motivate the most controversial part of my claim first. Let us start with a notion of a 'Rewrite System' -- this is a key notion of modern computer grammars, often called Backus-Naur form for writing down a grammar, or in linguistics, 'Transformational-Generative Grammar'.

[url]http://en.wikipedia.org/wiki/String_rewriting_system[/url] [don't click unless you want formulas early]

These are in fact the younger and more general cousin of Groups [as in Group Theory]. There is a generalisation of a Group called a Monoid, and rewrite systems are related to a certain type of monoid -- the 'generation' part of generative grammar is like generating a Boolean Lattice, and the 'transformational' part is like rewriting the strings of symbols. In fact, there is a group of tranformations over the monoid, in mathematical terms.

Thus far, we have some pretty straightforward and maybe even boring standard subjects in computer science. The subject of Semi-Thue systems was worked out in the 20th century. Some of the names here are Emil Post, Noam Chomsky, and Marvin Minsky.

Let's make it more interesting: Lindenmayer systems, also known as parallel rewrite systems or L-systems for short, can generate fractals.

[url]http://en.wikipedia.org/wiki/L-system[/url] [few formulas, lots of pretty fractals to set the mood]

The connection is via 'turtle graphics' of the sort made famous in the children's programming language, Logo. By interpreting the sequence of symbols in the string or grammar as 'take a step' or 'turn 90 degrees' or whatever command, the rewrite rules generate more and more complex paths.

Of course, this sort of 'graph' is exactly what we mean by a [non-random] walk, and such graphs are used in computer science to model -- oh, everything. browsing the web, states of a program or the thread of execution, and not least computer graphics!

It is, of course, a simple matter to make such things 'stochastic' or a random process, which brings us to our next step:

2. Stochastic Process Limits. We have a pretty good handle on 'stochastic process' now -- it is something like staggering around drunk, or deciding to get in line at the grocery store, or deciding to cross the 520 bridge to Seattle at rush hour. Ordinary Calculus can deal with deterministic functions pretty well, but to deal with *stochastic* processes, unless we are very lucky, we will need to invent a new sort of Calculus.

One thing we will be giving up is the ideal conditions much of our statistics is founded on (the kind that works well with well calibrated equipment in a physics laboratory, but seems to break down on the floor of a Stock Exchange or in the Bond Trading Pit). You may recall, in Statistics, there is something called the Central Limit Theorem, which in turn makes most processes end up looking like Gaussians, or Normally distributed. The Normal Distribution is the famous Bell Curve, against which you might have been graded, as a child.

The key assumptions that lead to a process having a gaussian as its stochastic process limit [which is to say, ending up looking like Brownian Motion], is that it be memoryless, with bounded variance and identical distributions at all times. In statistics, this is often summarised IID(N) -- independent and identically disributed, subcase 'normal'. Independent samples will certainly give you 'no memory of the past' except maybe the fact that you are where you are, on the next step. 'identically distributed' means no biases, and no changes in time to the 'width' of the distribution controlling the next step.

There are plenty of ways of 'relaxing' these assumptions -- for one thing, you might suppose there is *no* prior bound on how funky things can get. Certainly, the real world is that way. Thus, over time, the 'variance' of the process may well 'go to infinity' -- that is, the process may be unbounded. Another way is to allow the 'volatility' of the distribution, called in the financial world its 'beta', to change over time. Perhaps you will even draw the variance from two different distributions: one for when Republicans are in power, and one for Democratic years, and have a 'two-level markovian' process for your stock prices.

Or, you might choose any of several methodologies: Hidden Markov Model (HMM), Gaussian Process (GP), 'Particle Filter model', or MCMC - Markov Chain Monte Carlo. These are all methods of solving problems in computer science relating to data mining, signal processing (think van Kalman filters), or machine learning for pattern recognition. In short, the full set of 'interesting applications' in the Military-Financial complex. Good reason for WNs to think carefully here, if only to understand the threat offered by the Yellow Peril or whatever we should be worried about.

Only, as WLP said in that famous radio broadcast, in a very low whisper at the very end, 'Only there were no Chinese'.

3. The next step to understand is that, once we have understood Rewrite Systems as (possibly random) walks, have observed their fractal dynamics, and noted that Stochastic Processes are very important and rather ubiquitous -- we should realize that the subject matter has been discovered a couple times, from different perspectives.

Not only have we seen the take Economists, Mathematical Logicians (Post, Kleene, Goedel), Linguists (Chomsky), Frequentist and Bayesian Statisticians, and Real Analysts studying Borel fields, sigma algebras, Measure Theory and the like -- but we have the *Functional* Central Limit Theorem (FCLT, due to Donsker), Ito's Lemma, and the Stochastic Calculus, which will be the subject in depth of Part 2. That is, the standard way of tacking probability theory on to Analysis -- Lebesgue Measure and Measure Theory and the Quantum Mechanics of Hilbert Space -- has generalisations that deal with special cases of *unbounded* measure to handle those pesky stock market like aspects of the world.

To make a long matter short: there are two basic theories why the Internet is so screwy. One stresses 'Heavy Tailed Distributions' and the other suggests 'Long Range Dependence' (LRD) is the culprit. These are sometimes given the biblical names 'The Noah Effect' [we just had the 1000 year flood], and the 'Joseph Effect' [the seven good years were not followed by a memory reset of Markovian/Mosaic proportions, but by the seven bad years]. It turns out these are not the same, and one or both can be present to screw up Stochastic Processes.

Thus, a lot of the learning to be done is just to develop a catalog of interesting models and how to compute with them, and how to recognise which case is which.

I have written before on Stochastic Process Limits in relation to the 'Mathematics of the Current Crisis', so I won't recapitulate all the connections here. We will revisit the relevant links in Part 2.

One final note: besides knowing that Stochastic Process Limits is an important subject and has something to do with Long Range correlations and 'the long tail' of the internet. There is an independent subject that is, it turns out, mathematically the same: Percolation Theory. This exists in two forms -- bond percolation and site percolation -- that are duals of each other. It arises in computer science but also in geophysics and even the engineering of soil. If matter is packed together loosely, with enough interstices that water can 'percolate' around it and drain, there comes a point in compacting the matter, called the critical point, that the system no longer drains well. This is because water can no longer filter from hole to hole, and the system undergoes a 'phase transition'.

In queuing theory, the same transition is observed in going from the light traffic case [no other cars on the road, you can get anywhere in predictable time], through the fluid-like heavy traffic phase, to 'transition' in which congestion shows up, and you start and stop. Finally, to unstable growth.

4. In the interests of brevity, I can now point to an excellent article that puts together the history of how Percolation theory, the theory of Ising models [spin lattices], Conformal Field Theory (CFT), and Quantum Gravity (QG) evolved. We will be going over this in detail, but the first part of this paper, while not easy reading, has almost no formulas at all and is mostly the recent history of the subject, lucidly rendered:

[url]http://arxiv.org/PS_cache/math-ph/pdf/0303/0303034v2.pdf[/url] [PDF]

[note: this topic was already cited here: [url]http://www.thephora.net/forum/showthread.php?p=878838#post878838[/url]
and mentioned here:
[url]http://www.thephora.net/forum/showpost.php?p=878868&postcount=23][/url]

Macrobius
09-10-2010, 02:45 AM
Here's a nice link on one man's journey learning the Stochastic Calculus

http://www.systomatics.info/?page_id=94


I do not have any background in Maths or Science and I have decided to learn stochastic calculus on my own (from books).

What I am attempting here in my blog is to translate what I have learnt about stochastic calculus. I hope it will be of benefit to people who do not have academic background in quantitative analysis (or mathematics and statistics). I am planning to add at least one article every week. So if you want to understand exactly what it is that the quants do then please read all articles from beginning to end.
...


19 articles at the link.

DonkeyKong
12-01-2010, 05:58 PM
Thank you for this, Mac. I have an interest in solving "interesting applications" in computer science. It's hard to find computer scientists who know anything of mathematics, physics, or philosophy. They usually know a tiny bit about their subject matter, and that's about it.

Just as a heads up, I will make some attempts at following your work on the Internet.