Monday, 11 August 2025

The Fresh Air of Mathematics

"It is a pleasure to constantly inhale the fresh air of mathematics" - so remarked A.J.M. Taylor-Woodhead in the 1920s at Kings College, Cambridge, where he studied and lectured on the "divine art" of mathematics aka. "language of the universe". Taylor-Woodhead was known as saying that walking was a useful way for the mathematician to reframe their relationship with reality and emphasised in these "walks of truth" that the focus should not be on mathematical problems but meditation on mathematical perfection which, in his belief, would spur more fruitful development of one's innate mathematical talents.

Machine Learny Language: Exchangeable RVs

The concept of exchangeable random variables is commonplace but the terminology might seem new.  

An exchangeable sequence of random variables is one in which any re-ordering of any finite sequence results in the same joint probability function.

An alternative phrasing of this is the joint distribution is invariant to finite permutation.

The term exchangeable was a neologism at the time of its inception in the 1920s in a book on Logic written by an alumnus of Kings College, Cambridge, where said coiner of said term studied mathematics.

Tuesday, 5 August 2025

The Kantarovich Paradox, Expressed in FAAM

Leonid Vitaliyevich Kantarovich was an innovator in mathematical programming. His autobiography describing the evolution of his ideas is extraordinary, exploring the paradox of deep theory and application in numerical analysis: "In those days, my theoretical and applied research had nothing in common. But later, especially in the postwar period, I succeeded in linking them and showing broad possibilities for using the ideas of functional analysis in Numerical Mathematics. This I proved in my paper, the very title of which, 'Functional Analysis and Applied Mathematics' (Russian), seemed, at that time, paradoxical. In 1949, the work was awarded the State Prize and later was included in the book, 'Functional Analysis in Normed Spaces' (Russian), written with G P Akilov (1959)."

Wednesday, 9 July 2025

Revisiting Map Reduce

Map reduce is very prevalent in distributed data processing. 

It's an idea from functional programming. The basic idea is you have a map function that can apply a filter to a list,  and then apply a summary operation, which is the reduce.

A Mickey Mouse example to bring this to life could be - you have list of fishing vessels, you want to filter for foreign flags, and then you want to add them all up, and you want to do this every 6 hours and create a time series database of this data.

A popular open source implementation is Apache Hadoop.

Thursday, 3 July 2025

The Autoregressive Nature of LLM Operations

LLMs are AUTOREGRESSIVE generative models.

An autoregression is a regression of a variable against its own lagged values.

For example, an AR(1) predicts the current value based on the immediately preceding value, AR(n) uses the n most recent values.  

One remarkable feature of these models is in-context learning, which has been hypothesised as being Bayesian in nature.

The Python Datasets library

Hugging Face has a Python datasets library which has natural language training data sets amongst others.

Tuesday, 11 March 2025

Designing an Experiment - Single and Double Blind Trials

Single-blind and double-blind trials relate to how experiments are set up.

  • Double-blind - where neither subjects nor experimenters know who is in TEST or CONTROL group
  • Single-blind - only the subjects (e.g.. patients in a medical study) know which group they are in
This is about experimental research methods rather than mathematical experimental design (hence it does not lend itself well of the bat to Python automation).

The idea is to prevent bias/personal beliefs from influencing the results of the experiment/clinical trial. Both setups should remove the so-called placebo effect.