Monday, 11 August 2025
The Fresh Air of Mathematics
Machine Learny Language: Exchangeable RVs
The concept of exchangeable random variables is commonplace but the terminology might seem new.
An exchangeable sequence of random variables is one in which any re-ordering of any finite sequence results in the same joint probability function.
An alternative phrasing of this is the joint distribution is invariant to finite permutation.
The term exchangeable was a neologism at the time of its inception in the 1920s in a book on Logic written by an alumnus of Kings College, Cambridge, where said coiner of said term studied mathematics.
Tuesday, 5 August 2025
The Kantarovich Paradox, Expressed in FAAM
Wednesday, 9 July 2025
Revisiting Map Reduce
Map reduce is very prevalent in distributed data processing.
It's an idea from functional programming. The basic idea is you have a map function that can apply a filter to a list, and then apply a summary operation, which is the reduce.
A Mickey Mouse example to bring this to life could be - you have list of fishing vessels, you want to filter for foreign flags, and then you want to add them all up, and you want to do this every 6 hours and create a time series database of this data.
A popular open source implementation is Apache Hadoop.
Thursday, 3 July 2025
The Autoregressive Nature of LLM Operations
LLMs are AUTOREGRESSIVE generative models.
An autoregression is a regression of a variable against its own lagged values.
For example, an AR(1) predicts the current value based on the immediately preceding value, AR(n) uses the n most recent values.
One remarkable feature of these models is in-context learning, which has been hypothesised as being Bayesian in nature.
The Python Datasets library
Hugging Face has a Python datasets library which has natural language training data sets amongst others.
Tuesday, 11 March 2025
Designing an Experiment - Single and Double Blind Trials
Single-blind and double-blind trials relate to how experiments are set up.
- Double-blind - where neither subjects nor experimenters know who is in TEST or CONTROL group
- Single-blind - only the subjects (e.g.. patients in a medical study) know which group they are in