Tuesday, 11 September 2018

Category Theory for Pythonistas

Category theory is a language composed of "objects" and "arrows". It is a general mathematical theory of structures. It can be regarded as philosophy as much as mathematics or theoretical computer science.

Saturday, 8 September 2018

Mathematics as the Path to Progress in Computer Programming

Mathematics enables a wide range of computer algorithms to be defined and created. But pure mathematics (study of logical structures, patterns and extraction of general principles and axioms) is the path to progress in many areas of computers - it is a "fuel" that can help push the boundaries of computers and increases our capacity for invention. Many computing advances were created by mathematicians who chose to apply themselves to computers.

Tuesday, 4 September 2018

PyPI - The Python Package Index

PyPI is the Python Package Index.  Newish stuff on there include neural network toolkits, IoT utilities such as mbed-flasher and math language interfaces e.g. amplpy.

PyTorch

PyTorch is a way to do DL/neural nets in Python. Anaconda is the recommended package manager (used by Bloomberg, BMW, PIMCO). Latest PIP (replacement for easy_install) and numPy packages are required.

"Deep learning" (DL) As Subfield of "Machine Learning" (ML) - The Chris Manning View

Chris Manning at the Departments of Computer Science and Linguistics at Stanford University describes deep learning as a subfield of machine learning - which is a form of computational statistics. 

He emphasises the human-computer partnership in successful machine learning, in the sense that ML methods shown to work well have done so due to "human-designed features or representations". 

Examples given are SIFT (scale-invariant feature transform) or HoG (Histogram of Oriented Gradients) features for vision and MFCC (mel-frequency cepstral coefficients) or LPC (linear predictive coding) features for speech. 

In these cases, ML becomes a weighting scheme optimization process to make the best prediction.

OK, but so how does deep learning (DL) differentiate itself from more "conventional" machine learning (ML)? What are the key characteristics of this much touted subfield?

One element is representation learning (also known as "feature" learning) to learn good features and representations, with DL learning multiple levels of these representations. Neural networks are currently the tool of choice for this.  

One could almost claim that "DL" is the new marketing spin on Neural Networks. "Differentiable programming" is another trendy name for this.

Why now for DL - the large amounts of training data, modern multi-core CPUs/GPUs, and just maybe, some progress in algorithmic science along the way?

Monday, 3 September 2018

Who coined the term "machine learning" anyhow?

The term "machine learning" was coined by American AI pioneer Arthur Lee Samuel in 1959 in the context of an auto-learning checkers program. He developed one of the earliest implementations of hash tables.

"The Most Important Noog of Our Time" - Gilbert Strang

Gilbert Strang described the Fast Fourier Transform as "the most important numerical algorithm of our time".

The FFT samples a signal over time and divides it into its frequency components.

Cooley and Tukey are credited with the development of the FFT in its modern form although Gauss had developed his own methods in an unpublished work from 1805 (using it to interpolate the orbit of asteroids).

John Tukey was an American statistician who coined the term "bit"(a short form for "binary digit"). He and Cooley developed the FFT algo in 1965 while working at IBM.

Manchester-born Frank Yates (graduate of St John's College Cambridge and one of the pioneers of 20th Century statistics, known to A-level students for Yates Continuity Correction) also created something similar in 1932 which he called the interaction algorithm.

Modern FFT was developed in the context of processing sensor data and has become a leading algorithm in the art of digital signal processing.

Mathematician Marcus du Sautoy (Fellow of New College Oxford) also talks about it in his podcast on Fourier, on the BBC's podcast series A Brief History of Mathematics.

Introduction to Classification - The Spam Filter

A spam filter is a simple example of statistical classification, putting data into known categories.

Classification is a case of supervised learning - where a training set is provided with correctly identified observations. An algorithm that performs classification is called a classifier.

A simple class of classification algorithms is called Naive Bayes, which has been studied extensively since the 1950s.

Naive Bayes (NB) is a popular choice for text classification e.g. is this a novel versus is this a poem or an essay. It is also used in word sense disambiguation.

Discriminative Models (aka Conditional Models)

Discriminative models are also known as conditional models.  They are used in Machine Learning to model the dependence of unobserved variables on observed variables, modeled probabilistically using P(y|x), where y is the unobserved variable vector, and x is the observed variable vector.

What is Nonparametric Statistics?

"Conventional" statistics uses distributions and parameters like mean and variance.

"Nonparametric" statistics relies on being "distribution free" or using unspecified distribution parameters.

"Support Vector Machines" are a form of nonparametric statistics useful in machine learning. It is a "discriminative" classifier.