Wednesday, 9 July 2025

Revisiting Map Reduce

Map reduce is very prevalent in distributed data processing. 

It's an idea from functional programming. The basic idea is you have a map function that can apply a filter to a list,  and then apply a summary operation, which is the reduce.

A Mickey Mouse example to bring this to life could be - you have list of fishing vessels, you want to filter for foreign flags, and then you want to add them all up, and you want to do this every 6 hours and create a time series database of this data.

A popular open source implementation is Apache Hadoop.

No comments: