Tag: Algorithms
-
Developing an Algorithm in F#: Fast Rotational Alignments with Gosper’s Hack
This post is for the 7th day of the 2014 F# Advent Calendar. It has been said that functional languages can’t be as fast as their imperative cousins because of all of the allocation and garbage collection, this is patently false (as far as F# is concerned at least) largely because the CLR has value types.…
-
Record Linkage Algorithms in F# – Extensions to Jaro-Winkler Distance (Part 3)
While writing the previous article on tokenized matching I realized I left out some important background information on Jaro-Winkler distance. First, there’s something important to know about the Jaro-Winkler distance: it’s not a metric distance and so does not obey the triangle inequality. That is, if you found the JW distance between strings A and B, and then…
-
Imperative Pseudocode to Pure Functional Algorithm with Gale-Shapely and F#
Continuing from last time, let’s look at how one goes from imperative pseudocode to pure functional using Gale-Shapely as an example. Overall, to convert an algorithm from imperative to functional is a fairly simple process once you understand how to convert from a while loop to recursion with accumulators. This post is just a more advanced…
-
Record Linkage in F# – Token Matching, Stable Marriages and the Gale-Shapley algorithm
This post was originally published September 13th, 2011. Initially, one of the biggest problems I found when trying to marry records was the god awful quality of much of data I often have to work with. It’s mostly old mainframe and database data with truncated fields, limited character sets and fields with nonsensical contents. Even…
-
Record Linkage Algorithms in F# – Jaro-Winkler Distance (Part 2)
Last time we dove into the Jaro distance algorithm and picked apart how each of its components are calculated. However, from a modern perspective Jaro alone is a rather weak method of string matching. It was Winkler’s extension that brought this algorithm into widespread modern use. Matthew Jaro’s insight when inventing the Jaro distance algorithm was that…
-
Record Linkage Algorithms in F# – Jaro-Winkler Distance (Part 1)
When first approaching the task of record linkage I was initially overwhelmed by the huge number of different algorithms available for comparing strings. Now I know that the secret to finding your way in this sea of algorithms is two fold. First, know that many are outdated and have newer and better implementations, so they can be…