Posts Tagged: R Type Provider


30
Dec 13

My 2013 F# Year in Review

It’s been a great year for F# with the blossoming of the fsharp.org working groups. It’s been amazing watching the community come together to form a movement outside of Microsoft. This is most certainly the long term future of F#, protected from the whims of layer upon layer of management. Who knows, in the coming year we might even see community contributions to the F# Core libraries. Who would have thought that would ever have been possible?

I’m very happy to see that Sergey Tihon has maintained his wonderful weekly roundup of F# community goings on. It’s a big time investment week after week to keep the weekly news going. After leaving Atalasoft, and no longer being paid to blog on a regular basis, I found I couldn’t keep investing the time and felt very badly about not being able to continue my own weekly roundups. Sergey has picked up that mantle with a passion, and I’m so very glad for this extremely useful service he provides to the community.

Meanwhile Howard Mansell and Tomas Petricek (at his BlueMountain sabbatical), worked toward building a bunch of great new tools for data science in F#. The R Type Provider has become extremely polished and while Deedle may be fresh out of the oven, it already rivals pandas in its ability to easily manipulate data.

At Bayard Rock Paulmichael Blasucci, Peter Rosconi, and I have been working on a few small community contributions as well. iFSharp Notebook (An F# Kernel for iPython Notebook) is in a working and useful state, but is still missing intellisense and type information as the iPython API wasn’t really designed with that kind of interaction in mind. The Matlab Type Provider is also in a working state, but still missing some features (I would love to have some community contributions if anyone is interested). Also in the works is a nice set of F# bindings for the ACE Editor, I’m hoping we can release those early next year.

Finally, I wanted to mention what a great time I had at both the F# Tutorials both in London and in NYC this year. I also must say that the London F# culture is just fantastic; Phil is a thoughtful and warm community organizer and it shows in his community. I’ve been a bit lax in my bloggings but they were truly both wonderful events and are getting better with each passing year.

F# Tutorials NYC 2013 Group Photo

F# Tutorials NYC 2013

That right there was the highlight of my year. Just look at all of those smiling functional programmers.


22
Jul 13

The Promise of F# Language Type Providers

In most software domains you can safely stick with one or two languages and, because the tools you are using are fairly easy to replicate, you’ll find almost anything you might need to finish your project. This isn’t true in data science and data engineering however. Whether it be some hyper-optimized data structure or a cutting edge machine learning technique often you only have a single language or platform choice.

Even worse, when you want to build a system that uses one or more platform specific components, things can become quite an engineering mess. No matter what you do you can’t avoid the high cost of serialization and marshaling. This makes some combinations of tools non-options for some problems. You often make trade-offs that you shouldn’t need to make, for example using a worse algorithm just because the better option hasn’t been written for your platform.

In .NET this is a particularly bad problem. There are quite a few dedicated people working on open source libraries, but they are tiny in number compared to the Matlab, Python, R or Java communities. Meanwhile, Microsoft research has several fantastic libraries with overly restrictive licenses that make them impossible to use commercially. These libraries drive away academic competition, but at the same time can’t be used outside of academia. It’s a horrible situation.

Thankfully, there is a silver lining in this dark cloud. With the release of F# 3.0 in VS 2012 we were given a new language feature called Type Providers. Type Providers are compiler plugins that generate types at compile time and can run arbitrary code to do it. Initially, these were designed for access databases and getting types from the schema for free, but when Howard Mansell released the R Language Type Provider everything changed. We now realized that we had a way to build slick typed APIs on top of almost any other language.

This means that now it doesn’t matter if someone has written the algorithm or data structure for our platform as long as there’s a Type Provider for a platform where it has been done. The tedious work of building lots of little wrapped sub-programs is completely gone. It shouldn’t even matter if the kind of calculation you’d like to do is fast on your native platform, as you can just transparently push it to another. Of course, we still must pay the price of marshaling, but we can do it in a controlled way by dealing in handles to the other platform’s variables.

The language Type Providers themselves are a bit immature for the moment but the idea is sound and the list is growing. There is now the beginnings of an IKVM Type Provider (for Java) and I’m working on a Matlab Type Provider. The Matlab Provider doesn’t yet have all of the functionality I am aiming for, but I’ve been working on it for several months and it’s quite usable now. All that’s left is for someone to start in on a Python type provider and we’ll practically have all of the data science bases covered.

It’s an exciting time to be an F#’er.