Skip to content

haskell

These are the stories that have been posted to the haskell category.

F# and Haskell, Estranged Cousins


Published to Rick Minerich's Development Wonderland by Richard Minerich January 29, 2009 20:24

In this post I compare and contrast Haskell and F#.  It may come as no surprise that with so much shared history they share so much in common.  However, it’s interesting to consider how the perspectives of the languages’ developers play a large role in determining the differences between the languages.

 

A Shared History

As far as the family tree of functional programming is concerned, F# and Haskell are not too distant cousins. haskellandfsharphistory

They both share a very similar syntax as well as a large number of features.  A great example of this is Hindley–Milner type inference. 

ML was the first widely used language to leverage Hindley–Milner for static inferred typing, a feature to which it owes much of it’s success.  However, almost all functional programming languages now also have this feature.  The FP community has always been fast to adopt obviously useful features.  Some other things that fit into this category are garbage collection (Lisp) and lazy evaluation (Lazy ML).

The most obvious difference between Haskell and F# is somewhat easy to infer from this graph: object oriented constructs.  That is to say, OCaml pioneered the use of object oriented data structures in functional programming and F# is it’s direct descendant.  This has made OCaml (and in turn F#) somewhat of a black sheep in the theoretical functional programming world. 

The reason many functional programming theorists dislike objects is because they want a language based on math.  Unlike the majority of the ideas in functional programming, objects don’t have roots in either lambda calculus or category theory.  However, this has not stopped OCaml from being successful.  In fact, quite the opposite. 

The use of objects mitigates one of the largest roadblocks in the path to functional programming adoption by engineers:  the difficulty inherent in organizing large functional programs.  The OCaml language engineers also showed that leveraging the object oriented paradigm did not hamper their ability to use static analysis techniques.  Because of this OCaml approaches the speed of C

While it is not pure, OCaml is almost an ideal compromise between theory and engineering.   Indeed, nothing approaches it in terms of a functional language which fits into the paradigms of the Microsoft .NET framework.  It’s easy to see why Microsoft chose to extend OCaml when building a functional language to bring to it’s software engineering masses.

On the other hand, Haskell is almost the ideal language for academic exploration of functional programming.  The fact that it’s strictly limiting in terms of side effecting and adherence to abstract mathematical concepts means no side effecting surprises.  Also, the fact that it’s a committee language means that if a researcher can get enough support for an idea, they can almost be sure it will be included in the next iteration of the language.

 

Haskell as a Committee Language

Repeat the mantra after me:  Haskell is Lazy; Haskell is Pure; Haskell has Type Classes; Haskell is a Committee Language.

Of all of these, the most defining characteristic of Haskell is that it is a committee language.  It’s an amalgamation of many different goals with no clear vision.  This is at the same time Haskell’s greatest strength and greatest weakness.  While it is the most widely used pure functional programming language, the quirks of committee design are obvious.

Some I ran into within two hours of starting with Haskell:

The first was integer rollover.  Haskell has two integer datatypes: integer and int.  Integer is infinitely sized but can be quite slow to use and due to that, it’s rather infrequently used.  On the other hand int is fast but, just like in C, can roll over. There is no way to check the overflow bit.

So, ints can roll over, I can accept that.  What it implies to me is that speed is more important to Haskell than robustness.  However, this brings me to my second point:  Many basic list operations will throw errors on an empty list.  This seems entirely inconsistent to me. 

I understand that if they didn’t, a logic error would be much more likely to cause an infinite loop in a tail recursive function.  However, this seems completely at odds with the “speed first” definition of an int.  It also means that almost everyone ends up wrapping the default list operations with the Maybe monad.

The third issue was that operations with the float data type are slow.  Real World Haskell suggests always using a double due to the fact that a great deal of focus has gone into optimizing double arithmetic but very little into floats.  This demonstrates another thing that comes about with committee languages: often things as important as optimization of basic data types can fall through the cracks because everyone involved wants to work on more exciting things.

Please don’t misunderstand me here, I really like Haskell.  I’m hard on it because I can see that it has a great deal of unrealized potential.  If Haskell is to be a language used for real software engineering, the committee needs to sit down and think hard about an overarching vision for the project.

 

What is the goal here?

The biggest difference between the world of theorists and the world of engineers is that each group has an entirely different set of concerns.

Theorists want to implement ideas fast so that they can crank out papers fast.  A large part of this is having a language that is very close to math so that implementing ideas directly from the chalkboard is trivial.  As the theory world changes so fast, they don’t often care much about organization or maintainability. 

As the committee responsible for Haskell is mainly made up of theorists, it’s easy to see why the language has taken the direction it has.  It’s a language that is very close to math.  As the lifecycle of most academic code is very short, small implementation details which might cause a reduction in robustness are less important.

Engineers want to minimize time spent maintaining code.  Part of this is having a language that emphasizes safety in that it facilitates catching as many bugs as possible, as early in the process as possible.  Another important part of this is code organization as every moment that is spent trying to find a bug is a moment spent not fixing it.  As the cost of maintaining software generally dwarfs the initial development cost, development speed must take a back seat to testing and organization.

The syntax heavy C# is a great example of this.  It’s slow to write in but provides many constructs for the organization and testing of code.  On top of this a great number of design patterns exist to further categorize substructures in a computer program.  C# is slow to write, but it’s relatively safe and mountains of patterns and best practices have been made to guide it’s developers.

However, we in the software engineering world are in the midst of a crisis.  It turns out that traditional imperative object oriented programs do not lend themselves to heavy parallelization.  Yet, parallelize we must.  We are looking at exponential growth in the number of cores contained in each processor. Because of this we engineers find ourselves at a bit of an impasse.  Those that are looking ahead know…

Engineers will soon want very badly to minimize time spent maintaining parallelized code.  We need our programs to be easy to organize, manage and test.  Yet, as we will soon need to deal with massively parallelized systems, we find many of our ideas about what makes code robust and maintainable are broken.  At the same time, to move to a purely functional language means leaving behind years of thought on how computer programs ought to be constructed, tested and maintained.  Having any pattern, even if it’s wasteful or has many corner cases, is much better than having none.  This is why a hybrid language is so important.

OCaml and F# provide engineers with the set and forget concurrency that comes along with the functional tradition.  At the same time these languages have all of the organizational constructs of object oriented programming as well.  This means that we can continue to use the same types of large scale organizational structures in our programs and also gain the safe parallelism that implicit immutability provides.

 

Conclusion

And so we see that it’s important to consider a language in terms of how it’s creators envisioned it’s use.  Haskell has been developed mainly with research in mind and so is a fantastic research language.  F# has been developed mainly with engineering in mind and so is much better suited for engineering.

In understanding this it’s also easy to see that comparing Haskell to F# is like comparing the tools of a physicist to those of an engineer.  They may have much in common superficially, but they are designed with much different ends in mind.

 

Links

A History of Haskell: Being Lazy with Class
Tutorial: OCaml for Scientific Computation (Contains some history)

Image Processing as Sets of Transformations


Published to Rick Minerich's Development Wonderland by Richard Minerich April 28, 2009 00:21

In the image processing world, like most computational problems, we often think our work is composed of only two basic ideas: representation and transformation.  Of course, one may have many layers of both representations of transformations and transformations of representations which can make things appear quite complex at times.

However,  the problem is much more simple than it appears.  This is because a representation can be considered as a transformation from a zero or identity state.  Thus, in writing a symbolic language for image processing, we are left with only a single idea to consider:  transformations.  By composting layers of transformations we can apply image processing techniques in way which is not only bidirectional and platform agnostic but also comes along with a host of other benefits.

 

Let us consider a simplified example of processing an image:

1) We read in a file (representation) and use a codec (transformation) to convert it into a format understood by our API (representation).

2) We then perform some type of algorithm on that data (transformation) which results in some type of output (representation).

3) Finally, via another codec (transformation), another file is saved to disk (representation).

 

In most cases there are a great number of intermediate representations.  Each is a full copy of the previous iteration with whatever changes have been so far applied.  Essentially, the same information is copied over and over again in memory.  We do allow for some kinds of in-place processing, however, this is bad as when the operation has been completed, the previous representation has been destroyed.

 

Instead, what if we batched up sets of transformations?  This could have many benefits:

1) The most obvious benefit is that of parallelization.  Even at the simplest level of functional composition, these transformations could be handed off to a cluster for asynchronous processing or saved for a later batch processing job.

2) With an intermediate symbolic transformation language, processing algorithms could potentially be combined and reduced to produce a single transformation out of many.  This would significantly reduce the processing overhead as well as the number of intermediate memory representations.

3) An intermediate symbolic language which encompassed both codec and processing may make it possible to push the processing transformation through the codec transformation and in so doing no longer need to have any intermediate memory representation.  This could provide significant memory and processing speed time benefit. 

4) The intermediate symbolic language could be saved into the files themselves thus removing the need for the codec to be present on the end machine.  Admittedly, the user would also need the image language interpreter.

5) Instead of applying simple image processing algorithms to an image, the symbolic representation could be appended to the end of the file.  This would be quite similar to layers in practice.  In this way it would be possible to view the image at all stages of transformation.

6) For large or proprietary transformations, the representation could be kept on the internet and either be downloaded or, in the case where the owner did not want to expose their algorithm, a flattened representation could be sent out and a processing delta could be sent back.

 

Conclusion

Of course, when I speak of data I don’t only mean the image itself.  This technique could also be applied to many classes of data or algorithm.  Most notably for us, image metadata.

My initial goal is to build a basic codec representation along with some simple transformations.  Currently, I am researching bidirectional, reversible and declarative languages as examples.  With F# as a base language I believe it will be possible to build something portable to other ML variants.

Image Processing as Sets of Transformations


Published to Rick Minerich's Development Wonderland by Richard Minerich April 28, 2009 00:21

In the image processing world, like most computational problems, we often think our work is composed of only two basic ideas: representation and transformation.  Of course, one may have many layers of both representations of transformations and transformations of representations which can make things appear quite complex at times.

However,  the problem is much more simple than it appears.  This is because a representation can be considered as a transformation from a zero or identity state.  Thus, in writing a symbolic language for image processing, we are left with only a single idea to consider:  transformations.  By composting layers of transformations we can apply image processing techniques in way which is not only bidirectional and platform agnostic but also comes along with a host of other benefits.

 

Let us consider a simplified example of processing an image:

1) We read in a file (representation) and use a codec (transformation) to convert it into a format understood by our API (representation).

2) We then perform some type of algorithm on that data (transformation) which results in some type of output (representation).

3) Finally, via another codec (transformation), another file is saved to disk (representation).

 

In most cases there are a great number of intermediate representations.  Each is a full copy of the previous iteration with whatever changes have been so far applied.  Essentially, the same information is copied over and over again in memory.  We do allow for some kinds of in-place processing, however, this is bad as when the operation has been completed, the previous representation has been destroyed.

 

Instead, what if we batched up sets of transformations?  This could have many benefits:

1) The most obvious benefit is that of parallelization.  Even at the simplest level of functional composition, these transformations could be handed off to a cluster for asynchronous processing or saved for a later batch processing job.

2) With an intermediate symbolic transformation language, processing algorithms could potentially be combined and reduced to produce a single transformation out of many.  This would significantly reduce the processing overhead as well as the number of intermediate memory representations.

3) An intermediate symbolic language which encompassed both codec and processing may make it possible to push the processing transformation through the codec transformation and in so doing no longer need to have any intermediate memory representation.  This could provide significant memory and processing speed time benefit. 

4) The intermediate symbolic language could be saved into the files themselves thus removing the need for the codec to be present on the end machine.  Admittedly, the user would also need the image language interpreter.

5) Instead of applying simple image processing algorithms to an image, the symbolic representation could be appended to the end of the file.  This would be quite similar to layers in practice.  In this way it would be possible to view the image at all stages of transformation.

6) For large or proprietary transformations, the representation could be kept on the internet and either be downloaded or, in the case where the owner did not want to expose their algorithm, a flattened representation could be sent out and a processing delta could be sent back.

 

Conclusion

Of course, when I speak of data I don’t only mean the image itself.  This technique could also be applied to many classes of data or algorithm.  Most notably for us, image metadata.

My initial goal is to build a basic codec representation along with some simple transformations.  Currently, I am researching bidirectional, reversible and declarative languages as examples.  With F# as a base language I believe it will be possible to build something portable to other ML variants.

Discoveries This Week 05/17/2009


Published to Rick Minerich's Development Wonderland by Richard Minerich May 17, 2009 21:43

A bit of a slow week.  Perhaps some are out playing in the recently fantastic weather instead of blogging about functional programming.  My favorites this week were Paul Hudak’s talk on Haskell, Jason Olson’s Channel9 discussion, Niklas Gustafssons’s Actors in F# and Kurt Schelfthout’s Testing DSLs with FsCheck.

 

Paul Hudak on InfoQ - Haskell

An interview that begins with a discussion of when to introduce difficult Haskell concepts like monads, moves to a discussion of the philosophy of higher order programming.

Paul Hudak, as a principal designer of Haskell, explains FP topics with a passion few others could.  Also, I’m curious about this idea of arrows (Freyd-categories) as opposed to monads.

 

Jason Olson’s on Channel9 - Composing Programming Languages

We talk about his [Lang.NET F# and OO] presentations and his perspectives on object orientation, F# and his own language.

As it seems impossible to pin down exactly what OO is, perhaps we should try to categorize it’s subcomponents across different languages.

 

Niklas Gustafsson on Actors in F#

One of the advantages of this model over the Axum model is that any number of clients can communicate with an actor, all that is needed is access to the mailbox reference.

I must beg to differ with Niklas on one of his stated disadvantages of the Actor model.  It would be easy enough to include a “closing down state” in your Actor FSA.  The inability to statically check your agreements is correct though.  I’d be interested in seeing a simple Axum sample which could handle N clients and still be statically checked.

 

Kurt Schelfthout on How to test DSLs (and: FsChecking FsCheck)

It occurred to me that (domain specific) languages seem difficult to test using traditional unit tests: after all, the usage possibilities are far greater for a language than for, say, a typical user interface.

An extremely clever way to test DSLs.

 

I understand it is exceedingly painful to sign up for an account to leave a simple comment.  That given, I am still trying to figure out a solution to my spam problem.  For now, if you would like to comment and don’t wish to set up an account, please email me your comment and I will manually post it here.

Code Camp 12: Boston – Why F#?


Published to Rick Minerich's Development Wonderland by Richard Minerich October 16, 2009 15:18

A couple of months ago I was talking to Lou Franco, the head of our Software Engineering department and fellow functional programming enthusiast, about the possibility of using F# for projects in the future.  Being business minded, he replied that he would need a compelling reason to bring F# on board.  This presentation is dedicated to him and others who have functional programming on their radar but haven’t yet found a compelling reason to bring it in to their company.

I acknowledge that, as for now, it’s difficult to suggest anyone do more than play with F#.  I have been anxiously awaiting the stabilization of the F# API which will come along with the release of VS2010.  With the recent changes breaking backwards compatibility, maintaining my old F# samples has become quite a nightmare.  Indeed, not a single code sample I have from a year ago works out of the box with the current release.

However, VS2010 is only a few months away.  Now is the time to start learning about F# and the paradigms which make it so powerful.  Functional programming has amazing benefits in terms of parallelization, code compression and overall code robustness.  

At Code Camp 12 Boston, I will talk about the soon-to-be-realized world where programmers are divided into groups which each use different types of languages to build different kinds of things.  This is easy to predict as it is already occurring.  UI, data processing and data storage programmers are already diversifying both in working knowledge and tools.

As is evidenced by WPF, HTML and CSS it seems UI design is moving more and more to a declarative style.  Similarly, the rise of F#, Scala, Erlang and Haskell indicates that algorithmic programmers are migrating to the functional programming languages.  SQL has long been the language of those involved in data storage.  It’s no wonder that this has happened.  When your tool is better designed for your job, the work gets done faster and you end up with a better result.

Where does this leave imperative and object oriented languages?  Languages like Java, VB and C# will be relegated to being used as “glue” for existing systems while abstract languages slowly eat away their market share.  This will happen more and more as the number of cores per processor continues to increase and those with imperative implementations find that they are unable to scale.

When: Oct 16th, 2009 (11:50am)
Where: 201 Jones Rd, 6th Floor, Waltham MA USA (Room TBC)

Slides are available here.

Also, be sure to also check out my fellow F# User Group leader Talbott Crowell’s talk.  It’s right before mine (10:30) in the same room (Thanks Chris!).  You can find out more by heading over to Chris Bowen’s blog and reading his post on Code Camp 12.

F# and You! - New Hampshire User’s Group and Beyond


Published to Rick Minerich's Development Wonderland by Richard Minerich October 26, 2009 16:24

I’ve been working for a while on a new presentation which I was finally able to give last week at the New Hampshire .NET User GroupF# and You! focuses on painting the big picture about F# instead of the off-putting details like having to learn new syntax.

 

functional

For this new presentation I start by discussing the adoption of functional programming on other platforms by using Scala, Erlang and Haskell as examples.  I then continue on to how the algorithmic programmers are moving to functional languages while UI developers are moving towards declarative.

This then naturally raises the question of why programmers would choose a specific style of programming for a specific task.  The answer, of course, is that when you work close to a domain you can build things more quickly and with a lower error rate.

This then transitions easily into the specific beneficial properties of functional programming (and F#).

 

General Overview:

  • The Separation of UI and Back End
  • What is Functional Programming? (Functional Concepts)
    • Pure functions
    • Immutability
    • Lambda Expressions
    • Higher Order Functions
    • Recursion
  • The Benefits of Functional Programming
    • Code Compression
    • Parallelism
    • Robustness
  • Proof is in the Pudding
    • TrueSkill
    • Grange Insurance Rating Engine
    • MSN adCenter
  • Wrap-up

 

I’ve also found that it is useful to show the latest circulating version of the Ant Colony simulation I wrote over a year ago.  I found a version in Don Syme’s JAOO Talk code samples but am not sure who has been keeping it up to date.  A big thanks to whoever that person is.  It provides some sweet eye candy to dull the bitterness of the technical Functional Concepts section.

 

Planned (and Past) “F# and You!” Locations:

11/10/2009 – CT .NET Developers Group (Farmington, CT)
10/28/2009 – Technology Users Group (Charleston, SC)
10/21/2009 – New Hampshire .NET User Group (Nashua, NH)

See the new Other Events section of our F# User Group site for information on other upcoming F# talks.

 

Downloads:

Latest Slides
VS2008 Code Samples

F# Discoveries This Week 12/06/2009


Published to Rick Minerich's Development Wonderland by Richard Minerich December 06, 2009 23:14

We have a great selection of links this week with topics including discriminated unions, equality and comparison constraints, purely functional data structures, a language performance comparison, and a couple of 1.9.7.8 compatibility tweaks for XNA and Silverlight.

Also, the first Monday of the month approaches and with it another meeting of the New England F# User Group.  Talbott Crowell will be speaking on F#’s asynchronous programming features with a beginner audience in mind. 

 

-- Events --

Talbott Crowell is Speaking at the New England F# User Group

 

-- Links --

Gordon Hogenson Explains Discriminated Unions

In this video, programming writer, Gordon Hogenson explains and gives examples of discriminated unions in F#. 

 

Brian McNamara’s Motivating F# equality and comparison constraints

By reading this blog entry, the reader should learn: what structural equality and comparison are, what is the problem that F# equality/comparison constraints solve, why this problem cannot be solved without a specific language feature, [and] when does a programmer need to care about this.

 

Julien Ortin’s Purely Functional Data Structures in F#: Leftist Heap, Finite Map and Binary Search Tree.

This post describes the F# implementation of the [leftist heap, finite map, binary search tree] from Chris Okasaki’s “Purely functional data structures”.

 

Phillip Trelford’s C++ vs C# vs F# vs Haskell

Since I first posted an F# solution to Left-Truncatable Primes, C# and Haskell have entered the frame, and although this is not a good problem for a serious comparison of languages, I think it is still interesting.

 

Steve Gilham’s F# October CTP and Silverlight 3.0

and also his F# and Silverlight 3 addendum

There's an official Silverlight 3.0 project template for Visual Studio 2008; while targetted at the May CTP, these can be used with the October CTP by opening up the .fsproj file and amending the file path for the FSharp.Core assembly.

 

Joh’s F# on the XBox 360

This article describes how to build an XNA Game Studio application using an F# library in such a way that it can be run on the Xbox 360. It applies to the current version of F#, 1.9.7.8.

 

Talbott Crowell’s Adding Parallel Extensions to F# for VS2010 Beta 2

Matthew Podwysocki comes to the rescue with his blog post Adding Parallel Extensions to F#.  Unfortunately his code example doesn't work in VS 2010 Beta 2 (his post was from February.)

F# Discoveries This Week 01/13/2010


Published to Rick Minerich's Development Wonderland by Richard Minerich January 13, 2010 16:59

Back again this week with a fresh batch of F# Posts, Videos and Events.  I’ve been enjoying Matthew Podwysocki’s “Much Ado About Monads” series quite a lot.  They are well worth checking out for beginner and advanced alike.

 

Events

If you would like to see your event here, send me an email via the link at the top of the page.

Rick Minerich - Charleston SC Technology Users Group on the 27th of January (check out the awesome flier)

Steffen Forkmann - Frankfurt .NET Usergroup on the 21st of January

 

Posts

Don Syme’s Async and Parallel Design Patterns in F#: Parallelizing CPU and I/O Computations

One simple way to write parallel and reactive programs is with F# async expressions. In this and future posts, I will cover some of the basic ways in which you can use F# async programming - roughly speaking, these are design patterns enabled by F# async programming.

 

Don Syme’s F# Interactive Tips and Tricks: Visualizing Data in a Grid

The demos in my F# talks use a number of coding snippets to acquire, generate and display data interactively. Some of these little snippets are not so well known, but they are useful :-)

 

Don Syme’s F# Interactive Tips and Tricks: Formatting Data using AddPrinter, AddPrintTransformer and %A in sprintf/printf/fprintf

Here are some tips and tricks for formatting data in F# Interactive. This is not meant to be a comprehensive guide, just enough to get you started. Please let me know if you need more examples.

 

Matthew Podwysocki’s Much Ado About Monads – Reader Edition

So, our ultimate goal would be instead to have our environment set once and then read from it implicitly.  We still want to keep what we have here in terms of our script, but change the underlying mechanism for how it happens.

 

Anton Schwaighofer’s SkillsMatter.com talk: F# and Units-of-measure for Technical Computing

I will start by giving an introduction to units-of-measure and their implementation in F#. I'll work through smaller and larger code examples that make use of units-of-measure.

 

Nancy Strickland’s MSDev.com Training Session on F#

Visual Studio 2010 includes a new programming language, F#. This session explains and provides a walk-through demonstration of basic programming in F#.

 

Julien Ortin continues his series on Purely Functional Data Structures with a Lazy Pairing Heap, a Real-time Queue and Bottom-up Merge Sort

This post describes the F# implementation of the <insert data structure here> from Chris Okasaki’s “Purely functional data structures”.

 

Mauricio Scheffer’s Translating Haskell to F# and other considerations

BUT, syntactic similarity does not imply that they're semantically the same. There's a fundamental difference from the original code: Haskell is a lazy language, while F#, coming from the ML-family, does eager evaluation.

 

Mark Needham’s F#: Refactoring to pattern matching

I was looking through some of the F# code I've written recently and I realised that I was very much writing C# in F# with respect to the number of if statements I've been using.

 

Mark Needham’s F# attempt at Roy Osherove’s TDD Kata

As I've mentioned in a few of my recent posts I've been having another go at Roy Osherove's TDD Kata but this time in F#.

 

Holoed’s Weak Subscribe to an IObservable Source

A code snippet describing how to subscribe to an event via a weak reference.

 

Erik Schulz’s Resource Pool in F#

I’ve been toying around with F# recently, it’s good to see an example that you can easily compare and contrast with the C# version.

 

Ade Miller’s Gotchas: Common Traps for the F# n00b

I spent a bunch of time over the holidays getting to know F# a bit better. I think I now consider myself to be truly dangerous with it.  A couple of things which repeatedly bit me as I stumbled through learning F# as a n00b.