Skip to content

Muck and Brass – Chas Emerick

These are the stories that have been posted to the Muck and Brass – Chas Emerick blog.

All my methods take 316 arguments, and I like it that way


Published to Muck and Brass – Chas Emerick by Chas Emerick December 31, 2009 11:53

News Item edited by Chas Emerick

Of course, I'm not so daft as to say that, but:

If you use an imperative programming language that provides for mutable state, that's what you are saying.

For some background, I read this article yesterday, which contains this choice passage (emphasis mine):

Imagine you've implemented a large program in a purely functional way. All the data is properly threaded in and out of functions, and there are no truly destructive updates to speak of. Now pick the two lowest-level and most isolated functions in the entire codebase. They're used all over the place, but are never called from the same modules. Now make these dependent on each other: function A behaves differently depending on the number of times function B has been called and vice-versa.

In C, this is easy! It can be done quickly and cleanly by adding some global variables. In purely functional code, this is somewhere between a major rearchitecting of the data flow and hopeless.

A comment on proggit very concisely summed up just how crazy the above passage is:

Considering that one of the majors reasons to use FP is so that you don't have such inter-dependencies, it's odd to point that out as an issue.

The whole problem with imperative programming is that state gets threaded everywhere, and you can't look at any function individually and know how it will behave. I won't even go into problems associated with concurrency, where state becomes incredibly difficult to reason about if you allow that sort of thing.

I really appreciated the notion of imperative programming "threading state everywhere". Let's drive the point home, though.

Hey, I'm just the messenger

Consider a method you might see in any Java application (I oh-so-love the jvm, so I get to pick on Java), but the same sort of thing applies in C, C++, C#, python, ruby, perl, et al.:

public void doSomething (String arg1, int arg2, FooBar arg3) throws IOException;

Simple enough, right? Hey, we're programming, life is good. But, what if you saw a signature like this:

public void doSomething (String arg1, int arg2, FooBar arg3, .....,
                         String arg316) throws IOException;

316 arguments to a method (which I don't think is actually possible in the jvm, but bear with me)? "That's absurd!", you'd say. The problem, of course, is that the 3-arg doSomething actually has far more arguments than its signature implies:

The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked.

So, if you have 313 other variables in your program, that 3-arg doSomething is functionally (ha!) operating over 316 arguments.

Would you ever intentionally write a method signature that takes 316 arguments? Would you use any library that contained such a function signature? No? Then why are you using tools that force such craziness upon you?

Postscript

Of course, there is a place for mutable, imperative programming. The fellow who wrote the blog post to which I linked above appears to work on games, one of the few places where one could unapologetically use an imperative programming language with mutable state. Update: Looks like the state-of-the-art in game programming is heading towards FP languages more than I thought. Thanks to this comment, here's a LtU thread, with slides, about the guys who wrote Gears of War and the Unreal engine recommending FP as the future of game development.

However, we need to collectively get past encouraging other software developers – the vast majority of whom do not have the particular requirements of game, systems, or embedded development – to inflict the pain of imperative languages and mutable state upon themselves, especially given the concurrency challenges that lie ahead (never mind the general problems such environments present, as I argue above). The languages are ready, the runtimes are widespread...let's stop doing it wrong.

Mavenization of NetBeans Platform projects


Published to Muck and Brass – Chas Emerick by Chas Emerick December 28, 2009 21:24

News Item added by Chas Emerick

Over the past month, I've been gradually porting all of our projects' builds from Ant to Maven. Everything's gone swimmingly, especially given the excellent clojure-maven-plugin, which allowed me to cleave off all of our comparatively complicated ant scripts for building and testing Clojure code. One part that did require some work was the porting of the builds associated with our NetBeans Platform-based applications – so, I thought I'd post a couple of hints to help others over the rough spots.

A plug for NetBeans
We've had a good deal of success in using the NetBeans Platform recently (often referred to as the NB RCP). It provides a metric ton of fairly high-quality plumbing for thick-client applications, and definitely saved our asses in a couple of key areas insofar as we've been able to reuse large pieces of the Platform, essentially unchanged, to meet critical new requirements. Of course, that's why we chose to use it in the first place.

Extemporaneous and Lengthy Background

To be clear, the rough spots in question aren't associated with the actual Mavenization of the NetBeans Platform-based projects – that's a relatively straightforward affair, with archetypes available in the NetBeans IDE to get one started, and very well-documented goals available, all provided by the NBM Maven Plugin. Given an existing ant-based build process, I found the actual porting of the build fairly straightforward.

The dicey part had to do with having a set of Platform artifacts available to build against. Under the ant-based build regime, it was common for those building on top of the NB RCP to keep a set of RCP artifacts available in every build environment. This was always a pain (for potentially-obvious reasons that I don't really want to get into now), and the general non-composability of the ant-based build process drove NB RCP users (and the Platform developers themselves) to extreme lengths of hacking to get stuff working properly. (BTW, just so everyone knows, I'm not picking on Fabrizio here – he's just the one who appears to have pushed the envelope more than anyone else vis á vis improving the composability of the ant-based RCP build process.)

One great thing about the NBM Maven Plugin is that it cuts this knot quite elegantly, making it possible to treat NetBeans Modules (NBMs) as first-class citizens within the maven world. So, if you have a maven repository that contains NBMs (like this one hosted by the NetBeans folks themselves), you can readily add NBM dependencies just like you would jar dependencies from maven central:

<dependency>
   <groupId>org.netbeans.api</groupId>
   <artifactId>org-openide-nodes</artifactId>
   <version>${netbeans.version}</version>
</dependency>

...and the NBM plugin will take care of using those NBM dependencies as appropriate:

  • injecting the NBMs' associated jars into the project's compile classpath
  • adding the NBMs as runtime dependencies of whatever NBM(s) your project/application produces
  • adding the NBMs to the (optional) "update site" associated with your NB RCP application (making remote updating of that application in the field trivial)

And, to complete the cycle, the nbm-maven-plugin provides a nbm packaging type, so that you can build NBMs independently, deploy them as you'd expect, and then compose them without any ceremony into however many NB RCP applications you'd like. No suite-chaining, no special platform or cluster artifacts in every build environment, nothing at all different from what one is used to in any other jvm/maven environment.

The Rough Spot

All of the above works flawlessly (at least it has for me in my ~month of usage). The key prerequisite though, is having access to a repository that contains the Platform NBMs that you'd like to use. The repository that I linked to above does not track NetBeans releases in lockstep (e.g. at the time of this posting, the http://bits.netbeans.org/maven2 repo has NBMs from NetBeans v6.5 and v6.7, but not v6.7.1, or the recently-released v6.8). The solution is to populate your own maven repository with those NBM artifacts.

Deploying NetBeans Platform artifacts to your own repository

This might have been a tedious process, were it not for another handy goal from the NBM Maven Plugin, populate-repository, which will push all of the artifacts produced by a NetBeans Platform build (the NBMs themselves, their sources, javadoc, and appropriate non-NetBeans dependency metadata) into your own maven repository.

There's a fair bit of configuration and setup that goes into this though. A HOWTO is provided by the nbm-maven-plugin project, but there are a number of things that it leaves unspoken. So, here's a dump of what I did to successfully populate a Nexus maven repo with a full set of NetBeans Platform artifacts:

  1. Pull the NetBeans Platform sources from the associated hg repo (I used the release68 repo, as we're targeting v6.8 of the NB RCP now). It appears that populating your repo with NB RCP artifacts from a binary download is possible, but then you'll not have the associated javadoc, source artifacts, etc.
  2. Build the entire project – I'm sure it's possible to restrict the build to certain clusters, but I don't see any reason to optimize this process since doing so only saves a little bit of disk.
    1. You must set your JAVA_HOME environment variable to point to a Sun JDK, especially in linux environments that often come with non-Sun JDKs (I'm looking at you, Ubuntu, with your cute gcj JDK). Not doing this will result in very strange compilation errors.
    2. You must set your ANT_OPTS environment variable to specify a higher-than-default maximum heap (export ANT_OPTS=-Xmx1024m worked for me).
    3. Within the top-level of your NetBeans Platform source checkout, run ant; ant nbms build-source-zips build-javadoc – this will build everything you care about in order to populate your maven repo.
  3. You want to have the NBMs in your repository to have appropriate dependency relationships established with third-party artifacts, right? Achieving this is easy if you have Nexus:
    1. unzip sonatype-work/nexus/storage/central/.index/nexus-maven-repository-index.zip somewhere (I used /tmp/nexus-index).
    2. set the nexusIndexDirectory property in the last step to that the path where you unzipped central's index; the nbm-maven-plugin will search that Lucene index to find dependencies referred to within the Platform's NBMs
  4. set MAVEN_OPTS to specify a higher-than-default maximum heap (export MAVEN_OPTS=-Xmx512m worked for me). I'm not sure why this would be required, but I got OutOfMemoryErrors with max heap set to anything less than 512MB. Perhaps searching the maven central repo index is what pushed allocation so high.
  5. Make sure you don't have a pom.xml in your current directory. Bad things will happen.
  6. Decide on a version number for the deployed artifacts, and use it as the value of the forcedVersion property. I used RELEASE68 to go along with the pattern established at http://bits.netbeans.org/maven2; 6.8 makes more sense to me, but if/when the NetBeans maven repo comes up to date with the NetBeans release schedule, sticking with their convention will allow us to use that authoritative repository with no changes to our projects.
  7. Assuming you're deploying to a release repository, make absolutely sure that you've (temporarily) enabled redeployment for that repository! nbm-maven-plugin deploys some NBMs multiple times (presumably while traversing various dependency graphs), and not enabling redeployment will result in errors (400 errors from Nexus, specifically – I can't say what might happen with different repository managers).
  8. Now for the big finish:
    mvn org.codehaus.mojo:nbm-maven-plugin:3.1:populate-repository -DforcedVersion=RELEASE68 -DnetbeansInstallDirectory=nbbuild/netbeans -DnetbeansSourcesDirectory=nbbuild/build/source-zips -DnexusIndexDirectory=/tmp/nexus-index -DnetbeansJavadocDirectory=nbbuild/build/javadoc -DnetbeansNbmDirectory=nbbuild/nbms -DdeployUrl=<nexus_repo_url> -DskipLocalInstall=true

Whew! Let that sucker run for a while, and you should be left with a maven repository fully populated with NetBeans Platform artifacts.

Sane web development with Compojure, Jetty, and Maven


Published to Muck and Brass – Chas Emerick by Chas Emerick January 08, 2010 13:12

News Item added by Chas Emerick

I find myself slipping back into web development in the new year. I've known this was coming for some time, so I've had a fair chance to carefully choose my weapons:

What has really tied this all together is Maven (and a couple of plugins for it), which has enabled me to fill in a couple of gaps in what is otherwise the most pleasant web development environment I've ever used (where Pylons was the prior champ, FWIW).

The biggest gap is in automatic application reloading/redeployment – in concrete terms, when I save a Clojure source file, my application should be reloaded nearly immediately, thereby avoiding any code-build-deploy cycle. To be precise, this capability is built into Jetty (as it is in many other Java-based app servers). The question is, how to most readily utilize it.

I came across this post by Jim Downing, which describes how to set up a Maven project for a Compojure application, enabling development-mode app reloading using the maven-jetty-plugin (the formatting on that post appears to have degraded since it was published; you can check out the project described in the post here). This certainly appears to fit the bill; unfortunately, the setup that Jim describes there doesn't quite work for me – when I save a source file, the application is automatically redeployed, but no changes are picked up.

Thankfully, the fix is easy. Below is the relevant section of my pom.xml, configuring maven-jetty-plugin to add my Clojure source root as an extra classpath element. This allows Clojure, running in the jetty application server, to find and load any Clojure source files that are newer than their AOT-compiled counterparts in the usual target/classes directory (note the webAppConfig/extraClasspath elements):

<plugin>
    <groupId>org.mortbay.jetty</groupId>
    <artifactId>maven-jetty-plugin</artifactId>
    <version>6.1.15</version>
    <configuration>
        <contextPath>/</contextPath>
        <webAppConfig>
            <extraClasspath>src/main/clojure</extraClasspath>
        </webAppConfig>
        <scanIntervalSeconds>5</scanIntervalSeconds>
        <connectors>
            <connector implementation="org.mortbay.jetty.nio.SelectChannelConnector">
                <port>8080</port>
                <maxIdleTime>60000</maxIdleTime>
            </connector>
        </connectors>
        <scanTargetPatterns>
            <scanTargetPattern>
                <directory>src/main/clojure</directory>
                <includes>
                    <include>**/*.clj</include>
                </includes>
            </scanTargetPattern>
        </scanTargetPatterns>
    </configuration>
</plugin>

With that, I'm just a mvn jetty:run away (or, really, a single click away in NetBeans) from having a development process identical to paster serve --reload, with the added benefit of Clojurey goodness.

♫The more you know...♬♪

(Apologies to those who aren't familiar with American pop culture.)

If you want to compile Clojure code (and really, if you're involved in a project of any size or importance, you should be, if only to avoid forcing Clojure to generate bytecode at runtime, which will slow down the sort of rapid development enabled by automatic app redeployment as describe above), do me a favor and use clojure-maven-plugin. (The post I reference above manually invokes the Clojure compiler using ant's exec task, but that was what you had to do back in July 2009.) It's a great piece of kit, and additionally serves as a perfect gateway drug to Maven – which, despite the controversy, and my own quibbles with various aspects of it, will eventually save your bacon in any larger project.

Western Mass Developers Meet at Snowtide!


Published to Muck and Brass – Chas Emerick by Chas Emerick March 13, 2009 22:23

News Item added by Chas Emerick

I just wanted to say ‘thank you’ to everyone who came to last night’s Western Mass. Developers’ meeting.  Further, many thanks to those who helped out in one way or the other  — especially Miles and Doug for running for the D’Angelos, Doug for bringing the ice and cooler, Joe and Lou and Greg and Brian and everyone else who helped to set up or tear down.  I think everyone pitched in, which made it all work out pretty smoothly, I think.

FYI, we collected $170 last night.  That covered all of our food expenses and then some — I think once I tally up everything, we’ll have a surplus of ~$40 (and we have a bunch of generic supplies that we can put to use in the future).  Thank you very much to everyone who pitched in in this way, too.  Hopefully we can keep that pot flowing.

Some highlights from the meeting, and random thoughts of mine, in no particular order:

  • Doug gave what sounded like a rousing talk about the PHP templating system that he conjured up.  It seemed like most of the group really enjoyed that.
  • I generally don’t touch PHP, so I hung back and talked about entrepreneurship and software business models with Lou, Maria, Michael, and….darnit, I can’t recall the name of the other gentleman that joined us.  Sorry, man, I can be bad with names at times.  Keep coming to the meetings, and I’ll straighten out, I promise.
  • In the second time slot, I instigated a discussion about the current state of rich client platforms, through the lens of some particular requirements that we have for current/future projects.  That turned out to be pretty entertaining and productive, with a big chunk of time dedicated to people being impressed by the surface features of Titanium/Appcelerator.  That may be a good topic for future blog posts if we end up really digging into it.
  • Just about everyone was down on Adobe Flex/AIR as being very unpleasant from an end-user perspective (widgets not behaving as one would expect, etc).  I unintentially sort ended up trashing on JavaFX — or more specifically, the current lack of an integration story between Swing and JavaFX, as well as the oddities of JavaFX script.  In the clear light of day, I feel like I should probably give it a closer look, simply because of our established JVM codebase.
  • There was widespread speculation that a “shadow group” got together at Panera, despite all of the chatter and announcements about the change in venue.  Maybe next time (if there’s a next time here @ Snowtide), someone could swing by Panera and gather up those who aren’t as plugged-in to the group’s chatter.
  • Gerard walked away with Managing Humans, graciously provided to us by Apress’ developer group book program.
  • Will and I ended up holding on to the Terracotta book (also from Apress), though we promise to pass it on to Miles when we’re done!

It seemed like everyone had a good time and that most were pretty happy with the results compared to the usual Panera experience, but I’m clearly biased.  One way or the other, shout out what you liked and didn’t like (either on the mailing list or in the comments below).

FWIW, I’m happy to have Snowtide continue hosting the group’s meetings if people enjoyed the result.  If there’s a next time, get the shared conference room, and see how that works out.

Again, thanks to everyone who came!

Venture capitalists are entertaining, but please don’t take them too seriously


Published to Muck and Brass – Chas Emerick by Chas Emerick March 05, 2009 23:23

News Item added by Chas Emerick

I often enjoy the Entrepreneurial Thought Leaders podcasts, which deliver talks from the Stanford Technology Ventures Program.  In particular, it is often useful to glean an idea or moral from the war stories told by some of the weathered entrepreneurs that the STVP invites to talk about their past or current companies.

Every now and then, though, a podcast lands in my iPod that involves a roundtable of venture capitalists.  VCs are often very dynamic, engaging people that are entertaining to listen to, but just as often, they say the most amazingly absurd things.  A recent roundtable podcast (entitled What is the Next Big Thing) really pegged the absurdity meter, though.

Addressing a gathering of Stanford students, alumni, and associates, three venture capitalists, Tony Perkins, Tim Draper, and Michael Moediscussed the recent economic conditions, with the bottom-line message that it is in “times like these”, when markets and economies look their bleakest, that the most successful and impactful businesses are often forged.  That’s an oldie but goodie — so far, so good.

Things go off the rails around the 13:20 mark, though.  One of the three speakers — I believe it was Tony Perkins, but these things are hard to be sure of in an ensemble podcast — relayed how Marc Andreessen (former founder of Netscape and now also a part-time investor) was talking with Charlie Rose about how the New York Times should just kill their paper version.  That’s no huge new idea, but that got Tony off on a slight tangent that led him straight into the weeds (bold emphasis mine):

A lot of the whole [dot-com] bubble period was based upon a vision of the Internet steamrolling the way people do business and creating what was then called the “New Economy”.  My theory right now is that all of those things we talked about that were going to happen, like the end of television, the end of newspapers, all that stuff that we poured a bunch of money in because we thought it was going to happen ten years ago is actually happening now.

So a lot of the destruction in the market, a lot of the jobs that are being destroyed, are jobs that are being steamrolled — a lot by the Internet — but increasingly by the “green tech” movement because entrepreneurs are looking at how we do everything, and they’re saying “how can I do that same thing in a way that is better for the environment?”.  That’s bringing the Silicon Valley mentality into the whole green space, which is super-exciting.

The reason I share your optimism is because we are the future.  Silicon Valley is the future; a lot of the jobs we’re seeing being destroyed are never going to come back, but it is our world that is causing the destruction, and therefore is going to be the one that creates the jobs.

Hey, I’m essentially a nobody, so maybe Tony’s really got the inside track, and I’m not seeing the forest for the trees.  But wow, the U.S. economy lost 598,000 jobs in January, including:

  • 22,000 cut from Caterpillar
  • 4,500 from Kodak
  • 19,800 cut from Pfizer
  • 5,000 cut from Microsoft (the first mass layoff in that company’s history)
  • 2,400 cut from EMC
  • 13,500 cut from Alcoa

Etc., etc.  Sorry Tony, these job losses aren’t due to Silicon Valley and VC-backed internet and green-tech companies owning the world and replacing Caterpillar’s earth movers and minimizing the need for Alcoa’s aluminum.  There are a lot of theories about why the economy is what it is of late (lending practices, creative derivative strategies, poor Federal Reserve policy, etc.), but honestly it never occurred to me that I’d come across anyone with the chutzpah to say that recent shrinkage (and reversal) of economic growth and the attendant job losses are due to internet and green-tech companies “steamrolling” the Old Economy1.

Even more crazy to me is the notion that Silicon Valley is going to be singularly responsible for reinvigorating the economy.  It certainly has a role to play, and has had tremendous impact in the past, but from where I sit, Silicon Valley has been far too busy over the past couple of years building Web 2.0 trinkets to be ready with any kind of game-saver anytime in the near future.  Thinking (and saying) otherwise is good marketing within that particular echo-chamber, but it likely sounds like simple self-aggrandizement anywhere else.  (Hopefully there’s a stealth-mode clean energy startup that will prove me wrong on this point.)

I don’t mean to pick on Tony here.  Lots of other VCs have said similar things — it’s just that in this case, the usual VC rhetoric happens to bump up pretty hard into real-world facts and real-world struggle.  Big-picture notions about how entrepreneurship and innovation are the keys to building a stronger economy and a better world are good, but watch out for the odd notions that are borne out of the VC bubble (which seems to have its effect upon almost everyone that steps inside for a time).

My general point is simply that VCs say the darnedest things, and especially as it’s become clear that venture capital isn’t at all required (or even desirable) in many situations, one needs to be careful about how much of the VC worldview one takes to heart.

Footnotes:

1Wow, typing “Old Economy” right there reminded me of back-in-the-day when Wired was raving on and on about the new economy and introduced The Wired Index consisting of 40 New Economy companies.  That’s classic entertainment.

Why MIT now uses python instead of scheme for its undergraduate CS program


Published to Muck and Brass – Chas Emerick by Chas Emerick March 24, 2009 12:34

News Item added by Chas Emerick

This week, I find myself lucky enough to be at the International Lisp Conferenceat MIT in Cambridge, MA.  I won’t get into why I’m here right now, for those of you who might be surprised.  The purpose of this post is simply to paraphrase what Gerald Jay Sussman, one of the original creators of Scheme, said yesterday in an a brief impromptu talk about why the computer science department at MIT had recently switched to using python in its undergraduate program.  This change is something that was widely panned when it was announced by many people all across the programming and computing world from various disciplines, so it seems worthwhile to try to document what Prof. Sussman said.

(The impromptu talk happened much after Monday’s formal talks and presentations, and I don’t think that anyone was recording Prof. Sussman’s remarks.  If anyone does have a recording, by all means, post it, and I’ll link to it here — and probably just drop my paraphrasing.)

This is all from memory, so I’ll just apologize ahead of time for any errors or misinterpretations I propagate. If anyone has any corrections, by all means, leave a comment (try to keep your debate reflex in check, though).  In a couple of places, I’ve added notes in italics.  Just to keep things simple and concise, the following is written in first-person perspective:

When we conceived of scheme in the 1970’s, programming was a very different exercise than it is now.  Then, what generaly happened was a programmer would think for a really long time, and then write just a little bit of code, and in practical terms, programming involved assembling many very small pieces into a larger whole that had aggregate (did he say ‘emergent’?) behaviour.  It was a much simpler time.

Critically, this is the world for which scheme was originally designed.  Building larger programs out of a group of very small, understandable pieces is what things like recursion and functional programming are built for.

The world isn’t like that anymore.  At some point along the way (he may have referred to the 1990’s specifically), the systems that were being built and the libraries and components that one had available to build systems were so large, that it was impossible for any one programmer to be aware of all of the individual pieces, never mind understand them.  For example, the engineer that designs a chip, which now have hundreds of pins generally doesn’t talk to the fellow who’s building a mobile phone user interface.

The fundamental difference is that programming today is all about doing science on the parts you have to work with.  That means looking at reams and reams of man pages and determining that POSIX does this thing, but Windows does this other thing, and patching together the disparate parts to make a usable whole.

Beyond that, the world is messier in general.  There’s massive amounts of data floating around, and the kinds of problems that we’re trying to solve are much sloppier, and the solutions a lot less discrete than they used to be.

Robotics is a primary example of the combination of these two factors.  Robots are magnificently complicated and messy, with physical parts in the physical world.  It doesn’t just move forward along the ground linearly and without interruption: the wheels will slip on the ground, the thing will get knocked over, etc.

This is a very different world, and we decided that we should adjust our curriculum to account for that.  So, a committee (here, Prof. Sussman peaked his hands over his head, which I interpreted to indicated pointy-headedness) got together and decided that python was the most appropriate choice for future undergraduate education.  Why did they choose python?  Who knows, it’s probably because python has a good standard library for interacting with the robot.

That is my best paraphrasing of Prof. Sussman’s remarks.  I spoke with him briefly earlier today, primarily to ask his permission for me to post this sort of first-person paraphrasing; he replied: “Sure, as long as you paraphrase me accurately.”  Hopefully I succeeded; I’ll mention again my solicitation for corrections in the comments.

As a short addendum, while I had Prof. Sussman’s ear, I asked him whether he thought that the shift in the nature of a typical programmer’s world minimizes the relevancy of the themes and principles embodied in scheme.  His response was an emphatic ‘no’; in the general case, those core ideas and principles that scheme and SICP have helped to spread for so many years are just as important as they ever were.  However, he did say that starting off with python makes an undergraduate’s initial experiences maximally productive in the current environment.  To that, I suggested that that dynamic makes it far easier to “hook” undergrads on “computer science” and programming, and retaining people’s interest and attracting people to the field(s) is a good thing in general; Prof. Sussman agreed with that tangential point.

Working with git submodules recursively


Published to Muck and Brass – Chas Emerick by Chas Emerick September 28, 2009 15:45

News Item edited by Chas Emerick

Git submodules are a relatively decent way to compose multiple source trees together, but they definitely fall short in a number of areas (which others have discussed at length elsewhere).  One thing that immediately irritated me was that there is no way to recursively update, commit, push, etc., across all of one's project's submodules.  This is something I ran into immediately upon moving to git from svn some months back, and it almost scared me away from git (we used a lot of svn:externals, and now a lot of git submodules).

Thankfully, the raw materials are there in git to work around this.  (I've since noticed a bunch of other attempts to do similar things, but they all seem way more complicated than my approach...maybe it's the perl? ;-))

Here's the script we use for operating over git submodules recursively:

git-submodule-recur.sh
#!/bin/sh

case "$1" in
        "init") CMD="submodule update --init" ;;
        *) CMD="$*" ;;
esac

git $CMD
git submodule foreach "$0" $CMD

Throw that into your $PATH (I trim the .sh), chmod +x, and git submodules become pretty pleasant to work with.  All this is doing is applying whatever arguments you would otherwise provide to git within each submodule, and their submodules, etc., all the way down.  The one special invocation, git-submodule-recur init, just executes git submodule update --init in all submodules.

So, want to get the status of your current working directory, and all submodules?  git-submodule-recur status  Want to commit all modifications in cwd and all submodules? git-submodule-recur commit -a -m "some comment"  Want to push all commits?  git-submodule-recur push  You get the picture.

Note
Starting in git 1.6.5, git submodule will grow a --recursive option for the foreach, update and status commands. That's very helpful for the most common actions (and critical for building projects that have submodules in CI containers like hudson), but git-submodule-recur definitely still has a place IMO, especially for pushing.

This script has saved me a *ton* of typing over the past months.  Hopefully, it finds a good home elsewhere, too.

Edited 2009/09/28
I tweaked the git-submodule-recur script to quote the path to the script ("$0" instead of $0); this became necessary when I dropped the script into C:\Program Files\Git\bin in our Windows-hosted Hudson environment.

Snowtide Informatics Welcomes Ben Fry (of Processing fame) to Northampton


Published to Muck and Brass – Chas Emerick by Chas Emerick April 28, 2009 13:03

News Item added by Chas Emerick

Next Tuesday, the 5th of May @ 6:30PM, Snowtide Informatics and Atalasoft will be hosting Ben Fry, creator of the Processing programming language and environment and author of Visualizing Data from O’Reilly, at Snowtide’s offices in Northampton, MA.

(This hasn’t been a secret or anything (for good reason!), but I thought I’d put out an announcement post.)

Dr. Fry will be presenting “Computational Information Design” – a mix of his work in visualization and coding plus a quick introduction to the Processing language and environment.  Processing has had a huge impact on the field of data visualization, and Dr. Fry’s presentation will no doubt be enlightening for anyone who engages in data visualization at any level.

There will be refreshments.  There’s a Google Maps link on this page if you need directions; please note that the presentation will be held in the second-floor conference room, Suite 234.

Small afterthought: the three avid readers of my blog may recall that a similar event was held a year ago, when we hosted Rich Hickey, creator of the Clojure language.  I think we (meaning Snowtide, Atalasoft, the Western Mass. Developer’s Group, et al.) have a pretty unique combination in this area of outrageously talented people with a collectively broad set of experience and specialties, and a relatively intimate environment where ideas or presentations can be fully fleshed out with lively feedback from everyone involved.  I think there’s some potential to build this foundation up into something very worthwhile; perhaps a regular flow of software wizards to give talks, show off their newest ideas, and recruit evangelists (zealots? ;-)).  Something to think about anyway…

Java is dead, but you'll learn to love it


Published to Muck and Brass – Chas Emerick by Chas Emerick October 01, 2009 14:22

News Item added by Chas Emerick

A favorite hobby-horse among various programming-related communities is to talk about why "Java is dead", and further, that programmers working in the Java ecosystem should really look for greener pastures elsewhere.  You see these sorts of posts pop up on proggit, for example, often enough for it to get old.  That's a lot of hot air, with plenty blowing in the other direction from various folks that have been pushing hard for significant improvements and changes to Java. Both sides are wrong, though, because as a result of its success and a series of historical accidents:

Java-the-language is dead.
Get over it, and realize that because of that fact, you'll probably come to depend upon Java more than you ever thought possible.

The JVM is probably one of the most vibrant platforms for developing new programming languages there is, in part because of the status of Java-the-language.

First, let's settle the premise. In comments on one of his recent blog posts, Joe Darcy, one of the fellows the heads up Sun's management of the JVM and JDK (I'm not sure of his exact title and portfolio), said a couple of key things about the never-ending saga regarding closures in Java:

There are millions upon millions of Java developers who would have to learn about closures if they were added in the platform.

...there is far from unanimity in the Java community on the underlying choice of whether or not closures would be an appropriate language change for Java at this time.

OK, there it is, closures are never going to be added to the Java language.  Done, and done.  And if closures aren't going in, then you can surely bet that other things aren't going to make it, either.  To further make the point, Joe commented on an earlier blog post of his here 2 , saying in reference to a question about why the Java standard libraries don't slough off deprecated APIs:

To date, we have valued continued binary compatibility with code calling the deprecated elements more than cleaning up the API.

This sort of stuff pisses a lot of people off, and leads others to propose mildly absurd things IMO, like forking the Java language into "stable" and "experimental" versions. This a lot of wasted effort.


It seems that Sun decided long ago, through pressure from its customers and developers, that compatibility is more important than innovating at the language level. With that, managing Java and the JDK became more an exercise in stewardship than anything else. The quotes above from an authoritative source are proof-positive that this is the case.

That may make the Java language dead with regard to features, but it's hardly useless – it's simply transitioned to be the stable "systems language" for the JVM that a large swath of programmers (who Sun likely correctly identifies as being uninterested in things like closures, syntactic improvements, etc. etc.) happen to use for applications as well.

Trading off "progress" for stability bestows upon Java at least two characteristics that are shared by other systems languages:

  • screaming into the void about how improvements and changes should be made yesterday is generally pointless and irrelevant
  • knowing that the language is essentially fixed for years to come means that it fades into the background as a very useful artifact for those that want to build on top of a system with well-known characteristics

A side effect of this is that the JVM is a very fertile spot for new(er) languages, where language implementers don't have to worry about their building blocks being taken away or changed radically from year to year 3 . At the same time, the JVM itself has been getting tweaked and tuned heavily under the covers to support non-Java languages, not the least of which is Sun's JavaFX, their entry into the post-Java JVM language fray 4 . So, you want your fork of Java that pushes boundaries? They are many and plentiful, so go choose one, already.

The upshot of all this is that it's more likely than not that over the course of the coming years, your life (and quite likely your professional life as well, if you're involved in software) will come to rely upon Java, the JVM behind it, and many different other language stacks built on one or both of those technologies.

Of course, interop between these languages is a concern: only APIs matching Java's binary signatures are accessible by all languages, there's no standard interface for closures, there's no standard (sane) numeric tower, etc. etc. These things are frustrating if one happens to be working in a polyglot environment, but I've no doubt that necessity will draw the larger players in the JVM language space together to establish certain baselines to ensure interoperability.

In the end, we might have all been better off if the current state of affairs had arrived years ago. A steady drip, drip, drip of Java language improvements serves only to keep developers tied around what is functionally a frozen language, and away from superior alternatives (on the same JVM platform!) if they're so inclined to look up from their work. Since the state of play vis á vis Java-the-language is clear, maybe those that care so deeply about programming language productivity, innovation, and progress can set about enjoying the advantages of the future that Java has ensured for us all.

Activity is not Progress (or, 'Did you really need to shave that yak')


Published to Muck and Brass – Chas Emerick by Chas Emerick October 08, 2009 16:47

News Item edited by Chas Emerick

Anyone who is accountable for any sufficiently-complex objective is constantly having their focus being pulled away from that larger goal by a thousand different fiddly tasks. Christened as yak shaving some time ago by a fellow at the MIT media lab, the concept has become a favorite shorthand in various programming and software development circles. I only heard of it this year, but it's helped to coalesce my thinking about focused work and the relationship between activity and progress.

In particular, I think it's helpful to occasionally check one's activity using what I'd call "root objective analysis".

Many people in technical fields are familiar with root cause analysis, where a problem or failure is analyzed in such a way as to determine its root cause. There are lots of flavors of root cause analysis, with Five Whys being popular among programmers due to the Joel Effect and probably some loose association between Five Whys and the lean development/startup methodologies that are all the rage these days.

In contrast, root objective analysis runs in the "opposite direction", so to speak: for any given activity, you trace the likely causal link between that activity you're engaged in, and the progress you want to make. In short: "Is what you're doing right now getting you closer to your end goal?" 1 If you do this right, or at all, you'll go down fewer dead-ends, waste less time, and prioritize the yaks you do shave so that you get to your desired end state sooner rather than later.

There's obviously a lot of fuzziness in any kind of speculative analysis like this; if there weren't, then project management would always bring jobs in on time and within budget. However, if your work often leads you far afield of your "main line" of focus, then asking yourself the question above from time to time may help you to ensure that every yak shaving you engage in is necessary, as opposed to a distraction caused by confusing activity for progress.

And Now for Something Completely Different

A yak shaving that is near and dear to my heart is the fable of the software developer and the PDF documents (not surprising, since we talk to a lot of developers who have worked with lots of PDF documents). There are many variations, but the most extreme goes something like this:

  1. Joe the developer needs to get some chunk of data into his company's database (maybe it's financial data, maybe he's working with excerpts of academic journal articles – such details are mostly irrelevant)
  2. The data is only available in PDF documents, and there's a lot of them. Thousands, perhaps millions of chunks of data in as many different PDF documents.
  3. Joe's first thought is that he needs to build a function to extract text from these PDFs so that he can get at the data he needs.  But, after...
    • reading the 1,000+ page PDF specification,
    • adding support for the 8 different versions of the spec,
    • adding support for a half-dozen encryption protocols, and
    • adding support for extracting Chinese (or Japanese, or Korean, or Icelandic with its lovely ð ("eth") character) along with the embedded fonts that go along with it
  4. ...Joe now has spent nearly a year building a one-off PDF text extraction library that (again, depending on the version of the fable) fails on 24% of the documents his company needs to access, and still doesn't run fast enough to finish in the batch window he has to work with.

Seriously, scouts-honor, I've heard this story at least 5 times...and each time right before or right after the developer/company in question purchased PDFTextStream to replace their homebrew PDF library. That, my friends, is activity without progress, yak shaving at its most epic.