Skip to content

Paul B

These are the stories that have been posted by Paul B category.

Branching and merging in real life


Published to E-Scribe News: a programmer's blog by Paul B November 08, 2009 03:18

At work I still mostly use Subversion for version control. Its main selling points: stable, performs as expected, integrates nicely with Trac, holds all our old stuff (legacy inertia).

Note that "pain-free branching and merging" is not on that list. (And don't give me the old "branching is cheap in svn!" line. It's not about the branching, it's about the merging.) A couple years ago I started also using Mercurial and plan to eventually replace svn with it entirely. The aspect of Mercurial that made my life better recently is its support for branching and merging.

The scenario: an important internal web app (in use all day every school day) needed some significant changes on a short timetable. Normally I'd work on the app thus: edit the staging copy, commit, update the live copy. I didn't want to take that approach here. I knew that during the development window there might arise unrelated urgent change requests; I wanted to keep the new code isolated during development, but also deploy and track those unrelated urgent changes. Branching seemed like the right approach.

I could have made a full clone of the app (hg clone mainrepo newrepo). However, handling environment dependencies (web server, PythonPath, database) would have added time and fussiness to the job, and time was in short supply. So, using Mercurial's named-branches feature, I made a new branch (hg branch newstuff) right inside the fully-functional staging copy of the app. That way I was able to develop and test as usual, secure that my unproven work-in-progress was not "polluting" the current app's revision history.

To handle "unrelated urgent changes" as mentioned above, I'd:

  1. Commit any current work on the "newstuff" branch
  2. Switch to the main branch (hg update -r default)
  3. Make the urgent change, test, commit, update the live copy
  4. Switch back to the new branch (hg update -r newstuff)

It took me a couple tries to understand how branch-switching worked, but it's simple: you really are updating your working directory to a new revision, it just happens to be a revision stored in a different branch from the current one.

It was fun looking at the graph (via HgWeb) and seeing my two parallel branches with their individual commits.

The moment of truth came at the end of the day Friday, when it was time to merge the tested and complete "newstuff" code with the current live codebase. It was dead simple, and effectively instantaneous. Condensed version: hg update -r default; hg merge -r newstuff; hg ci -m "merged new stuff". Followed by: update live copy and let out a big sigh.

Summer Spam


Published to E-Scribe News: a programmer's blog by Paul B July 24, 2009 03:44

Spam is occupying more than its customary share of my attention in recent weeks. I've long had a morbid fascination with sleazy human communication (hence Purportal.com). That makes the always-relentless stream of spam, though not exactly welcome, at least interesting.

Spam volume also seems to have increased during this period. The number of spam attempts my mail server rejects per day had been steady at around 3,000 for months. Now it's back up around 5,000 or 6,000.

I run my own mail server and fight spam via greylisting, blacklisting, and other strict technical rules. This setup rejects 99+% of the spam aimed at the domains I host, but some still gets through to me. Never enough to displace real mail, but enough to keep my little hobby-interest alive. Here are some of the spam highlights of my summer so far:

  • After one too many identical HTML spams, I took the rare step of adding a custom rule to my mail server config. I started rejecting all mail with "Content-Type: text/html; charset=us-ascii". In this age of Unicode, that's turned out to be a pretty safe bet. Lots of rejections and no known false positives.

  • I received a weird email about money via Craigslist. It looked like a response to an ad -- one I'd never seen before, and certainly hadn't placed. Naturally my first thought was that the Craigslist bit was all a ruse, but a at the message headers showed it was real: it had been sent via Craigslist in response to an ad with my email address attached. In other words, a Craigslist ad that had been created (copied verbatim from a legit ad) just to send spam to me via Craigslist's email forwarding feature.

  • I spent a few minutes trying to convince emusic.com (via email) of the fact that since I received spam at an email address that I had invented purely for use with their service, and which had never been used for anything else, this meant that somebody had poached their list from inside. They are still thinking about this silently.

  • I encountered a new form of referrer-spam. Remember referrer spam? Spammers would put their URLs in the HTTP_REFERER header when hitting blogs and other websites that had dynamically generated lists of "top referrers", then the spammers' sites would show up in those lists. Well, this week I saw an inscrutable but surely related anomaly in the headers of some requests made to one of my sites (which I was looking at for other reasons, not spam-hunting). This HTTP_REFERER header was a giant comma-delimited list of approximately 10 or 15 URLs.

And finally, there was the phishing message I received today. It was a fake eBay notice, with the usual "click here to resolve the dispute" links. Those links were supposed to take the victim to a fake eBay page the scammers had set up (where the victim would type in all sorts of exploitable personal information). Looking at the message's raw source, I noticed something very odd -- the pages they were trying to link to were on an FTP server in Russia. Even weirder and better, the link code contained their FTP username and password! A minute later I was logged into their FTP server, looking at the one file there: the fake eBay page.

This was a darkly humorous reminder that the international spam-and-scam business is, from what I can see, a refuge for IT people (or wannabes) with poor skills and poorer ethics. So by this point I was kind of feeling bad for the incompetent underling who had put this thing together for his terrible boss.

However, I didn't let my compassion interfere with my sense of justice and fun. I replaced their fake eBay page with my own content, a much simpler message in plain text: "We are scammers."

SPF-enabled spam domains


Published to E-Scribe News: a programmer's blog by Paul B June 03, 2009 22:22

Among the many anti-spam measures on my mail server -- which help me reject 5000 spam attempts per day -- is SPF. SPF allows domain name owners to specify which mail servers are allowed to send its mail. That makes it an excellent way to detect address forgeries, a favorite spammer tool.

One of the early questions raised about SPF was: won't spammers just buy their own domains and set up their own SPF records that say it's all OK? You can read the answer in the SPF FAQ, but the short version is: Yes, they will, but it won't give them a free pass.

That's because if spammers register a domain, publish SPF records for it, and send spam, they've identified that domain as one intended to be used for spam. Very good blacklist fodder.

With that in mind, here's a list of about 50 domain names that have recently been used to send me spam. All of these have published SPF records, and all the spam I received was from servers approved by those SPF records.

In other words, as far as I can tell, these are domains that exist primarily, if not purely, to send spam.

Update -- Here are the latest as of 2009-07-24: alg.com barrewardonline.com blueheavenbooks.com boudy.com eautocentral.com export2000.ro outpost.mm302.com qeentreeforlife.com ronaldvnash.com sistemas.com.ar smartserv.net solorpowernowme.com spig-int.com synergynetfour.com topproducerhelp.com truehouseinfo.com truelifeproducts.com unafraidrewardonline.com weathersearchontheweb.com

If for some reason a perfectly innocent non-spammy domain of yours has made it into this list, please let me know. (You might have to use my contact form, since I've already blacklisted all these domains!)

Chess via iPod


Published to E-Scribe News: a programmer's blog by Paul B May 11, 2009 20:53

I'm still loving my iPod touch. It's really a great little handheld computer. I'm able to do almost everything I need with the stock apps, but there are a couple free third-party apps that have earned a permanent place on it. One is the game Chess With Friends from NewToy.

Chess with God This is a version of what is also known as "postal" or "correspondence" chess. You make a move and send it to your opponent; your opponent makes a move and sends it back to you. (In this version, the CWF app rather than your mail carrier is the middleman.) You can pick somebody out of your address book, or ask the CWF app to find you a random opponent. Nice touches include in-game chat, step-by-step replays, and optional email or SMS notifications.

The human angle is what makes it fun. Most chess players have been periodically disheartened by computer opponents that beat humans (those who play at a mortal level like I do, anyway) coldly, soundly, and rapidly. The variety of human players that the CWF random-opponent feature delivers is a welcome change.

You get to pick your own screen name. People who know you can search by this name if they like, so it serves a useful purpose in addition to being a nametag. It also is occasionally the source of some amusement, as in the screen capture included here from the end of a recent game.

Aesthetics and computation


Published to E-Scribe News: a programmer's blog by Paul B May 06, 2009 01:33

This evening, the Western Mass. Developers Group was treated to a talk by Ben Fry of Processing fame. It was excellent and inspiring. Having not much prior exposure to Processing or his work, I left hungry for more. (The title of this post is taken from the name of the group at the MIT Media Lab where Fry did his PhD work.)

I liked the graphical-REPL flavor of his live demos. Surprisingly, the feeling reminded me of being a kid flipping through Alan Kay's article about the Xerox Alto in Scientific American 30 years ago.

He gave a fun tour of creations by Processing users, with various highlights along the way including magazine cover art, a Superbowl ad, a scene from Minority Report, and the work by Robert Hodgin that was picked up by Apple for the iTunes 8 visualizer. Along the way he was concientious about giving his co-conspirator Casey Reas (not in attendance) his share of the credit.

Turnout was good, by our small-town standards: a full room, 25 people or so. Many had come out of the woodwork from local colleges (notably Smith and UMass). O'Reilly gave us a few copies of his book, which we had a drawing for at the end.

I found his work to be a heady mix of technical acuity, aesthetic commitment, and pragmatism. And I liked his dry sense of humor -- jokes that many non-technical audiences probably wouldn't have even known were jokes.

His work is especially interesting to me because I've straddled the design/enginering line most of my professional life.

At the end I asked him about this cross-disciplinary world of his, and whether he had observations about qualities that were good predictors of success. He thought for a moment. His answer, which included mention of a Harvard class he taught to a mix of art/literature/CS/etc. majors, began with one clear word: "Curiosity."

Hello from BarCampBoston


Published to E-Scribe News: a programmer's blog by Paul B April 25, 2009 14:18

Greetings from Boston -- specifically, BarCampBoston. My first "unconference". Nerds galore.

The format is (mostly) half-hour talks from attendees on whatever subjects interest them -- as long as other attendees have also expressed interest. It's all tracked on a big board in the lobby. So far I've been in discussions involving localization, designing for technophobes, cloud computing, physics simulation in games, and Lisp. The level of interactivity is high -- as is the collective expertise brought by the participants.

Stata Center This is taking place in MIT's Stata Center, a wild-looking Frank Gehry creation that clearly houses a lot of fun regular old MIT stuff in addition to transient visitors like us. Walking down a side hall during lunch I peered through the glass of a closed door into a large office containing a pile of what looked like robots.

Update: In BarCamp spirit, I gave a talk. The title was "Moving from PHP to Django". Much like some of my earlier blog posts on the subject, I talked about how to make the transition from PHP to Python/Django smoother and more enjoyable. At the end, I noted that I had two copies of our book on hand. I made a special offer: buy one (at a great discount price, in fact) and I would donate the proceeds to BarCamp. I had two takers and made my donation immediately after. Thanks, guys!

robots.txt via Django, in one line


Published to E-Scribe News: a programmer's blog by Paul B April 25, 2009 11:13

A significant difference between developing Django sites versus static-HTML-based approaches (among which I count PHP and the like) is that static files, aka "media", live in a dedicated spot.

Sometimes you need a piece of static content to be available at a specific URL outside your media root. robots.txt for example. This can be done in pure Django (i.e. without even touching your Apache configuration), and is especially nice if your robots.txt content is short. The example below serves a basic "keep out" configuration.

At the top of your root URLconf, add this import:

from django.http import HttpResponse

and below, among your list of URL patterns, add:

(r'^robots\.txt$', lambda r: HttpResponse("User-agent: *\nDisallow: /*", mimetype="text/plain"))

The lambda r bit is a concise way of creating a function object which accepts (and discards) the HttpRequest object that Django provides to all views. The "mimetype" setting (aka "content_type" in Django 1.0) is important too, because robots don't like text/html.

So there you have it -- a classic one-line (plus an import) robots.txt solution.

Django LogEntry to the rescue


Published to E-Scribe News: a programmer's blog by Paul B February 14, 2009 00:47

If you use Django's admin application, you're familiar with its "Recent Actions" sidebar. It gives a simple summary of your latest edits, including clickable links to the relevant objects (not any ones you deleted, naturally, but ones you added or changed).

It's probably not something you look at very often, unless you do such intensive work in the admin that you lose track of things.

Django stores that log data (via the admin's LogEntry model) for all admin users, a fact which has caused me to repeatedly daydream about writing a custom view or two to display it. In other words, I'd like to let superusers browse all object editing history. Because sometimes you need to answer questions like "When was that changed?" and/or "Who changed it?"

Today at work, a question arose about some data that was deleted via the admin several months ago. It didn't need recovering, we just needed a record of its deletion. An audit trail.

LogEntry to the rescue! Via manage.py shell and manage.py dbshell I was able to do some quick spelunking and get exactly the records we needed.

It was a very positive experience. I love being able answer questions that begin, "Paul, is there any way to..." with: "Yes!" After this, I may even be a little bit closer to writing that code I've been daydreaming about.

Keeping emacs backup files tidy


Published to E-Scribe News: a programmer's blog by Paul B February 02, 2009 13:36

In the shell, emacs is my editor of choice. However, it has one default behavior that has gotten in the way more often than it has helped -- automatic generation of backup files in the same directory as the original.

Emacs is great for making quick edits to files on the web server. But I don't want or need all those *~ files sitting around. The material is all in version control, so I can revert to any point in history already.

I went to the #emacs IRC channel on Freenode to ask about this, and was prompty handed a canned help message that led right to the solution to my problem: backup-directory-alist lets you specify a directory where backup files get saved.

A little googling yielded the following nice snippet from Sean B. Palmer which I prompty deployed in my .emacs file. Problem solved. Thanks, everyone!

;; Turn off the annoying default backup behaviour
(if (file-directory-p "~/.emacs.d/backup")
    (setq backup-directory-alist '(("." . "~/.emacs.d/backup")))
    (message "Directory does not exist: ~/.emacs.d/backup"))

Email servers: how not to do it


Published to E-Scribe News: a programmer's blog by Paul B December 17, 2008 22:45

I run my own mail server. I don't consider myself an especially skilled administrator, so I shouldn't point fingers. However, in recent weeks I've had the following experience more than once.

  1. A delivery-failure message arrives from an unfamiliar host.
  2. The (quoted) orginal message is nothing I ever sent.
  3. The recipient is unfamiliar to me.
  4. The "sender" of the original message is an email address I control, but not one I ever send mail with.
  5. OK, so this is backscatter.
  6. I email the postmaster suggesting they learn how to avoid sending it.
  7. The message to the postmaster bounces back because of some server misconfiguration.

Argh! Nothing spoils the catharsis of a good complaint like a bounce.

I trust that most of these servers will disappear over time, given that they already are showing signs of neglect. So maybe that's the happy ending! A bit too deferred for me, though.