Joe Ganley

June 16, 2008 4:33 PM | Permalink | Comments (0)
Wordle
Wordle creates really nice-looking tag clouds. Here is one using the entire text of my dissertation as input:

Dissertation tag cloud

(Click to enlarge.)

Labels: ,

April 23, 2008 11:09 AM | Permalink | Comments (0)
Résumé tag cloud
A trendy thing these days seems to be to represent your résumé as a tag cloud. Just for kicks, here's mine:
created at TagCrowd.com

Labels: , ,

April 06, 2008 7:47 AM | Permalink | Comments (0)
Text messaging trivia
The other day my daughter was watching me send a text message, and how when two successive letters are on the same key I had to pause a moment until it timed out, and she wondered aloud what was the longest word that did not have two successive letters on the same key. Searching the web, I was surprised not to be able to find an answer, so I fed the 480K-word dictionary on my Linux box through a few tr and grep commands and a little Python code to find the answer, plus a number of other special words.
  • The longest word that is texted with no repeated keys is dichlorodiphenyltrichloroethane (that's the full name of DDT).
  • The longest word that is texted with no repeated keystrokes (i.e. not only no repeated keys, but each letter is the first choice on its key) is pagatpat (a variety of mangrove tree).
  • There is a word that repeats no keys (even non-consecutively): idolatry.
  • The longest words that are texted with only a single key are bacaba (a variety of palm tree) and deeded (the dictionary also contains deedeed, but I can't find a definition and believe that might be a typo of deeded).
  • Several words contain a sequence of seven of the same key, including: fiddledeedee, homonomous, mononomial, and nonmonogamous.

Labels:

February 27, 2008 6:19 PM | Permalink | Comments (0)
White Oak Technologies is hiring
I just had a post on Slashdot about finding great programmers for the company where I work. My plug for the company got (understandably) dropped, so I'll put it here where hopefully some /. readers will end up.

The company is White Oak Technologies. It's a great place to work, and we're hiring. Be surrounded by really smart people, work on hard, interesting problems, and get better-than-market compensation and benefits. What we do is data exploitation work on very large volumes of data. It's really interesting stuff. If you're really good, exactly what languages you prefer aren't all that important; we have a lot of flexibility in that regard. However, among the technologies we use most are Python and C++. Not to put too fine a point on it, our standards are very high, so bring your A game.

We are in the DC Metro area. Due to the nature of the business, telecommuting is not possible. Also, you must be a US citizen and will be subject to a clearance investigation.

If you're interested, check out our openings (be sure to tell them how you found us!) or send me email.

Labels:

Great-great-great-great-grandfather Alonzo
I saw where someone had posted their academic genealogy, so just for fun I looked up my own. Here it is, working backward: me - James Cohoon - Sartaj Sahni - Ellis Horowitz - George Collins - Barkley Rosser - Alonzo Church.

Labels: ,

August 15, 2007 1:19 PM | Permalink | Comments (0)
Some problems are never really solved
Via the Beautiful Code blog, a couple of nice sorting results: First, the proportion extend sort, a recursive sort that allegedly outperforms quicksort in both theory and practice. (Warning: Their code is pretty hard to understand, undercommented and obfuscated.) Second, the Python distribution contains a very nice writeup of the sorting routing Tim Peters used to implement listsort. The filename is listsort.txt, and it describes a nicely engineered adaptive mergesort with some elegant tricks to make it perform better on lists that already contain some order.

Labels: ,

August 08, 2007 11:37 AM | Permalink | Comments (0)
How many times did this have to happen to justify a sign?

Labels:

July 21, 2007 2:46 PM | Permalink | Comments (0)
1-card stud
We went to Las Vegas last week (my first time there), and the one casino game that surprised me was War. Yes, just like the one you used to play as a kid until you realized that it is boring. I walked past a table, and then doubled back in disbelief that this was what I thought it was. This is about the most brain-dead card game I can think of, far dumber than anything else in the casino (except, of course, the slot machines).

Labels: ,

May 13, 2007 2:48 PM | Permalink | Comments (0)
Helpful error message

Labels:

January 26, 2007 8:15 PM | Permalink | Comments (0)
Use distinctive usernames
Years ago, the first time I worked at Cadence, my username was "ganley". I left to go to a startup, where my username was "joe", and when Cadence acquired that startup I kept the username "joe". I quickly discovered that this was a mistake, since spammers send to name@domain for every common name. Only much later did I discover that "ganley" would have been better for another reason: I worked on an open source project, and my username still appears in some of the CVS tags in the source code. It would be much cooler if it was "ganley" instead of "joe" - after all, "joe" could be anyone, right?

Labels:

January 03, 2007 7:00 PM | Permalink | Comments (0)
Online random selection
Consider the problem of choosing a random element from a linked list whose length you do not know. Obviously, you could count the length of the list, then pick a random number between 1 and length, and then walk to that element, but there is a clever way to do this in one pass.

The algorithm is as follows:

Element choice;
int count = 0;
for (Element e = head; e != NULL; e = e->next) {
   if (rand01() <= (1.0 / ++count)) {
      choice = e;
   }
}

You can easily show that when this finishes, choice is equal to any given element with probability 1/n, where n is the number of elements: The probability of setting the ith element to be the new choice is 1/i. The probability of not selecting any of the subsequent elements to be the new choice is the product as j goes from i+1 to n of (1 - 1/j), which if you work out the math ends up as i/n. Multiply the two probabilities together and you get 1/n.

Labels: ,

May 17, 2006 4:43 PM | Permalink | Comments (3)
iAlarmClock
If I were inclined to start a company, something that I've always thought about is a company that applies Apple-style design sensibilities to relatively mundane consumer products. My company's first product would be an alarm clock. I've probably owned a dozen alarm clocks in my life, and used a couple of dozen more, and every single one has had an atrocious interface. Even the best of them lacked what I would consider some basic requirements. Few of them leave me confident that I've set the alarm properly and it is actually going to go off when I intended it to. It is difficult for me to turn them off in the dark, much less be able to actually set the alarm in the dark. For those of us not inclined to snooze, I've never seen one with a disable button as readily accessible as the snooze bar, but that kills the alarm until the same time tomorrow. And if there are multiple alarms or weekday-only alarms or such, forget it - the interface is so difficult that you'll probably never use those features. Someday...

Update (May 02007): Hmmm, WidgetStation might be just the thing. Looking forward to its release.
Update (Mar 02008): WidgetStation seems to be indefinitely delayed, but Chumby seems to be exactly what I want anyway. It's a little pricey ($180), but I still may have to buy one.

Labels:

May 04, 2006 3:35 PM | Permalink | Comments (0)
Start your shredders
A scary article in which the author takes an airline ticket stub out of the trash and uses it to find all sorts of personal data on its owner. I'm pretty paranoid about shredding, but I admit to occasionally considering tossing my airline stub. And I routinely toss the tag that they put on your luggage, and I bet that contains much (all?) of the same data. No more!

I disagree with the author's claim that all of this data collection is bad, however; the data itself isn't the problem, it's the fact that this data is used as authentication of who you are. Experts have long derided security through obscurity, and security techniques that rely on your personal data being secret are doomed in today's world.

Labels:

April 13, 2006 9:26 AM | Permalink | Comments (1)
Chinese menu enumeration
A number of times I've been faced with what I call the "Chinese menu enumeration" problem - you know, one from column A, one from column B, etc. Given an array num of N entries where each entry contains the number of items in that column, enumerate [the indices of] all of the possible meals. Equivalently, list every N-digit number in which the i'th digit is less than num[i]. This is not a hard problem, but it's all too easy to write ugly code for it. My latest attempt ended up like this:

int val[N];
for (int i = N - 1; i >= 0; --i) val[i] = 0;
for (int inc = 0; inc >= 0;) {
   // do whatever you need to do with val[]
   inc = N - 1;
   while (inc >= 0 && ++val[inc] == num[inc]) {
      val[inc--] = 0;
   }
}

Pretty succinct, but I can't shake the feeling that it could be obfuscated simplified further. Any takers?

Labels:

February 07, 2006 1:48 PM | Permalink | Comments (0)
It's all about bits
One of the more comprehensive collections of bit-twiddling hacks I've seen.

Labels:

December 22, 2005 1:04 PM | Permalink | Comments (0)
Five people, loosely joined
Yesterday I was reading Paul Graham's site, and wondered what other Lisp hackers might be around, so I went to one of the Lispier pages and Googled 'similar links.' To my surprise, one of the first hits was my friend Lawrence. That led me to revisit my friend Brad; he was on the programming team with me, and he and Lawrence were roommates (of each other, not of me). Somewhere in here I also lost half an hour to Drew's site; I don't know him at all, but he's a friend of Lawrence's. (I particularly liked the manually-raytraced sphere.)

The good that comes of this sort of jaunt is that Lawrence, Brad, and I are making plans to get together when I'm in California next month; it's been an overly large number of years since I've seen them.

Labels:

December 07, 2005 9:29 AM | Permalink | Comments (0)
PhD duration and getting things done
A coworker and I were discussing the relative rarity of people like me, namely PhDs who are great software engineers and who are good at actually getting things done. He conjectured that there would be a strong correlation between those traits and how quickly one finished their PhD. Certainly that fits in my case; I did my PhD in 2.5 years. I'm curious as to how strong this correlation might be, and I'll be collecting data about it going forward.

Labels:

November 29, 2005 6:37 PM | Permalink | Comments (0)
Lisp in JavaScript bug fix
Reader Bennet Yee pointed out that my Lisp in JavaScript interpreter failed on his Y-combinator code, as follows:
    ((lambda (x y) (x x y)) (lambda (me n) (cond ((< n 1) 1) (t (* n (me me (- n 1)))))) 4)
Naturally, I suspected a bug in the lambda machinery, but it turned out to be much simpler: The symbol-table lookup code had a conditional that depended on the value of the variable being looked up, and JavaScript (as in C/C++) equates 0 to false. Thus, it failed to find the variable when its value was 0. I hadn't caught this before because I habitually write that terminating condition as the slightly more efficient "< n 2", and thus it terminates before n reaches 0.

This task also led me to revisit the Y combinator, which is really beautiful. For those who don't want to wade through the details, it's a clever mechanism for implementing a recursive call to a lambda (i.e. nameless) function.

Labels: ,

November 27, 2005 2:47 PM | Permalink | Comments (0)
Numb3rs on Steiner trees
My wife called me into the room the other day because the Numb3rs episode that she was watching had mentioned Steiner trees (the principal subject of my Ph.D. dissertation). As usual for that show, they got the math half wrong and half ornamented with extraneous jargon, but still it was a tiny thrill to see Steiner trees mentioned on TV.

Labels: ,

October 07, 2005 11:08 AM | Permalink | Comments (0)
C++ operator precedence reference card
Annoyed at having to look up, yet again, one of the couple of C++ operator precedence rules that I don't have memorized (this time, whether equality or ternary is tighter), I took the C++ precedence rules from here and formatted them into a 3x5 reference card. Here it is in PDF form, or in Word if anyone wants to tweak it. (My graphic-design skills are fairly meager, so if anyone makes substantial improvements, please let me know.)

Labels: ,

September 29, 2005 9:26 AM | Permalink | Comments (0)
Template parameters cannot be friends
Recently I wrote some code in which a templatized class declared one of its template parameters a friend. This worked fine in Visual C++, and I was surprised to find that GCC gave the rather unambiguous error, Template parameters cannot be friends. I was even more surprised to find that, indeed, this is not GCC being overly picky - it is not legal C++! There are extremely crufty workarounds, but in general it's clearly a bad idea to rely on platform-specific violations of the C++ standard in production code.

Labels: ,

September 26, 2005 1:44 PM | Permalink | Comments (0)
Fuse-bead cellular automata
SnowflakeIt struck me the other day that fuse beads could be a good way to teach my daughter about cellular automata, so we used them to build this snowflake cellular automaton.

Labels: ,

August 30, 2005 6:32 PM | Permalink | Comments (0)
Good developers like to learn
An interesting essay about why developers leave. In particular, I found the analysis in the morale section very interesting. This is one place where I think Google really has it right: Giving your people time to work on projects of their choosing pays huge dividends in morale. (Via particletree.)

Labels:

May 11, 2005 1:51 PM | Permalink | Comments (0)
Type systems > Coding standards
Joel Spolsky's latest essay is on coding standards, and in particular why Hungarian notation is good. I've always liked Hungarian and similar notations, but as I read this article I found myself thinking that these classifications could be not only denoted but also enforced by making them classes instead of just variables of the same type with differently-denoted names. This requires a little finesse with simple types such as int, but it can be done. The result not only denotes the same information as Hungarian (and less cryptically) in the class names, but the compiler even enforces the relationships for you. No need to "watch" for types with mismatched notation (as in Spolsky's essay); it's part of the type system. Which is not to say that I no longer find Hungarian notation useful, but that in many situations and in many languages it should be supplemented (or in some cases replaced) by type definitions.

Labels:

April 07, 2005 11:09 AM | Permalink | Comments (0)
Hackers != painters
Maciej Ceglowski writes an excellent, half-facetious rebuttal to Paul Graham's Hackers and Painters essay. I didn't buy this analogy when I read it, and still don't, and Ceglowski does a great job of enumerating why it doesn't fly. I've always thought of programmers more like 'design and build' contractors. The best of them are both good architects and good builders, and like programmers who are good at both of these things, they are rare. Many more are either good architects and spotty builders (their buildings look nice and function well but are poorly constructed) or vice-versa (their buildings are well put-together but don't flow well or are unattractive), and of course some aren't good at either.

Labels:

April 01, 2005 11:54 AM | Permalink | Comments (0)
On not knowing
I just ran across an Eric Sink article that I've somehow previously missed called The Hazards of Hiring. There are many good points in this article; in particular, he does a good job on one of my favorite points: in his words, The "very best" people never stop learning. ... They know their own weaknesses, and they're not insecure in talking about them. Many people seem afraid to say "I don't know." I love to say that, because it means we've just identified a gap in my knowledge, and almost always, it means that we're just about to fill that gap. That is the single most fulfilling thing in life, work-wise anyway.

Another interesting point in that article is his skepticism toward people with advanced degrees. I see this a lot, and in fact I spend a lot of my time in interviews trying to convince people that despite my having a Ph.D., I am not some sort of ivory-tower computer science researcher. First and foremost, my love is for writing software. Sure, I love to sink my teeth into a really hard CS problem from time to time, but there is also fun to be found in all of the other facets of the software development process, even those that many consider mundane. I really enjoy positions that offer a lot of variety, from hard algorithmic problems to user interface design to library architecture to low-level infrastructure.

Labels:

January 11, 2005 10:29 AM | Permalink | Comments (0)
Duff's Device
Among the entries in Google's recent usenet timeline is the original Tom Duff post describing Duff's Device, a loop unrolling technique that exploits a surprisingly legal C quirk. I remember seeing this for the first time in a job interview in 1990 or so, when my interviewer wrote the code on the board and asked me what it did. I still think it's quite cool; though compiler optimization technology has probably rendered this unnecessary, it's still a great illustration of the sorts of things that are made possible by the "anything goes" design philosophy of C.

Labels:

January 06, 2005 11:11 AM | Permalink | Comments (0)
Talent trumps experience
Somewhat in the same spirit as that last post, another great engineering story, this one from Damien Katz. This story illustrates perfectly why it makes me so crazy when job postings demand, "10 years experience with XYZ." Clearly if you can get a really great person who also has XYZ experience, that's optimal, but if you have to choose, you want a really great programmer (who doubtless can learn XYZ very quickly) over a mediocre programmer with lots of XYZ experience. (Link via Ned Batchelder.)

Labels:

December 23, 2004 1:56 PM | Permalink | Comments (0)
Projects with a life of their own
I just love this story of the Macintosh Graphing Calculator. It was part of a project that got cancelled, but its authors continued to sneak into Apple to work on it for free, and eventually got it shipped with MacOS. This is an extreme and wonderful illustration of the dedication a good engineer can have to an important project. I just wish it was available for Windows!

Labels:

December 03, 2004 1:08 PM | Permalink | Comments (1)
C++ constructors shouldn't throw exceptions
Ned Batchelder tells a C++ debugging story, the punchline of which is this: If a C++ constructor throws an exception, then that object's destructor is called. I encountered this in my own work, and it turns out exactly the opposite: If the constructor throws, the destructor is not called. These stories have the same moral, though: Constructors really shouldn't throw exceptions. Update May 2005: I reported my findings to Ned, and he wrote an update that describes in detail the precise semantics of what happens when a C++ constructor throws an exception. In short, the destructors of every base class and member whose constructor successfully completed will be called; note that this necessarily precludes the destruction of the class whose constructor threw the exception, since that constructor obviously did not complete execution.

Labels: ,

November 03, 2004 6:13 PM | Permalink | Comments (0)
SCCS in Haskell!
Via Ned Batchelder I learned about Darcs, which is a fully peer-to-peer source code control system. This sounds ideal for home hobby work, as it doesn't require a dedicated server machine. Also, it surprised the heck out of me that it's written in the [almost] purely functional language Haskell!

Labels:

October 25, 2004 6:14 PM | Permalink | Comments (0)
Think before coding
I agree pretty much 100% with these Hallmarks of a Great Developer. In particular, the first point ("plans before coding") resonated with me. This is a former weakness of mine, and the area where I've made the most improvement in the past couple of years, since joining my current project. The point is even better made by Michael Abrash in his Graphics Programming Black Book (chapter 10) and by Ned Batchelder in his diamond cutter analogy.

Labels:

October 15, 2004 11:55 AM | Permalink | Comments (0)
If it's complicated, it's broken
Red Hat's Alan Cox on writing better software. There is a lot of good stuff in there, but my favorite quote is, "If it does everything it's complicated, and if it's complicated, it's broken."

Labels:

September 08, 2004 9:53 PM | Permalink | Comments (0)
Software economics
Some interesting observations from Brian Cantrill on The Economics of Software.

Labels:

August 26, 2004 11:08 AM | Permalink | Comments (0)
Loop unrolling
In last month's C/C++ Users Journal, there was a clever article that described using templates to write loop constructs that are unrolled at compile time. In my last project I wrote some multiobjective optimization code that was full of loops that went from 0 to 2 or 3; this mechanism would've been great for that. On the other hand, five years ago I highly doubt that all the compilers I supported would have processed this kind of fancy template code correctly.

Labels: ,

August 18, 2004 2:09 PM | Permalink | Comments (0)
Graphics portability
Michael Abrash has a series of articles appearing in Dr. Dobb's currently. The subject of the articles is code optimization, but the thing I found most interesting is the code being optimized. It is a software-only rasterizer. I wouldn't have thought it possible (even for Michael Abrash) to get DirectX7 performance in software. But most of all, it's noteworthy that state-of-the-art graphics hardware and drivers still have so many bugs and portability problems that Abrash finds it worthwhile to spend a lot of time and sacrifice two generations off the state of the art in order to avoid them. I certainly feel his pain, and I'm jealous that he is able to focus on a single type of processor and thus achieve this level of optimization.

The other thing that caught my eye was an illustration of how on modern processors, optimization can be extremely nonobvious. The example was a subproblem I've solved before: Given a value x between 0 and 1 and an array of ascending values in the same range, figure out which pair of values x falls between. I used a binary search. It turns out that on the x86, binary search is predicted poorly by the processor's branch prediction logic, while in a linear search every compare but the last one is predicted correctly. A mispredicted branch is very expensive. So, for a small number of bins a linear search is 30-50% faster (!), and the problem size has to reach 64 bins before the improved asymptotic efficiency of the binary search actually beats the linear search in real life. My problem had at most a few tens of bins, and the search was deep in an inner loop, so this would have been useful information to have known.

Labels: ,

August 17, 2004 2:52 PM | Permalink | Comments (0)
Improving assertion readability
Ned Batchelder discusses rewriting assert conditions to make them more readable. Since one of the purposes of assertions is as documentation, this is good advice.

Labels: