The Raw and The Cooked: C++0x Extensible Literals

Posted by Christopher Smith Tue, 03 Jun 2008 07:43:00 GMT

When I look at C++, and squint a certain way, it appears to be a heroic attempt to retrofit a real type system on top of C’s terribly weak one. C’s weak typing is, for the most part (because we can’t possibly break backwards compatibility… except for when we do), augmented with C++’s strong typing. C’s typeless preprocessor is augmented with C++’s so-thoughtful-about-types-it-is-sentient template system. C’s structs are augmented with C++ objects and operator overloading. C’s weak typing and don’t-ask-me-why-just-cast-it operator are augmented with the far stricter, precise, and admittedly verbose static_cast<>, dynamic_cast<>, const_cast<>, reinterpret_cast<>. C’s unfortunate format string and varargs oriented I/O functions are augmented with C++’s strongly typed std::iostreams. Retrofits can sometimes be stronger and more powerful than other approaches, but they are almost inevitably more complex, less elegant, and generally less lovable than “from scratch” solutions. I think many programmers feel the C++ committee was engaged in an academic exercise to demonstrate just how true this principle could be. Looking over the C++0x proposals, it appears as though a strong sentiment on the C++ committee’s part, was that C++98 was too limited a case study and they could go further to produce more spectacular results, an opinion that has been greeted by jeers and cheers (seems like mostly jeers on the blogosphere, but then… it’s the blogosphere).

One of the benefits of all this effort is that in C++ user defined types are practically first class citizens in C++’s type system, largely indistinguishable from primitive types (Java has also realized they missed this boat and is attempting to correct it, albeit through an alternative trajectory more consistent with its nature with things like autoboxing). I say “largely”, because there are some subtle differences (well, in true C++ fashion, they are only subtle until you encounter them, at which point they are as subtle as a punch in the face) that continue to annoy, and the C++0x committee’s holy quest is leading them to find new ways to address this. Perhaps one of the more interesting efforts to bridge the remaining gaps between user-defined types and primitive types in C++ is the Extensible Literals proposal.

The Abysmal Status Quo

If you’ve worked in C or C++ (or Java for that matter) for a while, particularly if you’ve also had exposure to scripting languages, you have probably come to recognize how limited the built-in literals are. You have literals for the built-in types. You also have initializers for arrays and structs, which were so limited that the C99 committee felt compelled to improve them (see designated initializers). The Java world, after a great deal of reflection which I presume involved the use of powerful hallucinogens, has concluded XML is the best language around for defining data and there is no way another programming language can match it. So Java has simply chosen an idiom where what one would normally think of as a fine opportunity to use literals is instead a fine opportunity to define a new DTD and/or XML Schema and then write out some verbosely tagged data (always being careful to insert XML entities instead of > or < signs).

When one looks at Boost’s Program Options Library, one can’t help but marvel at the syntactic trickery employed to do something that is so simple and straightforward in, say, Perl. (Look mom! I used “Perl” and “simple” in the same sentence!). What I find most disturbing though, is that user-defined types can’t have literals associated with them. The closest you can get is a constructor that takes primitive types (which do have literals) as arguments. So, for example, if you are working with Unicode strings, you invariably end up writing something like:

UnicodeString("this is a Unicode string", "UTF-8")

Now, in their infinite wisdom, the C++Ox committee has addressed the Unicode issue separately from user defined types, so now you can do something like:

u8"this is a Unicode string"

The existence of this particular literal extension to C++0x if anything demonstrates literals are important. This is particularly true as the committee, in what can be described as a philosophical feud with the C99 folks, has not provided a corresponding built in set of literals for the complex number type (“see, we can do complex numbers entirely as a library, thereby keeping our syntax simple…” —just try to say that with a straight face).

To Boldly Go Where No C++ Compiler Has Gone Before

Into this mess, the C++0x committee has brashly bravely charged. The result is the extensible types (i.e. user-defined literals) proposal. As with all good things in C++, there turns out to be a fair bit of complexity to the matter, but it all makes sense once one thinks about the hoops one is having the compiler jump through. The new literals mechanism is built off of suffixes (apparently, to use prefixes for user defined literals would invoke a computer apocalypse of sorts… I don’t understand the details, but I heard mumblings about California’s governor going back in time in starkers and Google’s distributed computer calling itself “Skynet”), allowing for things like 123km to translate literally in to some object representing 123 kilometers, which I have to say seems rather cool and… straightforward at first glance. That’s the simple concept from which the inevitable complexity begins.

Two Dancers Alternating Through Double Hops, One in Black and One in Yellow, Various Other Pairs of Dancers, a Guy with a Truly Impressive Falsetto, and a Guy In Stripes Who Is Definitely Not A Dancer

It turns out, to be able to express all the wonderfulness that should be C++ literals, the proposal introduces two distinct types of literals: The Raw And The Cooked.

Editor’s Note: For those of you who didn’t enter the workforce until after the original C++ standard was introduced, you may have to contact one of your elders to properly understand the cultural reference. For those of you wondering what this link has to do with the dichotomy between the natural and artificial world… I can only say that most of us at the time didn’t get the video either, but during that era the cool thing to do was have an abstract, obtuse video that completely went over the heads of most of the audience —that’s the way it was, and we liked it!

The raw literal is defined as the raw sequence of characters that form a literal. It’s the raw bytes of the literal before the compiler has had its way with it (although after the preprocessor has expanded any macros and any string literal concatenations have been done… just so that we can’t have a completely simple definition and the C preprocessor can continue to be the bane of all C++ developer’s experience).

The cooked form is defined as the typed value that the literal string represents (before all the magical user-defined literal processing happens). In particular, this allows one to be able to have that user defined literal in the “123km” example operate on the integer value “123” rather than have to first transform { ‘1’, ‘2’, ‘3’, ‘\0’ } in to a useful binary value.

My God… It’s Full of Operators

C++ operator overloading is simultaneously one of its most useful and most abused features. In what will undoubtedly ensure that everyone will either love or hate extensible literals, the proposal follows the C++ standards tradition and adds… more operators to overload in to the language. Bravely eschewing the C++/CLI’s approach of overloading the meaning of yet another symbol (I seriously could have imagined something like Foo::$Foo()), instead we have some new operators. For raw literals, we have the form:

T operator "suffix" (char const*)

Where T is the type of the literal, and “suffix” the magical suffix that identifies the user defined literal. So, when the compiler sees: 123km it interprets that (if the appropriate function is defined) as a call to long long operator"km"("123") (just like a constructor it has the option of throwing an exception… that one is going to keep the exception safety nuts up for months).

So far, though, things aren’t complicated enough to really meet the usual standard, multi-paradigm mischief that we’ve come to expect from C++. I mean, it’s barely OO, and doesn’t really tie in to the whole generic/functional programming world C++ developers have come to know and love. Fortunately, this proposal leaves no stone unturned. Indeed, this particular stone has been turned… and kicked around a fair bit afterwards, in the form of this:

template<char...> T operator "suffix" ();

Yup, that’s not just a templated version, but a variadic templated version. For those of you playing this at home, Variadic Templates are also part of the C++0x standard. They are essentially a way of turning LISP-y looking Typelists in to C-y looking vararg functions. So, our 123km example, if someone had defined an extensible literal operator like this:

template <char...> long long operator"km"();

then the compiler interprets that as a call to:

operator"km"<'1', '2', '3'>()

Note the lack of null at the end there? I’m sure that was put in there to ensure inconsistency with the other form. ;-)

Anyway, the primary advantage of the variadic template form is that if you tag the function with the new “constexpr” keyword, the entire thing can be evaluated at compile time like all good template metaprogramming foo. One could argue a sufficiently smart compiler might be able to determine when to do compile time evaluation of the former form in cases where it was possible. However, if there is one myth that the C++ committee considers heresy, it must by the myth of the Sufficiently Smart Compiler (one of life’s little ironies is how this view directly results in C++ compilers having to be the most sophisticated, complex, and nuanced compilers known to man.. but I digress).

I hope your head isn’t spinning and spewing forth pea soup right now, because we’re just firing up the barbie to move on to “cooked form” extensible literals.

Looking at the proposal, cooked literals seem actually quite straight forward and the rational way to handle the 123km example I keep bandying about. According to the proposal, one would define:

Kilometer operator"km"(unsigned long long);

which would then be invoked with 123. There is no special templated form, and cooked literals wisely take precedence over raw literals. There are similar forms for doubles and the various forms of C-style strings that now exist in C++0x (strings prefixed with “u” and “U” both get their special form), although surprisingly the functions are length terminated, so for example, one could make a convenience literal for std::strings such that:

"this is an std::string\n"s

creates an std::string by defining a literal operator as follows;

std::string operator"s"(const char* s, size_t length) { return string(s, length); }

While at first this might seem counter intuitive, it does provide a nice way to distinguish between cooked literals and non-templated raw literals.

I’ve Come To Bury Extensible Literals, Not to Praise Them

The one thing I didn’t see in the proposal is how negative integers are handled (it seems odd that the proposal would imply defaulting to unsigned integers, but perhaps I’m missing something). Unfortunately, the proposal also doesn’t really address a simple way to define hierarchical/structured literals like Boost’s Program Options library desperately cries out for. While one could define literals for maps and hash maps, I just don’t see them even remotely approaching the elegance you find in scripting languages, and it still seems like there isn’t a convenient way to have a literal composed of compound expressions. Then again, it’d be hard to distinguish between the latter and a sufficiently clever constructor and literals for the constructor’s type parameters.

There are lots of cool uses one can foresee for this proposal. Obviously one could make a units/measurements system that was much more seamlessly integrated through the use of literals for constructing measurement instances. Having an agreement upon literal syntaxes for string objects would bring them one step closer to first class status beside their C-style predecessors. One particularly cool example from the proposal is to have literals for internationalization efforts, such that “foo”_i18n might translate to: ‘lookup the key “foo” in the appropriation i18n table and use the appropriate value’, which might reduce i18n friction enough that developers would adopt sane i18n development practices without first having several sessions on the rack. One can see some interesting abuses, as well. I have to wonder at the extent to which they can be employed recursively. I can’t see a reason why one couldn’t create literals with side effects, although hopefully this practice would be viewed as in poor taste.

Despite the tongue-in-cheek comments found throughout this article, and some of the shortcomings in the concept and the specifics of the proposal, I actually quite like the design of extensible types and hope that a somewhat more polished version of it will make it into the standard and the top tiered C++ compilers quickly. While a complex solution for what at first glance seems like a simple problem, like a lot of C++ features, most of the complexity is pushed to the library designer side (i.e. the person writing the extensible literal), while using the feature seems likely to be simple and straight forward in the most common cases. The former is, in my opinion, intrinsic to C++’s nature, forgiveable, and arguably a feature. Designing code for reuse is always difficult, and sometimes languages which make it deceptively easy encourage very poor designs that would effectively be stillborn in C++. If, however, you can make using said code fairly straightforward, you make it easier for less sophisticated developers to leverage the skills of the masters. In the end, this proposal strikes me as emblamatic of the language itself: yes, it is complex under the hood; yes, it has a face only a mother could love; yet, beneath all that is is both powerful, pragmatic, and cleaner to work with than the typical hackery C and C++ programmers tend to come up with to address this issue.

CSI: Dialog Written By Millions of Monkeys Copying Tech Manuals 3

Posted by Christopher Smith Sun, 01 Jun 2008 17:58:00 GMT

Just let me whip up a Python script here to recursively query the root server for the host name of that IP address… wait, even that isn’t as bad.

Anatomy of Javascript Hack 3

Posted by Christopher Smith Fri, 02 May 2008 18:28:00 GMT

NOTE: Several of the links in this article point to the original Javascript of this exploit or transformations of the original Javascript. If you actually execute the Javascript you will be performing the exploit. I suggest readers download the links and then look at the source in an editor, rather than clicking on the link and risking their browser attempting to execute the Javascript. I’ve set the content types of these links to “text” in order to minimize the risk of this, and I’m sorry if that creates an inconvenience.

One of the user groups I participate in is the UUASC. Recently, one of the BOFHSysAdmins in the group posted a rather cryptic bit of Javascript that they saw flowing over their network. Their question was pretty simple. What does this do?

So starting with the outer bits of the code first, it is pretty clear that the bulk of the message is escaped Javascript which is fed in to eval. Not exactly a good sign, but not necessarily a bad thing. The first step I took was to unescape the relevant data (without performing the eval of course), which yielded this.

The source was clearly obfuscated, so I went through it, cleaned up the formatting and attempted to assign more meaningful names to variables and functions. This yielded this. The transform1 function included a use of eval(someString.replace(/blah/, ”)), which was clearly just a way of obfuscating how myCallee was computed, so I performed the replace and substituted the result, yielding this much clearer source, which shows that myCallee is actually the function itself!

So, it now becomes clear that the obfuscation of transform1 was not entirely random in nature, as the source itself was used as part of the key to decode the “payload” (the string passed in to transform1 after its definition). The easiest way to accurately decode this was to create both the original fpTu function in all its obfuscated glory, then set myCallee to fpTu in transform1, and then see what we got back from transform1 when we passed it the payload.

Not surprisingly, we have another somewhat obfuscated chunk of Javascript. After reformatting it (thank you Steve Yegge for js2-mode and providing some more meaningful function and variable names, the payload now looks like this.

There are still a bunch of places where the replace(/blah/, ”) idiom is being used to obfuscate things, and another place where a variant is used where all alphanumeric values are being replaced with a period, and then all instances of multiple periods are replaced with a single one. After unraveling those, the intent of the code becomes clear and we can attach more meaningful names to the functions and variables. Thus I ended up with this.

We can now see that the code points to 3traff.cn (don’t you feel safer already? ;-), and it cleverly embeds an iframe pointing to 3traff.cn in to each document as it is loaded. I’m not up on my browser exploits, but it looks like the intent of all this is to at the very least track users as they go off site, and also might be hoping to confuse a browser about which domain a document came from, and therefore potentially cause cookies to leak to 3traff. Either way, this isn’t the kind of code I like running in my browser.

Leading the Horse to the Fountain of Knowledge 1

Posted by Christopher Smith Fri, 02 May 2008 05:48:00 GMT

While truly great rants are mostly entertaining, some rants can be enlightening for the insight they provide during rare moments of candor about things that aren’t obvious to those who don’t share the ranter’s paradigm. Such is the case with Design patterns are from hell!. I’ve seen a lot of critiques of Design Patterns, and I’ve seen a lot of misuse of Design Patterns, but I totally didn’t grok what the real problem was until reading this rant. Now I finally get it.

The killer insight that I was lacking was the notion that people reading the books would focus on the wrong thing. I was blinded by what I found to be very clear statements by the authors of the book about the benefits and purposes of patterns, how one should select patterns, how they differed from other reuse techniques, and indeed even the title itself. It seemed clear to me that it was all about design, all about practical solutions to common problems encountered in OO designs (and yes, the authors explicitly state that these solutions are nothing new, but rather well established), and most importantly: not about the code.

What the Design Patterns book teaches is for people to think in the box instead of outside of it; just as I did on that exam. Pattern thinking has now permeated and perverted peoples’ thinking to the extent where patterns are perceived as being an ends; something you need to use to correctly solve problems.

First, I don’t get how that principle related to “the Plank”, because it seems clear from the statement that he knew how to solve the problem using a different one of the cookie-cutter solutions he’d learned. Rushing to judgment before understanding a problem doesn’t strike me as “thinking in the box”. Normally, people “thinking in the box” spend most of their effort on careful consideration of how a problem resembles and differs from problems they’ve seen before, giving up if it is too different from anything they’ve worked with before. Secondly, he jumps from talking about “Design Patterns” the book to “Patterns” in general, apparently failing to recognize that Design Patterns was just one book detailing commonly encountered design techniques of OO developers and comes far short of representing all patterns (indeed, the PLoP books detail new patterns presented each year), they were just the low-lying fruit. Even if you forgo this misnomer, it somehow misses the very logical conclusion one should reach in not finding a pattern that fits your problem: your problem isn’t a commonly recurring one. It boggles the mind to think one would instead think to apply a pattern to a problem it was explicitly not designed for.

Now I understand why people gripe about the Singleton all the time: they’ve too often run in to developers who use it because they can, not because they need to, developers who think of the Singleton solution instead of the kinds of problems the pattern addresses.

Here’s the problem, if one is the sort of person who tries to figure out the minimal amount of studying to get through one’s statistics class (ironically applying approaches to achieving this that could have been derived from statistical principles ;-), one probably skimmed (or worse skipped) over the beginning of the book that takes the time to explains how to use the patterns. One probably went straight to the patterns themselves, and after looking at several of them, one probably decided the real meat of the patterns was in the code itself, because one’s main use for programming books was to look for code samples and apply them to your work without having to think so carefully to avoid making mistakes.

To paraphrase JWZ: now they have two problems.

Thinking in patterns is exactly the wrong thing to do! It makes you think in terms of the solution instead of in terms of the problem! Pattern thinking makes you try to fit a round, square, or oval hole (your choice of patterns) to the triangular peg (your problem). When all you have is a hammer…

A designer who goes around thinking about design in terms of trying to shove a bunch of pre-built solutions in to a problem is screwed whether they’ve read Design Patterns or not. A designer’s first focus should always be on what problems they have to solve, and only think about solutions as a consequence of addressing said problems. If you focus on the solutions instead, you will inevitably be wandering around with a hammer, thinking all the problems you face are pegs.

What was curious to me when first reading the quoted text above was that I thought it was odd to think of Patterns as solutions looking for a problem. The book quotes about the specific way that patterns are documented, and right there at the beginning of each pattern is talk about the kinds of problems the pattern is appropriate before. Heck, the pattern catalog was even grouped by the kinds of problems each pattern solved! It seemed obvious that the proper way to organize your thinking about patterns was in terms of the problems they solved (indeed, I’d often forget details of specific aspects of designs or implementation techniques for patterns I didn’t use much, but I tended to remember their applicability quite clearly). It’s odd to me that one would think one had to organize your thinking in terms of the solution, particularly since the solution was inevitably so much more complex than the problems it solved. What I was missing of course, was that to someone trying to find the “meat” of a pattern without reading and understanding the whole thing, they’d tend to of course focus on where the biggest sections of the pattern description were, and where the complexity was. So now you’ve compounded the problem: you’ve dramatically increased the cognitive load necessary to select appropriate patterns and you are focused on the wrong thing for coming up with a good design.

But wait, it gets worse!

For example, before dumb and dumber decided to call it the “visitor pattern” every programmer worth his salt would just call it a “map” operation (as in the LISP functions “map”, “mapcar”, “maplist”, etc). That’s just, oh, something like 1994-1960(?) = 34 years of previously established terminology! Twits.

Yup, he just completely missed that the visitor pattern is primarily about double dispatch, and even talks about how CLOS does this natively. It talks about how Smalltalk’s do: (the moral equivalent of map/mapcar/maplist) won’t get you double dispatch support. Most importantly, it even talks about how putting traversing behavior in the Visitor, as done in the sample code, only really makes sense if the logic for determining what to traverse depends on results of the algorithm itself (i.e. when map/mapcar/maplist won’t work).

So, by focusing on the code, rather than the discussion of the problem, it’s various facets, and how they effect possible solutions, one not only misses out on understanding the problem, but also in understanding the solution.

It really is a poor craftsman who blames his tools. The problem is being focused on solutions, and that doesn’t come from Design Patterns, it was there before hand (as demonstrated by The Plank).

In retrospect, I think the real mistake some pattern advocates may have made was in encouraging people who just didn’t get it, or didn’t want to get it, to use patterns anyway. To really mix up some metaphors: once you’ve brought the horse to the fountain of knowledge, if it doesn’t drink, just shoot it and put it out of its misery. If you force it, it’s just going to pretend to drink while secretly gargling, and you’re going to find yourself stuck in the middle of the desert with a dehydrated horse. ;-)

Ruby on... Gemstone? 2

Posted by Christopher Smith Wed, 30 Apr 2008 23:02:00 GMT

Really, when you think about it, how can a company called Gemstone NOT get involved with a language called Ruby. So, Gemstone, of Gemstone and GLASS fame, have apparently decided to get the traditionally lackadaisical Ruby runtime running on their VM. From the first time I dabbled with Ruby it seemed like “file-based Smalltalk with some ugly Perl-isms and a crappy VM” (and yes, in fairness, the ugly Perl-isms are also part of its strength), so this makes a lot of sense, and may yet drag Ruby in to the real world. Gemstone gets bonus points for providing yet another example of confusing efficiency with scalability.

BTW: Mike came up with a great acronym for Gemstone to use: GLARE: “Gemstone Linux Apache and Ruby Emulation”.

UPDATE: Avi caught me red handed for not reading the entire interview. Upon further reading of the interview and Avi’s excellent blog posting comparing Gemstone to Rails, it appears the Gemstone folks are very much talking about scalability as opposed to efficiency. In fact, it seems they are expecting the primary advantage of MagLev to be through Gemstone’s persistence architecture (here’s hoping it is also a lot more efficient).

Awesome Comic On Java & Javascript's Tortured History

Posted by Christopher Smith Mon, 21 Apr 2008 16:29:00 GMT

Who's Your Boss?

Posted by Christopher Smith Thu, 10 Apr 2008 09:07:00 GMT

Paul Graham fired yet another shot across the bow and engulfed the blogosphere in extensive flames, and particularly produced enough heat to melt Kevin’s webserver. Honestly, the whole thing amuses the heck out of me, because the entire premise of the discussion strikes me as built upon an implied assertion that is, at best a matter of perspective and at worst an illusion. From there it seems to just get worse.

[Note: This was going to cover a lot of different points, but I’ve decided to focus on the main one, and perhaps address the others at a later time.]

So, the fundamental premise of the essay is that startup founders don’t have bosses. This is the great deceit of the self-employment dream. I remember when I was in college, a successful entrepreneur visited the school to lecture us wet-behind-the-ears types, in hopes of imparting some of his wisdom from his successes and failures (“successful entrepreneur” just means you’ve succeeded at least once, and most “successful” entrepreneurs have more failures than successes). One of the first things he tackled, was the notion that by starting your own company, you get to be your own boss. Having participated in my own startup, I couldn’t agree with him more.

Who’s the boss of a startup founder?

Well, for starters, there are investors. Most investors want some control over their money, and they literally own a chunk of the company. VC’s and other seasoned investors, will also exert additional influence thanks to extensive shareholder rights that they tend to negotiate in to funding agreements. Of course, these days, you can supposedly start a company on a shoestring budget, and if you are lucky (strictly in the sense of having more “freedom”) you can either find a silent partner or self finance. Of course, your chances of success tend to be higher if you don’t go either route, but we’re talking about living your life to the fullest here.

Then there are lenders. A lot of software startups avoid lenders initially, but sooner or later, a lot of small companies come to involve lenders in their financing. The good thing about lenders is that they tend not to be as nosy as investors. They are however, more risk averse. If they start to think that they aren’t going to get paid, they can get squirrely and start talking about things like collateral and what are you going to do to improve your cash flow? Lenders can literally break a company when it is in trouble, so once you have their attention, they are your boss until you can pay them off (Steve Jobs underscored this when he wrote with palpable relief when Apple was debt free for the first time after his return to the company). Again though, if you don’t need money for your startup, and don’t stand to benefit from the extra cash flow, you can avoid this one too.

Then there are your board members. These will often include representatives for your investors, and sometimes even lenders depending on the terms of your debt, so there is often overlap between this group and the previous ones. If you provide your own funding, you get to choose who is on your board, so you might feel you are really their bosses, and to a certain degree you are right. However, they still have a fiduciary responsibility to you, the startup owner, to tell you, the startup officer (isn’t it great wearing many hats?), what to do. Indeed, you should probably consider firing them if they never do. Still, if you provide your own money and have no real interest in chasing down investment, you can have whatever push over board members you want.

Then there are your cofounders. These guys typically have a similar stake in your company as you, and they frequently are on the board, but they obviously have special status. In fact, if you have more than a couple of founders for your company, it is likely that collectively your cofounders own more of the company than you do. Of course, the shared adversity of a start up environment tends to bind you even closer to your cofounders, creating additional responsibilities. A common phenomenon are feelings of guilt about not working as hard as your other cofounders, which can often lead to an unspoken competition to see who will work hardest. More importantly though, it is really key to have all the founders of a company on the same page, so when even one founder raises a concern, you have to listen to them. Of course, if you found the company entirely by yourself, you can avoid these concerns too.

Then there are your employees. Your employees? Yup. You might think the term “employee” kind of implies that you are their boss, and you wouldn’t be entirely wrong. In practice, nothing is more important to the success of a startup than the initial choice of employees. Competition for talent is fierce, particularly in the valley, and you should assume that whenever the risk/reward equation for a given employee looks better elsewhere, they’ll start thinking about making a move. Furthermore, at a startup, you typically have to offer employees some level of equity in your company, so they also qualify as investors. Consequently, if they think you should do something, you need to listen, or your labour pool can’t grow beyond the founders. Of course, maybe that’s what you want, even if that lowers your chances of success. It is, after all, your company.

The one boss you can never escape is your customers. That’s right. The owners of a startup are, ultimately, looking to get customers to part with their money in exchange for some kind of product or service. Whatever terms your customers demand ought to be treated as divine commandments, and even if they just want or like something, you probably should listen. This is of course is true with a large company as well, but with a startup, customers exert significantly more leverage. Indeed, this is why customers often like smaller businesses: “they are more flexible and listen to my needs [because I’m some measurable chunk of their revenue and they have to listen to me]”. It’s not uncommon for potential customers to have more capital than the entire net worth of a startup. They know this, and they love to exploit it. Customers can actually be far more demanding than your typical manager at a large company, because ultimately they might not care whether your business succeeds or fails, so long as they get what they want (IBM and Walmart are famous for actually driving companies in to bankruptcy by making and/or changing outrageous demands of their suppliers, and then happily moving on to another company, leaving investors and lendors to pick up the pieces). A startup that doesn’t need customers isn’t a company: its hedge fund. ;-)

Now, you still might say, “yeah, but I still get to call the shots more when I run my own company,” and you might very well be right. I will concede that there are three aspects of being a founder that is different from working at a large company.

First, there is your own mindset. When people found their own companies, they feel a sense of propriety that causes them to try to exert control more: “it’s my company, I’ll set my own hours” or “I don’t care what they say, I know this is the right way to do it, and it is my company”. I’d argue this is mostly a failure of mindset. Large companies are rife with people who play with new technologies, take on high risk projects, set their own hours, etc. The trick is learning how to create those opportunities for yourself, and it is a trick that takes some time to learn how to do.

Second, the risk/reward equation is typically fundamentally different for founders. Unless you’ve handed over almost all your equity to VC’s and employees, as a founder you tend to have a huge stake in the company’s future; the kind of stake that allows people to retire after five years of work if they are lucky. You also typically are risking a significant chunk of capital and/or sweat equity. As an employee at a large company, one typically has some options that might at best double or triple your income for a year, but normally the “I don’t have to work any more” scenario doesn’t enter in to the equation. An employee also has weighty guaranteed income in the form of their salary. Now, startup founders can sometimes negotiate a nice salary with their investors, but smart investors (which frustratingly are the ones you want) will typically seek to keep founder salaries low in order to ensure founders have a similar risk/reward profile to their own. The employee’s compensation package can be a set of golden handcuffs that limits their flexibility, making them risk averse and lazy, and a founder’s compensation package is typically akin to a ticket at a state fair concession that encourages them to assert themselves and swing for the fences, but beyond the undeniable economics of the situation, that is really a matter of choice.

Finally, and I think this is the one that a lot of people, including Mr. Graham, tend to not fully appreciate: the potential rewards for conformity in a large company are much, much higher. Large companies are very much like big ships at sea: they may not be able to change directions as quickly as startups, but they can bring tremendous power to bear in whatever direction they choose to point. This tends to create tremendous pressures to conform. If you try to negotiate a unique benefits package for your team, people will look at you funny (and look for an explanation), because you can get way better cost/benefit by going with the corporate plan. Same goes for having different hours, dress codes, coding styles, technologies, etc. This often creates situations where what would be the wrong decision at a startup actually turns out to be the right decision at a larger company just for this reason. If you don’t like conforming though, it isn’t hard to find places/roles in a company where the value of conformity is substancially less.

When you look at all this, the you realize that the one difference that is hard to overcome is the risk/reward equation. Now, the being in a high risk/reward situation does tend to change how you live your life, but unless you are lucky, it is not necessarily a net positive one (as Kevin has noted); feral cats might be living the call of the wild, but the stress of their “natural” lifestyle limits their lifespan to a small fraction of domesticated cats. When appealing to evolutionary notions of how we are “meant to live”, it is important to keep in mind that often what was “meant” was for us to die (particularly if you’re a weak geek with poor hand-eye coordination ;-).

Linux AIO sucks less

Posted by Christopher Smith Sun, 30 Mar 2008 23:11:00 GMT

So, the last little while on the Zumastor project, I’ve been working on integrating AIO in to the code base in order to absorb some of the latency penalty that we experience from disk seeks.

Critical for this was getting AIO to work with poll(2), because the ddsnapd daemon follows the tried-and-true “poll then do something” loop that allows for efficient, scalable, and relatively simple Unix server design. Unfortunately (or fortunately, if you are familiar with POSIX ;-), Linux’s native AIO doesn’t follow the POSIX AIO spec, and instead implements it’s own event queue for notification of completion of IO operations. This event queue isn’t exposed as a file, so you can’t poll it. So, I hacked together a library that spawns a separate thread which does nothing but read in events and copy them out to a pipe, so that the main thread can poll said pipe just like any other file descriptor. Ugly? Yes. Wasteful? Yes. Easier to work with than the apparent alternatives? Yes.

I got most of the way through the process. I discovered what appears to be some kind of race condition in AIO where the vast majority of the time I was losing completion events if I submitted multiple IO requests at once. I still haven’t tracked it down, but while looking for possible sources of the problem, I discovered a heretofore unknown (well, by those of us on the project at least) syscall: eventfd(2).

eventfd does for AIO what signalfd(2) does for signals. In other words: it does the obvious thing that we wanted in the first place but were too mentally challenged to find. The moral of the story: even if you think a Linux API (AIO in this case) sucks, expect it to suck less, and question why when it doesn’t.

I See Dead People

Posted by Christopher Smith Sun, 02 Mar 2008 19:56:00 GMT

This article amused me in the way only someone with a name like Christopher Smith can fully appreciate.

During interviews, I always hate getting the question “Why did you become a software developer?”, because to me it’s like asking, “Why did you become a heterosexual male?”. I’m sure there are factors, both genetic and environmental, but I can’t claim to have any real understanding of them. So, when I got this kind of question early on, I would fib a bit and tell people it was because my name had gotten me mixed up in who knows how many databases, and I wanted to fix that.

There is some truth to it. When I was in grade school I missed two days of school because I was sick, but because there was another boy with the same name attending school with me, all the records showed me as present, and I was called in to the Dean’s office because he had noticed I wasn’t in his class, and he assumed I was had skipped out on his class. I once had all my Physics grades disappear because they overlapped with grades for another Christopher Smith. When I was at MIT (of all places), I was the source of much drama during rush week, because frats are obligated to tell other frats if a potential pledge is with them (imagine the confusion with multiple Chris Smith’s, sometimes at the same place, sometimes at different places). My French teacher in high school had the unfortunate case of six students named Chris in a class of twenty-two. She decided to name use “Christophe Un, Christophe Duex…”, <sarcasm> which in no way made us feel like weren’t getting individual attention </sarcasm>. I remember thinking my title as “Christophe Cinq” was a likely indication of how little she liked me, particularly since “Christophe Six” had showed up a month in to the school year. When I was in boarding school, my parents arrived for the end of my first year, and my mother ran in to who she thought was my math teacher (in fact, he’d been my teacher for the first week of school before I transferred to a different class) and almost fainted when he said my impressive performance on the final exam had “really made a difference” (my mom had correctly been under the impression that I was an A math student before I took the final). When I first visited the financial aid office in university, I felt quite the ego stroke as I was mistaken for a grad student with the same name but five years my senior. While flattering, I think it took most of the first semester to straighten out my financial aid status. In a recent election, I discovered a small flaw in the voting process: my name had been crossed off for another Christopher Smith in the same precinct. I had to write up a provisional ballot, and to this day I doubt that my votes were counted. Most embarrassing was probably when I was invited to an orientation seminar for new students on campus that was clearly intended for women (if I’d been less insecure I would have realized it was a terrific opportunity to find a date, but somehow the emasculating experience of being mistaken for a female removed whatever limited confidence I had). I think my favourite one though was when I found out how much more a product manager with my name made when a notice of a raise got routed the wrong way.

I could probably drum up a half dozen more stories, but suffice it to say, it has been an adventure walking through life with name disturbingly common amongst my peer group, and I certainly have become sensitive to issues of referential integrity and unique identifiers as a consequence (my wife remembers from a recent discussion of how the grading system worked at her school ;-). I find that most databases have far more logical constraints in them than the actual system they are trying to represent. I’m sure the personal angst from this has helped me at various points, but I think the biggest lesson I’ve learned from that is how most systems tend to have cases that fall through the cracks, and the need to identify and respond to this as best one can. Reading this article, the SSA gets a big fail from me.

With something as significant as whether a person is alive or not, corrections should be trivial. Listing someone as dead should come from a death certificate, which in turn should be tied to all kinds of personal identifiers, including name, SSN, date of birth, time of death, place of birth, etc., and the identifier for the death certificate should be tied in to the transaction that lists them as “dead” (of course, that only works if your system actually collects this information, but the fact that it doesn’t strikes me as insane). If someone challenges the notion that they are dead, it ought to be as simple as comparing the data in their death certificate against other data on record, and quickly reversing the operation in the case where it doesn’t add up (even better would be if the system notified those doing the data entry of a likely error at the time of entry). Sure there are errors in death certificates too, but it is probably pretty easy to determine that there was someone else who should have been marked dead, and this person was just collateral damage. Just name, date and place of birth ought to avoid collisions except in the most exception circumstance. Only the most exceptional cases would require significant review or effort on behalf of the “deceased” party. The system should also have an additional state between “deceased” and “alive” along the lines of “under review”. Sure, you might not hand out social security cheques to folks in that state, but you probably wouldn’t throw out their tax returns either.

Honestly, given how error prone and time consuming the system is, I have to wonder at the thinking which lead the powers that be to not streamline the process as I suggest long ago. I’m sure it makes sense in some really unfortunate way, but really, this is the kind of problem that you imagine happens somewhere else.

Where Is the Collaborative Filtering? 1

Posted by Christopher Smith Thu, 31 Jan 2008 05:39:00 GMT

Rob Malda recently discussed why Digg, reddit, etc. all stink. He’s bang on the money, but this brings to mind the question the thing that has been driving me batty about these news sites: what’s going wrong with the collaborative filtering?

In theory, collaborative filtering algorithms should effectively work like this: lots of people of people label different bits of a dataset based on their tastes. The collaborative filtering algorithm chews through all the labels in the dataset and then predicts how you would label other bits of the dataset based on those whose labels most closely resemble yours have labeled them. When it comes to news, this should mean the engine selects news items based on what is interesting to other people who usually find the same news interesting as you do. My experience on reddit is that this somehow means that no matter how many Ron Paul articles I rate as totally uninteresting, it still seems to find new Ron Paul articles which reddit believes I will find completely fascinating.

Now, I’m well aware that creative people will find ways to game the system, but frankly, proper collaborative filtering should make it really hard to game the so overwhelmingly. If you spam the system much, other people most similar to you start labeling your spam the opposite to how you have, and very quickly your recommendations don’t impact them any more. The only way they can get back in the game would be to create a new account and quickly try to label a whole bunch of data to get in to a trusted position again. This will get increasingly difficult as a community matures, as there will be more and more labeled data for at least the older accounts. I’ve been on reddit for ages, faithful labeling articles most ever day, and the recommended page is still completely useless.

I’ve seen comments from the folks behind reddit that describe their collaborative filtering algorithm as “not working well when there are wide divergences between segments of the community”, which makes me think it isn’t collaborative filtering at all, but rather a machine learning algorithm that is trying to predict an overall “interesting” score, with a karma system to boost the weights of redditor’s labels. When you read the faq this is how it all seems to work.

So is that the deal? The term is just horribly abused, and nobody has bothered to put together a proper collaborative filtering news site, or is there some inherent problem with the concept that I’m missing out on?

Older posts: 1 2 3 4