Two Speeches

Posted by Christopher Smith Wed, 19 Mar 2008 17:33:00 GMT

It’s a black and white difference that you can see with your eyes closed.

And Now For Something Completely Different

Posted by Christopher Smith Wed, 20 Sep 2006 20:28:00 GMT

So, normally I don’t talk about pop culture, because I don’t properly appreciate it, which makes me a lousy critic, but I’m making an exception for Studio 60, or as I like to call it: season three of The Sorkin and Schlamme Show. I finally got around to getting my TiVo to show me the pilot. If you can’t wait for the episodes of this show to come out, I suggest you find a friend with DVD’s for SportsNight (that’d be me) and ask to borrow them. The parallels are so numerous that upon seeing Felicity Huffman in an obvious guest star role, I didn’t register that she was a guest star until about five minutes after I started wondering why she hadn’t been on the screen for a while. Similarly, anyone who has been in mourning since the show was canceled can put on a happy face again. For all intents and purposes its back.

I’d do more of a write up on the show, but I trust that every TV critic on the planet has already gone over it in depth, so I’ll just hit some highlights: Sorkin’s politics are as usual not being held close to the vest, the show is so insanely autobiographical that if you read the Wikipedia page on Sorkin and the Show, you can get confused about which page you are on, Amanda Peet must be thanking her lucky stars for landing her part: she’s going to look great in it, and just how awesome is it to pilot a show with an homage to Network?

Ruby on Rails Memory Efficiency 7

Posted by Christopher Smith Wed, 13 Sep 2006 16:00:00 GMT

Okay, I said I suspected that Ruby wasn’t terribly memory efficient yesterday. Today I have some strong indications that at least Typo isn’t quite so memory efficient.

I had noticed that my blog was getting quite slow and handling updates, and it wasn’t clear to me why. I am particularly sensitive to efficiency issues, because I run this whole thing inside a User Mode Linux instance, which imposes it’s own inefficiencies on pretty much all aspects of the Linux kernel. So I generally have to be fairly careful to tune everything I run inside it to minimize kernel involvement. I was all ready to blame UML again until I looked at what was going on with the instance: it was swapping like crazy.

Some peaking around in the system showed working set sizes for my Ruby FastCGI processes in the 35MB-50MB range (very bad considering my instance only has 64MB of RAM allocated for it) –that’s the kind of footprint that makes you start to think of J2SE/EE! To get an idea of how inefficient that is, Postgres’s working set size for handling this puny blog is about 4-5MB.

So far, I’ve been able to get back to some kind of decent performance by trimming my FastCGI process count down to one (which has some unfortunate side effects, but none as unfortunate as having 20-30MB of memory constantly swapping).

This tweaks my curiosity as to what’s going on under the covers, and how much of the overhead is Ruby, how much is Rails, and how much is Typo.

On Efficiency, Scalability, and the Wisdom to Know the Difference 4

Posted by Christopher Smith Wed, 13 Sep 2006 00:11:00 GMT

Joel Spolsky has been on a tear lately. He’s managed to really kick up a lot of dust. I’ve ignored most of the excitement, but I couldn’t ignore his latest post. He seems to have completely confused the differences between efficiency and scalability and has curious notions about the reasons for the importance of either.

Let’s review: efficiency is the ability to get something done while consuming few resources. Efficient code uses less memory, less IO bandwidth, and less CPU time to get the same job done. Scalability is a made up term used in industry to refer to the notion of software being able to handle growing levels of work gracefully, where gracefully generally translates to “costs grow no more than linearly with the size of the work”.

All of Joel’s arguments were about efficiency, not scalability. This is important because particularly for web applications, it is well recognized that non-scalable solutions are really not that useful, so there is hardly any debate about that. Scalability is generally tied to algorithms and the infamous Big O notation, so it’s very hard to point at a programming language or other low-level component and say, “that’s not scalable”. You can find frameworks (sometimes libraries) and identify places where they fail to scale, but Joel declines to do so. His beef is with Ruby, so it is all about efficiency (and actually specifically CPU efficiency), despite all his statements about it being “scalability”.

Efficiency is an interesting point of contention. People tend to make a huge deal about it even when they shouldn’t, and ironically don’t tend to make a huge deal about it for the one reason they should. I’m surprised that Joel fell in to the same trap.

First, I haven’t yet seen an implementation of Ruby that is particularly CPU efficient. I haven’t looked at memory consumption, but I’d be willing to guess it’s not so great there. Like most languages, it’s fine for IO efficiency.

So, right off the bat, if your application’s limiting factor is IO, there isn’t an efficiency disadvantage to using Ruby. Check around, and you’ll find there are a LOT of apps that are essentially IO bound. If your competition is using C and needs 10 servers, you could use even dog-slow Ruby and still only need 10 servers, not 100.

Now, Joel claims you’ll inevitably run in to some place where you are CPU bound. I’ve seen exceptions to this rule, but for the most part he’s right. There is always some performance hot spot that comes up that needs to be optimized. A lot of the time, even that hotspot can be addressed algorithmically, which means that you really don’t care about the language it’s implemented in. In the cases where it ultimately comes down to a language runtime’s CPU efficiency, the faulty logic here is that because this one part of your app can’t be implemented in language X, then you can’t use language X for the rest of your app. That’s just silly. You can always implement that hotspot in some other language, provided your language has some reasonably efficient way to hand off computation to code in another language (which Ruby seems to do reasonably well). If it is the difference between 10 and 100 servers, it is probably worth the development overhead to do it.

Joel also pokes fun at “advocates singing hymns about developer cycles vs. CPU cycles”, which I found surprising as well. Sure, you have small parts of your application where CPU cycles are key, and it’s worth sacrificing developer cycles for that added efficiency, but generally for apps the bulk of your code is much more sensitive to developer efficiency, because developer efficiency translates to “more features that work better”. You can find evidence of this in almost every software paradigm: interpreters in embedded systems, languages like Lua bound to high performance C++ game engines in the gaming industry, web servers written in C calling PHP/Perl/Ruby/VBScript/Python/whatever which in turn invoke functions in highly tuned databases (and it’s worth pointing out that Yahoo and Google use PHP, Python and Java despite scaling their apps to literally thousands of servers), and desktop apps like Word whose core is carefully tuned C/C++ and assembler, but that use languages like Basic to implement a lot of their features. The biggest example is the web browser. Most web apps are implemented in XML and Javascript (neither of which are about to set any efficiency records) that are executed by some very highly tuned browsers. So, there is considerable evidence that while you often need some expertise with a CPU efficient runtime, for almost any problem domain, what Joel calls an “inefficient” runtime is still useful and desirable.

Joel also makes some funny claims about duck typing effecting performance. Sure, it has an effect, but it is hardly the kind of thing that can’t be overcome. Yes, it makes type inferencing harder, but the key word there is harder, not impossible. Lots of folks have demonstrated how you can do really simple things like “hey, if self is of type X when I make this first call, it’s probably of type X when I make subsequent calls, and in fact, I can prove that it is always true until someone loads some more code in to this image”. With a sufficiently clever runtime (which Ruby lacks at this time), you can and should be able to get to the point where you are no worse than half as CPU efficient as C code. Joel’s right on one key point though: Ruby lacks this at this time, and that is a concern, but the concern is one he fails to mention.

Read that last sentence again: after two paragraphs pointing out that efficiency isn’t really that important, now I’m saying it is. Isn’t life full of contradictions?

Efficiency *is* important because it is a fairly reasonable proxy for the maturity of a platform. There’s a funny little factoid about software: there is almost always a way to write code in a way that gives the runtime enough information to execute efficiently. When your runtime doesn’t do this, you have to ask the question: why?

Joel makes the argument that you should be able to get the overhead of a function call down to the level where it’s a single CALL instruction. First, a CALL instruction can be expensive, thanks to the wonder of cache misses. That aside, you can in fact get it down to where the overhead isn’t even a single instruction, thanks the the wonders of inlining. As Herb Sutter pointed out in his article “Inline Redux”, inlining is almost always possible, because there are so many places where you can do it (Java, which Joel suggests has poor performance, has runtimes that inline far more aggressively than most C/C++ runtimes). As I mentioned above you can do tricks with type inferencing that get around the performance costs of late binding, except for when you are actually taking advantage of late binding’s benefits (in which case, as per Greenspun’s 10th law, the late binding runtime probably performs better than most attempts to get equivalent capabilities using C/C++). The same can be said for automatic heap management through all kinds of tricks. You can get message dispatch or generic dispatch to perform like function dispatch for the cases where you only need the simpler functionality of the latter. Zero overhead bounds checking can be done by code analysis or in the worst case using page faults. Really, the list of performance optimizations available tends to trump the best efforts of language designers to make things slow. ;-)

So that brings us back to the key question: if your runtime isn’t that efficient, why? The answer is that nobody has put that kind of effort in to making it that efficient. It just hasn’t been worth the effort yet, and that strongly suggests that the platform just isn’t that mature yet. If it were, that’d be one of the things that would have been addressed along with integration with legacy systems, sophisticated development and debugging tools, dealing with corner cases that could break the runtime, building out a complete set of support libraries, integration with various platforms and technologies, etc., etc. The bottom line is that if efficiency hasn’t been tackled to the point where you are within being about half as efficient as the ideal solution, some of those things haven’t been addressed. While efficiency might not matter to you, at least one of those other things probably will. Lack of CPU efficiency should be treated as a strong indicator that some other shortcoming that really matters to you might exist.

Now there are important exceptions to this to consider, in particular there are languages like Erlang, where the whole point is to deal with one very difficult domain efficiently (distributed computing/parallelism), and they actively encourage you to use another language (C) for more “regular” tasks. Even in those cases, you can expect that Erlang will show some lack of maturity if you try to use it for “regular” tasks, but you’re probably more than happy to use it for what it’s good at, and use something else for the rest.

Now, a runtime can be lacking in CPU efficiency in your application domain and still be useful for you. It could be your problem domain is all about other efficiencies, like IO or memory, and the runtime is great for that stuff (indeed, CPU and memory trade offs in particular create cases where it really only matters if a runtime can be memory efficient). You might be at a startup where the maturity of your platform just isn’t as important as your ability to get something up and running before the company runs out of capital, and some degree of risk due to platform immaturity is acceptible. You might be working on a problem that is so complex, and your resources are so constrained, that just getting something that works is such a victory that nobody cares about how efficient it is, how well it integrates with other technologies, etc. Fine, but most people benefit from the advantages of working with a mature platform, and as such, efficiency is a very good proxy for the far more difficult to quantify property of maturity.

So yes, I’d agree that efficiency is important, but in none of the ways that Joel suggests.