Benchmarking Typo

I finally had a bit of time to do some Typo benchmarking over the weekend and (as usual) found that my instincts were all wrong.

I was specifically interested in the performance difference between the page cache and the action cache–my guess was that the action cache was a 10x performance hit.

So I set up a test environment under Xen, running Typo r683, PostgreSQL, Apache 2, FastCGI, Ruby 1.8.2, and Rails 0.13.1. I didn’t do any Apache or Postgres tuning–I just ran them out of the box.

Then I ran ab against a snapshot of scottstuff.net from a couple weeks ago. I used the index page for my testing, as it’s a fairly large page and I wanted to give Typo a real workout.

Here’s what I found:

Cache TypeRequests per second
Page Cache2357
Action Cache10.6
No cache1.01

That really wasn’t what I’d expected. The action cache underperformed my expectations by a factor of 20.

I then did a bit of experimentation. I created a new uncached action in my test Typo setup that did nothing but render :text => 'foo', :layout => false, just to see if the caching system was slowing things down. Result? 10 requests/second. Then I created a new Rails project from scratch and added a new controller with the same action, and saw the same results. Still 10 requests/second.

However, in Rails’s logs, it said that it handled the request in 2 ms, and I should be seeing 500 hits/sec for the “foo” page. So something is adding an extra 98 ms to each request. I’m still hunting for this–I don’t know if it’s something Xen-related on my system, an artifact of my Apache config, or what.

I’ve tried upgrading to Rails 0.14.0, but that’s a whole other article.

Conclusions:

  1. The page cache is really, really fast.
  2. The action cache is a substantial improvement over the uncached case–about 10x on this system–but can’t touch the performance of the page cache.
  3. Changing the concurrency settings on ab and/or the number of FastCGI backends in use didn’t make a substantial performance difference. Settings 2-15 gave roughly the same results.
  4. My caches_action_with_params is slightly faster then the stock action cache.
  5. Moving the action/fragment cache from the FileStore (Typo default) to MemoryStore gives no real performance boost. Moving to the MemCacheStore is a substantial performance hit (~2 hits/sec vs 10 hits/sec).
  6. Adding a new uncached action in ArticlesController that simply returns a fixed string (render :layout => false, :text => 'foo') is no faster then the action cache.
  7. Moving the new action from the previous step to a Controller of its own doesn’t help.
  8. Removing all of the routes except the default :controller/:action/:id route doesn’t help, either.

Frankly, on this hardware, I don’t seem to be able to get more then 10 requests/sec out of Rails no matter what I do. I’m pretty sure that this is a mistake, so I’ll post a followup when I figure out what’s wrong.

Update 1: Another datapoint. Running an example Ruby FCGI on this box gives me 832 hits per second. So whatever the problem is, it’s not fundamental to Ruby FCGI on this box. So I need to look into Rails and see what’s happening.

Update 2: Making some progress. Apparently I screwed up when I tested a new, standalone Rails FCGI app before (forgot to restart FCGI?). This time, I got 130 req/sec, which is a vast improvement over the 10 req/sec that I was seeing before. Most of that speed hit seems to come from using Postgres for session storage. Unfortunately, even after making that change, my null controller is still only getting 40 req/sec with Typo. Transporting the same controller to a blank Rails project gives me 130 req/sec with the same code. Watching strace, it looks like something is forcing Typo to reload the digest/md5 module for every hit. Unfortunately, I can’t figure out how that’s happening–my environment.rb is identical between the two trees, as is my database.yml. I’ll get back to this later; I have other things that I need to finish today.

Posted by Scott Laird Mon, 17 Oct 2005 16:51:13 GMT


Asterisk extension language

While I really like all of the things that Asterisk allows me to do with my phone system, I’m really not very fond of its configuration language. The language provided in Asterisk 1.0 is slightly better then sendmail.cf, but it’s still a lousy language. They made a few small improvements early in the Asterisk 1.2 development process that helped a bit, but I assumed that we’d have to wait for another year or two before someone broke down and wrote a decent language for Asterisk.

It looks like I may have been overly pessimestic. I discovered the Asterisk extension language on a list of new features for Asterisk 1.2 today. Somehow I’d missed this when it first went into Asterisk.

Here’s a chunk out of my old config file. It’s not perfect, but it’s what you get when you’re stuck dealing with line numbers:

[macro-diallocal]
  exten => s,1,AbsoluteTimeout(7200)
  exten => s,n,SetAMAFlags(default)
  exten => s,n(analog),Dial(${TRUNK}/${ARG1})
  exten => s,n,Congestion
  exten => s,analog+101,Macro(condsetcid)
  exten => s,n,SetCIDName(LAIRD SCOTT)
  exten => s,n,SetAMAFlags(billing)
  exten => s,n(nufone),Dial(${NUFONE}/1${ARG1})
  exten => s,n,Congestion
  exten => s,nufone+101,Busy

And here’s the equivalent using the new configuration language:

macro diallocal( number ) {
  AbsoluteTimeout(7200);
  SetAMAFlags(default);
  Dial(${TRUNK}/${number});
  if(${DIALSTATUS} = "CONGESTION" || ${DIALSTATUS} = "CHANUNAVAILABLE") {
    &condsetcid();
    SetCIDName("LAIRD SCOTT");
    SetAMAFlags(billing);
    Dial(${NUFONE}/1${number});
    switch(${DIALSTATUS}) {
      case BUSY:
        Busy;
      default:
        Congestion;
    }
  }
}

It’s still not ideal–${VAR} is ugly–but it’s vastly better then the old syntax. This plus a bit of Rails integration should give us a really nice phone environment. I’m rapidly approaching the “tear it down and rebuild it” point with my Asterisk system–there’s a lot of stuff that I’d like to add that just doesn’t fit in with my current configuration files–so I’ll have a set of articles on integrating Asterisk, AEL, and Rails in a few weeks.

Posted by Scott Laird Mon, 10 Oct 2005 19:28:54 GMT


Typo Theme Contest

Geoffrey Grosenbach just announced the start of the official Typo theme contest. There are a number of fantastic prizes (including a 4 GB iPod nano) available, so take a look at the rules and fire up your editor now.

Posted by Scott Laird Mon, 10 Oct 2005 17:39:19 GMT


Rails caches_action_with_params

One of the big problems with caching in Rails is the way that Rails’s caching systems handles query parameters. Page caching completely screws this up–the page cache will turn /articles/read?id=100 into /articles/read.html, and Apache will then hand all future hits on /articles/read off to that static HTML file, even if the user was looking for /articles/read?id=99. You can mostly get around this by making sure that you always use named parameters via Rails’s routes, but even then a malicious user can do weird things to your cache by feeding query parameters via ?.

The action cache is slightly better, but it’ll still misbehave with the examples above. What we really need is a caching system that pays attention to all parameters, not one that ignores all of them that aren’t part of a route.

Towards that end, I’ve created caches_action_with_params. It’s a minor derivative of caches_action with a different fragment cache key; instead of using the URL (as generated by url_for), it ignores URLs completely and uses ACTION_PARAM/<host>/<controller>/<<action>/<params>. This way caching isn’t dependent on routing, which will help with some of the stranger problems that Typo has seen. On the downside, if your actions explicitly check the URL that the user used, then caches_action_with_params won’t work for you.

Once I’m off the train and sitting somewhere with usable IP, I’ll post some sample code to the Typo bug tracker and generate a couple benchmarks. I expect this to about about 10% as fast as the page cache, but it should still be faster then 100 hits/second, which is my personal definition of “fast enough” this week. Then, if no one has any big complaints, I’ll commit this and switch off the page cache.

Once that is done, it’ll be fairly easy to add a lifespan to cached pages, so we can say “this page is only good for 2 hours” and have it regenerate automatically after that.

Posted by Scott Laird Tue, 04 Oct 2005 15:54:24 GMT


Rails caching presentation

I gave a short talk on caching with Rails last night at the Seattle.rb meeting. The short version is “the page cache is going to hurt a lot worse then you’d expect,” but anyone who read my previous article on caching should already know that. I did the slides for the talk with S5, which was new to me–I had planned on using Keynote, but it seems to have died in the year and a half since I last had a use for it. S5 worked well enough, although there were some formatting issues that kept popping up as the browser window size changed. By and large it was easy to use, and it’s nice to have a HTML version of the talk that doesn’t look like a nasty afterthought.

About halfway through preparing for the talk, I realized that I really need to add a new action cache option, something like caches_action_with_params, so we can explicitly say how query strings and other parameters affect the cache. Here’s a bit of sample code:

class ArticlesController < ApplicationController
  caches_action :index
  caches_action_with_params :read, :id
  caches_action_with_params :permalink, :year, :month, :day, :title

  def index
    @pages, @articles = paginate(
      :article, 
      :per_page => config[:limit_article_display], 
      :conditions => 'published != 0', 
      :order_by => "created_at DESC"
  end

  def read  
    @article = Article.find(
      params[:id], 
      :conditions => "published != 0", 
      :include => [:categories])    
  end

  def permalink
    @article = Article.find_by_permalink(
      params[:year], 
      params[:month], 
      params[:day], 
      params[:title])
  end

  ...
end

At least as of Rails 0.13.1, calling /articles/read?id=10 will create a cache entry for /articles/read, which is wrong, and then asking for /articles/read?id=20 will return the cached entry for id=10. Yes, the user is supposed to use routes for this, but explicit query params still work, and there are times when you really need to use them. Fortunately, this is really only 20 lines of code, so it shouldn’t be too hard to write.

Posted by Scott Laird Wed, 28 Sep 2005 15:53:57 GMT


Comments for Typo

I’m starting to look at expanding Typo’s comment system. I have a few goals:

  1. Give Typo the best comment system of any weblog engine.

  2. Make it easier to handle large numbers of comments. This includes some sort of comment threading as well as comment pagination. Yes, those conflict with each other. No, I’m not sure how I’ll resolve it.

  3. Make the comment system more resistant to spam. I’m not really sure which approaches I’m going to use for this either, but this is a well-explored problem, and I have a few ideas. Fortunately, Typo has always had a few features that discourage comment spam, and we’d like to add to that a bit.

  4. Add support for authenticated comments, ideally using one or more external user identity systems, like TypeKey or OpenID. Personally, I’m really excited about OpenID, and I’d like to have Typo become both an OpenID client and server.

  5. Increase the “socialness” of the comment system. Make it easier to conduct discussions in comments. Allow frequent users to track responses to their comments via RSS or email. Allow “community building” via comments. Consider allowing some forum-like features, like allowing users to re-edit their comments after they’re posted. This obviously depends on having a reasonable identity and authentication system.

  6. Make it faster while we’re at it. Posting new comments currently invalidates Typo’s entire page cache; there has to be a way around that so we can keep up with Slashdot when they link to a Typo site.

  7. Keep it easy to run. One of my basic goals with Typo is to minimize the number of configuration options in the system, following DHH’s mantra: ”flexibility is overrated.” At the same time, different people *do* have different needs; we need to find the right balance between having 50,000 little configuration options and force-feeding the One True Comment System down people’s throats.

My goal is to have something to show for this in the the next couple weeks; that depends on my consulting schedule and a few other things that are hard to predict, but I’m pretty optimistic about this.

Before I dig deeply into design, does anyone have any particularly good comment systems that we should look at? My current favorite is Dunstan Orchard’s on 1976design.com, but I’m open for suggestions. I’d love to hear about needs that I’m overlooking, too.

Posted by Scott Laird Fri, 16 Sep 2005 02:10:00 GMT


RubyConf '05 registration is closed

DHH just announced that RubyConf ‘05 is now full and they’re not taking any more registrations. When I went in 2002, there were around 50 people, apparently they drew the line at 195 this year.

I was still debating going, too. Oh well. Next year.

Posted by Scott Laird Fri, 16 Sep 2005 01:47:36 GMT


Rails and Asterisk

Oooh. Just when it looks like I’m going to have to stoop to programming in Perl to do a bit of Asterisk integration, someone comes out with a way to use Rails with Asterisk’s AGI scripting interface.

This has the potential to be really interesting, because Asterisk+Rails should be about a decade more advanced then anything that the IVR people are used to seeing.

I have a couple small Asterisk scripting jobs that I’ve promised to various family memebers; if my schedule allows, I’ll give RAGI a spin and share how it goes later this week.

Posted by Scott Laird Mon, 12 Sep 2005 14:11:20 GMT


IMAP on Ruby

I just noticed that there is now a Ruby-based IMAP server available. Ximapd is only at version 0.0.4, which suggests that it isn't exactly production-ready, but I'll be following it as it develops over the next few months. Like the Ruby WebDAV server that I talked about a month ago, the big value in these sorts of servers is the fantastic things that you can do when you integrate them into non-traditional contexts. For example, a workflow system that can give a web view of the documents that it manages, while also acting like a file server and an email server. Changes to files or email messages are interpreted as actions in the workflow system and then the system state changes appropriately. That lets the users manipulate the data in the system in natural ways using familiar tools.

Another idea: nn email-support ticket system that only shows internal users the mail associated with open tickets that they own. When the ticket is closed, the mail goes away on its own.

Posted by Scott Laird Tue, 06 Sep 2005 21:25:16 GMT


Push me, pull me

Someone pointed out today that none of the “convert from your old blog system to Typo” converters in the current Typo development tree were working. They all produce articles without any HTML in them. This was caused by my big filter update from a week or so ago; apparently no one has tried to convert directly to a development version of Typo in the last week or two. The problem is that none of the text filters were running. Unfortunately, there’s no easy way to make them run because they need access to a working Rails controller, and there isn’t one available from inside of the converters.

At the same time, Piers Cawley asked for an easy way to rebuild all of the HTML generated by filters on his site–he was doing filter development and he needed to rebuild everything. Unfortunately, the filter design doesn’t make this easy, either.

These two are basically the same problem–the way that we run text filters is kind of painful in the current Typo tree. In Typo 2.5 and earlier, filters were applied at the model level, and nothing outside of the model really needed to worry about them–the filters were automatically applied every time that the article (or comment, or page) body changed. Due to the changes in the dev tree, this just isn’t possible any more, but I’d tried to hack it together by changing the dozen or so actions that changed Articles. It worked, but it was ugly, and it breaks when something like a converter needs to create a new article, because the converter has no way to run the filter.

So I’ve been making a few changes to Typo.

The basic problem is that we’ve been using a “push” model for updating the HTML version of articles, when we should really be using a “pull” model. That is, instead of updating the HTML when the article changes, we should really be generating the HTML when the article is viewed and then caching the HTML so we don’t have to do it more then once per article.

Fortunately, this change was pretty easy to make–I just had to search for every reference to body_html, extended_html, or full_html and change it to a reference to article_html(article). Then I moved the filter calls into article_html(article), saving the generated HTML back into article.body_html.

Once that was done, I could rip out all of the complicated filtering code that I’d had to put in to make the new filters work right, and everything Just Worked. I had to tweak a few tests that expected the HTML to be available in the database immediately after posting new content, but I already had tests that verified that the content viewed right, so it was just a matter of removing code, not really adding new code.

There’s one more change that I’m debating making. From an architectural standpoint, we shouldn’t really be stuffing things back into body_html–we should be using Rails’ fragment cache. Switching to the fragment cache would be trivial, it would only take a couple extra lines in article_html, and then I could rip a bunch of lines in the editor actions, because I could use a sweeper instead of explicit calls to article.body_html = nil.

Unfortunately, if we do that then we’ll end up killing Typo’s performance when it’s running in development mode, because caching is disabled in dev mode. So it’d be cleaner, but probably too slow to be useful. I’ll probably revisit this again before the next Typo release–there are a bunch of performance tweaks that we need to make before the next release; once those are done, we might be able to stand the performance hit.

Posted by Scott Laird Mon, 05 Sep 2005 01:25:00 GMT