Typo 2.5.7: testers wanted
I’d love to release Typo 2.5.7 today, but I’d like a few people to verify that the current 2.5.x Subversion repository actually works correctly in their environment first. If you’re interested, please grab it from svn://leetsoft.com/typo/branches/branch_2_5_x and kick the tires. I’ll take feedback any way I can get it–via IRC, direct mail, the Typo mailing list, or comments here. I’d prefer not to clutter up the Typo trac with this, though, as there’s no good way to say that the report is about the 2.5.x SVN tree.
Typo 2.5.7 developments
I’m just about to start in on Typo 2.5.7 development. I have two goals for this update to Typo 2.5.6:
- Make Typo work correctly with Rails 0.14.x.
- Backport the new theme updates from the Typo trunk to the 2.5 series so Typo Theme Contest developers can use them.
I’d like to have this on RubyForge before I leave for the airport on Wednesday.
Counting RSS users
One of the great problems with RSS is that it’s really hard to know how many readers you have. Feedburner is supposed to be able to help with that, but I’m reluctant to outsource my RSS feeds to them–I’m not really sure how I’d get them back if I decided to can Feedburner. So, while I know that I’m averaging around 1,300 JavaScript-enabled page hits per day on my blog, I have *no* idea how many people are reading via RSS. On one level, it doesn’t really matter, but I find that I’m more willing to write when I know that people are reading, and the more readers I have, the more time I’m willing to spend writing.
The problem is that there isn’t a 1:1 correspondence between RSS downloads and readers, like there is for normal web pages (modulo caching and a few other issues). Bloglines is helpful enough to tell me that it has around 60 subscribers, and I know that I’ve served up around 24,000 RSS and Atom feeds so far this month, but I have no easy way to know if that’s 1,000 people with a slow refresh set or 11 people refreshing every 5 minutes, or even 50,000 people all reading via a portal. Plus, there are at least three “planet” sites syndicating one feed or another (PlanetRubyOnRails, PlanetTypo, and Planet Foo), and I have no clue how many readers they have, either via HTML or RSS.
I’ve been tempted to integrate a 1-pixel “web bug” into Typo’s RSS feeds more then once, but I don’t really like the privacy implications. Fortunately or unfortunately, I get the same effect any time I post an image here. The Flickr montage that I posted almost 8 hours ago has resulted in 347 image hits. Of those, 150 have no referrer, so they’re probably from standalone RSS readers, like NetNewsWire. Another 95 are from scottstuff.net, followed by 42 from Planet Ruby On Rails, then 24 from Bloglines, 16 from Planet Typo, 3 from Planet Foo, 3 from Google Reader, and a couple that are either comment spammers or internal feeds from stealth companies.
Does anyone have any good leads on how to track this sort of thing on a more regular basis? While we’re at it, does Feedburner just play session cookie games, or are they doing something clever? Finally, it seems clear that embedding images into RSS feeds works most of the time, but I’ve never heard of anyone using web bugs with RSS–did I just miss the discussion, or are people avoiding them?
Typo Theme Contest, again
As a reminder, the Typo Theme Contest is still running. The deadline for entry is November 28th, so there’s still time to start, but you don’t want to wait much longer. The prize pool has grown again–the top prizes are a new 15” PowerBook, a 12” iBook, an iPod Nano, and a year’s free hosting. Although it pales in comparison to a new PowerBook, I’d love to bundle the top couple themes with future Typo releases, so you can count on thousands of Typo users enjoying your work.
Typo moves to Rails 0.14.2
As of this morning, the Typo trunk requires Rails 0.14.2. I ran rails . and then spent a couple hours cleaning up Rakefile and environment.rb, and then updated all of the tests to work without instantiated fixtures. As advertised, this makes tests run quite a bit quicker; since Typo’s test suite had been flirting with the 5 minute mark on my laptop, anything that we can do to speed up tests is welcome.
Now that we’re safely requiring 0.14.x versions of Rails, I’ve started adding session :off all over the place. My sessions table for scottstuff.net has around 70,000 sessions in it right now, and I doubt that more then 15 or 20 of them were ever useful. We use sessions for user authentication, but that really only matters for admin pages plus a few other cases, like comment posting. In essence, any page that can be cached will never need access to a session. So, I made a little change to the Article controller. Instead of
class ArticlesController < ApplicationController
caches_page :index, :read, :permalink, :category,
:find_by_date, :archives, :view_page, :tag
...
endI’m now using
class ArticlesController < ApplicationController
cached_pages = [:index, :read, :permalink, :category,
:find_by_date, :archives, :view_page, :tag]
caches_page *cached_pages
session :off, :only => cached_pages
...
endThis is a really common pattern, because pages that can use the page cache really shouldn’t depend on the session in any way. If I was doing this in more then one place, I’d probably want to extract it into a nice little plugin, but I’m just not feeling the motivation right now.
Migrating in two dimensions
This seems to be the season for talking about Rails migrations. A lot of people are finally discovering them and finding that they’re very useful for maintaining your database schema over time. I’m a big fan of Rails migrations; we’ve been using them with Typo since the middle of July, when they were all new and shiny. We’re currently up to 24 migrations in the Typo source tree. We’re even using migrations to create our initial database, via my Schema Generator. I haven’t done a formal survey, but I suspect that Typo is the biggest open-source user of migrations, and may actually be the biggest user overall.
The big problem is that we’ve been using migrations wrong the whole time, and we just realized it.
There are probably a dozen bugs in Typo’s bug tracker that boil down to “I fell behind the trunk and now rake migrate throws exceptions and I can’t upgrade anymore.” The problem is that migrations are designed to run against an earlier version of your database, but they use the current version of your code. The first time that this caused problems was with the migration from Typo 2.0 to 2.5–we’d added two new fields to articles. Migration number 7 added the permalink field and a before_save hook to make sure that all saved articles have permalinks. Then migration number 9 added GUIDs and a second before_save hook to fill the guid field. Both migrations did Articles.find(:all).each { |a| a.save } to update each Article and populate the new fields.
This worked great for developers who frequently upgraded. A few days after the GUID migration went in, though, we started getting weird bug reports–users who tried to do both upgrades at the same time found that migration number 7 was dying. What was happening was that migration number 7 added the new permalink field to articles, but when it went to run the save loop both before_save hooks ran, and Typo tried to add a GUID to each article. However, the guid field didn’t exist yet, so the migration threw a bunch of exceptions and died.
This caused a bunch of grumbling on the Typo IRC channel. We threw around a bunch of possible fixes. Our favorite was separating migrations into two parts–a schema change part and a data change part. First we’d run all of the schema changes, and then update all of the data. As a work-around, we added a hack that checked the current schema version and disabled specific before_save filters for older versions.
We managed to keep this little bandaid working until a couple weeks ago, when a huge set of new migrations went it; they renamed the articles table and merged several other tables into the new contents table using STI. And, again, we found that older migrations broke when users tried to upgrade from Typo 2.5.6 to the current dev tree. Unlink the permalink/guid case, this time there was no simple workaround. We couldn’t just add a couple if statements in a filter and make it all go away.
The fundamental problem is that we were using the wrong mental model for migrations. I saw migrations as a one-dimensional thing–a list of steps for migrating old data into the new format. In this view, the migration for going from schema version 6 to schema version 7 is constant–once it’s been written, the only reason to change it is if a bug turns up in the logic for that migration. Otherwise, the migration code should remain unchanged over time.
And that’s the problem–migrations aren’t one-dimensional. They are (and need to be) two dimensional–the schema version is one dimension and the code version is the other. Individual migrations exist to migrate from a specific old schema version to the current version, using the current code. Each migration should change over time to adapt to the changes in the code. So, the right fix for the permalink migration that caused so many problems wasn’t to add a bunch of logic to before_save. Instead, we should have deleted the entire save loop from the migration, and trusted the GUID migration to update both fields. If that wasn’t good enough, then we should have added a new migration at the end to do permalink cleanup after the GUIDs were added.
Once I came to grips with this, the migration changes needed to allow 2.5.x users to upgrade to the current trunk were pretty simple, and took about 5 minutes to write and test.
Or was I the only person in the Rails universe who thought about migrations this way?
The Great Typo Memory Leak
A number of users complained this weekend that Typo was using way too much memory, with reports of 100+ MB per FastCGI dispatcher. Typo usually uses around 20 MB, and even that’s too much; 100 MB is enough to cause big problems with hosting providers like TextDrive.
The first step that I took was to verify that the problem actually exists outside of TextDrive. I set up a test Apache/FastCGI/Typo server, disabled caching, and then pounded on it using curl:
# while true; do curl http://typo1/ > /dev/null; done
I let that run for a few seconds and watched while my dispatch.fcgi processes grew from 22 MB to 80 MB. I then did a bit of experimenting:
- The main index page leaked
- RSS feeds didn’t leak
- Individual article pages leak
- Static pages, like
/pages/aboutleak - Error pages even leak
Disabling the layout for a leaking page and then re-testing it showed that the leak followed the layout. Turning layouts back on and removing the sidebar block fixed the leak.
Entertainingly enough, disabling the sidebar from inside of the sidebar infrastructure didn’t fix the leak. The mere act of calling render_component to generate the sidebars seemed to be causing the memory leak. Since Typo is one of the very few users of Rails components, this suggests that render_component may have a leak that no one else has noticed, so I created a new test Rails app with only two files. First, app/controllers/foo_controller.rb:
class FooController < ApplicationController
def bar
render_component :layout => false,
:controller => 'sidebars/sidebar', :action => 'index'
end
endThen components/sidebars/sidebar_controller.rb:
module Sidebars
class SidebarController < ApplicationController
def index
render :text => 'test', :layout => false
end
end
endThis is about as minimal as a Rails app can get. Then I set up a FastCGI server running this project, and ran curl against /foo/bar, and watched the process size climb. So the leak is part of Rails, not really part of Typo.
Unfortunately, I’m not sure where the leak is coming from. I read component.rb and made a few small changes, but the leak hasn’t stopped. So I’m going to file this as a Rails bug and see if we can get it fixed before 1.0.
Update: Rails bug 2589.
Update: Thanks to Scott Barron, the bug has been fixed. Users with memory problems should probably install the patch, although a bit of testing would obviously be recommended first. The next release of Rails (either 1.0rc3 or 1.0; I’m not sure what they’re planning) should include this fix.
Typo 2.5.6 and Rails 1.0
As far as I can see, Typo 2.5.6 (the most recently released stable version) should work fine with Rails 1.0. I just did a brief round of testing with 1.0rc2, and all of the tests pass. Er, except for one test that had a stupid typo that somehow still worked with Rails 0.13.1; the bug is in the test itself, though, so it’s not worth releasing Typo 2.5.7 just for that. If we ever release Typo 2.5.7, then I’ll make sure that the fixed test is included.
Also, Typo 2.5.6 should work just fine with Ruby 1.8.3, too, as long as you’re using Rails 1.0. I haven’t actually tested this yet, but I’d be surprised if it doesn’t work perfectly.
Surprisingly enough, the current Typo trunk (r683 or so) doesn’t work with Rails 1.0. All of the filtering code is broken; I’ll fix it shortly and check in the fix. Fortunately, the current Typo trunk is pinned to Rails 0.13.1 for now, so it should be safe to upgrade the version of Rails on the box; Typo will just ignore the new Rails for now.
Update: the Typo trunk r685 or later should be compatible with Rails 1.0rc2. I’ll probably break Rails 0.13.1 compatibility soon, so it’s time to upgrade.
Typo and Ruby 1.8.3
Just for the record, current versions of Typo (either 2.5.6 or the current Subversion trunk) don’t work with Ruby 1.8.3. There are two problems–the Logger bug that keeps Rails 0.13.1 from working with Ruby 1.8.3 (this is easy to fix), and a second bug that I haven’t read about anywhere else–apparently YAML serialization is broken with Ruby 1.8.3 and Rails 0.13.1. This keeps Typo’s sidebar from working properly.
I’m going to see what it’ll take to get the Typo trunk working with Rails 1.0 (rc1 or rc2, if it’s out today), and then see if it works properly with Ruby 1.8.3. Once that’s done, the trunk will probably shift from 0.13.1-only to 1.0-only.
Update: That was quick. ChrisNolan on IRC pointed out that Rails bug #2304 contains a patch to fix this. The patch is already a part of the current Rails trunk, but you’ll need to patch 0.13.1 manually if you want to use it with Ruby 1.8.3.
Benchmarking Typo
I finally had a bit of time to do some Typo benchmarking over the weekend and (as usual) found that my instincts were all wrong.
I was specifically interested in the performance difference between the page cache and the action cache–my guess was that the action cache was a 10x performance hit.
So I set up a test environment under Xen, running Typo r683, PostgreSQL, Apache 2, FastCGI, Ruby 1.8.2, and Rails 0.13.1. I didn’t do any Apache or Postgres tuning–I just ran them out of the box.
Then I ran ab against a snapshot of scottstuff.net from a couple weeks ago. I used the index page for my testing, as it’s a fairly large page and I wanted to give Typo a real workout.
Here’s what I found:
| Cache Type | Requests per second |
|---|---|
| Page Cache | 2357 |
| Action Cache | 10.6 |
| No cache | 1.01 |
That really wasn’t what I’d expected. The action cache underperformed my expectations by a factor of 20.
I then did a bit of experimentation. I created a new uncached action in my test Typo setup that did nothing but render :text => 'foo', :layout => false, just to see if the caching system was slowing things down. Result? 10 requests/second. Then I created a new Rails project from scratch and added a new controller with the same action, and saw the same results. Still 10 requests/second.
However, in Rails’s logs, it said that it handled the request in 2 ms, and I should be seeing 500 hits/sec for the “foo” page. So something is adding an extra 98 ms to each request. I’m still hunting for this–I don’t know if it’s something Xen-related on my system, an artifact of my Apache config, or what.
I’ve tried upgrading to Rails 0.14.0, but that’s a whole other article.
Conclusions:
- The page cache is really, really fast.
- The action cache is a substantial improvement over the uncached case–about 10x on this system–but can’t touch the performance of the page cache.
- Changing the concurrency settings on
aband/or the number of FastCGI backends in use didn’t make a substantial performance difference. Settings 2-15 gave roughly the same results. - My
caches_action_with_paramsis slightly faster then the stock action cache. - Moving the action/fragment cache from the FileStore (Typo default) to MemoryStore gives no real performance boost. Moving to the MemCacheStore is a substantial performance hit (~2 hits/sec vs 10 hits/sec).
- Adding a new uncached action in ArticlesController that simply returns a fixed string (
render :layout => false, :text => 'foo') is no faster then the action cache. - Moving the new action from the previous step to a Controller of its own doesn’t help.
- Removing all of the routes except the default
:controller/:action/:idroute doesn’t help, either.
Frankly, on this hardware, I don’t seem to be able to get more then 10 requests/sec out of Rails no matter what I do. I’m pretty sure that this is a mistake, so I’ll post a followup when I figure out what’s wrong.
Update 1: Another datapoint. Running an example Ruby FCGI on this box gives me 832 hits per second. So whatever the problem is, it’s not fundamental to Ruby FCGI on this box. So I need to look into Rails and see what’s happening.
Update 2: Making some progress. Apparently I screwed up when I tested a new, standalone Rails FCGI app before (forgot to restart FCGI?). This time, I got 130 req/sec, which is a vast improvement over the 10 req/sec that I was seeing before. Most of that speed hit seems to come from using Postgres for session storage. Unfortunately, even after making that change, my null controller is still only getting 40 req/sec with Typo. Transporting the same controller to a blank Rails project gives me 130 req/sec with the same code. Watching strace, it looks like something is forcing Typo to reload the digest/md5 module for every hit. Unfortunately, I can’t figure out how that’s happening–my environment.rb is identical between the two trees, as is my database.yml. I’ll get back to this later; I have other things that I need to finish today.