Typo 4.0 plans
We’ve been talking about releasing a new stable version of Typo since October, with no real success. Frankly, I’ve only spent a few hours per month on Typo since I started working for Google in late November, and that’s not enough to get a major release out. Fortunately, most of the weirdness in my personal schedule seems to be over, now that I’m done commuting back and forth between Seattle and Silicon Valley once or twice per month. I still don’t have as much free time as I used to (Google keeps me busy), but I’m not spending every minute preparing to travel and dealing with the consequences of life on the road.
So, now that I’m back on a normal schedule, it’s time to get serious about releasing a stable version of the current Typo trunk. I’d like to release Typo 4.0.0 before April 1st. Frankly, 95% of the features that we want for Typo 4.0 are already in the tree, so mostly we just need bug fixing and testing. There are a few features that need some finishing work, and one feature that still needs to be implemented, but late March should be workable, if people can pitch in and help a bit.
As I see it, here are the areas that we really need to work on:
- Trackbacks (they’re still utterly broken)
- Podcasting (cleanup and testing)
- Notifications
- Threaded comment support (optional)
- Migration cleanups
- Merge one or two theme contest themes.
- Bugs, bugs, bugs
I should probably explain the “threaded comments” bit–months and months ago, I wrote some threaded comment code for Typo. Tobi didn’t want it in Typo, so he (correctly) refused to merge it. I’m still using my threaded comment code here, maintaining it in parallel with the Typo trunk. The problem is that there’s no easy way to merge the old threaded comment code with Piers’s big STI patchset, so I’m stuck running an older version of Typo here. I can’t upgrade without either losing all of the threading information that I’ve accumulated or spending some time adding minimal threaded comment support to Typo. I’ve discussed my Big Plan for comments in Typo, but I don’t think I’ll have time to implement everything in the next month; instead I want to concentrate on getting something released, and then we’ll spend some more time enhancing comments for Typo 4.1.
There’s one big thing that I need from people if we’re going to be able to make this schedule: help getting back on top of all of the bugs in Typo’s Trac. We were trying to keep the number of open tickets below 50 for most of last year, but now we’re up to 165, and that’s too many for me to easily manage. So, I need to start applying patches and closing them. If there’s a patch that’s ready to apply, or a simple ticket that doesn’t need more then 5 minutes worth of work, can you please leave a comment here? Thanks.
Finally, to answer the inevitable question: where is Typo 3.0, you ask? Months ago, we decided to skip version 3.0, to avoid confusion with Typo3. So we’re going from Typo 2.x to Typo 4.x. That’s it.
Time-limited caching for Rails
I’m finally getting back into Typo hacking after too long away. I tried to apply a few patches last weekend, but I was traveling and my network access was too spotty. So, I spent my time adding a bit of new functionality. Since then, I’ve been debating whether to commit it to Typo or not. I decided I’d write about it here and see which way the comments go.
The code in question is a time-limited cache for Rails. I’d like to be able to say “cache this page, but only for three hours. After three hours, re-render the page.” This sort of thing comes up in Typo all the time. The most obvious example is the sidebar–some of the sidebar components display information with a short lifetime, and it’d be dumb to keep pages in the cache for weeks when they include sidebar data that’s only good for hours. This isn’t really a problem on busy sites, because the current cache sweeper usually resorts to sweeping the entire cache every time a new article is posted, but it’s a pain on slower sites.
There are certainly other ways to fix the sidebar problem (AJAX sidebars are the obvious example), but the same basic pattern comes up all over Typo. A few examples:
Users keep requesting the ability to create articles with a publication date in the future. The article won’t appear on the site until after the publication date. This is common CMS feature, and apparently other blog engines have it as well, but it really doesn’t mesh with Typo’s current cache, because there’s no way to say “sweep the cache at 7:30 today” short of adding a cron job for every article that’s posted this way.
We have a bunch of aggregation classes that suck data off of other sites, like Flickr, Upcoming.org, and so on. These usually end up as sidebars, but we need to cache the back-end data somewhere. An expiring fragment cache would work perfectly for this.
On really busy sites, we could use something like this to avoid rebuilding comment pages on every comment–we could drop the sweep-on-new-comment code and swap for expire-after-5-minutes. If you’re getting more then 1 comment every 5 minutes, this would be a win. If you’re getting a comment every few seconds (think Slashdot or Curt Hibb’s “hammer my comments” post), this would be a major win.
To accomplish this, I added two new features. First, I added a set of “meta-fragment cache” methods, building on Rails’ existing fragment cache. The fragment cache stores (key, value) pairs, while the meta-fragment code stores (key, value, metadata_hash) triples. This is simply implemented as two fragment cache entries, one for the data and one for the serialized metadata hash.
Then, on top of that, I re-implemented my caches_action_with_params code. This is a variant of Rails’ native action cache with a number of cleanups and bugfixes.
When all is said and done, you’re left with a controller that looks something like this:
class ArticleController < ApplicationController
caches_action_with_params :read
def read
response.lifetime = 3600 # 1 hour
...
end
endThat’s it–the read action will now be cached with a 1 hour lifespan. After an hour, the cached version will expire. If response.lifetime isn’t set, then the page won’t expire on its own, and it’ll need to be swept as usual.
So here’s the big question–should this go into Typo? I can see good arguments on each side.
Pro:
- It solves a lot of cache-with-parameter problems that we’ve had.
- Switching to some variant of the action cache means that switching between production and development mode doesn’t leave cache problem. This is a major cause of bug reports from new users.
- It’ll let us implement future posting easily.
- It’ll make it easy for sidebars to stay current.
- It’ll let us move the aggregation backends behind the sidebars to a more reasonable architecture. For example, we’ll be able to use the
Flickrclass for the Flickr sidebar instead of (mis-)parsing their RSS feed. - It’ll make us less dependent on web server configuration and weird rewrite rules.
Con:
- It’s slower then the page cache. I haven’t benchmarked my new code yet, but the last time I checked, on my box I could handle almost 2400 page cache requests per second, while the action cache was good for *10* hits per second. That exposed a couple major Typo performance bugs; I suspect that retesting with the new code would give us 100-200 hits/second, which is pretty busy for a blog. Still, this may be an issue for shared hosting providers.
- The action cache serves cached pages via Rails, while the page cache serves the same pages directly from the webserver without invoking Rails at all. Because of this, I suspect that a lot of sites will want to increase the number of FastCGI Typo processes that they run. With the page cache, running with one FCGI process was usually okay; with the action cache, it might be better to use a second process.
Those are the only two major problems that I see. Basically, if we switch to using the action cache (in any form), we’re going to be harder on big hosting companies like TextDrive and Planet Argon, and they’ve been very supportive of Typo in the past.
Does anyone feel strongly about this one way or another?
Rails Schema Generator 1.0.0
It’s taken forever, but I finally have a version of my schema generator that works with Rails 1.0. The actual fix was fairly minor (7 lines), but getting to a point where I could test it has been amazingly difficult–my new PowerBook had a bit of disk corruption and Ruby stopped working. Rebuilding the entire Ruby distribution from DarwinPorts wasn’t enough to fix the problem, somehow, but installing a new copy of REXML did the trick for reasons that are too obscure for understanding.
So, go and enjoy.
Update: I just bumped it to 1.0.1 to fix a MySQL dependancy bug that I’d forgotten about for 1.0.0. Have fun.
Rails Schema Generator 0.9.0
I just released a new version of my Rails Schema Generator on Rubyforge. The schema generator takes a collection of Rails database migration scripts and assembles a complete set of SQL schema files using only the information from the migrations. This release supports MySQL, PostgreSQL, and SQLite; it can generate schemas for all three DB types even if the databases aren’t installed on the system.
We’ve found that the schema generator drastically lowers the work needed to keep Typo working correctly with multiple database types.
This release fixes a number of bugs and should finally work correctly with all common migration operations, including field renaming. This has only been tested with Rails 0.14.4, and at least one API has changed recently, so this may not work with earlier 0.14.x releases.
To use the schema generator, run gem install schema_generator, and then (from your Rails project directory) ./script/generate schema. This will create several schema files in the db/ directory, prompting you before overwriting existing files.
Typo notifications
I just committed a big patch to Typo that adds the first part of a notification framework. This piece adds the ability to send email or Jabber messages whenever certain events happen. The events currently supported are new article creation and new comment posts.
This is still a work-in-progress, but it should be safe for a few brave people to use. You’ll have to edit your user settings in Typo to configure your notification settings and then edit the general settings to configure which email address Typo should use. After that, it should just work.
Photo Montages in Ruby
I just uploaded my photo montage software to my Subversion server. It’s still a bit rough around the edges, but it’s usable. I’ll upload it to RubyForge when I have time, but for now I just want to have it out in the public eye before I start at Google.
Flickr montages
I’ve been working on a cool new toy–a Ruby script that sucks up all of the images from a Flickr photo set and turns them into a random montage. The results are surprisingly pleasing, at least to me:

Once we’ve pushed the next Typo release out the door, I have a few ideas for cool and useful things to add to Typo, but you’ll have to wait to see what they are.
Typo moves to Rails 0.14.2
As of this morning, the Typo trunk requires Rails 0.14.2. I ran rails . and then spent a couple hours cleaning up Rakefile and environment.rb, and then updated all of the tests to work without instantiated fixtures. As advertised, this makes tests run quite a bit quicker; since Typo’s test suite had been flirting with the 5 minute mark on my laptop, anything that we can do to speed up tests is welcome.
Now that we’re safely requiring 0.14.x versions of Rails, I’ve started adding session :off all over the place. My sessions table for scottstuff.net has around 70,000 sessions in it right now, and I doubt that more then 15 or 20 of them were ever useful. We use sessions for user authentication, but that really only matters for admin pages plus a few other cases, like comment posting. In essence, any page that can be cached will never need access to a session. So, I made a little change to the Article controller. Instead of
class ArticlesController < ApplicationController
caches_page :index, :read, :permalink, :category,
:find_by_date, :archives, :view_page, :tag
...
endI’m now using
class ArticlesController < ApplicationController
cached_pages = [:index, :read, :permalink, :category,
:find_by_date, :archives, :view_page, :tag]
caches_page *cached_pages
session :off, :only => cached_pages
...
endThis is a really common pattern, because pages that can use the page cache really shouldn’t depend on the session in any way. If I was doing this in more then one place, I’d probably want to extract it into a nice little plugin, but I’m just not feeling the motivation right now.
to_proc
Dave Thomas just pointed out a great little hack from the Ruby Extensions Project. They added this little snippet to Object:
def to_proc
proc { |obj, *args| obj.send(self, *args) }
endand now instead of writing this:
result = names.map {|name| name.upcase}you can write this:
result = names.map(&:upcase)The “wave this method over all objects in a collection” idiom is very common in Ruby, and Ruby’s native block syntax isn’t too bad for this use, but it’s not as clean as Python’s list comprehension syntax would be:
result = [n.upper() for n in names]With the to_proc hack in place, Ruby’s code is a bit cleaner, at the cost of using a non-standard extension to the language. Personally, I’d prefer to see map extended to take an optional symbol, then we could ditch the ugly & when we’re using a single method with no parameters.
Migrating in two dimensions
This seems to be the season for talking about Rails migrations. A lot of people are finally discovering them and finding that they’re very useful for maintaining your database schema over time. I’m a big fan of Rails migrations; we’ve been using them with Typo since the middle of July, when they were all new and shiny. We’re currently up to 24 migrations in the Typo source tree. We’re even using migrations to create our initial database, via my Schema Generator. I haven’t done a formal survey, but I suspect that Typo is the biggest open-source user of migrations, and may actually be the biggest user overall.
The big problem is that we’ve been using migrations wrong the whole time, and we just realized it.
There are probably a dozen bugs in Typo’s bug tracker that boil down to “I fell behind the trunk and now rake migrate throws exceptions and I can’t upgrade anymore.” The problem is that migrations are designed to run against an earlier version of your database, but they use the current version of your code. The first time that this caused problems was with the migration from Typo 2.0 to 2.5–we’d added two new fields to articles. Migration number 7 added the permalink field and a before_save hook to make sure that all saved articles have permalinks. Then migration number 9 added GUIDs and a second before_save hook to fill the guid field. Both migrations did Articles.find(:all).each { |a| a.save } to update each Article and populate the new fields.
This worked great for developers who frequently upgraded. A few days after the GUID migration went in, though, we started getting weird bug reports–users who tried to do both upgrades at the same time found that migration number 7 was dying. What was happening was that migration number 7 added the new permalink field to articles, but when it went to run the save loop both before_save hooks ran, and Typo tried to add a GUID to each article. However, the guid field didn’t exist yet, so the migration threw a bunch of exceptions and died.
This caused a bunch of grumbling on the Typo IRC channel. We threw around a bunch of possible fixes. Our favorite was separating migrations into two parts–a schema change part and a data change part. First we’d run all of the schema changes, and then update all of the data. As a work-around, we added a hack that checked the current schema version and disabled specific before_save filters for older versions.
We managed to keep this little bandaid working until a couple weeks ago, when a huge set of new migrations went it; they renamed the articles table and merged several other tables into the new contents table using STI. And, again, we found that older migrations broke when users tried to upgrade from Typo 2.5.6 to the current dev tree. Unlink the permalink/guid case, this time there was no simple workaround. We couldn’t just add a couple if statements in a filter and make it all go away.
The fundamental problem is that we were using the wrong mental model for migrations. I saw migrations as a one-dimensional thing–a list of steps for migrating old data into the new format. In this view, the migration for going from schema version 6 to schema version 7 is constant–once it’s been written, the only reason to change it is if a bug turns up in the logic for that migration. Otherwise, the migration code should remain unchanged over time.
And that’s the problem–migrations aren’t one-dimensional. They are (and need to be) two dimensional–the schema version is one dimension and the code version is the other. Individual migrations exist to migrate from a specific old schema version to the current version, using the current code. Each migration should change over time to adapt to the changes in the code. So, the right fix for the permalink migration that caused so many problems wasn’t to add a bunch of logic to before_save. Instead, we should have deleted the entire save loop from the migration, and trusted the GUID migration to update both fields. If that wasn’t good enough, then we should have added a new migration at the end to do permalink cleanup after the GUIDs were added.
Once I came to grips with this, the migration changes needed to allow 2.5.x users to upgrade to the current trunk were pretty simple, and took about 5 minutes to write and test.
Or was I the only person in the Rails universe who thought about migrations this way?