Typo RSS enhancements

Last week’s filter project is now a standard part of Typo, so I’ve been spending time on a new Typo project: unified RSS feeds. Typo has had support for RSS 2.0 and Atom 0.3 for months, as well as RSS feeds for site-wide comments, site-wide trackbacks, and per-article comments. Each feed was generated by its own distinct view code. This wasn’t scaling very well–every time we add a new feed type (like per-category or per-tag feeds), we have to write a bunch of new code and copy existing templates, and replicate that work once for each feed format that we want to support. That’s why most of the feed types were only available in RSS–creating Atom feeds for them was too much work and too error-prone.

So I decided to rip all of the feed code out and replace it with a new system. Instead of having one template per feed type, I have one master template per feed format (RSS 2.0, Atom 0.3, and now Atom 1.0), plus a per-item template for articles, comments, and trackbacks in each format. That’s a total of 12 templates for 3 different formats. My controller then generates a list of items based on some set of queries and then asks the view to assemble itself based on the type of items that it finds in @items. So the recent articles RSS feed produces a list of the 15 most recent articles, shoves that into @items, and asks the view to take care of it. The trackbacks Atom feed works the same way–@items gets a list of the 15 most recent trackbacks and the same basic view logic does the work. Mixed-type feeds like the article+comments feed work too. This will make adding new feed types mostly trivial.

While I was at it, I added Atom 1.0 support. NetNewsWire 2.0.1 has Atom 1.0 support, so it’s time for us to add it too. Because of our nice new template infrastructure, every feed type that was available in RSS before is now available in Atom 1.0 as well. I also added per-category and per-tag feeds, although they aren’t linked in anywhere yet, so users have to type in the feed URL by hand in order to use them.

The big problem with having so many different feed types (6 types times 3 different formats) is keeping them all standards-compliant. Ruby doesn’t have a feed validator class yet, so I decided to cheat. The Python code from feedvalidator.com is available from SourceForge, so I grabbed it and installed it locally. Then I took their demo wrapper, copied it into /usr/local/bin/feedvalidator, and told Typo’s feed tests to test each feed type using the Python validator code. This helped immensely, and I was able to get all three feed formats back into compliance in no time at all. Frankly, this was the coolest application of unit testing that I’ve ever done, and it was massively and immediately useful. I’ve been trying to get into unit tests for a long time, and this is the first time that I feel like I’m really getting things right.

Hopefully I’ll be able to commit this to the Typo trunk later tonight, and then I can move on to my next little Typo project. I’m not really sure which one to pick up next–either podcasts, context-sensitive sidebars, action caching, statistics, or schema generation from migrations. So many choices.

Posted by Scott Laird Wed, 31 Aug 2005 05:13:00 GMT


Introduction to Typo filters

Although blogs are inherently HTML-based, HTML isn’t really a great format for writting plain-text documents. If nothing else, manually adding <p> and </p> around paragraphs interrupts the flow of writing. Most people would prefer to write in a more user-friendly manner, either via a GUI editor or a light-weight markup language like Markdown, which is then translated to HTML automatically. Very few people really want to write raw HTML blog postings on a daily basis.

Out of the box, Typo 2.5 supports the Markdown and Textile markup languages and the SmartyPants HTML-post processing filter, which adds typographical quotes and dashes to HTML. Adding additional filters is difficult because the filter setup is hard-coded into Typo. One of the new features that I’ve been working to add to Typo is the ability to easily add new text filters via filter plugins, similar to the sidebar plugins in Typo 2.5. At the same time, I’ve also added several new filtering plugins that extend Typo’s abilities in a number of useful ways.

The goal of all of this is to make it easier to write using Typo. I’ve tried to find things that cause me pain and then fix them. I want to make it easy to do common writing tasks without having to fire up an external tool. Admittedly, my definition of “common writing tasks” is probably different from most people’s, but the easy ability to extend Typo’s filtering system will allow people to adapt Typo to their own needs without having a deep understanding of Typo’s internals.

Inside Typo Filters

The new filter code supports three different types of filter plugins:

  1. Markup filters, like Textile and Markdown
  2. Macro filters
  3. Post-processing filters, like SmartyPants

Markup filters convert from a specific markup language into XHTML. You generally only want to use one markup language per article.

Macro filters convert certain Typo-specific macro tags into longer HTML sequences. These will be explained below.

Post-processing filters convert valid HTML into valid (but possibly enhanced) HTML.

Typo’s filtering system allows the user to create filter sets that use one markup filter and any mixture of post-processing filters. Macro filters are always enabled; they’re difficult to trigger accidentally and this greatly simplifies the filter management user interface.

Using Typo Filters

Typo 2.5 came with 5 hard-coded filter sets:

  • No filtering
  • Textile
  • Markdown
  • SmartyPants
  • Markdown with SmartyPants

The new filtering code comes with the same filters defined. If one of these fits your needs perfectly, then you can continue using it unchanged. If you need to make changes, Typo’s admin system now includes a “Text Filters” tab that lets you edit these filter sets and create new ones.

Each text filter defined in the admin interface has a drop-down box for the markup language used (currently None, Markup, or Textile) and check boxes for each available post-processing filter.

Macro filters

Macro filters convert certain Typo-specific tags to longer HTML sequences. The new filter code comes with three macro filter plugins:

  • <typo:code>: displays formatted code snippets, optionally with syntax highlighting and line numbering.
  • <typo:flickr>: produces an image tag linked to an image on Flickr, optionally with a caption.
  • <typo:sparkline>: displays a SparklineTufte’s name for a small in-line chart.

All macro filters use <typo:NAME>-style tags. The <typo:NAME> tag is then replaced by the output of the macro filter during the filtering process. For example, the Flickr macro filter would replace this:

<typo:flickr img="31366117" size="square" style="float:left"/>

with

<div style=\"float:left\" class=\"flickrplugin\">
  <a href=\"http://www.flickr.com/photo_zoom.gne?id=31366117&size=sq\">
    <img src=\"http://photos23.flickr.com/31366117_b1a791d68e_s.jpg\" width=\"75\" height=\"75\" alt=\"Matz\" title=\"Matz\"/>
  </a>
  <p class=\"caption\" style=\"width:75px\">
      This is Matz, Ruby's creator
  </p>
</div>

Notice that the <typo:flickr> line is a lot less typing.

The other macro tags work similarly. Here’s a brief example of the code plugin in action:

<typo:code lang="ruby">
  class Foo
    def bar
      "abcde"
    end
  end
</ typo:code>

The end result is basically the same as <pre>...</pre>, except that the text in the middle gets Ruby-specific syntax highlighting and all HTML is escaped.

Documentation enhancements

Each filter plugin has the opprotunity to define a self.help_text method that returns a help string. The admin interface currently has a button to show the help text for each filter; in the near future we’ll extend this to the content and comment editing pages as well. This way users will be able to see text formatting help that’s specific to the exact filter configuration in use.

Writing filters

Basic filters are pretty simple. Here’s a minimal markup filter, for example:

class Plugins::Textfilters::TextileController < TextFilterPlugin::Markup
  def self.display_name
    "Textile"
  end

  def self.description
    'Textile markup language'
  end

  def filtertext
    text = params[:text]
    render :text => RedCloth.new(text).to_html
  end
end

This is about as basic as it can be–it doesn’t include any help text, but it’s a fully functional text filter. Drop this into components/plugins/textfilters/textile_controller.rb, and Typo will automatically gain the ability to use Textile formatting.

To create markup filters, your filter class needs to be a subclass of TextFilterPlugin::Markup. Post-processing filters are essentially the same, except they’re subclasses of TextFilterPlugin::PostProcess.

Macro filters are slightly different. First, there are two different macro classes, TextFilterPlugin::MacroPre and TextFilterPlugin::MacroPost–one runs before markup filters, and the other runs after. Second, macro filters don’t define a filtertext method; instead they define a macrofilter method that looks like this:

def macrofilter(attrib,params,text="")
  data = text.to_s.split(/\s+/).join(',')

  if(attrib['data'])
    data = attrib.delete('data').to_s.split.join(',')
  end

  url = url_for(
    {:controller => '/textfilter', 
     :action => 'public_action', 
     :filter => 'sparkline',
     :public_action => 'plot', 
     :data => data}.update(attrib))
  "<img src=\"#{url}\"/>"
end

The attrib parameter is a hash of all attributes to the <typo:macroname> tag, params contains filter-wide parameters (see below), and text is the text between <typo:macro>...</typo:macro> tags, if any.

Filters are controllers, and they have access to all of the usual ActiveController methods, like url_for and friends. By default, none of the actions in plugins are visible to the public, so you don’t have to worry about someone feeding http://blog.example.com/plugins/textfilters/foo/exploit_me into their web browser and running code inside of your plugin. In some cases, though, you want to have certain methods in your plugin be accessible via URL. For instance, your plugin might need to use Ajax for something, or it might need to produce images, like the Sparkline plugin does.

To accomplish this, use plugin_public_action, like this:

class Plugins::Textfilters::SparklineController < TextFilterPlugin::MacroPost
  plugin_public_action :plot
  def plot
    ...
  end
end

This will connect http://blog.example.com/plugins/textfilters/sparkline/plot to SparklineController#plot. If you need to use views, then create a controllers/plugins/textfilters/<plugin> directory and put your views in there.

Plugin parameters

Some filter plugins need more information then they can easily collect when filtering each article. For instance, think about a hypothetical WikiWords auto-linking filter that turned WikiWords into links to a Wiki somewhere. If it’s going to link words, then it’ll need to know which wiki to link them to. That’s where filter parameters come in. Each filter plugin can have a default_config method like this:

def self.default_config
  {"wiki-link" => {
    :default => "", 
    :description => "Wiki URL to link WikiWords to",
    :help => "The WikiWords plugin links..."}}
end

Typo collects all of the default_config items from all enabled plugins and presents them to the user in the Text Filter admin area. If the WikiWords filter was installed, then each filter set would have an editing box labeled “Wiki URL to link WikiWords to”.

Using filters from inside of Typo

In Typo 2.5, filters were called via the HtmlEngine.transform library method. Unfortunately, this had to change with the new plugin system, because several plugins need to be called from a Controller context so they can use views and helpers like url_for.

Unfortunately, this means that it’s no longer possible to call filters directly from Models–they have to be called from Controllers so that they have the right context available. Fortunately, the code wasn’t too hard to convert, even though there was a lot of it.

To use filter plugins from inside of a controller, just call filter_text, like this:

filter_text('text to be filtered',[:markdown, :macropost, :smartypants])

This is rather low-level. To use whole filter sets, use this:

filter_text_by_name('more text to be filtered','markdown')

This will look up the filter set named ‘markdown’ in the text_filters table and apply it to the text more text to be filtered.

Any time that Article#body (or any of the similar models, like Comment and Page) changes, the controller must manually call filter_text_by_name This happens around 10 times in the current Typo tree.

Update: The API for filters changed somewhat around r685; the programming examples given here are a bit out of date now. I’ll write a “writing filters” document once the interface is stable.

Posted by Scott Laird Wed, 24 Aug 2005 02:13:00 GMT


Typo filters nearing completion

My little Typo filter project is finally nearing completion. I think I’ve been working on this for almost two weeks now, which makes it the most time-consuming Typo project that I’ve undertaken yet. I’ve added about 400 lines of code and 200 lines of new tests, and changed at least 400 more lines. Typo’s current trunk is only 2600 lines long, so I’ve touched almost 40% of the code.

At this point, almost everything works. I can drop new filters into components/plugins/textfilters and they’re immediately available for use. All of the current filters work (Textile, Markdown, SmartyPants), and I’ve added several new filters as well. There’s still a lot of cleanup left to do, and there are a bunch of corner cases that I need to write tests for, but the core code seems pretty solid, and it’s essentially feature-complete.

Posted by Scott Laird Sun, 21 Aug 2005 06:19:50 GMT


Typo 4.0 development begins

Typo development slowed down for a while after we released Typo 2.5.0 so we could make sure that we had a stable release for people to use. It’s been a couple weeks now, and Typo 2.5.5 looks good, so we’ve started working on the next release, which will be called Typo 4.0. We’re going to skip Typo 3.0, because it conflicts with Typo3.

The two biggest changes so far are tags and file upload support. Over the next few weeks we’ll add the beginnings of podcast support, unify our RSS and Atom feed code, and (hopefully) add the filter code that I’ve been working on for the last week. Our current wishlist for Typo 4.0 is available in the Typo wiki.

I’m starting to feel good about my filter code. It’s a huge patch–it currently touches 68 files, not including vendor updates. I have three or four things left on my to-do list, and then I’ll spend a bit of time doing audit, cleanup, and testing. Most likely it’ll be ready by early next week.

Posted by Scott Laird Fri, 19 Aug 2005 17:43:45 GMT


Another day, another filter plan

I’ve spent a bit of time playing with moving my Typo filter patch to use controller instances for filtering, and it doesn’t look too hideous on the filter side, but I’m going to need a lot of infrastructure changes before I can deploy it.

Unfortunately, the main user of filters right now is Article#set_defaults, which is called whenever an Article is saved. If I make filtering a controller issue, then I’ll have to rip out the filtering in Article (also Comment and Page) and move it to ArticleController. This won’t be a big performance hit, because I can use fragment caching, but it’s very invasive.

So, before I go any further down this road, I need to decide if this is really cleaner then just caching a base URL and using it to manually generate URLs. I was originally against this, but a thread on the Typo mailing list reminded me that we’re going to need a base URL if we want to support multiple blogs in a single Typo install anyway.

Ugh.

Posted by Scott Laird Wed, 17 Aug 2005 15:13:38 GMT


Typo filters hit a bit of a wall

I just hit a bit of a problem with Typo filters, and I’m not sure what the best way out is.

The problem first showed up with the sparkline filter. This filter turns a block like <typo:sparkline data="10 20 30 40 50"/> into an <img/> tag that points to a Sparkline generator on the current website. Things were going great with this until I realized that the filters don’t know anything about “the current website.” Even though my filters are technically Rails controllers (there’s a reason for this, I just haven’t fully implemented it yet–it’s next on the list once this problem is fixed), the actual filter method is a class method, not an instance method, and anyway most of the time, the filtering code is called from outside of a controller context. Basically, when the filter code gets called, it doesn’t know which website it’s getting called for.

And that sucks. Even ignoring complex things like the sparkline code, this makes it impossible for filters to produce URL references to elsewhere in the current Typo site. That means filters can’t do locally-hosted images, or WikiWord-style links, or AJAX. And that’s going to be a problem.

As I see it, I have two options:

  1. Re-spin the filter API so that it’s all controller helpers and components. This way they’ll always be able to use url_for. It’s a big conceptual change, but the code will be cleaner, and I doubt I’ll actually have to change more then 50 lines of code, not counting unit tests.
  2. Cache the base URL for the site somewhere and hand-code URLs myself. This is easy, but ugly.

Since I’m trying to avoid ugly, it looks like I’m going to be taking option #1. That’ll probably push the first filter release out to this weekend, but it’ll be better code.

As a side note, my Rails Book finally showed up yesterday, so I have printed documentation to work with finally. PDFs are nice for skimming and searching, but they’re a pain to read cover-to-cover.

Posted by Scott Laird Tue, 16 Aug 2005 15:36:07 GMT


Questions about typo filters

After posting yesterday’s typo filter announcement, I started to have a few misgivings about the way that I was planning on configuring filters. I asked a few questions on the Typo IRC channel and got a number of wildly different suggestions and opinions, but one line from cDlm stuck with me:

it’s gonne be a nightmare to keep all those filters orthogonal

There was also a comment that every blog was going to end up with its own markup language, incompatible with everyone else. I thought about it a bit, and I think I’m approaching filters wrong. Or, rather, I’m approaching them like a programmer, not like an end user. The filter interface that I’ve been planning is too generic, and we’ll probably be better-served if we remove some of the genericness, at least on the front end.

As I see it, filters fall into 4 basic categories:

  1. Markup languages–they convert from some non-HTML markup language into HTML. Examples include Markdown and Textile.
  2. HTML post-processors–they convert generic HTML elements into other generic HTML elements. Example: SmartyPants, and sort of the Amazon filter that I discussed earlier.
  3. Typo macro tags. This includes the <flickr> and <sparkline> filters from yesterday.

Looking at this list, I see 2 very specific things:

  1. There’s really no reason to run two markup languages on the same post. You want at most one of them, and possibly 0 if you’re writing raw HTML.
  2. As long as the Typo macro tags fit into a clean namespace and they don’t have side-effects when they aren’t used, there’s no reason to ever turn them off. If a macro tag filter is installed, then it should be used on all articles.

This really simplifies things, because the filter configuration no longer requires the user to set up an arbitrary set of filters. Now it just needs to know:

  1. Which markup language
  2. Which post-processing filters.

I think we can call the Amazon filter a post-processing filter without breaking anything. Furthermore, we can probably pre-order the post-processing filters by having a priority built into them. This way, the user just needs to click checkboxes. This is a lot less complex then dragging a half-dozen filters around in a script.aculo.us Sortable like we have to use for the sidebar.

This won’t handle 100% of the cases that people want, but it’s almost certainly well over 90%, and I think it’ll have about 10% of the complexity. The common cases, like “I want Markdown with SmartyPants” or “I want Textile with SmartyPants and WikiWords” will be really simple, and that’s much more important then the ability to stack filters in arbitrary orders. Anyway, there will be a way to get around the remaining cases by hand-creating TextFilter database entries; if someone really wants to do sufficiently weird things, like applying Markdown and Textile to the same article, then I don’t think asking them to do something like this is excessive:

$ ./script/console production
>> TextFilter.create(:name => 'weird filter', :description => 'My Weird Filter', :filters => [:markdown, :textile, :macros, :smartypants, :textile, :piglatin])

I’m going to start re-working my filter code to fit into this framework; I should have something to show in a few days.

Posted by Scott Laird Sat, 13 Aug 2005 23:29:08 GMT


Typo theme tutorial

Geoffrey Grosenbach just posted his new Typo theme turorial. It looks pretty good, and manages to show how simple Typo themes really are right now.

I’d really like to get the ability to override individual view files in place before the next major release. I haven’t looked into the guts of Rails yet, but I suspect that it’ll either be really easy (under 5 lines of code) or nearly impossible.

Posted by Scott Laird Sat, 13 Aug 2005 00:21:31 GMT


Pluggable text filters for Typo

Now that tags are working, I’ve started work on adding text-filter plugins for Typo. The current release (Typo 2.5.3) has support for 5 different combinations of Textile, Markdown, and SmartyPants hard-coded into it. The different combinations are actually repeated in 3 different places–the filtering code itself, the drop-down list for the built-in editor, and the Movable Type API code to list filter options.

That’s all gone now, replaced with a plugin system, similar to the sidebar plugins that made it into Typo 2.5. Individual filters get dropped into components/plugins/textfilters/ and the system picks up on them automatically. Then there’s an interface in the admin UI that lets you combine the filters into named filter sets, so you can combine Markdown and SmartyPants into “My Filters” (or “Markdown with SmartyPants”, which Ecto recognizes and performs some magic to get Markdown previews to work right). The UI isn’t really complete yet, but the entire back end is there, and I’ve added two new filters as a demonstration of what we can do. Here’s the current list:

  • Markdown. This is my favorite lightweight markup language, and I use it for everything that I write here.
  • SmartyPants. A companion to Markdown, it does a typographical cleanup on HTML, turning ASCII single and double quotes into their typographically correct cousins and fixing em-dashes.
  • Textile. Another lightweight markup language, like Markdown.
  • Amazon. This turns URLs like <a href="amazon:097669400X" ...> into a link to Amazon’s page for ASIN 097669400X, optionally attaching your Amazon affiliate tag. This is mostly a demonstration of what you can do with filters, although I’ll be using it on my blog.
  • Flickr. This sticks a picture from Flickr on the page. This is a bit more complex then the Amazon filter, but similar in concept. It turns <flickr img="31366117" ...> into a formated inline image, linked to Flickr’s full-sized image page, optionally with a caption attached. The full HTML produced is something like <div style=""><a><img/></a><p>Caption</p></div>, which saves a lot of typing.

I’m currently working on a Sparklines plugin, using Glyph’s Ruby sparklines code. It’ll be similar to the <flickr> tag, except it’ll spit out an <img> tag that points to a built-in sparkline generator. Turning <sparkline ...> into an image tag is trivial; allowing a text filter to export an action to the world is a bit more work.

There are currently two things that bother me about this code that I’ll need to resolve before releasing it:

  1. The <flickr> and <sparkline> tags–should they look like plain XHTML, or is that a mistake? Should I turn them into pseudo-bbcode tags, like [flickr]? I’m currently leaning towards sticking a typo pseudo-namespace on the front of them, and turning them into <typo:flickr .../> and <typo:sparkline ...>. Any objections to that?
  2. The admin interface to this is killing me. I’d love to have a nice, simple way of editing each filter set, but it’s turning into a nightmare. I could just copy the sidebar config page (with a few changes–you can only include each filter once, unlike sidebars), but lots of people have had problems with the sidebar editor, and I’d like something a bit cleaner. Except I have no idea what to do.

If all goes well, I’ll post a public patch for comment early next week, and then kick off the Typo 4.0 process by committing this and the tag code later in the week.

Posted by Scott Laird Fri, 12 Aug 2005 15:20:00 GMT


More Typo wishlist items

I updated my Typo to-do list this morning and uploaded it to the Typo wiki.

Hopefully we can get most of those features into the next major Typo release.

Posted by Scott Laird Tue, 09 Aug 2005 20:54:25 GMT