Canonical Web Addresses

Although, I had often heard the word, I didn’t really know what canonical meant until recently. I am starting to understand now after working for Automattic on WordPress.

Simon Willison wrote the interesting Why you should be using disambiguated URLs. It describes why canonical web addresses (URL) are important for search engines (SEO), browsers and web infrastructure (cache), and people (sharing).

Although, you can brute force making sure that the addresses are canonical, often the first step is asking how did a person get to that address you did not want to be used?

The main areas seem to relate to redirecting (301):

  1. www. or no-www, but not both.
  2. Trailing slashes on “directory”-based URLs or not.
  3. Don’t include index filename in URLs or not.

When should this be handled by the web server (.htaccess) or in the application?

WordPress does a very good job once you turn on permalinks, but I still installed Scott Yang’s Permalink Redirect WordPress Plugin to redirect addresses missing the trailing slashes. It also deals with /index.php, not that anyone would ever manually enter that URL — but maybe Simon will stop by.

2007-03-02 Update: With Scott Yang clean code, Matt was able to quickly add a stripped down version of Permalink Redirect WordPress Plugin to WordPress.com and the team is considering the functionality for installed WordPress (core).

6 Comments

  1. Posted March 1, 2007 at 3:59 pm | Permalink

    I took at look at all the funky ways people saved one of my posts to del.icio.us in What’s a URL to do?

  2. Posted March 1, 2007 at 4:14 pm | Permalink

    I’d like to get something like that into WordPress core. For 2.2, I worked in improving WP’s trailing-slash (or not) consistency (see [4886], but that doesn’t account for human typos or the issue of someone moving from a ?p=x structure to a “pretty” structure or www/no-www variations. We pretty much have the “how did they get there?” angle covered, as far as WP-generated links, but it’d be nice to cover the human-error angle as well.

  3. Posted March 1, 2007 at 4:39 pm | Permalink

    engtech, Brilliant investigation! Using a URL too long for someone to hand craft also isolates some of the issues.

    Your post reminded me to install Bennett McElwee’s less plugin. Not becomes of the issues discussed here, because I find the normal ‘more’ behavior jarring.

  4. Posted March 1, 2007 at 4:52 pm | Permalink

    Mark, it would be nice.

    You are correct, WordPress does a fantastic job of canonical web addresses throughout its experience. That is probably why, I am enjoying working through these little details.

    I am reminded how I am one of those people that have to do it to understand it. Most of these are bridges that the WordPress team likely crossed long ago.

  5. Posted March 2, 2007 at 1:22 am | Permalink

    Mark, I’m down for it in core. I’ve got a stripped down version of the plugin I’ve been testing on WP.com, want me to send it to you?

  6. Posted February 24, 2008 at 7:37 am | Permalink

    i have a heck of a time figuring out how to .htaccess 301 redirect my http://www. to my non-www. it seems almost everything i find on the web tells me how to do the opposite

3 Trackbacks

  1. By Hot Links on March 2, 2007 at 7:18 am

    Simon Willison : Permalink Redirect WordPress Plugin - Permalink Redirect WordPress Plugin. Neat WordPress plugin that forces a redirect to an item’s permalink if the URL has any extra crud in it. [via] # copy

  2. By Simon Willison’s Weblog on March 5, 2007 at 4:40 am

    don’t like, you ball it up and throw it away, and rip off a new, fresh one. — Jeff Atwood [IMG 0] Steampunk Star Wars (via) Beautiful illustrations of Star Wars re-imagined in a steampunk context. [IMG 1] Permalink Redirect WordPress Plugin (via) Neat WordPress plugin that forces a redirect to an item’s permalink if the URL has any extra crud in it. [IMG 0]

  3. Permalink Redirect WordPress Plugin (via) Neat WordPress plugin that forces a redirect to an item’s permalink if the URL has any extra crud in it.

Post a Comment

Your email is never shared. Required fields are marked *

*
*