Canonical Web Addresses

Although, I had often heard the word, I didn’t really know what canonical meant until recently. I am starting to understand now after working for Automattic on WordPress.

Simon Willison wrote the interesting Why you should be using disambiguated URLs. It describes why canonical web addresses (URL) are important for search engines (SEO), browsers and web infrastructure (cache), and people (sharing).

Although, you can brute force making sure that the addresses are canonical, often the first step is asking how did a person get to that address you did not want to be used?

The main areas seem to relate to redirecting (301):

  1. www. or no-www, but not both.
  2. Trailing slashes on “directory”-based URLs or not.
  3. Don’t include index filename in URLs or not.

When should this be handled by the web server (.htaccess) or in the application?

WordPress does a very good job once you turn on permalinks, but I still installed Scott Yang’s Permalink Redirect WordPress Plugin to redirect addresses missing the trailing slashes. It also deals with /index.php, not that anyone would ever manually enter that URL — but maybe Simon will stop by.

2007-03-02 Update: With Scott Yang clean code, Matt was able to quickly add a stripped down version of Permalink Redirect WordPress Plugin to WordPress.com and the team is considering the functionality for installed WordPress (core).

This entry was posted in Blogging, Web, WordPress. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

9 Responses to Canonical Web Addresses

  1. Pingback: Hot Links

  2. Pingback: Simon Willison’s Weblog

  3. engtech says:

    I took at look at all the funky ways people saved one of my posts to del.icio.us in What’s a URL to do?

  4. Mark Jaquith says:

    I’d like to get something like that into WordPress core. For 2.2, I worked in improving WP’s trailing-slash (or not) consistency (see [4886], but that doesn’t account for human typos or the issue of someone moving from a ?p=x structure to a “pretty” structure or www/no-www variations. We pretty much have the “how did they get there?” angle covered, as far as WP-generated links, but it’d be nice to cover the human-error angle as well.

  5. Lloyd says:

    engtech, Brilliant investigation! Using a URL too long for someone to hand craft also isolates some of the issues.

    Your post reminded me to install Bennett McElwee’s less plugin. Not becomes of the issues discussed here, because I find the normal ‘more’ behavior jarring.

  6. Lloyd says:

    Mark, it would be nice.

    You are correct, WordPress does a fantastic job of canonical web addresses throughout its experience. That is probably why, I am enjoying working through these little details.

    I am reminded how I am one of those people that have to do it to understand it. Most of these are bridges that the WordPress team likely crossed long ago.

  7. Matt says:

    Mark, I’m down for it in core. I’ve got a stripped down version of the plugin I’ve been testing on WP.com, want me to send it to you?

  8. Pingback: Simon Willison’s Weblog Friday, 2nd March 2007

  9. Dito says:

    i have a heck of a time figuring out how to .htaccess 301 redirect my www. to my non-www. it seems almost everything i find on the web tells me how to do the opposite

Leave a Reply

Your email is never published nor shared. Required fields are marked *

*

You may use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • 1