« Set Gmail as the Default Mail Client in Linux | Displaying Click Counts with PHP and MySQL »

Drupal SEO

By Studge | May 12, 2007

Drupal is hands down my favorite content management system, but in order to use it there are a few initial steps you need to follow to get it working at our expected SEO performance level.

Note: This article will not cover the installation of Drupal. The first thing I do is turn on the clean URLs. This option is available when logged in as an administrator under Administer>Site Configuration>Clean URLs. Drupal requires that you run the test first to determine whether or not your server is set up with PHP's mod_rewrite module. This will enable the use of meaningful URLs, rather than PHP posting text. To further compliment the use of clean URLs, we are going to install a Drupal module that will automatically name our posts and pages for us. This module is pathauto, download the appropriate package and upload the pathauto folder into the modules folder on your server. Be sure to enable it under Drupal's module section. If you have already created content and are only now installing this module, then it is important that you navigate to the pathauto configuration section and have it bulk generate index aliases.

One problem with Drupal is the creation of duplicate pages. For example: if you create a new post, then you will be presented with four different URLs for the same content:

http://example.com/new-post
http://example.com/new-post/
http://example.com/node/2
http://example.com/node/2/

This practice looks bad to search engines. So, we need to edit our .htaccess file, located at the root of our webserver, to prevent it from using the trailing slash. We will then edit our robots.txt file to prevent the search engines from indexing the /node area, thus preventing it from being an issue.

First we open out .htaccess file and add the following to the beginning of the file (replace example\.com with your domain name):

# remove trailing slashes
RewriteCond %{HTTP_HOST} ^(www.)?example\\.com$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Next we need to edit our robots.txt file to prevent the search engines from indexing our /node area. Add the following line:

Disallow: /node/

That should take care of the basic SEO setbacks that come with a default installation of Drupal. You may also want to implement the XML Sitemap module to automatically produce a sitemap when new content is created. It will also notify several search engines when it is updated.

Topics: SEO, Web Development

Share: del.icio.us | digg | reddit

RSS feed | Trackback URI

3 Comments »

Comment by Drupal SEO
2007-09-30 07:28:15

Great article. The one other thing to consider is adding the nodewords module that gives you meta tag elements for each node or site wide. Care should be taken to not get in to duplicate content issues with the sophisticated capabilities of any cms package like drupal.

 
Comment by Drupalzilla
2007-10-02 20:12:14

You could also use the Global Redirect Module to automatically remove the trailing slashes on Drupal URLs. It also redirects the /node URLs to their URL aliases.

 
Comment by Hagrin
2008-01-31 22:58:51

Good guide, but like Drupalzilla mentioned you could definitely benefit by including the Global Redirect module which works well with clean URLs and pathauto.

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.