Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2 3 4

Author Topic: Google Sitemap Mod  (Read 29915 times)

Ken Dahlin

  • Full Member
  • ***
  • Karma: 30
  • Posts: 139
    • http://www.kendahlin.com/
Google Sitemap Mod
« on: August 08, 2007, 10:47:23 am »

This dynamically generates an XML sitemap for Google to index each page of your site. It  includes articles, categories and pages... You can find more information about XML sitemaps on http://sitemaps.org.

Update 8/01/08: See Joost's important information about the "trailing slash issue" http://snewscms.com/forum/index.php?topic=7570.0. The current mod does not use trailing slashes. If you use the new .htaccess file Joost suggests, you'll need to find every occurrence of  "</loc>" and replace it with "/</loc>". This is highly recommended for reasons I posted in that thread.

Update 9/16/07: If you don't want to modify your snews.php file, you can download this mod as a more or less standalone program here: http://dahlin.googlecode.com/files/snews-google-sitemap02.zip. Just drop the files in the root of your install and tell google to find your sitemap at googlesitemap.php instead of sitemap.xml. This approach works well and should be more compatible with already heavily modded versions of snews. I recommend using the standalone.

If you love your text editor and choose to not use the standalone, here are the instructions for modifying your snews.php:

Add a new function in snews.php called sitemapxml(); the code is based on the normal sitemap function, but for XML.

Code: [Select]
//SITEMAP.XML
function sitemapxml() {

$sitemap_css = "smstyle.xsl";

header('Content-type: text/xml; charset='.s('charset').'');
echo '<?xml version="1.0" encoding="UTF-8"' . '?' . '>'. "\n";
if (file_exists($sitemap_css)) {
echo '<' . '?xml-stylesheet type="text/xsl" href="'.db('website').'smstyle.xsl"?' . '>'. "\n";
} else {
echo "<!-- Debug: No smstyle.xsl -->\n";
}


echo '<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/09/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'. "\n";

echo "<!-- Debug: Main, Archive and Sitemap Pages -->\n";
echo "  <url>\n";
echo "    <loc>".db('website')."</loc>\n";
echo "    <changefreq>daily</changefreq>\n";
echo "    <priority>1</priority>\n";
echo "  </url>\n";
echo "  <url>\n";
echo "    <loc>".db('website')."archive</loc>\n";
echo "    <changefreq>daily</changefreq>\n";
echo "    <priority>.8</priority>\n";
echo "  </url>\n";

echo "  <url>\n";
echo "    <loc>".db('website')."contact</loc>\n";
echo "    <changefreq>monthly</changefreq>\n";
echo "    <priority>.3</priority>\n";
echo "  </url>\n";

echo "  <url>\n";
echo "    <loc>".db('website')."sitemap</loc>\n";
echo "    <changefreq>daily</changefreq>\n";
echo "    <priority>.8</priority>\n";
echo "  </url>\n";

$link = "<loc>".db('website');
$query = "SELECT * FROM ".db('prefix')."articles WHERE position = 3 AND published = '1' ORDER BY date";
$result = mysql_query($query);
while ($r = mysql_fetch_array($result)) {
    $art_title = $r['title'];
    $art_date = date('Y-m-d', strtotime($r['date']));
    $art_time = date('H:i:s', strtotime($r['date']));
    echo "<!-- Debug: Pages -->\n";
    echo "  <url>\n";
    echo "    ".$link.l('home_sef')."/".$r['seftitle']."</loc>\n";
    echo "    <lastmod>".$art_date."T".$art_time."+00:00</lastmod>\n";
    echo "    <changefreq>monthly</changefreq>\n";
    echo "    <priority>.5</priority>\n";
    echo "  </url>\n";
    }

$art_query = "SELECT * FROM ".db('prefix')."articles WHERE position = 1 AND published = '1'";
$query = $art_query." AND category = 0 ORDER BY date DESC";
$result = mysql_query($query);
while ($r = mysql_fetch_array($result)) {
echo "<!-- Debug: Uncategorized Article -->\n";
    echo "  <url>\n";
$art_title = $r['title'];
    echo "    ".$link.l('home_sef')."/".$r['seftitle']."</loc>\n";
$art_date = date('Y-m-d', strtotime($r['date']));
$art_time = date('H:i:s', strtotime($r['date']));
    echo "    <lastmod>".$art_date."T".$art_time."+00:00</lastmod>\n";
    echo "    <changefreq>weekly</changefreq>\n";
    echo "    <priority>.8</priority>\n";
    echo "  </url>\n";
    }


$cat_query = "SELECT * FROM ".db('prefix')."categories WHERE published = 'YES' ORDER BY catorder";
$cat_result = mysql_query($cat_query);
while ($c = mysql_fetch_array($cat_result)) {
    $catid = $c['id'];
    $query = $art_query." AND category = $catid ORDER BY id DESC";
    $result = mysql_query($query);
    echo "<!-- Debug: Category -->\n";
    echo "  <url>\n";
    echo "    <loc>".db('website').$c['seftitle']."</loc>\n";
    echo "    <changefreq>weekly</changefreq>\n";
    echo "    <priority>.8</priority>\n";
    echo "  </url>\n";
    while ($r = mysql_fetch_array($result)) {
    echo "<!-- Debug: Category Article -->\n";
    echo "  <url>\n";
    echo "    ".$link.$c['seftitle'].'/'.$r['seftitle']."</loc>\n";
$art_date = date('Y-m-d', strtotime($r['date']));
$art_time = date('H:i:s', strtotime($r['date']));
    echo "    <lastmod>".$art_date."T".$art_time."+00:00</lastmod>\n";
    echo "    <changefreq>weekly</changefreq>\n";
    echo "    <priority>.8</priority>\n";
    echo "  </url>\n";
    }
}
echo "</urlset>\n";
}

Ok, now you have to be able to call the thing so add "sitemap.xml" to the "$l['cat_listSEF'] =" line

Quote
$l['cat_listSEF'] = $l['home_sef'].',archive,contact,sitemap,sitemap.xml,rss-articles,rss-pages,rss-comments,login,administration,admin_category,admin_article,article_new,extra_new,page_new,categories,articles,extra_contents,pages,settings,files,logout'; //SEF links of the hardcoded categories

and then flip on the switch in snews_startup()  like this, right after "connect_to_db();" add the line "if (get_id('category') == "sitemap.xml") {sitemapxml(); die;}" like this :

Quote
   connect_to_db();
   if (get_id('category') == "sitemap.xml") {sitemapxml(); die;}

Now you can tell Google to read your XML sitemap at http://[yourdomain.com]/sitemap.xml

SNEWS MULTIUSER INSTRUCTIONS:

Same as above, but insert the following code into function getGetParm:
Quote
Right under the line:

         case 'category' :

Add:
         if ( $url[0] == 'sitemap.xml' ) {
            $parmValue = $url[0];
         } else





You can look at a generated sitemap here: http://snews.kendahlin.com/google_sitemap/sitemap.xml

* Modified 9/24/07 to include links to contact, sitemap, and archive pages thanks to Dom.
* Modified 9/10/07 to include <lastmod> information for articles and pages, <priority>, <changefreq>, and will use smstyle.xsl for style information if it exists in the root directory. (example smstyle.xsl attached to this message). Should now work with sNews MultiUser.
* Modified 9/08/07 to fix bug in sitemapxml() with categories that have more than one article.
* Modified 9/07/07 to validate XML thanks to Joost for the code and Sven for bringing it to my attention.
« Last Edit: August 01, 2008, 05:17:24 pm by Ken Dahlin »
Logged

sanyez

  • Newbie
  • *
  • Karma: 0
  • Posts: 2
Google Sitemap Mod
« Reply #1 on: August 11, 2007, 05:09:22 pm »

Works fine, thanks.

I've got some problems. I'll post about it later. I use v1.6.
Logged

Joost

  • Guest
Re: Google Sitemap Mod
« Reply #2 on: September 05, 2007, 02:37:37 pm »

If one of the moderators would 'touch' the first post of this mod to make the code readable, it would be appreciated.

Thanks, :)
Logged

Patric Ahlqvist

  • Nobodys perfect, but Im pretty effing close
  • ULTIMATE member
  • ******
  • Karma: 65
  • Posts: 4867
  • “I'm a self-made man and worships my creator.”
    • p-ahlqvist.com
Re: Google Sitemap Mod
« Reply #3 on: September 05, 2007, 02:39:28 pm »

What do you mean by "touch", Joost... To small text or what. I can read it although it's too small, but that's the only flaw I can discover...

Edit: quoted the code and changed font size..(somehow changing fontsize in a code bracket didn't work.)
« Last Edit: September 05, 2007, 02:52:13 pm by Patric Ahlqvist »
Logged
"It's only dead fish that goes with the flow... "
Updated

Joost

  • Guest
Re: Google Sitemap Mod
« Reply #4 on: September 05, 2007, 02:52:07 pm »

What do you mean by "touch", Joost... To small text or what. I can read it although it's too small, but that's the only flaw I can discover...
I mean the forum migration issue, that makes code malformed. Saving it again without modifying, will bring the code back the way it should be rendered.
Forum migration has a bad effect on especially html and xml code. Both are not displayed. Ken's mod has got a lot of xml.
Thanks.

For who needs to know: The code is fully functional again.
« Last Edit: September 06, 2007, 02:58:56 am by Joost »
Logged

Vasile Rusnac

  • Newbie
  • *
  • Karma: 7
  • Posts: 49
Re: Google Sitemap Mod
« Reply #5 on: September 07, 2007, 11:36:30 pm »

Hi, very userful mod from my point of view, thanx for sharing.
I have tested this mod with 1.6MU and whenever I point into my browser "sitemap.xml" I am being shown the frontpage
I suppose this could be somehow related to sef function overriding the sitemap.xml link but I can not figure out the problem
I have also tested this mod on the normal 1.6 version and it works.
Logged

areyouami

  • Newbie
  • *
  • Karma: 3
  • Posts: 24
Re: Google Sitemap Mod
« Reply #6 on: September 08, 2007, 04:15:47 am »

I was having the the same issue.

That was strange. But apparently it just works. I just didn't have the path in there correctly. I thought by adding it to the .htaccess file would make it work, but it works already:

domain.com/sitemap.xml/
« Last Edit: September 08, 2007, 04:20:02 am by areyouami »
Logged

Ken Dahlin

  • Full Member
  • ***
  • Karma: 30
  • Posts: 139
    • http://www.kendahlin.com/
Re: Google Sitemap Mod
« Reply #7 on: September 08, 2007, 08:17:42 am »

Hi, very userful mod from my point of view, thanx for sharing.
I have tested this mod with 1.6MU and whenever I point into my browser "sitemap.xml" I am being shown the frontpage
I suppose this could be somehow related to sef function overriding the sitemap.xml link but I can not figure out the problem
I have also tested this mod on the normal 1.6 version and it works.

Just tested this with MU and it worked for me. If you want, upload me your snewsMU.php, minus your database information of course, and I'll compare it with mine to see if we can find your issue.
Logged

Ken Dahlin

  • Full Member
  • ***
  • Karma: 30
  • Posts: 139
    • http://www.kendahlin.com/
[BUGFIX] Google Sitemap Mod
« Reply #8 on: September 08, 2007, 04:35:53 pm »

I noticed a bug in sitemapxml() which has now been corrected in the original post. Basically I had <url> and </url> outside of one of the while loops:

Code: [Select]
    echo "  <url>\n";
    while ($r = mysql_fetch_array($result)) {
        echo "    ".$link.$c['seftitle'].'/'.$r['seftitle']."</loc>\n";
    }
    echo "  </url>\n";
}

This would cause Google to reject the sitemap in categories which had more than one article...

The correct code looks like this:

Code: [Select]
    while ($r = mysql_fetch_array($result)) {
    echo "  <url>\n";
        echo "    ".$link.$c['seftitle'].'/'.$r['seftitle']."</loc>\n";
    echo "  </url>\n";
    }
Logged

Sven

  • ULTIMATE member
  • ******
  • Karma: 88
  • Posts: 2029
  • Chasing MY bugs!
    • hiseo.fr - rédacteur Web
Re: Google Sitemap Mod
« Reply #9 on: September 08, 2007, 06:17:51 pm »

Thanks a lot, Ken. :)

Ken Dahlin

  • Full Member
  • ***
  • Karma: 30
  • Posts: 139
    • http://www.kendahlin.com/
Re: Google Sitemap Mod
« Reply #10 on: September 10, 2007, 08:04:31 pm »

Original post has been modified. Mod now includes <lastmod> information for articles and pages, <priority>, <changefreq>, and will use sitemap.xsl for style information if it exists in the root directory. (example sitemap.xsl has been attached to the original topic).
Logged

Ken Dahlin

  • Full Member
  • ***
  • Karma: 30
  • Posts: 139
    • http://www.kendahlin.com/
Re: Google Sitemap Mod
« Reply #11 on: September 11, 2007, 05:36:02 am »

I have tested this mod with 1.6MU and whenever I point into my browser "sitemap.xml" I am being shown the frontpage
I suppose this could be somehow related to sef function overriding the sitemap.xml link but I can not figure out the problem

Please check the updated original post, I think I solved this problem for sNews MU.
Logged

Vasile Rusnac

  • Newbie
  • *
  • Karma: 7
  • Posts: 49
Re: Google Sitemap Mod
« Reply #12 on: September 11, 2007, 08:52:57 pm »

Yeah Ken, indeed, now it works for the snews MU version. Great news  8)
Thank you for solving the issue, now I am ready to send it to google!
Logged

Vasile Rusnac

  • Newbie
  • *
  • Karma: 7
  • Posts: 49
Re: Google Sitemap Mod
« Reply #13 on: September 18, 2007, 11:44:56 am »

Have just tested the standalone googlesitemap script, no issues were encountered,  ;) both with 1.6 and 1.6MU
nice job and thank you for your work.
cheers!
Logged

codetwist

  • Hero Member
  • *****
  • Karma: 50
  • Posts: 940
Re: Google Sitemap Mod
« Reply #14 on: September 18, 2007, 08:10:44 pm »

Thx, comes handy ;)

Two comments:
1) In Your package short opening tag for php is used ( like <? ); IMHO better to use full form.
2) XSL is GPL-ed one ... so, license should be supplied along and whole package should be GPL IMHO ... and then if somebody is integrating that in bigger package again GPL, doh, ... not that it's very important ;)
Logged
Pages: [1] 2 3 4