Please login or register.

Login with username, password and session length
Advanced search  

News:

You need/want an older version of sNews ? Download an older/unsupported version here.

Author Topic: Send correct 404 response headers  (Read 9954 times)

henno

  • Newbie
  • *
  • Karma: 0
  • Posts: 4
Send correct 404 response headers
« on: January 07, 2007, 06:58:06 AM »

EDIT: This code is crap. Do not use until I fix it. Read comments below

This is just a quick note for those who don't like sending HTTP responses of "200 OK" for pages that really should be "404 not found".

sNews by default takes over the 404 error messages with a "Content Not Found" message. (refer: language variable 'not_found') This is fine, but sNews has already sent the browser (or search engine) a response header or 200 OK - this is less than ideal. Here is my very very quick hack to get response headers working correctly:

Firstly, there are two references to the not_found variable in snews.php - both in the centre(); function. Search for those; they will look something like this:

Code: [Select]
if (!$result_articles) {echo '<h2>'.l('not_found').'</h2>'; break;}and
Code: [Select]
if (!$result || !$numrows) {echo '<h2>'.l('not_found').'</h2>';}Ideally, you would simply add the correct headers there, but as the original "200 OK" header has already been sent to the browser by this stage, it is too late. So I get around this by setting a new variable by changing the above lines to these (respectively):

Code: [Select]
if (!$result_articles) {$set_http_header="404"; echo '<h2>'.l('not_found').'</h2>'; break;}and
Code: [Select]
if (!$result || !$numrows) {$set_http_header="404"; echo '<h2>'.l('not_found').'</h2>';}Then, in your index.php, right at the top just below include ("snews.php"); add the following:

Code: [Select]
if ($set_http_header="404") {
header("HTTP/1.1 404 Not Found");
}
I will most likely change the above code for other response headers and wrap them in a function of their own (such as get_headers(); or something) but for now, it works and that is all that matters.

Henno.
« Last Edit: February 23, 2009, 01:48:06 PM by Joost »
Logged

Patric Ahlqvist

  • Nobodys perfect, but Im pretty effing close
  • ULTIMATE member
  • ******
  • Karma: 65
  • Posts: 4867
  • “I'm a self-made man and worships my creator.”
    • p-ahlqvist.com
Send correct 404 response headers
« Reply #1 on: January 07, 2007, 09:56:18 AM »

Found this out aswell, when going through Googles webmaster tools... I couldn't do something about it, so thanks Henno... You'll post next solution here as well... ?
Logged
"It's only dead fish that goes with the flow... "
Updated

henno

  • Newbie
  • *
  • Karma: 0
  • Posts: 4
Send correct 404 response headers
« Reply #2 on: January 07, 2007, 10:10:22 AM »

If I make any changes or improvements, or get around to constructing a better way to achieve the same result, I will be sure to post the code here.
Logged

Patric Ahlqvist

  • Nobodys perfect, but Im pretty effing close
  • ULTIMATE member
  • ******
  • Karma: 65
  • Posts: 4867
  • “I'm a self-made man and worships my creator.”
    • p-ahlqvist.com
Send correct 404 response headers
« Reply #3 on: January 08, 2007, 01:17:35 PM »

Mhm, this make the page not possible to validate... So I'm removing it... for now.

This:
Quote
if ($set_http_header="404") {
header("HTTP/1.1 404 Not Found");
}
Makes the validation service get a 404, so it can't validate the page...
Logged
"It's only dead fish that goes with the flow... "
Updated

henno

  • Newbie
  • *
  • Karma: 0
  • Posts: 4
Send correct 404 response headers
« Reply #4 on: January 08, 2007, 03:02:07 PM »

Hmm... I must have been drunk when I wrote that code, because I sure wasn't thinking.

Firstly, there is a typo. And secondly, it is currently impossible to set the db headers in that fashion the way that the function is called (as far as I can tell). If I set the headers first, it is too early, as the db query has not yet been called in function center(). If I set the headers after the db query, then it is too late, and the '200 OK' response headers have already been sent.

This could be easily resolved by another db query in a seperate function (get_headers() or something), but that is just plain nasty and makes you wonder if it is actually worthy of anoother trip to the DB.

I think I need to have a cup of coffee and think about this one some more.

Henno.
Logged

Sven

  • ULTIMATE member
  • ******
  • Karma: 88
  • Posts: 2029
  • Chasing MY bugs!
    • hiseo.fr - rédacteur Web
Send correct 404 response headers
« Reply #5 on: July 27, 2007, 02:37:25 PM »

:(
I thought I would find an answser here about managing 404 with sNews 1.5 since I got some issues with Google when tryin to suppress old pages.
404 needed Google said. :rolleyes:

Joost

  • Guest
Send correct 404 response headers
« Reply #6 on: July 27, 2007, 05:55:01 PM »

@Sven

Perhaps this works for you.
Google gets the appropriate 404 (no redirect). Visitors might get (in rare occasions) the page they with the same name even when the category is wrong like this:
domain/wrong-category/same-titled-page/ (this happens when a page is transferred to a different category). Still they get a 404 header. Personally i don't mind: Google gets what it wants and so does the visitor.
ognennyjstorm did another suggestion, sending 404 and a blank page. It relies on IE, that has got his own errorpages.
Logged

Sven

  • ULTIMATE member
  • ******
  • Karma: 88
  • Posts: 2029
  • Chasing MY bugs!
    • hiseo.fr - rédacteur Web
Send correct 404 response headers
« Reply #7 on: July 28, 2007, 10:35:56 AM »

Thanks a lot Joost.
I don't get why Google refused to delete those old urls (created 4 years ago).
I'm going to try this solution.

Have a nice day.
Sven

Joost

  • Guest
Send correct 404 response headers
« Reply #8 on: July 28, 2007, 08:30:44 PM »

@Sven

Consider using '410' for stubborn search engines. 410 means: gone forever,bugger off!!!!
It might work (not tested).
Insert in .htaccess:
Quote
Redirect gone /virtual/path/to/deleted/file
Make a 410 error file and insert in htacess (including the right path and name):
Quote
ErrorDocument 410 /gone.html
I guess you have to put it before the rewrite rules.
Read about '410'
Logged

Sven

  • ULTIMATE member
  • ******
  • Karma: 88
  • Posts: 2029
  • Chasing MY bugs!
    • hiseo.fr - rédacteur Web
Send correct 404 response headers
« Reply #9 on: July 30, 2007, 10:23:52 AM »

Thats' working really fine Joost!
a BIG thanks to you Dude! :cool:

EDIT:
seems that 410 is not supported when accessing through HTTP/1.0 without host header)
(see: http://www.w3.org/Protocols/rfc1945/rfc1945)

Joost

  • Guest
Send correct 404 response headers
« Reply #10 on: July 30, 2007, 05:45:23 PM »

Quote from: Sven
Thats' working really fine Joost!
a BIG thanks to you Dude! :cool:

EDIT:
seems that 410 is not supported when accessing through HTTP/1.0 without host header)
(see: http://www.w3.org/Protocols/rfc1945/rfc1945)
HTTP/1.1 has been around since 1999. Let's hope Google developers have noticed this ;)
Unfortunately there is nothing we can do about bad designed web crawlers. The only solution is banning or killing these creatures!
Logged

Sven

  • ULTIMATE member
  • ******
  • Karma: 88
  • Posts: 2029
  • Chasing MY bugs!
    • hiseo.fr - rédacteur Web
Send correct 404 response headers
« Reply #11 on: July 30, 2007, 06:31:00 PM »

Quote from: Joost
Quote from: Sven
Thats' working really fine Joost!
a BIG thanks to you Dude! :cool:

EDIT:
seems that 410 is not supported when accessing through HTTP/1.0 without host header)
(see: http://www.w3.org/Protocols/rfc1945/rfc1945)
HTTP/1.1 has been around since 1999. Let's hope Google developers have noticed this ;)
:lol:
Let me introduce my 410  ;)

Thanks again Joost. You helped me a lot, Dude.