Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 [2]

Author Topic: sNews and Duplicate Content  (Read 13216 times)

Joost

  • Guest
sNews and Duplicate Content
« Reply #15 on: April 24, 2007, 10:59:25 PM »

@ki11
I don't see any reason for Google to work this way, it would not benefit the search-engine user. Why would they degrade quality and relevant content?
Logged

4dd1ct

  • Newbie
  • *
  • Karma: 0
  • Posts: 7
sNews and Duplicate Content
« Reply #16 on: April 25, 2007, 11:48:38 PM »

Duplicate content is bad all round since it breaks the one document per URI model of the web (which is what search engines expect, and after all, is the way the web was intended to work). I can go into why this is harmful to both users and search engines in more depth, if requested.

AFAIK sNews currently has two major problems that create duplicate content in a way that is harmful.

There is no validation of categories, so ...

domain/category-SEF/article-SEF

domain/any-characters/article-SEF

...result in the same article, but different URIs. Try http://www.solucija.com/ghjghj/new-free-template-internet-corporation/

The other issue is the broken missing article code, so any deleted, renamed or mistyped URI results in the accidental creation of a valid URI (discussed already on these forums - http://www.solucija.com/forum/viewtopic.php?id=3822). Try http://www.solucija.com/ghjghj/fghfhghgfhghg/
Logged

Joost

  • Guest
sNews and Duplicate Content
« Reply #17 on: April 26, 2007, 01:26:50 AM »

Quote from: 4dd1ct
Duplicate content is bad all round since it breaks the one document per URI model of the web (which is what search engines expect, and after all, is the way the web was intended to work). I can go into why this is harmful to both users and search engines in more depth, if requested.

AFAIK sNews currently has two major problems that create duplicate content in a way that is harmful.
OK, sNews is bad for the Internet :( , but how harmful are near duplicates for the sNews user? And yes I am looking forward to some in depth explanation, if you would like to take the effort..
Quote from: 4dd1ct
There is no validation of categories, so ...

domain/category-SEF/article-SEF

domain/any-characters/article-SEF

...result in the same article, but different URIs. Try http://www.solucija.com/ghjghj/new-free-template-internet-corporation/
No well designed webcrawler has ever looked for http://www.solucija.com/ghjghj/new-free-template-internet-corporation/,  until today. :D
Quote from: 4dd1ct
The other issue is the broken missing article code, so any deleted, renamed or mistyped URI results in the accidental creation of a valid URI (discussed already on these forums - http://www.solucija.com/forum/viewtopic.php?id=3822). Try http://www.solucija.com/ghjghj/fghfhghgfhghg/
There is a mod available for this issue, you can find it here. It doesn't look like the mod is finished.
The basic idea, is to send an ErrorDocument 404 header.

Regards,

Joost
Logged

quaffapint

  • Newbie
  • *
  • Karma: 0
  • Posts: 14
sNews and Duplicate Content
« Reply #18 on: April 26, 2007, 01:32:18 AM »

I have pages of mine end up in the supplemental index do to the very reasons described - Since the robot says, I can get to it via main\page6 - so why do I need to also list the-real-article-link/.  Taking some of the actions ki11 mentioned would probably be a good idea, just to be sure you get the 'real' page link in the index and not have it end up in the supplemental index.
Logged

codetwist

  • Hero Member
  • *****
  • Karma: 50
  • Posts: 940
sNews and Duplicate Content
« Reply #19 on: April 26, 2007, 01:21:10 PM »

Quote from: Joost
...
No well designed webcrawler has ever looked for http://www.solucija.com/ghjghj/new-free-template-internet-corporation/,  until today. :D
...
There is a mod available for this issue, you can find it here. It doesn't look like the mod is finished.
The basic idea, is to send an ErrorDocument 404 header.
...
Well ... it's more problems when applying/writing mods - it's easy to create crappy (as in complete nonsense) URI that will still allow to access article. And everything will looks dandy to user in this case ;)

As for that 404 page - it's still not finished so there is not mod yet.
Logged

iatbm

  • Sr. Member
  • ****
  • Karma: 0
  • Posts: 251
    • Public domain photos
sNews and Duplicate Content
« Reply #20 on: April 26, 2007, 02:04:29 PM »

There are no duplicate content issues with sNews. I can confirm that running 20+ sites with sNews ..... it is all fine ....

codetwist

  • Hero Member
  • *****
  • Karma: 50
  • Posts: 940
sNews and Duplicate Content
« Reply #21 on: April 26, 2007, 02:23:10 PM »

From posts in this thread I'd say that duplicate content could be an issue only if site as such is set up in a way that almost exactly same stuff is showed on different URI. But this definitely isn't a problem with sNews code, just not so good site configuration.

Loose category handling is a little different story, but It still doesn't qualify as a bug that breaks things. And of course, this isn't problem for those who mods their snews code anyway ;)

And that 404 - yet another feature request in a queue.
Logged

Joost

  • Guest
sNews and Duplicate Content
« Reply #22 on: April 26, 2007, 02:35:43 PM »

Quote from: codetwist
Quote from: Joost
...
No well designed webcrawler has ever looked for http://www.solucija.com/ghjghj/new-free-template-internet-corporation/,  until today. :D
...
There is a mod available for this issue, you can find it here. It doesn't look like the mod is finished.
The basic idea, is to send an ErrorDocument 404 header.
...
Well ... it's more problems when applying/writing mods - it's easy to create crappy (as in complete nonsense) URI that will still allow to access article. And everything will looks dandy to user in this case ;)

As for that 404 page - it's still not finished so there is not mod yet.
Nice way of quoting, codetwist. You can make me say anything this way. :/

Anyway, ki11 started a very interesting discussion here. which should take place in somewhere else on the forum (that is what I think) and more often. It started as an seo issue, 'the one document per URI model of the web' was mentioned and now we are talking about implementing Hypertext Transfer Protocol -- HTTP/1.1.
Logged

codetwist

  • Hero Member
  • *****
  • Karma: 50
  • Posts: 940
sNews and Duplicate Content
« Reply #23 on: April 26, 2007, 05:32:49 PM »

Quote from: Joost
Quote from: 4dd1ct
Duplicate content is bad all round since it breaks the one document per URI model of the web (which is what search engines expect, and after all, is the way the web was intended to work). I can go into why this is harmful to both users and search engines in more depth, if requested.

AFAIK sNews currently has two major problems that create duplicate content in a way that is harmful.
OK, sNews is bad for the Internet :( , but how harmful are near duplicates for the sNews user? And yes I am looking forward to some in depth explanation, if you would like to take the effort..
Quote from: 4dd1ct
There is no validation of categories, so ...

domain/category-SEF/article-SEF

domain/any-characters/article-SEF

...result in the same article, but different URIs. Try http://www.solucija.com/ghjghj/new-free-template-internet-corporation/
No well designed webcrawler has ever looked for http://www.solucija.com/ghjghj/new-free-template-internet-corporation/,  until today. :D
Quote from: 4dd1ct
The other issue is the broken missing article code, so any deleted, renamed or mistyped URI results in the accidental creation of a valid URI (discussed already on these forums - http://www.solucija.com/forum/viewtopic.php?id=3822). Try http://www.solucija.com/ghjghj/fghfhghgfhghg/
There is a mod available for this issue, you can find it here. It doesn't look like the mod is finished.
The basic idea, is to send an ErrorDocument 404 header.

Regards,

Joost
Well ... it's more problems when applying/writing mods - it's easy to create crappy (as in complete nonsense) URI that will still allow to access article. And everything will looks dandy to user in this case ;)

As for that 404 page - it's still not finished so there is not mod yet.

P.S. Ok, Joost, here is full quote. I thought I didn't changed meaning, sorry. Only I hope that quoted post is still the same, not checking that.
Logged

Joost

  • Guest
sNews and Duplicate Content
« Reply #24 on: April 26, 2007, 05:48:13 PM »

Very considered of you codetwist. It was not such a big deal, but i thought these two lines together could be misinterpreted.  So I used the :/  icon and not the  :mad:  icon.

Regards
Logged

piXelatedEmpire

  • MIA
  • ULTIMATE member
  • ******
  • Karma: 37
  • Posts: 1401
  • currently MIA
sNews and Duplicate Content
« Reply #25 on: April 27, 2007, 02:02:30 AM »

Quote from: Joost
Quote from: codetwist
Quote from: Joost
...
No well designed webcrawler has ever looked for http://www.solucija.com/ghjghj/new-free-template-internet-corporation/,  until today. :D
...
There is a mod available for this issue, you can find it here. It doesn't look like the mod is finished.
The basic idea, is to send an ErrorDocument 404 header.
...
Well ... it's more problems when applying/writing mods - it's easy to create crappy (as in complete nonsense) URI that will still allow to access article. And everything will looks dandy to user in this case ;)

As for that 404 page - it's still not finished so there is not mod yet.
Nice way of quoting, codetwist. You can make me say anything this way. :/
Actually, this way of quoting is much cleaner as you can edit out anything that is relevant and keep post sizes smaller.

Now, back on topic lads :D
Logged
my apologies to the sNews crew, but I will be MIA for the forseeable future

Joost

  • Guest
sNews and Duplicate Content
« Reply #26 on: April 27, 2007, 02:22:06 AM »

Quote from: piXelatedEmpire
Quote from: Joost
Quote from: codetwist
Well ... it's more problems when applying/writing mods - it's easy to create crappy (as in complete nonsense) URI that will still allow to access article. And everything will looks dandy to user in this case ;)

As for that 404 page - it's still not finished so there is not mod yet.
Nice way of quoting, codetwist. You can make me say anything this way. :/
Actually, this way of quoting is much cleaner as you can edit out anything that is relevant and keep post sizes smaller.

Now, back on topic lads :D
OK :P
Logged

piXelatedEmpire

  • MIA
  • ULTIMATE member
  • ******
  • Karma: 37
  • Posts: 1401
  • currently MIA
sNews and Duplicate Content
« Reply #27 on: May 02, 2007, 02:33:32 AM »

Guys, a heads up... this issue is being addressed in the next version of sNews.  Stay tuned!  :cool:
Logged
my apologies to the sNews crew, but I will be MIA for the forseeable future

Joost

  • Guest
sNews and Duplicate Content
« Reply #28 on: May 02, 2007, 02:57:31 AM »

Quote from: piXelatedEmpire
Guys, a heads up... this issue is being addressed in the next version of sNews.  Stay tuned!  :cool:
Yes, less quoting = less duplicated content :lol:  :lol:  :lol:
Logged
Pages: 1 [2]