Jump to content


Photo

How To Exclude Url And A Possible Bug


  • Please log in to reply
6 replies to this topic

#1 travlang

travlang

    RAGE Newbie

  • RAGE Members
  • Pip
  • 1 posts

Posted 13 September 2006 - 08:45 AM

I am trying to exclude certain urls from the map. I only want http://www.travlang.com urls and nothing else.
I tried to exlude all urls at dictionaries.travlang.com or another one by using the filter:
do not all full url contains dictionaries.travlang.com or full url starts with http://dictionaries.travlang.com but that doesn't seem to work. I have serveral other subdomains to exlude.

Also urls that don't have the extensions in the default preferences are also added, such as .au

One other problem is that sitemap program keeps running, I have to stop is after over 24 hours to generate a site map.
Please let me know the correct filter set up to exclude the subdomains, and if anyone else is experiencing the perpetual run of the program.
Thanks
Howard

#2 RageSW

RageSW

    Administrator

  • RAGE Admin
  • PipPipPipPipPip
  • 2,082 posts

Posted 13 September 2006 - 07:37 PM

The settings for extensions in the preferences applies to your web pages, not the top level domain (ie .com,, .net etc...), Also, they will be added, they just will not be scanned for links. If you do not want them to be added than create a filter (such as do not add the url if the full url contains .au)

If you would like to create a filter to exclude a subdomains create a new filter that does not add the url if the Full Url contains 'subdomain.' (include the period in there without the quotes). In your case you would put

dictionaries.

In your filter.

Your web site is talking a long time because of the amount of links you have. Include more filters to not add pages that don't need to be added to your sitemap.

#3 stoptime

stoptime

    RAGE User

  • RAGE Members
  • PipPip
  • 6 posts

Posted 14 September 2006 - 11:02 AM

I'm also having similar issues. On a largish eCommerce site (OS Commerce), the program seems to loop over the same files over and over, whether or not I exclude them. For example, I created a filter to exlude all files containing "cookie_usage.php" and instead of excluding that file, it seems to be a magnet for it as it pops up over and over again in the "Web Pages Found" dialog window, along with some other pages that keep reappearing that I want to include, but seem to be included over and over again. When I stop it, the program freezes on the Cleaning Up alert box.

#4 RageSW

RageSW

    Administrator

  • RAGE Admin
  • PipPipPipPipPip
  • 2,082 posts

Posted 15 September 2006 - 12:17 AM

Are you using the latest version? If not, download version 1.3 from http://www.ragesw.co...glesitemap.html

Does your web page use mod_rewrite or any kind of redirection? That may be the problem. Please let me know what your web site is and I will look into it and hopefully get it fixed.

#5 stoptime

stoptime

    RAGE User

  • RAGE Members
  • PipPip
  • 6 posts

Posted 15 September 2006 - 08:07 AM

Are you using the latest version? If not, download version 1.3 from http://www.ragesw.co...glesitemap.html

Does your web page use mod_rewrite or any kind of redirection? That may be the problem. Please let me know what your web site is and I will look into it and hopefully get it fixed.


Yup, latest version 1.3 is installed. The site in question is: http://lingerieandmore4u.com/

The site's not using mod_rewrite or any redirection.

#6 RageSW

RageSW

    Administrator

  • RAGE Admin
  • PipPipPipPipPip
  • 2,082 posts

Posted 15 September 2006 - 05:16 PM

Ok, I believe the problem on your site is that there are a lot of redirects to the cookie_usage.php page.

I have fixed an issue with redirects that should help with the problems. However, you will still see a lot of cookie_usage.php pages show up as RAGE Google Sitemap Automator scans your web site. They will NOT however, be added to your final sitemap. They are the result of all the pages that require a login on your web site.

I have a fixed the problem and have a version waiting for you to try out. Just send an email using the form at the URL below with the subject as your web page url and I will send you a download link.

Your web site is very large, thats why it is taking a while to scan. Hope this helps,

#7 stoptime

stoptime

    RAGE User

  • RAGE Members
  • PipPip
  • 6 posts

Posted 10 October 2006 - 12:41 PM

Hi again - sorry, I've been away! Is that URL still available?




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users