djalmabright Posted September 17, 2011 Share Posted September 17, 2011 No matter what I try can' t get my XML sitemap Blog to be accepted by Google: http://earthmaps.heliohost.org/sitemap.xml The map is create by XML-Sitemaps WP Plugin - it's large,and the initial sitemap file references. This file is fully accessible to the world, and my robots.txt file Has The Following Information: User-Agent: * Allow: / Sitemap : http://earthmaps.heliohost.org/sitemap.xml I've validated this Xml file with two different validators and no errors are found. Consistently And yet I receive the Following message when i submit the sitemap: " General HTTP error: HTTP 403 error (Forbidden) We encountered an error while Trying to access your Sitemap. Please Ensure, your Sitemap follows our guidelines and Can be accessed at the location you provided and then resubmit. " Can anyone tell me what I'm doing wrong? OR MY SERVER - http://earthmaps.heliohost.org is blocking The GoogleBot ? Thanks. Quote Link to comment Share on other sites More sharing options...
Byron Posted September 17, 2011 Share Posted September 17, 2011 I'm getting a 403 error when I try accessing this page: earthmaps.heliohost.org/sitemap.xml Check your root htaccess file for anything that would be blocking everybody but you. Quote Link to comment Share on other sites More sharing options...
jje Posted September 18, 2011 Share Posted September 18, 2011 Meanwhile I can see the sitemap.xml fine? Quote Link to comment Share on other sites More sharing options...
Tjoene Posted September 18, 2011 Share Posted September 18, 2011 Meanwhile I can see the sitemap.xml fine? Same here. Quote Link to comment Share on other sites More sharing options...
Byron Posted September 18, 2011 Share Posted September 18, 2011 I'm still seeing a 403. Even my Header Tool is showing a 403 error. Can you post what you have in your root htaccess file? Quote Link to comment Share on other sites More sharing options...
Tjoene Posted September 18, 2011 Share Posted September 18, 2011 I'm still seeing a 403. Even my Header Tool is showing a 403 error. Can you post what you have in your root htaccess file? Now this is odd. I can see the XML file without problems. But when I check it using Byron tool, I see it give a 403 error. Quote Link to comment Share on other sites More sharing options...
claus Posted September 18, 2011 Share Posted September 18, 2011 Delete your robots.txt - it's invalid. There is no allow directive because that is the desired default behaviour - to allow access. Depending, which tool, you use, to create your sitemap, the format may well be invalid. Google is a little picky about this one. Supposing you are on some sort of Win, try the free Xenu Link Sleuth. It also verifies your links and outputs other useful information. If asked, if you want a report, click YES, than CANCEL on the FTP form, to load the report in your default Browser. I keep a link on my site at http://www.peterheinrichclaus.uk.tc/resources/#Xenu Well, okay, I should have known. You are using XML Sitemaps 1.6. Whatever it was, either htaccess or the malformed robots.txt - it appears, you have it fixed. At least, when I click the root link from the sitemap, some UFO Blog opens. You seem to like bizarre relations, don't you? What exactly is the connection between UFO's, Vikings, The UK, and whatever else unrelated stuff you place there. Oh, and consider using at least some margin and padding. It really helps. Regards, Peter Heinrich Claus Quote Link to comment Share on other sites More sharing options...
Byron Posted September 18, 2011 Share Posted September 18, 2011 His robots.txt is invalid but that wouldn't produce a 403 error. My browser doesn't check for a robots.txt nor my header tool. This is the correct code for all robots: User-agent: * Disallow: http://www.robotstxt.org/robotstxt.html Or just leave it blank. Quote Link to comment Share on other sites More sharing options...
djalmabright Posted September 19, 2011 Author Share Posted September 19, 2011 Thanks for reply, I already changed the robots.txt but i don't know if something is blocking the tracking in my .htaccess , please look this content : # -FrontPage- IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti* <Limit GET POST> order deny,allow deny from all allow from all </Limit> <Limit PUT DELETE> order deny,allow deny from all </Limit> AuthName earthmaps.heliohost.org AuthUserFile /home1/donovan/public_html/_vti_pvt/service.pwd AuthGroupFile /home1/donovan/public_html/_vti_pvt/service.grp <Files ~ ".xml"> Order allow,deny Deny from all Satisfy All </Files> # BEGIN WordPress <IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteCond /home1/donovan/public_html/wp-content/sitemaps%{REQUEST_URI} -f RewriteRule \.xml(\.gz)?$ /wp-content/sitemaps%{REQUEST_URI} [L] RewriteRule ^index\.php$ - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . /index.php [L] </IfModule> # END WordPress Also i changed the location of sitemap.xml FROM the root directory TO http://earthmaps.heliohost.org/wp-content/sitemaps/ and resent it to Google Webmasters Tool , but i am getting this 404 ERROR - NOT FOUND now. regards, Djalma Bina. Quote Link to comment Share on other sites More sharing options...
Byron Posted September 19, 2011 Share Posted September 19, 2011 This is what is causing the 403 error. It is saying to deny from all to any files that end in .xml. Remove that and the 403 error will go away. <Files ~ ".xml"> Order allow,deny Deny from all Satisfy All </Files> The 404 may be because the server is down right now. Quote Link to comment Share on other sites More sharing options...
claus Posted September 22, 2011 Share Posted September 22, 2011 Well, wherever you moved your sitemap, if it is not in the root, there ought to pop-up a proper 404. Instead, I get your page alright, without any styles, though. Refer to the attached screenshot. Somebody mentioned, the 404 may be caused by the server being down. ??Which planet do you live on?? If a server is down, it doesn't send anything. The only visible output in any such case, is a Server not found message in your Browser window. I agree to the syntax error in the .htaccess, regarding Deny from all. This is used for high-security areas, where first you block the whole world, and then add a filter, to allow only the IP's, you want to grant access; to whatever contents are there. Did it ever occur to you, to temporarily disable that odd .htaccess - if just for testing, to see, what happens without it? Just rename it to something like .htaccess-- to prevent the server from parsing it. Then, create a fresh copy, adding one piece at a time. As soon, as you run into troubles, at least you can pin-point them exactly. Take it from there, to address the issue in question. Another thing is, that Google really doesn't care where your sitemap lives on the server, as long, as its syntax is correct. Regards, Peter Heinrich Claus Quote Link to comment Share on other sites More sharing options...
Byron Posted September 22, 2011 Share Posted September 22, 2011 Somebody mentioned, the 404 may be caused by the server being down. ??Which planet do you live on?? If a server is down, it doesn't send anything. Evidently not on the same planet you live on. When the server is down and returning a Queued page for all accounts, my pages in my sub-directories return a 404 for me. And at the time of my post, that was the case. Quote Link to comment Share on other sites More sharing options...
claus Posted September 22, 2011 Share Posted September 22, 2011 Whatever gimmicks you may have configured to handle HTTP request, is one thing. However, any such is must be configured as bending the rules, aka standards. Thus, we both share the point. A stand-alone machine, on its own, dedicated IP returns blank if down, while anything on a shared IP may well return foo - depending the server's admin's tweaks. However, none of this is really helping the poor guy to solve his screwed .htaccess - or server, or whatever. And, as I have posted above, the screenshot is proof enough, that there is something utterly wrong with the configuration. Either the somewhat exotic interpretation of an .htaccess, or else I would place a bet on some sort of server-side mismatch. At least in the UK we have a saying, something about too many cooks do spoil the soup... Quote Link to comment Share on other sites More sharing options...
jje Posted September 22, 2011 Share Posted September 22, 2011 Here is the reason why the Queued page is seen when the server goes down: Most servers have a 'default webpage' which is displayed when the domain is pointing to the server, but the server has no record of that domain and therefore cannot find the appropriate web page to load. djbob changed this default webpage to his own Account Queued page, so this would appear if the user setup a domain and pointed it to HelioHost but it had not been activated and configured yet. However, in the event that the servers goes down, the server cannot find any record of the domain in it's register (since all accounts are down) and therefore redirects to 'default webpage', which is the Queued page. Quote Link to comment Share on other sites More sharing options...
claus Posted September 22, 2011 Share Posted September 22, 2011 Which applies only, if any such is explicitly configured. Try setting up Apache, using the defaults. Create a host entry in the hosts file. Now, if Apache is running; and directory listing is turned on; the default page is index of - End of communication. Take Apache down, return to the host, and the only thing you get is Server not found End of communication. Don't try fancy stuff - at least not with Apache. Chances are, it's creating conflicts with individual users settings. Any such is the cause of all evil in web-hosting. What seems to be working just nicely on your own little dev-box not necessarily leads to the same results in the real world. Nothing personal, but some of you probably just have too much time between High School classes, so keep going; happy bug-hunting. EOF Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.