Streaming live at 10am (PST)

Prevent Subdomain and Relative Path indexing on exported sites


#1

This isn't webflow specific but I figured someone in the community would know.

I have a few static exported webflow sites hosted on a server... but the relative path is being indexed and showing up in search engines.

I just want the domain "websitename.com" to be indexed. I don't want the subdomain "websitename.serverdomain.com" to be indexed nor do I want the relative server path "http://serverdomain.ca/websitename.com/" to be indexed.

I know how to block a whole folder from being indexed via robots.txt but not the relative path.... and for the subdomain I figure I would just forward it to the domain or handle removing it from google via webmaster tools.

Thoughts are appreciated.


#2

Hey @KProServices you should be able to prevent your webflow.io subdomain sites from being indexed by following the directions available here:
http://help.webflow.com/article/how-to-prevent-search-engines-from-indexing-your-webflow-site

As far as preventing relative server paths from being indexed, you may find this article helpful: http://webmasters.stackexchange.com/questions/2956/how-to-stop-certain-urls-from-being-indexed

​Please let me know if this helps, if not, I'm happy to look into it further. :smile:


#3

So @Waldo if I used:

User-agent: *
Disallow: /path/with-trailing-slash/

replacing "/path/with-trailing-slash/" with "/websitename" it would prevent indexing of that path and any content in that folder but still allow the main domain to be indexed? Like I said, I want to prevent the relative path form being indexed but ensure the domain and all relate dpages/content are still being indexed properly.


#4

Hey @KProServices can you please show me an example of where you're seeing the relative path being indexed? When we prevent subdomains from being indexed, it prevents the full subdomain and all of it's content from being indexed. You may need to perform an Un-Publish & fresh Publish of your site to resolve this. (may take a few days for web-crawlers to catch up)

Please let me know if this solves the issue, if not, I'm happy to help further :smiley:


#5

Hi @KProServices, updating the robots.txt for the site should work. Take a peek at this article on stack overflow: http://stackoverflow.com/questions/7609031/robots-txt-allow-sub-folder-but-not-the-parent

I hope this helps!


#6

Thanks @Waldo and @cyberdave - haven't had a chance to look at this yet but I will tonight and report back. Much appreciated.


#7