I published a webflow site (empathconnection.com) and checked the “Exclude this page from site search results” for several pages. The “auto generate sitemap” was turned on. We launched the site, submitted the sitemap google and then ran a search “site:empathconnection.com” and saw that the EXCLUDED pages were showing up in the list.
I then tried the suggested “noindex” code on the specific pages, but the problem with that is, the pages are still listed in the sitemap and therefore anyone who knows how to crawl for sitemaps (specifically the thieves who steal material downloadable after purchase) can gain access to the pages.
I understand that all pages need to be in the sitemap so that when google crawls the pages, it will see the “noindex” command – or do they???
Can anyone tell me if the answer to this problem is:
leave the noindex command on the individual pages
submit a custom sitemap to google that excludes the noindex pages
ADDITIONAL QUESTION: What will happen if I add the Google Analytics tracking code to the header of a page that is “noindex”? My reason for doing this is to see how many people are actually accessing those pages versus the number of people who I know paid to access those pages.