Streaming live at 10am (PST)

SEO - Duplicate, submitted URL not selected as canonical

Google is not indexing almost all my pages.

It says the reason is because: Duplicate, submitted URL not selected as canonical
Status: Excluded

They are not duplicates. Not even close.

It also appears that Webflow is somehow adding extra internal links on every page.

Screen Shot 2020-01-23 at 11.57.56 AM

For instance my home page has 2,363 internal links from only 194 pages. WTH. I definitely didn’t add an average of 12 links back to the home page on any page much less on every page.

When I try to “View Page Source” to see if I can figure what is happening, it won’t even load the full page.

Anyone have thoughts on a fix? Are these issues related as I suspect? Or is something else going on with the not indexing?

Hi @techincommon, thanks for your post.

I checked the site pages and there is not that many links on the published pages, so it most likely Google is duplicating the links in the index.

Webflow is not adding any links to the page automatically, those have to be set in the designer UI by the designer. I do see a lot of links to the home page from other pages on the site, each it looks like you are using symbols on each collection detail page and the symbol for the nav has two links per collection item pointed at the home page, i.e. check out the Events page, so there is at least over 900 + links from collections, but I would have to look at each page to count those.

There may be other static links on other pages, but not 2362 actually published.

​One issue I do see is that the WWW domain has not been set as the default domain in Webflow, see here: https://share.getcloudapp.com/WnuE9lRP. Set the WWW domain as the default canonical domain and then republish the site.

​Having no canonical domain set may cause Google to index the root domain, then index the www domain and make it appear as if there were duplicate links on the page, when it is really just in the index.

​After site republish, resubmit the Sitemap in your google search console.

As a reference, see how to Connect a custom domain in Webflow, it seems that the step on setting the WWW as the default domain may have been missed when the domain was originally add in Webflow.

​I hope this helps.

In general hard to solve this without full access to google web console data.

The excluded tab does not always represent an error/mistake (Sometimes you really want to exclude pages - this tab helps you to monitor those pages).

Some reasons for exclude (Full Search Console Help docs about this topic her):

  • Excluded by ‘noindex’ tag
  • Blocked by page removal tool
  • Blocked by robots.txt
  • Alternate page with proper canonical tag
  • Duplicate without user-selected canonical
  • Not found (404)
  • Page with redirect
  • Duplicate, submitted URL not selected as canonical

Each point here is an issue in itself.

On webflow empty CMS pages you block from indexing ==> go to “exclude” tab - because no way yet to exclude pages from auto-generate sitemap. The scenario:

I am google robot :slight_smile: - I see the page “/events/new-year-event” under your sitemap but you block this page by robots.txt for example).

Very minor issue by the way. In any case, you can do a draft for cms-items without any use. One way in the past i solve this is by adding password protection to empty CMS page collection (Just for cleanup google report).

SEO

In general, your ranking is ok? Do users find your site?