Streaming live at 10am (PST)

🚨 Major Webflow CMS Security Vulnerability

Do you use Webflow’s CMS to hold any sensitive information or data? Don’t. Your data has likely been exposed and is showing up on search engine results.

Webflow does not protect files on their CDN used in the CMS. Anything you upload to the Webflow CMS is accessible to anyone.

From Webflow Support:

It sounds like there may have been some confusion around the Webflow CMS. While it is a great way to store dynamic content like blog posts, listings, etc., it’s not meant to be a restricted database to hold sensitive data. All the data in a published CMS is discoverable by search engines. Just to clarify, files that are uploaded through a form can be restricted so you can collect sensitive files from web visitors. It’s the file distribution (CMS) side of things that can’t be restricted currently.

Additionally, deleting the files from the CMS collection does not delete the files from the CDN which means anything you upload stays on Webflow servers.

1 Like

Wow…

Wooow…

So, CMS was built and intended to be a content system and not an important database? Okay…

A security flaw is when some unintended vulnerability exists. When a web page is published to a website, and that page does not have access control restrictions, it is visible publically, as is the case for all platforms. Pages protected by a password are not accessible without the password. If you need to secure a page, then password protect it. I advise that If data is sensitive (financial, personal, or medical), it should not be published to Weblow’s CDN or pages.

As for assets not being removed and persisting on the CDN after deleting them, this is by design, not an oversight. Think of scenarios where an asset is used across multiple pages then deleted. Since the assets exists no 404 errors would arise. Deleted assets also present an issue on restore operations. Restores are fast since assets don’t need to be published back out to the CDN. I do think Webflow should have a switch to completely delete an asset (remove it from the CDN) in the asset manager, but they don’t. As I understand it, Customer support may have that ability when contacted.

So nothing you shared is a “Security Vulnerability.” You may just not have been aware of these limitations existing.

Mod edit: I added a ? to the title to change it from a false statement to a question.

4 Likes

Also, a web page showing up in search engines is hardly revelatory. You can easily block search engine crawlers and robots with a couple of lines of code, which is something you need to consider in any web project.

1 Like

@webdev the pages are password protected. Yes, your advice is the whole purpose of my post and why I’m sharing with the community.

Do not put sensitive data on Webflow pages, CMS, or CDN by any means.

Also, please do not edit my words. My topic is a statement, not a question. I’ve change the word from flaw to vulnerability to appease your syntax—as this is a weakness that allows unauthorized users to cross privilege boundaries. The password protected data is still viewable and crawled on the CDN.

@Fonsume it is revelatory when the pages are in fact using a no index / no crawl code. You’ve missed the point entirely above. Webflow moves the data to a place that is exposed even with these precautions.

Are you saying Google etc is indexing the pages even if you’ve excluded them from the robots.txt?

I wonder if this is also the case when you have populated your CMS but have also left your CMS Collection Page empty (totally blank)?

@ChrisDrit Interesting question, my observation is that if a CMS page has not been ‘initialised’ (by which I mean has never been edited in the designer) - it doesn’t get published.
Test: Create CMS items and then visit URL for a CMS item of a template page.
Expected result: 404 response error page.
However once a template is edited all CMS item pages become accessible. I don’t see how that affects CMS data aside from content that has been intentionally rendered into a page.
Whilst I fully agree that the CMS should not be used for sensitive data the issue in question is about page content not the CMS database itself.

1 Like

If a page is password protected then no actual content is loaded on that page until the password has been entered. The only content is the password form. This also applies to Collection Items when the collection template page has password protection set to on.

If assets are used on a password protected page or template, those assets are stored on the CDN like any other asset. If you had the URL to the asset, it would load in the browser regardless. Of course there is the issue of obscurity. Since the asset includes a GUID in the name, one would have to have a link to it to use/access/share it. Google can’t crawl the asset CDN (there is no index) and can only follow a link that exists on a page that is indexable. A password protected page is not indexable by any search bot.

I am not following where you think an issue exists. If you need to password protect assets for your project then you can’t use Webflow’s CDN for that since it does not support it. There are plenty of CDN’s you can use that do allow for access control of hosted objects.

Hi there. Thanks everyone here for the great conversation around uploading files to the CMS in a Webflow project.

I wanted to add in a little extra information to help confirm and clarify.

When uploading a file (for example a PDF file) into a Webflow CMS item field then these files are uploaded to AWS (Amazon Web Servers) and a unique filename generated that also links the file to the project.

These files are not restricted, so if you have the direct link then they can be loaded in a web browser. The files could be discoverable, however it is unlikely unless the unique filename (GUID) can be guessed. So it has to be linked or shared somewhere for someone to find the file, and also for Google (or other search engines) to be able to crawl and index the file it has to be linked somewhere on a publicly viewable web page.

If a CMS template page links to the file, or if the file is linked elsewhere on a static page (or another CMS template page) by using a CMS collection list, then Google can crawl the page and index any links it follows. If the page is password protected then the link would not be found on the page, as the page would return a password protected response and the content would not be loaded. But if the page is not password protected until later, then Google may have already crawled and indexed the content, including any links to files on the page. In this instance the actual file link is still able to be loaded, so Google will continue to index that file. The link to the uploaded file is not password protected, only the page itself.

To avoid indexing a link, an option is to set a rel="nofollow" attribute on any links that go to the PDF file. For more information on rel=nofollow please see https://support.google.com/webmasters/answer/96569?hl=en.

This can be set as a custom attribute on links in the project settings (Image 2020-09-07 at 11.57.1...).

However please note that this is a suggestion to Google not to follow the link to the PDF file, and that Google may still index if it crawls the file from another link elsewhere. Also it will not remove PDFs that are currently indexed on Google search.

Another option for uploading content to a project, if it needs to be restricted, is to upload a file through form file uploads and to toggle on the restrict file upload access (Form file upload | Webflow University). These files would not be accessible for search engines to index or for others to view without logged in access. Note that these files cannot then be used on CMS pages or static pages as the links are restricted and require being logged in.

I hope that helps. If you have any specific issues please contact Webflow support at Webflow Customer Support | Webflow University and we can help further.

2 Likes
  1. Webflow should absolutely offer the ability to delete files off the CDN. I spent the last 21 days trying to have files removed form the servers, was passed on to 3 different people, one who said they couldn’t do anything to help me.

  2. I’m still trying to get Webflow to submit a reindex request to Google. Webflow Support says they cannot do it and keep telling me to do myself through Google Analytics which is impossible since I don’t own any webflow.com servers (i.e. global-uploads.webflow.com or uploads-ssl.webflow.com)

  3. This should be made clear as a disclaimer somewhere in plain site as a preventative measure to protect people. Even going through the terms, it’s pretty vague… I imagine there are a few other people unaware of this.

The files could be discoverable, however it is unlikely unless the unique filename (GUID) can be guessed.

The files are actually very easily discoverable. The file hosted on the CDN is returned in the first 3 results yielded when searching for sensitive data in a PDF i.e. an email address. For example if my resume is uploaded, and I search my email address on Google, my resume (the file hosted on Webflow CDN) is served up as the 3rd result. This is true for other known keywords i.e. a website, a phone number.

An additional note on the penultimate paragraph above: anyone using Zapier with Webflow forms, you need the restrict file option toggled off. So this solution will not work.

These two points are indeed pretty bad.

My client is running into the exact same problem right now. We’re using Zapier to automate a bunch of file upload flows and now user resumes are surfacing on Google search as one of the top results when you search their names. This needs to be fixed asap @david_T as there is sensitive data floating around everywhere right now with no option to bulk delete the files from our side. Already messaged support but they’re not helping out — they just told me to delete hundreds of files submitted through forms by myself inside of file settings.

Hi Ian,

Just wanted to know if you ever got your files removed off the CDN. I’m in a pretty urgent situation and wondering how I could accelerate this process with Webflow support. I’ve been in email threads for over a month with the same issue trying to sort this out for my client. Webflow Support has been acting the same for me — passing me on to 3 different people, the most recent one telling me they’ll provide me an update “at the earliest moment” (it’s been 2 weeks since the last response). Let me know if you can, thanks.

Eric,

Sorry to hear that. Webflow did eventually remove them after nearly 1 month of repeated requests. So, yes, it can be done. Webflow still says they cannot submit a reindex request to Google.

Good luck.

If you don’t publish CMS data to a template page or collection list then how would one access it?