minutes reading time

How To Hide PDFs from Google

Hey, before you proceed just I wanna make sure you know that I use affiliate links on this website and in the email newsletter. And I assume you already know what affiliate links are but if you don't, you can get more info here. Now that's out of the way:

Onwards...

Lead magnets are a great way to grow your email list and PDF lead magnets can take many forms: Checklists, guides, workbooks, white papers or even the first chapter of your new e-book.

Are PDFs bad for SEO?

Did you know that Google and other search engines can easily crawl and index your PDF files?

It’s not really well-known, but since 2001 Google has been using Optical Character Recognition (OCR) technology to recognize and convert text PDFs. Till date Google has already crawled and indexed hundreds of millions PDFs.


And because your PDFs are likely to contain keywords which are relevant to your audience, your PDFs can show up in search results even when people aren’t looking them.

Your PDF could even outrank your landing page in search results. That’s because the PDF will likely have much more useful information in it than the info on your landing page.

So Google thinks “the PDF is more valuable so it must show up higher in results”.

And sometimes Google also shows PDFs in search snippets which appear above all other search results.


Things get even worse if the PDF is a paid product like an e-book. This means that people are getting access to what you are selling, for free!

Ouch!

So if you don’t protect your lead magnets from the prying eyes of search engines, people can download your lead magnet without providing you their email address.

And you could end up losing email subscribers and sales.


The bottom line is: PDFs can be really bad for your site’s SEO performance.

Today You'll Learn
  • Why most methods used to hide PDFs don’t work
  • How to properly hide lead magnet PDFs (and any other files like mp3, avi, mov, mp4 etc)

What most marketers do

If you're like most online marketers, you've simply uploaded your lead magnet to your site and didn’t really bother hiding it.

Because you figured, who can find them without knowing the exact URL, right?

And to be fair, you did nofollow all links leading to the download page and to the actual PDF itself, so it should be ok, right?

Well, that a no.

The problem is that it’s really easy to find PDF files.

How to search for PDFs

Google indexes and displays PDF files in search results because your PDF contains keywords that Google might find useful in search results. So, finding unprotected PDFs on the internet is super easy.

Just head over to Google’s advanced search, enter the domain name of the website and select file type as PDF.

Voila, now you can find all PDF files on any website.

OR if you don’t want to restrict yourself to PDFs from a particular site, you can even search the entire web using a query like this:

filetype:pdf how to get a student pilot license


Try it right now and see how many of your PDFs are visible to anyone willing run a simple Google search.

You can also use sites like PDF Search Engine and PDF Drive to search for PDF files on the web.

Spy on your competition

Ok, so this is not totally and completely 100% ethical, but you can also run such a search for your competitors and find out what their PDFs contain.


Smart marketers will do this to get a competitive advantage.


(And hey, if your competitors are smart, they might have read your PDFs already!)


I often get asked the question:

What if the PDFs aren’t used for lead generation and aren’t a paid product?

Should I still hide them?

Short answer: yes

I understand that sometimes you’ll want to give away a PDF file to your website visitors without requiring anything from them.

But you should still hide your PDFs because they are likely to contain more or less similar keywords to other parts of your website.

If you don’t, you can incur a duplicate content penalty from Google for the content contained in your PDFs.

But you also can’t edit your lead magnet PDFs and remove duplicate content without ruining them completely.

So, what’s the solution?

Hide Your Lead Magnet PDFs

Before we talk about the proper way to your files, let’s take a look at some of the common solutions found elsewhere on the internet:

Send as attachment

You setup your lead magnet or purchased PDF to be delivered through your email marketing service (like ActiveCampaign, which I use and recommend) by adding the PDF as an attachment to the email they receive.

This way you completely avoid hosting the file on your server.

The problems with this approach are:

  1. 1
    Email services like gmail and yahoo have file size limits so if you have a large PDF document, it can’t be delivered as an attachment.
  2. 2
    It’s not what smart marketers do. Smart marketers get new subscribers to come back to their website to download the PDF. This has many advantages.

Third party hosting

You could host your PDFs on a third-party service like Dropbox and simply provide the link in your email or on the lead magnet download page.

And no, that doesn’t work either.

See, unless you password protect your lead magnets (or the folders they are in), Google will still index them. So the file will likely benefit from the higher domain ranking of Dropbox and feature more even prominently in the search results.

That would be more harmful than just leaving the PDFs on your server.

Don’t do it!

Password protect PDFs on your website

This will prevent Google from indexing your PDFs. And that’s great.

But of course, Google has already indexed the PDF files on your website and there’s no telling when Google will update their index to reflect this. It could take weeks or even months.

But that’s not the main issue.

The main issue is that everyone who downloads the PDF file will need to enter a password every time they want to open the file, sometimes months or years after they download the file.

Do you think they’ll be able to remember the password?

And if people are unable to access the file at a later date, that’ll will leave a bad impression of you and your website. They might never return.

Here’s why:

See, when I signup on a mailing list to download a lead magnet, I pretty much know what to expect.

  1. 1
    I will signup on the list and automatically be redirected to a page which contains content like: “Thank you for signing up, now check your inbox for the email which contains the link to the download file.”
  2. 2
    I check my inbox and find the email. Click the link in it I’m sent to a page from where I can download the lead magnet.

As a digital marketer, I’m sure you must have been through this process so often that you don’t even think about the steps anymore. You just do it automatically.

I don’t even read the instructions anymore because they are just so standard.

Guess what? It’s the same for your website visitors.

They don’t expect the lead magnets to be password protected. So even when you provide them with the password, they are likely not to find it later.

Nofollow and noindex every link leading to the PDF

Sure, you could add the noindex attribute to the lead magnet download page and also nofollow all the links to that page.

But that doesn’t actually noindex the PDF file itself and Google still might be able to index it.

So that doesn’t work either but we’re getting closer to a solution that works …

Use robots.txt or .htaccess

This is the solution that actually works. But first, it’s disclaimer time:

If you’re already tech savvy and are comfortable using robots.txt or .htaccess then you’ll find these methods quite easy.


If you’re not tech savvy, please don’t make these changes on your own. If you make a mistake somewhere, it can cause big problems.


I suggest you contact your hosting company and send them a link to this page and they’ll do this properly.

These methods work because they block the bots from accessing your PDF files without the need for password protecting the PDFs.

You can use either the robots.txt method or the .htaccess method. Both work just fine.

You can also use these methods together and that won’t cause any issues.

Method one: edit robots.txt

Add the following code to your robots.txt file will block bots from crawling PDFs across your site:

User-agent: *
Disallow: *.pdf

If you only want to block PDF files in a specific folder, you can block access to entire folder using the code below.
* This is assuming your PDFs are in a folder named "pdfs". If your PDF files are in another folder, you'll need to replace "pdfs" with the name of your folder.

User-agent: *
Disallow: /pdfs/

Method two: edit .htaccess

This method will noindex the PDF files themselves. This means that Google will get the instruction to not index the PDF files on your website.

Google will also drop the PDFs from its index quickly because Google is quite serious when it comes to following noindex instructions.

Add the following code to your .htaccess file (usually located in the public_html folder)

<Files ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</Files>

And that’s it

Take a deep breath and relax. It’s done.

There’s no need to inform Google or any other search engines of the change. They will catch on fairly quickly.

If you have any questions: Please leave a comment below

About the Author

I'm obsessed with building profitable online funnels.
I created this blog to teach digital marketers the skills, tools & techniques needed to generate evergreen revenue from their online business.
Blogs I like: ActiveCampaign, Ahrefs, Backlinko, Thrive Themes.

Related Posts

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Subscribe to get the latest updates
>