Delete Pages From Google Index

Home Forums White Hat SEO Delete Pages From Google Index

This topic contains 25 replies, has 1 voice, and was last updated by   David James .

  • Author
    Posts
  • #6005

    Sara Grant
    Member

    How do you properly delete pages from Google index?

    I have made these unwanted pages noindex (using “X-Robots-Tag: noindex” HTTP header).
    Then I removed these pages using “Remove URLs” tool in Search console.

    These pages have disappeared from Google for a while, but now they have come back (and I really don’t want them there).

    What am I missing?

  • #6006

    Jenette Bush
    Member

    P.S. These pages are not blocked in robots.txt

  • #6007

    Giacomo Pieri
    Member

    Patience.

    • #6008

      William Souter
      Member

      Are you saying they will disappear eventually? Have you experienced something like this before?

    • #6009

      Giacomo Pieri
      Member

      Since you have no-indexed them, they won’t be indexed anymore. Give it a try with robot.txt as well, so bots don’t reach them for crawling anymore.

      For removing the cached version from Google, it totally depends on Google’s mood. They will be removed in a few days or maybe weeks.

    • #6010

      Alexandra Gorman
      Member

      Noindex the page and blocking the page using robots.txt are two different paths. It depends on the goal. Do you want to stop indexation or crawling? Because if you blocked it from robots.txt even the page is set to follow. The crawler won’t even reach it to read the robots directive and thus it will break the link equity flow (in the case that the page matters to you from the link equity perspective).

    • #6011

      William Souter
      Member

      Are you sure you’re understanding his requirements? Why would he need link equity when he is going to no-index it anyway?

    • #6012

      Richard Fisher
      Member

      Btw, you have just rephrased what I have said, lol.

    • #6013

      Syed Shah
      Member

      Link equity “flow” we don’t apply nofollow directive just because we applied noindex.. again indexation and crawling are two different things.

    • #6014

      Siobhan Newman
      Member

      I didn’t rephrase what you said.. you recommended blocking crawling on top of indexation with robots.txt which I’m not recommending since we are only talking about indexation; not crawling as well.

    • #6015

      Christopher Hammond
      Member

      We’re not talking about anything bro, if you have a better solution, go for it, but instead of rephrasing my solution, do post your own.

    • #6016

      Ryan Richardson
      Member

      Forget about my typos.. they wont hurt this “SEO” conversation… not as much as “nofollow” will hurt anyway.

    • #6017

      Simon Chapman
      Member

      But before posting your solution, make sure you have understood his requirements! He wants the page to be completely disappeared from search engines and he want it, fast!

    • #6018

      Giacomo Pieri
      Member

      You still insist on “rephrasing”? Lol, Okay. I will give up.. it’s clear that indexation and crawling are the same thing to you.. okay keep nofollowing everything that you noindex just cause google didn’t take it out from the index as fast as you wanted.

    • #6019

      Nadine Cairns
      Member

      I am clearly telling him that since he has already no indexed the page, it should be removed soon.

      But if he wants to speed up the process he can try blocking the robots as well from reaching the page.

      No indexation and blocking a robot from reaching the page are surely two different things.

      He doesn’t care about any link juice or equity so he shouldn’t care about blocking, no indexing, or any other technique.

  • #6020

    Christopher Hammond
    Member

    Recheck your noindex meta tags.

    • #6021

      Elizabeth Allen
      Member

      I’m using HTTP header “X-Robots-Tag: noindex” instead of meta tags, but yes, it’s there.

  • #6022

    Christopher Hammond
    Member

    You can also make these pages HTTP header response 410.

    • #6023

      Alexandra Gorman
      Member

      Can’t, these documents are necessary. I just don’t want them indexed in Google.

    • #6024

      Geoffrey Claughton
      Member

      Make sure no-index no-follow meta tags are properly placed in header. These pages are blocked from robots.txt.

    • #6025

      Dawn Cotton
      Member

      I only set noindex, without nofollow. Should I also add nofollow?

  • #6026

    Michal Kahn
    Member

    Yea 410 in htaccess does it for me.

    • #6027

      Nicola Hawkins
      Member

      Can’t, these documents are necessary. I just don’t want them indexed in Google.

  • #6028

    Alisdair James
    Member

    You could always change the URL’s and put 410 in place of them, if you are not wanting them in search Id presume you do not want the links anywhere else. Changing the URLs will not harm the site in anyway, but 410 in the htaccess will keep google away for good and its the fastest way to remove indexed links. 410 is a specific instruction to say the page has been permanently removed. 404’s can take months to be removed. Make sure to also exclude the links from any sitemaps that you use.

  • #6029

    Jenette Bush
    Member

    You should be quite clear about them to Google. Noindex them is great and it’s the first step, they’re technically still in the Google index, even though they’re not delivered to the users within the search results.
    This means that Google still crawl them (and if they’re a lot, they consume your crawl budget)

    Your goal is to remove them.

    You should:
    1. Noindex them.
    2. Include all of the pages to be deleted in a dedicated XML sitemap and post it on Google Search Console.
    3. Make sure Google see all of them (or most if they’re a lot) from the xml sitemap.
    4. Create and upload an orphan basic html page (with no css or any kind of layout from your site) with all the links to the URLs you want to delete, and ‘noindex’ the orphan page
    5. Delete them all (404)
    6. Fetch as Google (in GSC) the orphan html page: Fetch and render + Request indexing (Crawl this URL and its direct links)
    6. Check the sitemap (coverage) and monitor if/when they are removed from the Google index
    7. Check the logs if you have access to them so that you can make sure the bot had access to all the URLs.

    *Don’t* use the Disallow in robots.txt

    Hope this helps

  • #6030

    David James
    Member

    410

You must be logged in to reply to this topic.