More Panda 4.0 Findings: Syndication, User Engagement, Indexation & Keyword Hoarding
Now that the dust has settled, it's clear that Panda 4.0 was a powerful update that impacted many sites across the web. And that impact has been fascinating to analyze.
Once I heard the official announcement from Google's Matt Cutts, I began to dig into sites that had been impacted by Panda 4.0. That included clients that saw recovery and new companies reaching out to me with fresh hits. During the first week, I analyzed 27 websites hit by our new bamboo-eating friend.
I posted my initial Panda 4.0 findings on my blog, based on heavily analyzing recoveries and new Panda victims. During my research, it was incredible to see the highs and lows of Panda as I jumped from recovery to damage.
From a recovery standpoint, I had some clients absolutely surge after making a lot of changes (which I call "the nuclear option"). In addition, I saw industry experts rise in the rankings, I noticed a softer Panda, and I saw several Phantom victims recover. Again, you can read through my initial findings to learn more.
And on the flip side, I unfortunately saw many companies get pummeled. And when I say pummeled, I'm referring to losing more than 60 percent of Google organic traffic overnight. Some websites I analyzed lost closer to 75 percent. You can see two very different situations below.
Panda 4.0 recovery:
Panda 4.0 fresh hit:
A Quick Note About the Rollout of Panda 4.0
Panda 4.0 was officially announced on May 20, but I saw websites begin getting impacted the weekend prior (Saturday, May 17). So, it looks like Google began pre-rolling Panda out a few days prior to the announcement (which I have seen in the past with major algorithm updates).
But I also saw another interesting trend. I started seeing more volatility the following weekend (beginning Saturday, May 24). And then more jumps and decreases on Monday, May 26 and Tuesday, May 27.
Others saw this too, and several of us thought it could be the start of a Penguin update (or some type of major update). But Google denied Penguin (or some other spam effort) was rolling out. I had a hard time believing that, based on the additional volatility I was seeing.
Since we've been told that Panda can take 10 days to fully roll out, I was wondering if the Panda tremors I noticed a week into Panda was the tail end of Panda 4.0. I asked Cutts via Twitter, who responded yesterday, saying that Panda 4.0 actually rolled out a bit faster than usual (and that it didn’t take 10 days to fully roll out).
Wow, now that we know it wasn’t a 10-day rollout causing secondary volatility, what in the world was it? Maybe it was Google tweaking the algorithm based on initial findings and then re-rolling it out. Or maybe there was some type of secondary update, separate from Panda. It’s hard to say, but the data reveals that something was going on.
From an analysis standpoint, I saw websites that surged increase even more the following week, and I saw websites that took a hit decrease even more. I didn't see mixed trending (up then down or down then up). That's another reason I thought the volatility early last week was Panda 4.0 still rolling out (or getting re-rolled out).
Here's a quick example of the secondary surge:
More Panda 4.0 Findings
In my first post about Panda 4.0, I mentioned that it wouldn't be my last about the subject (not even close). I had a lot of information to share based on my research, and from across industries. In addition, I've now analyzed 14 more websites hit by Panda 4.0 so I have additional findings to share.
So, if you are interested in Panda, if you have been impacted by Panda 4.0, or if you are just an algo junky, then sit back and grab a cup of coffee. This post will provide several more findings based on analyzing fresh hits and more recoveries from the latest Panda update. Let's dig in.
Syndication Problems
There were a number of companies that reached out to me with major Panda hits (losing greater than 60 percent of their Google organic traffic overnight). Upon digging into several of those sites, I noticed a pretty serious syndication problem.
Those websites had been consuming syndicated content at various levels, and much of that content got hammered during Panda 4.0. The percentage of syndicated content residing on the websites ranged from 20 to 40 percent.
Each site that experienced a serious drop was in a different niche, so this wasn't a specific category that was targeted (like I have seen in the past). After analyzing the sites, it was hard to overlook syndication as a factor that led to their specific Panda hits.
When reviewing the syndicated content on the sites in question, none of them had the optimal technical setup. For example, using rel=canonical to point to the original content on another domain (via the cross-domain canonical tag). Some content had links back to the original content on third-party websites, while other pieces of content did not.
Also, the content could be freely indexed versus noindexing the content via the meta robots tag. That's always an option and I explain more about managing indexation soon.
From a Panda standpoint, I've never been a fan of consuming a lot of syndicated content. If it can be handled properly from a technical SEO standpoint, then some syndicated content is fine. But knowing that Panda can target duplicate content, scraped content, tech problems that create quality problems, etc., I've always felt that webmasters should focus on providing high-quality, original content. That's obviously the strongest way to proceed SEO-wise.
I'm not saying to nuke all syndicated content, but you need to be very careful with how it's handled. In addition, you need to always manage how much syndicated content you have on your site (as compared to original content).
Side Note: Indexation Numbers and Syndicated Content
By the way, I have a client that called me a year ago after previously being impacted by both Phantom and Panda. They also had been consuming syndicated content. Actually, they were consuming a lot of it (greater than 200,000 pages). It's a large site, but that's still a significant amount of third party content.
We worked hard to identify the content across the site, how attribution was being handled, and then drilled into the performance of that content SEO-wise. Then we had to make some hard decisions.
We decided that nuking risky content now would help them win long-term. Their indexation has dropped by approximately 70 percent over the past year. And guess what? The site is performing better now in Google than when they had all of that additional content indexed.
That's right, 70 percent less pages indexed yielded more Google organic traffic. It's a great example of making sure Google has the right content indexed versus just any content. And since Panda 4.0, Google organic traffic is up 140 percent.
More Syndication Problems – Downstream Considerations
During my analysis, I found some of the companies that were consuming syndicated content were also syndicating their own content to various partners. And much of that content wasn't handled properly either (attribution-wise).
To make matters worse, some of the partners were more powerful SEO-wise than the sites syndicating the content. That led to some of the partners outranking the websites that created the original content. Not good, and now that Panda 4.0 has struck, it's obviously even worse.
If you're going to syndicate your content, make sure the websites consuming your content handle attribution properly. In a perfect world, they would use rel=canonical to point back to your content. It's easy to set up and can help you avoid an attribution problem.
In addition, the sites republishing your content could simply noindex it and keep it out of Google's index. They can promote and highlight the content on their site (so it can still be valuable for users), but it won't be indexed and found via search. In aggregate, handling syndicated content properly can help everyone involved avoid potential SEO problems.
Even More Problems – Copying/Scraping Content
It's worth noting that during my analysis, I also found original content that had sections copied from third-party content. In this case, the body of the page wasn't copied in its entirety. There were just pieces of the content that were copied word for word. I saw this across several sites experiencing a steep drop in traffic due to Panda 4.0.
The solution here is simple. Write original content, and don't copy or duplicate content from third parties. It was clear during my analysis that content with sections lifted from third-party content experienced serious drops in traffic after Panda 4.0 struck.
Strong Engagement Tames the Savage Panda
Let's shift gears for a minute and talk about user engagement and Panda. If you have read my posts about Panda over the years, then you know that user engagement matters. Strong engagement can keep the Panda at bay, while weak engagement can invite the Panda to a bamboo feast.
It's one of the reasons I highly recommend implementing adjusted bounce rate (ABR) via Google Analytics. ABR takes time on page into account and can give you a much stronger view of engagement. Standard bounce rate does not, and is a flawed metric.
High standard bounce rates may not be a problem at all, and I know that confuses many webmasters. By the way, if you are using Google Tag Manager, then follow my tutorial for implementing adjusted bounce rate via GTM. If you implement ABR today, you'll have fresh data to analyze tomorrow.
Well, once again I saw poor user engagement (and bad user experience) get sites in trouble Panda-wise. As more and more companies reached out to me about fresh Panda 4.0 hits, I could clearly identify serious engagement issues once I dug in.
For example, pages that were ranking well in Google prior to Panda 4.0 had thin content, horrible usability, affiliate jumps so fast it would make your head spin, downstream links that were risky, stimulus overload, etc. There are too many examples to cover in detail here, but if you've been hit by Panda, take a hard look at your top content prior to Panda arriving. You just might find some glaring issues that can lead you down the right path.
One site in particular that I recently analyzed was driving a lot of Google organic traffic to pages that didn't provide a lot of solid information based on the query. The user interface was weak, the content (which was thin) was located down the page, and if you did click on anything on that page, you were taken to an order form.
I like to call that the Panda recipe of death:
- 1 part confusion
- 2 parts thin content
- 1.5 parts horrible usability
- 2.5 parts deception
- And you might as well throw in a bamboo garnish.
This recipe is guaranteed to drive low dwell time and a nasty hangover. There's nothing that attracts Pandas like frustrated users and low dwell time. Avoid this scenario like the plague.
Keyword Hoarding – Should Your Website Be Ranking For That Many Keywords?
While analyzing more fresh hits, I came across a few situations where smaller websites were ranking for thousands of keywords in a competitive niche without having enough content to warrant those rankings (in my opinion).
For example, one site had less than 40 pages indexed, yet ranked for thousands of competitive keywords. In addition, the homepage was ranking for most of those keywords. So you had essentially one page ranking for thousands of keywords, when there were clearly other pages on the web with more targeted content (that could better match the query).
Once Panda 4.0 rolled out, that site's traffic was cut in half. The content on the site is definitely high quality, but there's no way the homepage should have ranked for all of those keywords, especially when there were articles and posts on other websites specifically targeting torso and longer-tail terms. The site had been teed up to get smoked by Panda. And it finally did.
I saw this situation several times during my research, and to me, it makes a lot of sense. Google is going to want to match users with content that closely matches their query.
If a user searches for something specific, they should get specific answers in return (and not the homepage of a site that targets the category). I'm assuming that had to be frustrating for users, which could impact bounce rate, dwell time, etc.
Again, low engagement is an invitation to the mighty Panda. Unfortunately, the site only had standard bounce rate set up to review and not adjusted bounce rate. See my points earlier about the power of ABR.
The Answer: Develop a Stronger Content Generation Plan
Since Panda 4.0, the site I analyzed is naturally getting pushed down by third-party content targeting specific queries. My recommendation to anyone who is experiencing this situation is to expand and strengthen your content strategy. Write high quality content targeting more than just head terms. Understand your niche, what people are searching for, and provide killer content that deserves to rank highly.
Don't rely on one page (like a homepage) to rank for all of your target keywords. That's a dangerous road to travel. High quality, thorough content about your category can naturally target thousands of keywords.
Recommendations and Next Steps
The scenarios listed above are just a few more examples of what I've seen while analyzing websites impacted by Panda 4.0. So what can you do now that Panda 4.0 has rolled out? There are some recommendations below.
If you've been impacted negatively by Panda 4.0, then you need to move quickly to analyze and rectify content quality problems. And if you have been positively impacted, don't sit back and assume that can last forever. I've seen many sites drop after once gaining traffic due to Panda updates.
Therefore, I have provided two sets of bullets below. One set is for those negatively impacted, while the other is for those experiencing a surge in traffic.
If you have been negatively impacted:
- Have a Panda audit completed. Hunt down low-quality content, technical problems impacting content quality, engagement and usability problems, etc. I've always said that Panda should have been named "Octopus" since it has many tentacles. Thoroughly analyze your site to find major problems.
- Move quickly to make changes. Don't sit back and over-analyze the situation. Find low quality content, understand the best way to proceed, and execute. Rapidly executing the right changes can lead to a faster recovery. Delaying action will only keep the Panda filter in place.
- Keep driving forward and act like you aren't impacted by Panda. Keep producing high quality content, keep using social to get the word out, and keep driving strong referral traffic. The work you do now while impacted by Panda will help you on several levels. And the content you produce while being impacted could very well end up ranking highly once the Panda filter is lifted.
- Since Panda rolls out monthly, you technically have a chance of recovery once per month. You might not see full recovery in one shot, but the more changes you implement, the more recovery you can see. It's another reason to move as quickly as you can.
If you have been positively impacted:
- If you've taken action to recover from Panda, then you were obviously on the right track. Keep driving forward with that plan. Don't stop just because you recovered.
- If you haven't taken action to recover, and you have seen an increase in Google traffic after Panda 4.0 rolled out, then don't assume that bump will remain. You should still understand the content quality risks you have in order to rectify them now (while you have stronger traffic). I've seen sites yo-yo with subsequent Panda updates. Band-aids do not yield long-term Panda recoveries. A strong SEO strategy will.
- After seeing a recovery, review your top landing pages from Google organic prior to Panda being rolled out, look at user engagement, and analyze that content. You might find a strong recipe for "high quality" content based on your audience. Then you can use that as a model for content creation moving forward.
- Review the keywords and keyword categories driving traffic to pages now that Panda 4.0 has rolled out. You might find interesting trends with how Google is matching queries with content on your site.
- Continually crawl and analyze your content to pick up quality problems and technical problems. I do this for some of my clients on a monthly basis. And there are always findings that need to be addressed (especially for larger websites). You would be surprised what you can find. For example, I just found 100,000 pages that were once blocked and now can be crawled. That can be fixed quickly if picked up during a crawl analysis. Technical SEO checks should be ongoing.
Summary – Panda 4.0 Was Significant
Again, Panda 4.0 was a major algorithm update. There were many examples of sites recovering, while also many examples of sites getting crushed.
I plan to keep analyzing websites on both sides of the equation to better understand Panda 4.0 and its signature. And I'll be writing more posts covering my findings in the coming weeks.
Until then, good luck with your Panda work. Remember, the next update is only a few weeks away.