Home            Blog

Thursday, June 5, 2014

More Panda 4.0 Findings: Syndication, User Engagement, Indexation & Keyword Hoarding

More Panda 4.0 Findings: Syndication, User Engagement, Indexation & Keyword Hoarding

glenn-gabe
32 Comments
SEO Evolution: Sell, Discover, Deliver & Report on Highly Converting Keywords by Krista LaRiviere, gShift
Panda 4 Findings
Now that the dust has settled, it's clear that Panda 4.0 was a powerful update that impacted many sites across the web. And that impact has been fascinating to analyze.
Once I heard the official announcement from Google's Matt Cutts, I began to dig into sites that had been impacted by Panda 4.0. That included clients that saw recovery and new companies reaching out to me with fresh hits. During the first week, I analyzed 27 websites hit by our new bamboo-eating friend.
I posted my initial Panda 4.0 findings on my blog, based on heavily analyzing recoveries and new Panda victims. During my research, it was incredible to see the highs and lows of Panda as I jumped from recovery to damage.
From a recovery standpoint, I had some clients absolutely surge after making a lot of changes (which I call "the nuclear option"). In addition, I saw industry experts rise in the rankings, I noticed a softer Panda, and I saw several Phantom victims recover. Again, you can read through my initial findings to learn more.
And on the flip side, I unfortunately saw many companies get pummeled. And when I say pummeled, I'm referring to losing more than 60 percent of Google organic traffic overnight. Some websites I analyzed lost closer to 75 percent. You can see two very different situations below.
Panda 4.0 recovery:
Panda 4 Recovery
Panda 4.0 fresh hit:
Panda 4 Negative Impact

A Quick Note About the Rollout of Panda 4.0

Panda 4.0 was officially announced on May 20, but I saw websites begin getting impacted the weekend prior (Saturday, May 17). So, it looks like Google began pre-rolling Panda out a few days prior to the announcement (which I have seen in the past with major algorithm updates).
But I also saw another interesting trend. I started seeing more volatility the following weekend (beginning Saturday, May 24). And then more jumps and decreases on Monday, May 26 and Tuesday, May 27.
Others saw this too, and several of us thought it could be the start of a Penguin update (or some type of major update). But Google denied Penguin (or some other spam effort) was rolling out. I had a hard time believing that, based on the additional volatility I was seeing.
Since we've been told that Panda can take 10 days to fully roll out, I was wondering if the Panda tremors I noticed a week into Panda was the tail end of Panda 4.0. I asked Cutts via Twitter, who responded yesterday, saying that Panda 4.0 actually rolled out a bit faster than usual (and that it didn’t take 10 days to fully roll out).
Wow, now that we know it wasn’t a 10-day rollout causing secondary volatility, what in the world was it? Maybe it was Google tweaking the algorithm based on initial findings and then re-rolling it out. Or maybe there was some type of secondary update, separate from Panda. It’s hard to say, but the data reveals that something was going on.
From an analysis standpoint, I saw websites that surged increase even more the following week, and I saw websites that took a hit decrease even more. I didn't see mixed trending (up then down or down then up). That's another reason I thought the volatility early last week was Panda 4.0 still rolling out (or getting re-rolled out).
Here's a quick example of the secondary surge:
Panda 4 Initial Sign Rollout and Secnd Surge

More Panda 4.0 Findings

In my first post about Panda 4.0, I mentioned that it wouldn't be my last about the subject (not even close). I had a lot of information to share based on my research, and from across industries. In addition, I've now analyzed 14 more websites hit by Panda 4.0 so I have additional findings to share.
So, if you are interested in Panda, if you have been impacted by Panda 4.0, or if you are just an algo junky, then sit back and grab a cup of coffee. This post will provide several more findings based on analyzing fresh hits and more recoveries from the latest Panda update. Let's dig in.

Syndication Problems

There were a number of companies that reached out to me with major Panda hits (losing greater than 60 percent of their Google organic traffic overnight). Upon digging into several of those sites, I noticed a pretty serious syndication problem.
Those websites had been consuming syndicated content at various levels, and much of that content got hammered during Panda 4.0. The percentage of syndicated content residing on the websites ranged from 20 to 40 percent.
Each site that experienced a serious drop was in a different niche, so this wasn't a specific category that was targeted (like I have seen in the past). After analyzing the sites, it was hard to overlook syndication as a factor that led to their specific Panda hits.
When reviewing the syndicated content on the sites in question, none of them had the optimal technical setup. For example, using rel=canonical to point to the original content on another domain (via the cross-domain canonical tag). Some content had links back to the original content on third-party websites, while other pieces of content did not.
Also, the content could be freely indexed versus noindexing the content via the meta robots tag. That's always an option and I explain more about managing indexation soon.
Handling Syndicated Content
From a Panda standpoint, I've never been a fan of consuming a lot of syndicated content. If it can be handled properly from a technical SEO standpoint, then some syndicated content is fine. But knowing that Panda can target duplicate content, scraped content, tech problems that create quality problems, etc., I've always felt that webmasters should focus on providing high-quality, original content. That's obviously the strongest way to proceed SEO-wise.
I'm not saying to nuke all syndicated content, but you need to be very careful with how it's handled. In addition, you need to always manage how much syndicated content you have on your site (as compared to original content).

Side Note: Indexation Numbers and Syndicated Content

By the way, I have a client that called me a year ago after previously being impacted by both Phantom and Panda. They also had been consuming syndicated content. Actually, they were consuming a lot of it (greater than 200,000 pages). It's a large site, but that's still a significant amount of third party content.
We worked hard to identify the content across the site, how attribution was being handled, and then drilled into the performance of that content SEO-wise. Then we had to make some hard decisions.
We decided that nuking risky content now would help them win long-term. Their indexation has dropped by approximately 70 percent over the past year. And guess what? The site is performing better now in Google than when they had all of that additional content indexed.
70 Percent Drop in Indexation
That's right, 70 percent less pages indexed yielded more Google organic traffic. It's a great example of making sure Google has the right content indexed versus just any content. And since Panda 4.0, Google organic traffic is up 140 percent.

More Syndication Problems – Downstream Considerations

During my analysis, I found some of the companies that were consuming syndicated content were also syndicating their own content to various partners. And much of that content wasn't handled properly either (attribution-wise).
To make matters worse, some of the partners were more powerful SEO-wise than the sites syndicating the content. That led to some of the partners outranking the websites that created the original content. Not good, and now that Panda 4.0 has struck, it's obviously even worse.
If you're going to syndicate your content, make sure the websites consuming your content handle attribution properly. In a perfect world, they would use rel=canonical to point back to your content. It's easy to set up and can help you avoid an attribution problem.
In addition, the sites republishing your content could simply noindex it and keep it out of Google's index. They can promote and highlight the content on their site (so it can still be valuable for users), but it won't be indexed and found via search. In aggregate, handling syndicated content properly can help everyone involved avoid potential SEO problems.

Even More Problems – Copying/Scraping Content

It's worth noting that during my analysis, I also found original content that had sections copied from third-party content. In this case, the body of the page wasn't copied in its entirety. There were just pieces of the content that were copied word for word. I saw this across several sites experiencing a steep drop in traffic due to Panda 4.0.
The solution here is simple. Write original content, and don't copy or duplicate content from third parties. It was clear during my analysis that content with sections lifted from third-party content experienced serious drops in traffic after Panda 4.0 struck.
Entire paragraph copied

Strong Engagement Tames the Savage Panda

Let's shift gears for a minute and talk about user engagement and Panda. If you have read my posts about Panda over the years, then you know that user engagement matters. Strong engagement can keep the Panda at bay, while weak engagement can invite the Panda to a bamboo feast.
It's one of the reasons I highly recommend implementing adjusted bounce rate (ABR) via Google Analytics. ABR takes time on page into account and can give you a much stronger view of engagement. Standard bounce rate does not, and is a flawed metric.
High standard bounce rates may not be a problem at all, and I know that confuses many webmasters. By the way, if you are using Google Tag Manager, then follow my tutorial for implementing adjusted bounce rate via GTM. If you implement ABR today, you'll have fresh data to analyze tomorrow.
Bounce rate drops by 62 percent once ABR was implemented
Well, once again I saw poor user engagement (and bad user experience) get sites in trouble Panda-wise. As more and more companies reached out to me about fresh Panda 4.0 hits, I could clearly identify serious engagement issues once I dug in.
For example, pages that were ranking well in Google prior to Panda 4.0 had thin content, horrible usability, affiliate jumps so fast it would make your head spin, downstream links that were risky, stimulus overload, etc. There are too many examples to cover in detail here, but if you've been hit by Panda, take a hard look at your top content prior to Panda arriving. You just might find some glaring issues that can lead you down the right path.
One site in particular that I recently analyzed was driving a lot of Google organic traffic to pages that didn't provide a lot of solid information based on the query. The user interface was weak, the content (which was thin) was located down the page, and if you did click on anything on that page, you were taken to an order form.
I like to call that the Panda recipe of death:
  • 1 part confusion
  • 2 parts thin content
  • 1.5 parts horrible usability
  • 2.5 parts deception
  • And you might as well throw in a bamboo garnish.
The Ultimate Panda Cocktail
This recipe is guaranteed to drive low dwell time and a nasty hangover. There's nothing that attracts Pandas like frustrated users and low dwell time. Avoid this scenario like the plague.

Keyword Hoarding – Should Your Website Be Ranking For That Many Keywords?

While analyzing more fresh hits, I came across a few situations where smaller websites were ranking for thousands of keywords in a competitive niche without having enough content to warrant those rankings (in my opinion).
For example, one site had less than 40 pages indexed, yet ranked for thousands of competitive keywords. In addition, the homepage was ranking for most of those keywords. So you had essentially one page ranking for thousands of keywords, when there were clearly other pages on the web with more targeted content (that could better match the query).
Content Strategy and Panda
Once Panda 4.0 rolled out, that site's traffic was cut in half. The content on the site is definitely high quality, but there's no way the homepage should have ranked for all of those keywords, especially when there were articles and posts on other websites specifically targeting torso and longer-tail terms. The site had been teed up to get smoked by Panda. And it finally did.
I saw this situation several times during my research, and to me, it makes a lot of sense. Google is going to want to match users with content that closely matches their query.
If a user searches for something specific, they should get specific answers in return (and not the homepage of a site that targets the category). I'm assuming that had to be frustrating for users, which could impact bounce rate, dwell time, etc.
Again, low engagement is an invitation to the mighty Panda. Unfortunately, the site only had standard bounce rate set up to review and not adjusted bounce rate. See my points earlier about the power of ABR.

The Answer: Develop a Stronger Content Generation Plan

Since Panda 4.0, the site I analyzed is naturally getting pushed down by third-party content targeting specific queries. My recommendation to anyone who is experiencing this situation is to expand and strengthen your content strategy. Write high quality content targeting more than just head terms. Understand your niche, what people are searching for, and provide killer content that deserves to rank highly.
Don't rely on one page (like a homepage) to rank for all of your target keywords. That's a dangerous road to travel. High quality, thorough content about your category can naturally target thousands of keywords.

Recommendations and Next Steps

The scenarios listed above are just a few more examples of what I've seen while analyzing websites impacted by Panda 4.0. So what can you do now that Panda 4.0 has rolled out? There are some recommendations below.
If you've been impacted negatively by Panda 4.0, then you need to move quickly to analyze and rectify content quality problems. And if you have been positively impacted, don't sit back and assume that can last forever. I've seen many sites drop after once gaining traffic due to Panda updates.
Therefore, I have provided two sets of bullets below. One set is for those negatively impacted, while the other is for those experiencing a surge in traffic.
If you have been negatively impacted:
  • Have a Panda audit completed. Hunt down low-quality content, technical problems impacting content quality, engagement and usability problems, etc. I've always said that Panda should have been named "Octopus" since it has many tentacles. Thoroughly analyze your site to find major problems.
  • Move quickly to make changes. Don't sit back and over-analyze the situation. Find low quality content, understand the best way to proceed, and execute. Rapidly executing the right changes can lead to a faster recovery. Delaying action will only keep the Panda filter in place.
  • Keep driving forward and act like you aren't impacted by Panda. Keep producing high quality content, keep using social to get the word out, and keep driving strong referral traffic. The work you do now while impacted by Panda will help you on several levels. And the content you produce while being impacted could very well end up ranking highly once the Panda filter is lifted.
  • Since Panda rolls out monthly, you technically have a chance of recovery once per month. You might not see full recovery in one shot, but the more changes you implement, the more recovery you can see. It's another reason to move as quickly as you can.
If you have been positively impacted:
  • If you've taken action to recover from Panda, then you were obviously on the right track. Keep driving forward with that plan. Don't stop just because you recovered.
  • If you haven't taken action to recover, and you have seen an increase in Google traffic after Panda 4.0 rolled out, then don't assume that bump will remain. You should still understand the content quality risks you have in order to rectify them now (while you have stronger traffic). I've seen sites yo-yo with subsequent Panda updates. Band-aids do not yield long-term Panda recoveries. A strong SEO strategy will.
  • After seeing a recovery, review your top landing pages from Google organic prior to Panda being rolled out, look at user engagement, and analyze that content. You might find a strong recipe for "high quality" content based on your audience. Then you can use that as a model for content creation moving forward.
  • Review the keywords and keyword categories driving traffic to pages now that Panda 4.0 has rolled out. You might find interesting trends with how Google is matching queries with content on your site.
  • Continually crawl and analyze your content to pick up quality problems and technical problems. I do this for some of my clients on a monthly basis. And there are always findings that need to be addressed (especially for larger websites). You would be surprised what you can find. For example, I just found 100,000 pages that were once blocked and now can be crawled. That can be fixed quickly if picked up during a crawl analysis. Technical SEO checks should be ongoing.

Summary – Panda 4.0 Was Significant

Again, Panda 4.0 was a major algorithm update. There were many examples of sites recovering, while also many examples of sites getting crushed.
I plan to keep analyzing websites on both sides of the equation to better understand Panda 4.0 and its signature. And I'll be writing more posts covering my findings in the coming weeks.
Until then, good luck with your Panda work. Remember, the next update is only a few weeks away.

Tuesday, June 3, 2014

Salesforce's Marketing Secret: The Fourth Marketing P

Salesforce's Marketing Secret: The Fourth Marketing P

 
In his book Behind the Cloud: The Untold Story of How Salesforce went from Idea to Billion Dollar Company and Revolutionized an Industry, Marc Benioff shares the 111 plays he learned through Salesforce triumphant rise to the most valuable SaaS company in the world.
Play 15 is my favorite from the book. Benioff writes "position yourself either as the leader or against the leader in your industry." Play 15 highilghts the most frequently forgotten of marketing's four Ps, positioning. Positioning is easily forgotten because it's the least tangible of the four. Price immediatly impacts revenue. Product, well, everyone has a point of view on product. Placement in today's ecosystem means ad placements, most often. In performance marketing, the numbers speak for themselves.
Positioning can be amorphous. Without a concentrated focus on unique positioning, a company's persona in the market muddles with competition in a customer's mind. In Benioff's words, positioning means:
Every experience you give a journalist or potential customer must explain why you are different and incorporate a clear call to action. This does not require a large team or big budget; it just requires your time and focus.
Mastering positioning creates huge advantages for a company. Play 38 sums it up well. "Make Every Customer a Member of Your Sales Team." In other words, equip your champions to make a sale on your behalf. If the purpose of enterprise sales is helping customers get through their own internal buying processes, strong and clear positioning empowers internal evangelists to help close deals.
Getting back to Play 15, there's a simple brilliance to Benioff's advice of positioning a company either as the leader or the rival of the industry. The company instantly becomes part of every conversation, every blog post, every sales process. This is precisely what happened as Salesforce lobbed a stone at the Siebel's foot. With an unwavering resolve and a simple but powerful NO SOFTWARE positioning, Salesforce overtook the leader in the CRM market, Siebel Systems in less than a decade.
In particular, Salesforce employed guerrilla marketing tactics at SIebel events. In San Francisco, Salesforce hired mock protesters to proclaim the end of software, drawing crowds and police and making Salesforce a major topic at the Siebel conference. In Cannes, Benioff's team rented out quite literally all the taxis and converted them into mobile Salesforce marketing booths ferrying Siebel conference attendees from the airport to the venue. In San Diego, Salesforce replicated this play with bike rickshaws.
Because of the simple, powerful and clear positioning against the market leader, and a relentless insistence to be part of every Siebel event, Salesforce injected themselves into every buying conversation.
Positioning is one of those intangible marketing concepts. But the intangilibility shouldn't get in the way of understanding its potency to dramatically alter the the trajectory of a business. When combined with the three other ps, Positioning can pack quite a punch.

Wednesday, May 21, 2014

The Link Graph Conundrum: Why Citations Remain Critical to SEO Survival

The Link Graph Conundrum: Why Citations Remain Critical to SEO Survival

enge-eric
1 Comment
SEO Evolution: Sell, Discover, Deliver & Report on Highly Converting Keywords by Krista LaRiviere, gShift
It's a popularly held belief that the link graph is broken. This post will explore the roots of the problem, and why it is such a tough problem for Google and Bing to resolve.
The Link Graph Still Alive and Kicking
It all starts with the original Larry Page - Sergey Brin thesis. At the time they were developing this concept, the leading search engines of the time were almost solely dependent on keyword analysis of the content on your page to determine rankings. Spammers had so thoroughly assaulted this model that change had become an imperative, lest the concept of a search engine go the way of the dinosaurs.
Here are a couple of key sentences at the beginning of the thesis:
The citation (link) graph of the web is an important resource that has largely gone unused in existing web search engines. We have created maps containing as many as 518 million of these hyperlinks, a significant sample of the total. These maps allow rapid calculation of a web page's "PageRank", an objective measure of its citation importance that corresponds well with people's subjective idea of importance. Because of this correspondence, PageRank is an excellent way to prioritize the results of web keyword searches.
The concept of a "citation" (bolding above was mine, for emphasis) is a critical one. To understand why, let's step away from the web and consider the example of an academic research paper, which might include citations in them that look like this:
Academic Citations
Placement in this list is normally made by the writer of the paper to acknowledge major sources they referenced during the creation of their paper. If you did a study of all the papers on a given topic area, you could fairly easily identify the most important ones, because they would have the most citations (votes) by other papers.
Using a technique like the PageRank algorithm, you could build a citation graph where each of these "votes" were not counted equally (e.g., if a paper has a lot of citations, the votes it gives would count for more if they did not). And, just like the PageRank algorithm, you could apply the algorithm recursively to identify the most important papers. The reasons this works well in the academic citation environment are:
  1. Small Scale: The number of papers in a given academic space is reasonably finite. You might have hundreds, or thousands, of documents, not millions.
  2. No Incentive to Spam: You can't really buy a citation placement in an academic paper. If you were the author of a paper and had some illogical references in your citations, the perceived authority of your own paper would be negatively impacted.
  3. Small Communities: In a given area of academic research, all the major players know each other. Strange out of place behavior stands out in a way that it doesn't in an open chaotic environment like the web.

Citations and the Web

At the time of the Page-Brin thesis, the spammers of the world were attacking search engines using a variety of keyword stuffing techniques. Practical implementation of a link-based algorithm was a revelation, and it had a huge impact very quickly. The spammers of the world had not yet figured out how to assault the link based model.
As Google gained traction, this changed. Link buying and selling, large scale link swapping, blog and forum comment stuffing, and simply building huge sites and placing site-wide links on them were some of the many tactics that emerged.
Fast forward to 2014 and it appears that Google has partially won this battle. The reason we can say that they have partially won is that these days almost no one publishes articles in support of spammy link building tactics.
Unhappy Spammers
In fact, the concept of link building itself has been replaced with content marketing, which the overwhelming majority of people position as being about building reputation and visibility. This has happened because Google has gotten good at detecting enough of the spammers out there that the risks of getting caught are quite high. No business with investors or employees can afford to invest in spammy techniques because the downside risks aren't acceptable.
On the other hand, if you spend enough time studying search results, you can easily find many examples sites that use really bad link building practices ranking high in the search results for some terms. If you're playing by the rules and one of these people in outranking you, it can be infuriating.

Sources of the Problem

Why does this still happen? Part of the reason is that the web isn't at all like the world of academic papers. Here are some reasons why:
  1. Commercial Environment with High Stakes: Fortunes are made on the interwebs. People have a huge incentive to figure out how to rank higher in Google.
  2. Huge Scale: It was back in October/November 2012 that Google's Matt Cutts told me that Google knew about 100 trillion web pages. By now, that has to be more like 500 trillion.
  3. No Cohesive Community: The academic community would probably argue that they aren't as cohesive as one might think, but compared to the web there is a clear difference. There are all different types of people on the web, including those who are ignorant of SEO, those who have incorrect information on how it works, those who attempt to abuse SEO, and finally to those who try to do it the right way.
  4. User-Generated Content (UGC): Blog comments, forum comments, reviews, social media sites are all example of UGC in action. While Google tries to screen all of this out, and most of these platforms use the rel="NoFollow" attribute not all of them do. As a result, spammers implement algorithms to spew comments with rich anchor text references to their sites across the web.
  5. Advertising: The web is a commercial place. People sell advertising, and even if there intent is not to sell PageRank, many of them don't use nofollow attributes on the links and simply label the links as "Sponsored" or "Ads". Google is not always able to detect such labeling.
  6. Practical Anonymity: The chances of blowback if you link to a crappy site are much smaller than they are in the academic paper scenario. Because of the scale of the web, the advertising environment, and the structure of web content, a crappy link or two may just be seen as an ad, and the average visitor to a web page simply does not care.
  7. Complete Lack of Structure: Let's face it, the web is a chaotic place. The way sites are built, the way people interact with pages, the types of content, and the varying goals of such content lead to a web that has little real structure.
One Little Corner of the Web

Why Haven't Google and Bing Fixed This?

Of course the search engines are trying to fix it. Don't pay any attention to anyone who suggests otherwise.
Google lives in terror of someone doing to them what they did to Altavista. A fundamentally better algorithm would represent a huge threat to their business. And, of course, Bing would love to be the one to find such a new algo.
The money at stake here is huge, and both search engines are investing heavily in trying to develop better algorithms. The size of the spoils? The current market cap of Google is $356 billion.
The reason why they haven't fixed it is because they haven't figured out how to yet. Social media signals aren't the answer either. Nor is measuring user interaction with the SERPs, or on the pages of your site. These things might help, but search engines would have already started weighting them quite a bit more than they have if they were the answer.

What Does This Means To You?

Frankly, it's a tough environment. Here it is in a nutshell:
  1. Publishers that use crappy link building practices may outrank you on key terms, and they may stay there for a while.
  2. Google will continue to discover and punish bad tactics to the best of their ability, uneven though that may be. They do this well enough that any serious business just needs to stay away from such tactics (most likely that means you!).
  3. Search engines will keep looking for creative new ways to reduce their dependence on links. This will include more ways to use social media, user interaction signals, and other new concepts as well. However, Cutts says that links are here as a ranking factor for many more years.
  4. As search engines use more and more of these new signals we aren't going to get a roadmap as to what they are. Yes they patent new ideas all the time, but you won't know which patents they use, and which ones they don't. In addition, even when they use an idea from a published patent, the practical implementation of it will likely differ greatly from what you see in the patent.
Your Direction Might be Unclear
It isn't an ideal situation. Your best course of action? Focus your efforts and building your reputation and visibility online outside of the search engines. Ultimately, you want to build your own loyal audience. Here are a few ideas for doing that:
  1. Organic social media: Just recognize that this opportunity may be transient too. As we have seen, Facebook is reducing organic visibility in order to drive revenue growth. For that reason, new emerging social platforms are particularly powerful opportunities to get visible, provided that you pick the right horse to ride.
  2. Earned Media (Guest Posting): Cutts may have signalled the The Decay and Fall of Guest Blogging for SEO but writing regular columns on the top web sites in your market is something you should strive to do anyway. Don't view it as an SEO activity, its still a surefire way to build up reputation and visibility.
  3. Speaking at Conferences: This is a great technique as standing up in front of a room full of people and sharing your thoughts allows them to begin developing a connection with you.
  4. Writing Books or eBooks: Another traditional reputation builder, but a really good one. Don't underestimate the work in writing a book though. However hard you think it is, the reality is 4 to 10 times harder.
  5. Develop Relationships with Influential Media and Bloggers: Building meaningful relationships with other people that already have large audiences and adding value to their lives is always a good thing.
These activities will all give you alternative ways to build your reputation, visibility, and traffic. They also give you the best chance that your site will be sending out the types of signals that search engines want to discover and value anyway.
Ideally, your reputation will be so strong that Google's search results will be damaged in the event you aren't ranking for relevant terms, because searchers will be looking for you. You don't have to like the way the environment operates, but it's the environment we have. Complaining won't help you, so just go out and win anyway!