The Link Graph Conundrum: Why Citations Remain Critical to SEO Survival
It's a popularly held belief that the link graph is broken. This post will explore the roots of the problem, and why it is such a tough problem for Google and Bing to resolve.
It all starts with the original Larry Page - Sergey Brin thesis. At the time they were developing this concept, the leading search engines of the time were almost solely dependent on keyword analysis of the content on your page to determine rankings. Spammers had so thoroughly assaulted this model that change had become an imperative, lest the concept of a search engine go the way of the dinosaurs.
Here are a couple of key sentences at the beginning of the thesis:
The citation (link) graph of the web is an important resource that has largely gone unused in existing web search engines. We have created maps containing as many as 518 million of these hyperlinks, a significant sample of the total. These maps allow rapid calculation of a web page's "PageRank", an objective measure of its citation importance that corresponds well with people's subjective idea of importance. Because of this correspondence, PageRank is an excellent way to prioritize the results of web keyword searches.
The concept of a "citation" (bolding above was mine, for emphasis) is a critical one. To understand why, let's step away from the web and consider the example of an academic research paper, which might include citations in them that look like this:
Placement in this list is normally made by the writer of the paper to acknowledge major sources they referenced during the creation of their paper. If you did a study of all the papers on a given topic area, you could fairly easily identify the most important ones, because they would have the most citations (votes) by other papers.
Using a technique like the PageRank algorithm, you could build a citation graph where each of these "votes" were not counted equally (e.g., if a paper has a lot of citations, the votes it gives would count for more if they did not). And, just like the PageRank algorithm, you could apply the algorithm recursively to identify the most important papers. The reasons this works well in the academic citation environment are:
- Small Scale: The number of papers in a given academic space is reasonably finite. You might have hundreds, or thousands, of documents, not millions.
- No Incentive to Spam: You can't really buy a citation placement in an academic paper. If you were the author of a paper and had some illogical references in your citations, the perceived authority of your own paper would be negatively impacted.
- Small Communities: In a given area of academic research, all the major players know each other. Strange out of place behavior stands out in a way that it doesn't in an open chaotic environment like the web.
Citations and the Web
At the time of the Page-Brin thesis, the spammers of the world were attacking search engines using a variety of keyword stuffing techniques. Practical implementation of a link-based algorithm was a revelation, and it had a huge impact very quickly. The spammers of the world had not yet figured out how to assault the link based model.
As Google gained traction, this changed. Link buying and selling, large scale link swapping, blog and forum comment stuffing, and simply building huge sites and placing site-wide links on them were some of the many tactics that emerged.
Fast forward to 2014 and it appears that Google has partially won this battle. The reason we can say that they have partially won is that these days almost no one publishes articles in support of spammy link building tactics.
In fact, the concept of link building itself has been replaced with content marketing, which the overwhelming majority of people position as being about building reputation and visibility. This has happened because Google has gotten good at detecting enough of the spammers out there that the risks of getting caught are quite high. No business with investors or employees can afford to invest in spammy techniques because the downside risks aren't acceptable.
On the other hand, if you spend enough time studying search results, you can easily find many examples sites that use really bad link building practices ranking high in the search results for some terms. If you're playing by the rules and one of these people in outranking you, it can be infuriating.
Sources of the Problem
Why does this still happen? Part of the reason is that the web isn't at all like the world of academic papers. Here are some reasons why:
- Commercial Environment with High Stakes: Fortunes are made on the interwebs. People have a huge incentive to figure out how to rank higher in Google.
- Huge Scale: It was back in October/November 2012 that Google's Matt Cutts told me that Google knew about 100 trillion web pages. By now, that has to be more like 500 trillion.
- No Cohesive Community: The academic community would probably argue that they aren't as cohesive as one might think, but compared to the web there is a clear difference. There are all different types of people on the web, including those who are ignorant of SEO, those who have incorrect information on how it works, those who attempt to abuse SEO, and finally to those who try to do it the right way.
- User-Generated Content (UGC): Blog comments, forum comments, reviews, social media sites are all example of UGC in action. While Google tries to screen all of this out, and most of these platforms use the rel="NoFollow" attribute not all of them do. As a result, spammers implement algorithms to spew comments with rich anchor text references to their sites across the web.
- Advertising: The web is a commercial place. People sell advertising, and even if there intent is not to sell PageRank, many of them don't use nofollow attributes on the links and simply label the links as "Sponsored" or "Ads". Google is not always able to detect such labeling.
- Practical Anonymity: The chances of blowback if you link to a crappy site are much smaller than they are in the academic paper scenario. Because of the scale of the web, the advertising environment, and the structure of web content, a crappy link or two may just be seen as an ad, and the average visitor to a web page simply does not care.
- Complete Lack of Structure: Let's face it, the web is a chaotic place. The way sites are built, the way people interact with pages, the types of content, and the varying goals of such content lead to a web that has little real structure.
Why Haven't Google and Bing Fixed This?
Of course the search engines are trying to fix it. Don't pay any attention to anyone who suggests otherwise.
Google lives in terror of someone doing to them what they did to Altavista. A fundamentally better algorithm would represent a huge threat to their business. And, of course, Bing would love to be the one to find such a new algo.
The money at stake here is huge, and both search engines are investing heavily in trying to develop better algorithms. The size of the spoils? The current market cap of Google is $356 billion.
The reason why they haven't fixed it is because they haven't figured out how to yet. Social media signals aren't the answer either. Nor is measuring user interaction with the SERPs, or on the pages of your site. These things might help, but search engines would have already started weighting them quite a bit more than they have if they were the answer.
What Does This Means To You?
Frankly, it's a tough environment. Here it is in a nutshell:
- Publishers that use crappy link building practices may outrank you on key terms, and they may stay there for a while.
- Google will continue to discover and punish bad tactics to the best of their ability, uneven though that may be. They do this well enough that any serious business just needs to stay away from such tactics (most likely that means you!).
- Search engines will keep looking for creative new ways to reduce their dependence on links. This will include more ways to use social media, user interaction signals, and other new concepts as well. However, Cutts says that links are here as a ranking factor for many more years.
- As search engines use more and more of these new signals we aren't going to get a roadmap as to what they are. Yes they patent new ideas all the time, but you won't know which patents they use, and which ones they don't. In addition, even when they use an idea from a published patent, the practical implementation of it will likely differ greatly from what you see in the patent.
It isn't an ideal situation. Your best course of action? Focus your efforts and building your reputation and visibility online outside of the search engines. Ultimately, you want to build your own loyal audience. Here are a few ideas for doing that:
- Organic social media: Just recognize that this opportunity may be transient too. As we have seen, Facebook is reducing organic visibility in order to drive revenue growth. For that reason, new emerging social platforms are particularly powerful opportunities to get visible, provided that you pick the right horse to ride.
- Earned Media (Guest Posting): Cutts may have signalled the The Decay and Fall of Guest Blogging for SEO but writing regular columns on the top web sites in your market is something you should strive to do anyway. Don't view it as an SEO activity, its still a surefire way to build up reputation and visibility.
- Speaking at Conferences: This is a great technique as standing up in front of a room full of people and sharing your thoughts allows them to begin developing a connection with you.
- Writing Books or eBooks: Another traditional reputation builder, but a really good one. Don't underestimate the work in writing a book though. However hard you think it is, the reality is 4 to 10 times harder.
- Develop Relationships with Influential Media and Bloggers: Building meaningful relationships with other people that already have large audiences and adding value to their lives is always a good thing.
These activities will all give you alternative ways to build your reputation, visibility, and traffic. They also give you the best chance that your site will be sending out the types of signals that search engines want to discover and value anyway.
Ideally, your reputation will be so strong that Google's search results will be damaged in the event you aren't ranking for relevant terms, because searchers will be looking for you. You don't have to like the way the environment operates, but it's the environment we have. Complaining won't help you, so just go out and win anyway!