If you manage a website, you've likely heard of the term "canonical URL." But what exactly is a...
A Guide to Canonical URLs - A Beginners Guide to SEO
SEO is crucial for getting a website or page to rank favourably on search engine result pages. One of the things that can harm SEO is duplicate content. This happens when pages and sometimes websites with different URLs have the same content or content that is very similar. The issue is that search engines do not know which of the two pieces of content to index or show on their result pages. It also becomes impossible to know where to apply link metrics like trust and authority. Canonical URLs help solve this problem.
What are Canonical URLs?
A canonical URL is used to tell search engines which of the two pages with similar content is the main content, so they can index and rank that one. These links are known as such because they contain the element rel=“canonical”, which indicates the preferred webpage or even website for a piece of content.
Canonical URLs, also called canonical links, are especially important for e-commerce websites and websites using content management systems. In e-commerce websites, you might have two links like “/products/product-name” and “/products/category/product-name” pointing to the same content.
If you want that first URL to be the main one, you add the rel=“canonical“ element to it in the header section of the specific web page. The URL in the header will look something like this:
<link rel=“canonical” href=“/products/product-name” />
This element is only seen by search engines, has SEO benefits, and does not harm or change the user experience. All websites where there is a potential for duplicate content should be using them.
Why You Need Canonical Links
As mentioned, canonical links are crucial for dealing with duplicate content and telling search engines which page to index and rank on result pages. Apart from internal duplicate content, canonical links also help with the same issue with external duplicate content.
An example is where a writer provides a guest post to a blog they do not own or a syndication service. Adding a canonical link to the guest post or syndicated content helps eliminate any external duplicate content issues that could come up.
Canonical links can also help with backlinking. You want people who find your content and want to link to it to do so using the best URL. Since the canonical link will show up, they can pick that and thus help build the link equity and backlinking profile of the URL you prefer. For a better understanding of backlinks and how they help with your SEO, read this SEO beginners guide. It will help you get started with SEO quickly.
Lastly, canonical tools help simplify metric tracking. Your data will get confusing and fragmented if you have many links pointing to the same content. Simplify this process using canonical links, which also keep things simple when reporting performance data and metrics to SEO clients.
When to Use Canonical URLs
There is no scenario where it is a bad idea to include canonical URLs. However, you still need to know when and how to use this type of link.
The first instance is when you have a single version of a page, making the page unique. In such instances, you can use a canonical link that references the page itself. Using such a link tells search engines that the page in the URL is the only one and that it should be the only one indexed for that content.
Another instance is where a canonical link is required to reference another page, the main page that should be indexed. You may need to do this when:
- Multiple versions of a page were created intentionally
- Pages are very similar with only slight differences
- The link includes query parameters like “&product=product-name” but you still want it to point to a specific page
Following our e-commerce example, product pages with query parameters that dictate different languages such as “&lang=uk” and “&lang=de” will have different links but might point to the same product. Product variations can also lead to duplicate content, with links like “/product/shoes/men-38-red” and “/product/shoes/men-38-white” pointing to the same page but showing up as two different URLs.
Choosing The Canonical URL Structure
Even in cases where you don't think you have duplicate content, the scenarios discussed above can lead to the issue. It is therefore better to have canonical links than not to have them. To get started, you need to consider the different variations you could use.
Consider the following URLs:
- https://example.com/product-name
- https://www.example.com/product-name
- http://example.com/product-name
- http://www.example.com/product-name
All point to the same page but introduce a duplicate content issue. This is before we start talking about URLs with or without a slash (/) at the end.
To avoid all this, best practices dictate that you choose the versions with “https” and without the trailing slash at the end. These URLs are secure and do not present any complications wherever they are placed.
It is also important to consider which URL you think is more important. There is a lot of discussion surrounding URL structures, but it is better to pick a structure that is easy to copy and remember. For example, it is much easier to work with “/products/product-name” than “/products/men/shoes/product-name”.
Adding The Canonical Element
This is perhaps the easiest part of the whole process. Go to the pages you do not want to be the main page and add rel=“canonical“ to the preferred link to the head. For our e-commerce website, we would add the following to the head section:
<link rel=“canonical“ href=“/products/product-name” />
Do the same for the other links and you are done. Different links pointing to the same content but with different canonical links will now be treated as one; the canonical URL.
Using Canonical Links In Copied Content
There are some cases where you cannot avoid copying content from a main site verbatim. You should always treat your website as hosting duplicate content and add the canonical link to the original content.
However, adding a canonical link to such content is not always necessary, and search engines will likely not penalise you for it. Why? Search engines have become much smarter at identifying the type of content added to a website, so they know a press release or other type of content is likely to be copied.
The second reason is that search engines are much better at identifying the primary source of a piece of content. They use various sources of information, including timestamps in sitemaps to find this information.
Avoiding the use or addition of canonical links in copied content might work in specific cases, but it is sometimes considered grey-hat SEO. This means that even though it is an option, it is best to avoid it.
Things To Keep in Mind When Using Canonical Links
We mentioned that search engines “merge” two or more links so they act like the main canonical link. When this happens, these URLs act as soft redirects to the main page. In some cases, you might want to consider where a 301 redirect would be better for the job.
Using a 301 redirect says there is only one copy of your content and you redirect everyone there. Using a canonical link says you have multiple page versions people can view, but search engines will always be sent to the main or preferred URL.
Another thing to think about is whether you want to stop search engines from accessing pages with similar or duplicate content. You can do this on your website by using a robots.txt file. This can be a better solution in some cases, especially for pages you never want to be found, but it is not always the best solution.
Blocking a page like this can mean you miss out on content signals, engagement signals, and other signals and metrics that might have contributed to the ranking of the original page. Unless there is a technical need, it is best to use a canonical URL before using a rule in your robots.txt file.
Lastly, some people consider deleting pages with duplicate content as “cleaning things up”. This inclination is understandable, but it can be harmful. Consider whether people might have used your non-canonical URL in a bookmark or social media profile and if someone wants to reference it later. If you delete it and they do, they will land on a 404 page. If enough people do this, you will see a hit to your search engine rankings.
If you must delete pages with duplicate content, maybe because you have updated the main page, think about providing a 301 redirect that does not break the user experience.
Canonical URLs are crucial for dealing with duplicate content. Used right, they can help search engines rank the right pages. To ensure they are as effective as possible, website owners and SEOs should know how to create and leverage them in different situations and in different ways. They can also be used alongside other tools like redirects to give your visitors the best user experience.