I’m afraid that "keyword cannibalization" takes place when for example I have the following two pages on my site: /en/shoe and /en-us/shoe. Can I assume with absolute certainty that Google’s search engine in the United States only shows /en-us/shoe in the serp when it is clear to the search engine that the person doing the search with the keyword "shoe" speaks english and is in the United States. And when it is only clear to the search engine that the person speaks english that only the /en/shoe page is shown in the serp. And when both are not known the "x-default" page is the only one indexed in the serp for that person. I understand that this situation – if it can occur at all – can only occur if hreflang html tags are perfectly applied on the site.
The perfect application hreflang should look like this if I have understood everything correctly.
You can write the word ‘shoe’ in the following languages:
- French (fr) : chaussure
- Spanish (es) : zapato
My fictitious site has the following pages with the corresponding hreflangs:
- /en/shoe (en) and (x-default)
- /es/zapato (es)
- /fr/chaussure (fr)
- /en-us/shoe (en-us)
- /en-ca/shoe (en-ca)
- /en-uk/shoe (en-uk)
- /es-us/zapato (es-us)
- /es-ca/zapato (es-ca)
- /es-uk/zapato (es-uk)
- /fr-us/chaussure (fr-us)
- /fr-ca/chaussure (fr-ca)
- /fr-uk/chaussure (fr-uk)
And the next href long structure has each of the pages mentioned above.
<head> <title>Wha efa</title> <link rel="alternate" hreflang="x-default" href="http://example.com/en/shoe" /> <link rel="alternate" hreflang="en" href="http://example.com/en/shoe" /> <link rel="alternate" hreflang="es" href="http://example.com/es/zapato" /> <link rel="alternate" hreflang="fr" href="http://example.com/fr/chaussure" /> <link rel="alternate" hreflang="en-us" href="http://example.com/en-us/shoe" /> <link rel="alternate" hreflang="en-ca" href="http://example.com/en-ca/shoe" /> <link rel="alternate" hreflang="en-uk" href="http://example.com/en-uk/shoe" /> <link rel="alternate" hreflang="es-us" href="http://example.com/es-us/zapato" /> <link rel="alternate" hreflang="es-ca" href="http://example.com/es-ca/zapato" /> <link rel="alternate" hreflang="es-uk" href="http://example.com/es-uk/zapato" /> <link rel="alternate" hreflang="fr-us" href="http://example.com/fr-us/chaussure" /> <link rel="alternate" hreflang="fr-ca" href="http://example.com/fr-ca/chaussure" /> <link rel="alternate" hreflang="fr-uk" href="http://example.com/fr-uk/chaussure" /> </head>
Before I go any further, it is important to note the following.Every page about "shoe" that is written in english has almost exactly the same text. So it’s duplicate text.The same goes for Spanish and French pages on the subject of "shoe". However, the countries have to be distinguished, because that is a requirement because the shoes are the only thing that is country specific in a manner of speaking. Using only the /en, /es and /fr variants is therefore not sufficient.
So I am looking for sources and facts that conclusively establish that keyword cannibalization will not take place. Hopefully you can provide them. 🙂
Do you need the /en directory? Is that for customers from Australia and New Zealand? Do the pricing and shipping options even make sense for them? – Stephen Ostermiller♦ 4 hours ago
Very good point. Thank you for your comment. In practice I will also have an /en-au/shoe and /en-nz/shoe page. It could be that someone in a non-English speaking country searches with the keyword "shoe" and then these people can go to /en/shoe. That is the idea behind it. Because what if someone in Pakistan is looking for the keyword "shoe"? That person is not supposed to land on the /en-us/shoe page.
Just so we’re clear. In my question I gave the impression that this is a webshop site, but the idea behind my question is that people should land on pages that best (!!) match the language they speak (and use to google) and where they come from. And sometimes not all this information is there for the search engine and therefore instead of showing /en-us/shoe it can show /en/shoe. So I am willing to make all those different pages for the same keyword but only if there is no keyword cannibalization.
To sum up: I’m afraid that with all these different pages with different geo-targets and languages the google algorithm will lead to confusion and may show up in the search results in the united states for the keyword "shoe" several variants of the same page –> /shoe, /en/shoe, /en-us/shoe for example. This is that cannibalization I’m so afraid of.