Content authoring excellence when handling multiple countries and languages in Sitecore
Following my previous post, in which I explained the conceptual basics & recommended principles for building global multilingual websites, we will do a deep-dive regarding your first type of stakeholder: the "Sitecore CMS users" (content managers) and how to master their challenges for content authoring multi-country and multi-language websites on the Sitecore Experience platform.
What areas turn out to be challenging when adding multilingual content using Sitecore?
1) Using Sitecore's built-in language fallback mechanisms to one's advantage
2) Cleverly setting up shared & final page layouts for delivering pages in multiple language cultures
3) Properly allowing the use of custom “fake” language cultures
4) Mastering Solr search: making sure content in every language culture – also custom cultures – can be indexed & delivered for searches
5) Providing your content managers with additional tools to simplify their content authoring experience and managing translations
Let's go through these topics in detail
How does the Sitecore language fallback work
There is an extensive official documentation by Sitecore about the language fallback mechanism – especially how to configure it: Enable and set up language fallback
Simply explained, the language fallback allows inheriting or provisioning once added layout- or content-data from various language cultures with other cultures or versions of your pages and components. The following scheme shows how various cultures can be built to have a relation, and thus inheriting or displaying content, if it is not available in that particular version, but in one of it's relatives.
[caption id="attachment_5050" align="alignnone" width="300"] Sitecore's language fallback principles simply explained[/caption]
Hint: once you identified all countries and languages you need to serve content in with your website (you should have this in a language concept document), you should define these relations and fallback scenarios between the identified cultures.
Deciding on how to use shared & final layouts and fields
The key for this topic is to decide and act early: you will lay a major cornerstone with this decision for all your content managers & your content authoring process!
Basically you can ask yourself and discuss with stakeholders:
Will all webpages share the same layout ("design", placement of elements) across every language version…
…or will we have individual arrangements of elements on pages, based on their target language/country/market?
For the former, use the so called "Shared layout". For the latter, you will need to work with the "Final layout". To simplify life for your content managers, use the Sitecore configuration to define the shared or final layout as the default "mode" they will work in!
But it doesn't stop on that level: make smart use of the field-level “Shared”-setting in Sitecore! Shared means, that a field has the same value across all versions in all languages (remark: it will also overwrite the versioned / unversioned flag, in case you plan to use version-history for changes on a field level).
[caption id="attachment_5051" align="alignnone" width="300"] An example of a "Shared" field in Sitecore[/caption]
So it is obvious to only use this setting for fields which hold language independent values. Such as e.g. a configuration parameter or an e-mail address. In case of doubt, don’t enable this checkbox – because it will be a mess when your global content management teams start adding their content, and continuously overwriting each others filled-in values…
Working with custom cultures in Sitecore
By default, you can only add and work with language cultures in the Sitecore CMS, which are also registered in the underlaying Windows operating system. But it might happen, that you need to have a "special case" to deliver content to – this is where we talk about "custom cultures"; basically any combination of language and country parameters, which are not officially recognized for that particular country. Let me give you some examples: e.g. English for France, Spanish in the US, German in the Netherlands, and so on.
Therefore you always need to register custom cultures in Windows first, before making them available in the Sitecore backend! Keep in mind: a culture not only specifies it's language and country relation, but also what time and date- plus number formats to use!
In order to make custom cultures available for your website, you will need to…
- use a PowerShell script on the Windows OS-level to register them
- and also update Sitecore’s App_Config/LanguageDefinitions.config (this is required since Sitecore 9.0)
- adding the language in the Sitecore backend under /sitecore/system/languages (or via the Control Panel using "Add new language")
[caption id="attachment_5057" align="alignnone" width="300"] How to add new language cultures in Sitecore[/caption]
Teaching Solr-search to understand your website's languages
Not only when you do need to have custom cultures on your website, but also out of the box, there are some culprits with Solr search in order that it properly understands and (better) handles content in various languages. If you don't "help" Solr in terms of multiple cultures, you will see a lot of Solr exceptions in the log files, that could cause your pages to be not indexed at all or no search results showing up.
- There are some widely used, but missing languages in Solr out-of-the-box: Chinese (zh), Korean (ko), Polish (pl), Slovak (sk)
- There are some out-of-the-box supported cultures in Solr but with wrong language identifier: Czech (is "cz" => should be "cs"), Norwegian (is "no" => should be "nb")
- Consider that some language depending features in Solrwon't work out-of-the-box, such as Tokenizers (e.g.HMMChineseTokenizerFactory), Stopwords & Stemmers (e.g. StempelPolishStemFilterFactory, …)
- As already mentioned: you have to introduce your custom cultures to Solr, too, by modifying it's managed-schema file.
Read more on that in this excellent post by my colleague, Fabian: Adding new languages to Sitecore’s Solr indexes
Hint: really keep this in mind, that your search solution – Solr in this case – does not *automagically* understand your content in all languages. We often see that customers freely add new languages in Sitecore, because it's fairly easy, but then search is suddenly broken/not working anymore…
Helpful tools for content authoring & translations
When your content managers will face the challenge to work with multiple languages cultures in Sitecore, there are some helpful tools that can greatly increase content authoring efficiency – and thus you should plan implementing these.
- automated content translation with your preferred external agencies
- the “Language Checklist Report Tool” available on the Sitecore Marketplace: Sitecore_Language_Checklist_Report_Tool.aspx
- We at Namics built a public Sitecore module for easily “Copy-Content-to-Languages”: namics/SitecoreCopyPageToVersions
it allows e.g. copying whole page content from version en-US to en-GB, en-AU, etc. at once. - If shared & final layouts are both used, allow content managers to merge the final layout into the shared layout using the Sitecore PowerShell Extension, which comes with a Merge-Layout command for that. Example code snippet:
Get-ChildItem master:\content\Showcase\int -Recurse | Merge-Layout
What's next?
In the next post of this series you will learn what steps to take for achieving high international targeted SEO (search engine optimizations) for global, multilingual websites delivered through the Sitecore Experience platform.
Background story
This October we – my colleague Fabian Geiger & myself – had the pleasure to hold a breakout session at the Sitecore Symposium 2018 in Orlando. This blog post is a summary and follow up from the knowledge shared for building globally focused websites based on the Sitecore Experience Platform, with multiple languages and country cultures. Consider this a non conclusive compilation of learnings and recommendations from our project work at Namics.