Forcing Lowercase URLs in Optimizely CMS During Auto-Translation

Auto-translation tools like EPiServer.Labs.LanguageManager are fantastic for scaling multilingual sites, but they often come with a frustrating side effect they play fast and loose with your URL segments.

Recently, I ran into an issue where translated URL segments were retaining uppercase letters and, even worse including stray punctuation. For example, an English page titled "Cheese Cheese" was being translated to "Käse-,Käse" instead of the clean, SEO-friendly "käse-käse".

Here is why this happens and how to fix it by intercepting the translation workflow.

Bypassing the Standard Pipeline

Usually, you'd expect UrlSegmentOptions or a custom IUrlSegmentGenerator to handle this. However, the transation tool will bypass the standard Optimizely URL generation pipeline

Direct Assignment
- It will set the URLSegment property directly on the content object.
Property Overwriting
- After creating a new language version, the tool copies properties from the source language, often overwriting any values set during initial page initialization.
Configuration Ignored
- Because the segment is set manually by the tool, standard UrlSegmentOptions (like UseLowercase) are simply ignored.

A Multi-Part Event Interceptor

To fix this, we need to intercept the content at the right moment, after the translation tool has done its work, but before the data is committed to the database.

The Custom URL Generator

First, we need a generator that handles Unicode characters (to support Japanese, Arabic, Cyrillic, etc.) while forcing everything to lowercase.

public class LowercaseUrlSegmentGenerator : IUrlSegmentGenerator
{
    public string Create(string source, UrlSegmentOptions options)
    {
        if (string.IsNullOrWhiteSpace(source)) return "page";

        var segment = source.ToLowerInvariant();

        // Replace spaces with hyphens
        segment = segment.Replace(' ', '-');

        // Clean up: Keep Unicode letters (\p{L}), numbers (\p{N}), and standard URL chars
        segment = Regex.Replace(segment, @"[^\p{L}\p{N}\-_\.~\$]", "");

        // Remove multiple/leading/trailing hyphens
        segment = Regex.Replace(segment, "-+", "-");
        segment = segment.Trim('-', '.');

        return string.IsNullOrWhiteSpace(segment) ? "page" : segment;
    }

    public bool IsValid(string urlSegment, UrlSegmentOptions options)
    {
        return !string.IsNullOrWhiteSpace(urlSegment) && 
               Regex.IsMatch(urlSegment, @"^[\p{L}\p{N}\-_\.~\$]+$");
    }
}

services.AddSingleton<IUrlSegmentGenerator, LowercaseUrlSegmentGenerator>();

Tracking the State

We don't want to constantly overwrite URLs (especially if an editor has manually changed one). We need a tracking property on our BasePageData.

Note: The [CultureSpecific] attribute is critical here. Each language version must track its own "processed" state independently.

public class BasePageData : PageData
{
    [Display(Name = "URL segment processed", GroupName = "Settings")]
    [CultureSpecific] 
    public virtual bool? UrlSegmentProcessed { get; set; }
}

The Event Handler

This is where we detect if the page was just created by a translation tool. We hook into the SavingContent event.

[InitializableModule]
[ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
public class UrlSegmentEventHandler : IInitializableModule
{
    private IContentEvents _contentEvents;
    private IUrlSegmentGenerator _urlSegmentGenerator;

    public void Initialize(InitializationEngine context)
    {
        _contentEvents = context.Locate.Advanced.GetInstance<IContentEvents>();
        _urlSegmentGenerator = context.Locate.Advanced.GetInstance<IUrlSegmentGenerator>();
        _contentEvents.SavingContent += OnSavingContent;
    }

    private void OnSavingContent(object sender, ContentEventArgs e)
    {
        if (e.Content is BasePageData basePage)
        {
            // The "Detection" Logic
            if (basePage.UrlSegmentProcessed == null && 
                basePage.Status == VersionStatus.CheckedOut && 
                basePage.StartPublish == null &&
                basePage.Created > DateTime.Now.AddMinutes(-5))
            {
                basePage.UrlSegmentProcessed = false;
            }

            if (basePage.UrlSegmentProcessed == false && !string.IsNullOrWhiteSpace(basePage.URLSegment))
            {
                RegenerateUrlSegment(e.Content, basePage);
            }
        }
    }

    private void RegenerateUrlSegment(IContent content, BasePageData basePage)
    {
        var options = new UrlSegmentOptions { SupportIriCharacters = true };
        var newSegment = _urlSegmentGenerator.Create(content.Name, options);
        
        basePage.URLSegment = newSegment;
        basePage.UrlSegmentProcessed = true;
    }

    public void Uninitialize(InitializationEngine context) 
        => _contentEvents.SavingContent -= OnSavingContent;
}

The secret is in the detection logic inside OnSavingContent. We look for four specific criteria

UrlSegmentProcessed == null: The property hasn't been set yet.
Status == CheckedOut: The content is a draft.
StartPublish == null: The page has never been live.
Created > DateTime.Now.AddMinutes(-5): The page was created in the last 5 minutes.

This specific combination allows us to target brand-new translations without accidentally touching old drafts or existing published content from years ago. We could probably reduce the timecheck as the translation happens pretty much instantly, but 5 minutes felt like a fair balance.

Conclusion

Ultimately, this solution is a workaround for how the Labs translation tool handles URL segments. By bypassing the standard IUrlSegmentGenerator pipeline, the tool leaves us with inconsistent, non-standard URLs that aren't ideal for a production site.

However, by hooking into the SavingContent event and using a bit of intelligent state-tracking, we can force the system back into a predictable state. It gives us exactly what we want, a fully automated translation workflow that still respects the clean, lowercase URL standards required for a professional multilingual site.