Securing mailto links using Sitecore Publish Replacer

Posted 28 Aug 2024 by Marek Musielak

obfuscate mailto links with sitecore publish replacer

Including mailto: links on your website can be a great way to make it easy for potential clients to contact you. However, it also opens up the risk of your email address being harvested by malicious bots and added to spam lists. This blog post will explain how you can protect your email address in mailto: links, using the Sitecore Publish Replacer functionality.

I wrote this blog post while working for Blastic, a company that delivers great Sitecore solutions and much more.

One way to ensure that malicious bots cannot scrape email addresses from mailto: links on your website is by replacing the mailto: href attribute with a custom data attribute and triggering JavaScript code that reads the attribute and initiates the mailto: action on click. The solution I'm going to show is even simpler - it replaces the mailto: href with a javascript: href and splits the email address into a few parts. It's a basic solution, and depending on your requirements, you may want to consider something more complex.

To achieve automated conversion of mailto: links into javascript: links, I used Sitecore Publish Replacer. This is a functionality that not many are familiar with. It does not require any code, only configuration. In the Sitecore configuration, you can find the <replacers> section. You can configure multiple replacers, each of which may have multiple simple or regex replacements defined. However, there is one replacer called publish that is treated by Sitecore in a special way - it's called for each item that is being published. See the following configuration:

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <replacers>
      <replacer id="publish">
        <patch:delete />
      </replacer>
      <replacer id="publish" type="Sitecore.Text.Replacer, Sitecore.Kernel" singleInstance="true">
        <param desc="name">$(id)</param>
        <replacements hint="raw:AddReplacement">
          <regex 
            find="(href=([&quot;'])mailto:((?:(?!\2).)+)@((?:(?!\2).)+)\.((?:(?!\2).)+)\2)" 
            replaceWith="href=&quot;javascript:window.location.href = 'mailto:' + '$3' + '@' + '$4' + '.' + '$5'&quot;" 
            simpleTest="mailto:" 
            ignoreCase="false" 
            forPublish="true" />
        </replacements>
      </replacer>
    </replacers>
  </sitecore>
</configuration>

You can see a single replacer called publish with a single regex replacement. It uses a regular expression to find strings like

href="mailto:[email protected]"

and convert them into

href="javascript:window.location.href = 'mailto:' + 'marek' + '@' + 'hello' + '.' + 'world'"

Additionally, I added the simpleTest="mailto:" parameter. This is a small performance improvement that ensures the regex is not triggered if the field value does not contain the mailto: phrase. Now, when any item is published, field values are automatically replaced, and crawlers don't see the email address. However, when a visitor clicks the link, it still executes the default mailto: action.

You may ask why I decided to use Sitecore Publish Replacer to achieve this. First of all, it allows content authors to create regular mailto: links on the website without worrying about manually obfuscating the links. Publish Replacer is executed in the PerformAction processor, which is part of the <publishItem> pipeline for each version of any item that is being published. It's totally transparent for content authors. And I think it's always good to learn something new about built-in functionalities in Sitecore, so I decided to shed some light on Sitecore Replacers.

In this blog post, I demonstrate how to protect your email address in mailto: links from being harvested by malicious bots using Sitecore Publish Replacer. By converting mailto: links into JavaScript-based links during the publishing process, you can keep your email address hidden while still allowing visitors to easily contact you. This solution is simple, requires no coding, and is transparent for content authors, making it an effective way to enhance the security of your website.

Comments? Find me on or Sitecore Chat