Runbook技術サイト
技術メモ
日本語のGoogle Fontsを自分でいい感じに分割してセルフホストする

日本語のGoogle Fontsを自分でいい感じに分割してセルフホストする

よく知られていることとして、Google Fontsでは、読み込みのパフォーマンスを向上させるために、フォントファイルを120個程度のサブセットに分割して、使われているコードポイントのファイルだけがダウンロードされる仕組みになっています。なので素直にGoogleのCDNからフォントを利用するぶんには、何も意識しなくてもCSSファイルを読み込むだけで勝手に最適化してくれるわけですが、フォントを自前でホスティングしたいときはどうしようかなという話になります。例えばNotoSansJPのフォントファイル（.ttl）をそのままダウンロードすると、5.7MBあります。ボールドフォントもとなると、単純にその倍のサイズのフォントが必要になります。Google FontsのCDNからスライスされたファイルをごそっとダウンロードすればいいかもしれませんが、やはり自前でスライスしてホスティングできるのに越したことはありません。

生成したいもの

各フォント、ウェイトごとのCSSファイル
フォントをあらかじめ定義されたコードポイントの範囲でスライスしたファイル

用意しないといけなさそうなもの

各フォントの.ttfファイル
スライスごとのUnicodeコードポイントの範囲を定義したファイル

ではさっそくやってみましょう。まず必要なのは分割前の各フォントの.ttfファイルです。これはGoogle FontsのAPIを使うことで取得できます。

Developer API | Google Fonts | Google for Developers

developers.google.com

エンドポイントは以下です。APIキーはGoogleのデベロッパーコンソールから取得する必要があります。

https://www.googleapis.com/webfonts/v1/webfonts?key=YOUR-API-KEY

APIから、フォントの定義情報をJSON形式で取得することができます。

    {
      "family": "Noto Sans JP",
      "variants": [
        "100",
        "200",
        "300",
        "regular",
        "500",
        "600",
        "700",
        "800",
        "900"
      ],
      "subsets": [
        "cyrillic",
        "japanese",
        "latin",
        "latin-ext",
        "vietnamese"
      ],
      "version": "v53",
      "lastModified": "2024-08-07",
      "files": {
        ...
        "regular": "https://fonts.gstatic.com/s/notosansjp/v53/-F6jfjtqLzI2JPCgQBnw7HFyzSD-AsregP8VFBEj75vY0rw-oME.ttf",
        ...
      },
      "category": "sans-serif",
      "kind": "webfonts#webfont",
      "menu": "https://fonts.gstatic.com/s/notosansjp/v53/-F6jfjtqLzI2JPCgQBnw7HFyzSD-AsregP8VFBEj35rS1g.ttf"
    },

ここに各ウェイトごとの.ttfファイルのダウンロードURLが書かれています。

次に、スライスごとのUnicodeコードポイントの範囲を定義しないといけません。これは日本語であればどのフォントであっても共通なので、適当なGoogle FontsのCSSファイルからコードポイントを取得することにします。

通常Google Fontsでは、APIからCSSファイルを読み込みます。このAPIは指定されたフォントセットと、ブラウザのUser AgentからCSSを生成するしくみになっています。
https://fonts.googleapis.com/css2?family=Noto+Sans+JP

スタイルシートの内容は以下のような@font-faceから成ります。これはNoto Sans JPの例です。この情報から、コードポイントの範囲を抜き出してやればいいわけですね。

/* [0] */
@font-face {
  font-family: 'Noto Sans JP';
  font-style: normal;
  font-weight: 400;
  src: url(https://fonts.gstatic.com/s/notosansjp/v53/-F6jfjtqLzI2JPCgQBnw7HFyzSD-AsregP8VFBEj757Y0rw_qMHVdbR2L8Y9QTJ1LwkRmR5GprQAe69m.0.woff2) format('woff2');
  unicode-range: U+25ee8, U+25f23, U+25f5c, U+25fd4, U+25fe0, U+25ffb, U+2600c, U+26017, U+26060, U+260ed, U+26222, U+2626a, U+26270, U+26286, U+2634c, U+26402, U+2667e, U+266b0, U+2671d, U+268dd, U+268ea, U+26951, U+2696f, U+26999, U+269dd, U+26a1e, U+26a58, U+26a8c, U+26ab7, U+26aff, U+26c29, U+26c73, U+26c9e, U+26cdd, U+26e40, U+26e65, U+26f94, U+26ff6-26ff8, <省略>;
}

フォント定義と、コードポイントの定義を出力するプログラムは以下のようになりました。

get_ranges.py

import os
import re
import json
import requests

def download_file(url, filename):
    # Download the file with user agent
    headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36'
    }
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        with open(filename, 'wb') as f:
            f.write(response.content)
    else:
        raise Exception(f"Failed to download the file: {url}")

def extract_unicode_ranges(file_path):
    unicode_ranges = []

    with open(file_path, 'r') as file:
        content = file.read()

        # Collect all unicode-range values
        matches = re.findall(r'unicode-range:\s*([^;]+);', content)

        for match in matches:
            # Split multiple ranges
            ranges = [item.strip() for item in match.split(',')]
            unicode_ranges.append(ranges)

    return unicode_ranges

def save_to_json(data, output_file):
    with open(output_file, 'w') as file:
        json.dump(data, file, ensure_ascii=False, indent=4)

if __name__ == "__main__":
    spec_file_path = 'config/webfonts_spec.json'
    input_file_path = 'config/notosansjp.css'
    output_file_path = 'config/ranges.json'

    # Get API key from environment variable
    API_KEY = os.environ.get('GOOGLE_FONTS_API_KEY')
    spec_url = f"https://www.googleapis.com/webfonts/v1/webfonts?key={API_KEY}"

    # Download the spec file
    download_file(spec_url, spec_file_path)

    # Download CSS file
    css_url = 'https://fonts.googleapis.com/css2?family=Noto+Sans+JP'
    download_file(css_url, input_file_path)

    # Extract unicode ranges
    unicode_ranges = extract_unicode_ranges(input_file_path)
    save_to_json(unicode_ranges, output_file_path)

    print(f"Unicode ranges have been saved to {output_file_path}")

スライスした.woff2ファイルを生成する

フォントファイルとコードポイントの範囲が取得できたので、いよいよスライスした.woff2ファイルと、スタイルシートを生成していきます。

.ttfをコードポイントでサブセット化して、.woff2に変換するコードはこのようになります。

def parse_unicode_ranges(unicode_ranges):
    codepoints = set()
    for range_str in unicode_ranges:
        # Remove "U+"
        range_str = range_str.replace("U+", "").upper()
        
        # Get range of codepoints
        match = re.match(r'([0-9A-Fa-f]+)-([0-9A-Fa-f]+)', range_str)
        if match:
            start, end = match.groups()
            start, end = int(start, 16), int(end, 16)
            codepoints.update(range(start, end + 1))
        else:
            # Single codepoint
            codepoints.add(int(range_str, 16))
    return codepoints

def convert_ttf_to_woff2(ttf_filename, woff2_filename, unicode_ranges):
    font = TTFont(ttf_filename)

    options = Options()
    codepoints = parse_unicode_ranges(unicode_ranges)

    subsetter = Subsetter(options=options)
    subsetter.populate(unicodes=codepoints)
    subsetter.subset(font)

    font.flavor='woff2'

    directory = os.path.dirname(woff2_filename)
    if not os.path.exists(directory):
        os.makedirs(directory)
    font.save(woff2_filename)
    
    print(f"Converted to WOFF2 file: {woff2_filename}")

あとはこれに合わせてスタイルシートを生成するだけです。

と、いろいろ説明してきましたが、ここまでの処理を一括で行うコードをこちらに置きました。コマンド一発で、あらかじめ指定したフォント名のフォントをスライスしたファイル（.woff2）と、スタイルシートを生成します。コードポイントの取得も自動でやります。

GitHub - qloba/gen_webfonts

Contribute to qloba/gen_webfonts development by creating an account on GitHub.

github.com

使い方

以下のファイルにAPIキーを定義します。

/environments/python.env

GOOGLE_FONTS_API_KEY=<YOUR_API_KEY>

以下のファイルにサブセット化したいフォントの出力ファイル名と、Google Fontのフォント名を対応づけたものを指定します。

config/webfonts.json

{
  "notosansjp": "Noto Sans JP",
  "notoserifjp": "Noto Serif JP"
}

スライスされたフォントファイルとCSSを生成します。

$ docker-compose up

それぞれのフォントとウェイトごとに120個程度のファイルが生成されますので、指定したフォントが多い場合はそれなりのファイル数になります。ホスティングのサーバーにアップロードする際は漏れがないように注意してください。

最終更新:

日本語のGoogle Fontsを自分でいい感じに分割してセルフホストする

生成したいもの

用意しないといけなさそうなもの

スライスした.woff2ファイルを生成する

使い方

関連する記事