Skip to content

test: consolidate utf8 text fixtures in tests

We previously used a text that appears to be an excerpt of https://zh.wikipedia.org/wiki/%E5%8D%97%E8%B6%8A%E5%9B%BD and can have copyright/license complications. It may also include some geopolitical nuances. The text has been repeated through out the code base without much reuse.

This patch consolidates the fixtures by adding a common helper string as fixtures.utf8TestText which is identical to a copy in test/fixtures/utf8_test_text.txt. It also updates the text to a copy of 蘭亭集序, It was chosen because:

  1. It's a well-known Chinese classical piece written in 353 CE and therefore in public domain. The string is copied from https://zh.wikisource.org/zh-hant/%E8%98%AD%E4%BA%AD%E9%9B%86%E5%BA%8F which contains a disclaimer of copyright for this reason.
  2. The text is in suitable length for general UTF8 string read/write tests (including punctuations, 389 code points and 1167 bytes).
  3. This is also commonly used as reference text for Chinese text layout tests.
  4. It's a timeless and harmless preface for a collection of poems, written by a uncontroversial figure who passed away >1600 years ago and contains no geopolitical nuances. Background and an English translation of this text can be found at https://en.wikipedia.org/wiki/Lantingji_Xu

Merge request reports

Loading