文林 Wénlín
CDL Online Help
Open this page in a new window.
Summary
This document describes Wenlin Institute’s online system for rendering CDL and for converting IDS and free-form queries into CDL.
Login Help
- Need a valid login and password? Please email the CDL team.
- Did your session time out? Please log in again through the main CDL query portal.
Query Basics
Buttons
The query form has 3 main buttons:
- CDL: converts a non-CDL query to CDL, and displays the result.
- SVG: renders query as SVG.
- BMP: renders query as a Bitmap.
Valid Queries
The query form generates an image for valid queries in three different CJK description languages:
- CDL: Character Description Language [XML; see the CDL Specification]
- CRS: Character Representation Sequences [simple grid framework with 4 optional operators, for rapid CDL drafting; see below]
- IDS: Ideographic Description Sequences [see the syntax description in TUS 5.0:428]
Valid queries in any of these three languages may be named (describing an encoded character at a specific Unicode codepoint) or anonymous (describing an unencoded character, or a character at an unknown codepoint).
Example Queries
1. CDL: Character Description Language
- named CDL (
cdl element with char and/or uni attributes), e.g. :
<cdl char='字' uni='5b57'>
<comp char='宀' uni='5b80' points='0,0 128,40' />
<comp char='子' uni='5b50' points='0,48 128,128' />
</cdl>
- anonymous CDL (
cdl element without char and uni attributes), e.g. :
<cdl>
<comp char='宀' uni='5b80' points='0,0 128,40' />
<comp char='子' uni='5b50' points='0,48 128,128' />
</cdl>
- Valid CDL queries include
comp and/or stroke elements:
- Each CDL
comp element must have a char and/or uni attribute, drawn from the set of characters currently having CDL descriptions. This set currently includes:
- the two Radical blocks (Kang Xi and Supplement);
- all BMP CJK characters (URO and Ext. A);
- most Extension B characters (SIP);
- all CJK Strokes block characters (bxg ⇒ ㇃; d ⇒ ㇔; h ⇒ ㇐; hg ⇒ ㇖; hp ⇒ ㇇; hpwg ⇒ ㇌; hxwg ⇒ ㇠; hz ⇒ ㇕; hzg ⇒ ㇆; hzt ⇒ ㇊; hzw ⇒ ㇍; hzwg ⇒ ㇈; hzz ⇒ ㇅; hzzp ⇒ ㇋; hzzz ⇒ ㇎; hzzzg ⇒ ㇡; n ⇒ ㇏; p ⇒ ㇒; pd ⇒ ㇛; pg ⇒ ㇢; pz ⇒ ㇜; q ⇒ ㇣; s ⇒ ㇑; sg ⇒ ㇚; sp ⇒ ㇓; st ⇒ ㇙; sw ⇒ ㇄; swg ⇒ ㇟; swz ⇒ ㇘; sz ⇒ ㇗; szwg ⇒ ㇉; szz ⇒ ㇞; t ⇒ ㇀; tn ⇒ ㇝; wg ⇒ ㇁; xg ⇒ ㇂);
- many BMP PUA CJK characters;
- assorted Compatibility ideographs (might not get what you expect using these).
- Each CDL
stroke element must have a type attribute, the value of which is written with an alphabetic stroke abbreviation as defined in the CDL Specification. Characters from the CJK Strokes block cannot currently be used as type attributes, and some few CJK Strokes block character names are not currently supported as CDL stroke abbreviations (this will change in the near future).
2. CRS: Character Representation Sequences
- named CRS (with leading CJK Ideograph followed by colon, optional
span operators, and semi-colon or carriage-return row separator), e.g. :
-
In-line (no
span operator):
字:宀;子
or Multi-line (no span operator):
兊:
公
儿
- anonymous CRS (with semi-colon or carriage-return row separator, and optional
span operators), e.g. :
-
- CRS aims at rapid CDL drafting of simple and complex character descriptions.
- All CRS operators are optional: a simple grid layout of components streamlines CDL drafting.
- CDL generated from CRS can be easily refined: users do not have to write XML from scratch.
- CRS descriptions are best edited in an environment in which operators and components have the same width, so that columns and rows line up.
- The
span operator ☯ (U+262f) controls both colspan and rowspan; this permits “L”-type enclosure, as in the last example.
- CRS resolution (grid size) is a variable rectangle, determined by the user for the interpeter.
- If there is a
span operator anywhere in the CRS, all row lengths are automatically regularized to max row length, filling right with colspan operator “■” (U+25a0).
- Override auto-right-filling with the
gap operator (U+3000 “Ideographic Space”), as in the next-to-last example (point the cursor at the characters in those examples, to see their Unicode values).
3. IDS: Ideographic Description Sequences
- named IDS (with leading CJK Ideograph, and following IDC operator [⿰⿱⿲⿳⿴⿵⿶⿷⿸⿹⿺⿻]), e.g. :
字⿱宀子
- anonymous IDS (with leading IDC operator), e.g. :
⿱宀子
Last modified: 2008年06月23日
.