Page 136 DICOM PS3.5 2020a - Data Structures and Encoding
Table H.3-1. Character Sets and Escape Sequences Used in Example 1
Character Component |
Value of |
ISO |
Standard forESCSequence Code |
Character Set: |
||||
Set |
Group |
|
(0008,0005) |
Registration Code |
|
Element Purpose of Use |
||
Description |
|
Defined Term |
Number |
Extension |
|
|
|
|
Japanese |
First: |
Value 1: |
ISO-IR 6 |
|
|
GL |
ISO 646: |
|
|
Single-byte |
none |
|
|
|
|
|
|
|
Second: |
Value 2: |
ISO-IR 87 |
ISO 2022 |
ESC 02/04 |
GL |
JIS X 0208: |
|
|
Ideographic |
ISO 2022 IR 87 |
|
|
04/02 |
|
Japanese kanji, |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
hiragana, katakana |
|
|
Value 1: |
ISO-IR 6 |
ISO 2022 |
ESC 02/08 |
GL |
ISO 646: |
|
|
|
none |
|
|
04/02 |
|
for delimiters |
|
|
|
|
|
|
|
|||
|
Third: |
Value 2: |
ISO-IR 87 |
ISO 2022 |
ESC 02/04 |
GL |
JIS X 0208: |
|
|
Phonetic |
ISO 2022 IR 87 |
|
|
04/02 |
|
Japanese hiragana, |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
and katakana |
|
|
Value 1: |
ISO-IR 6 |
ISO 2022 |
ESC 02/08 |
GL |
ISO 646: |
|
|
|
none |
|
|
04/02 |
|
for delimiters |
|
|
|
|
|
|
|
|||
H.3.2 Value 1 of Attribute Specific Character Set (0008,0005) is ISO 2022 IR 13.
Example H.3-2. Value 1 of Attribute Specific Character Set (0008,0005) is ISO 2022 IR 13
Specific Character Set:
(0008,0005) ISO 2022 IR 13\ISO 2022 IR 87
Character String:
^ = ^ = ^
^ = ESC 02/04 04/02 ESC 02/08 04/10 ^ ESC 02/04 04/02 ESC 02/08 04/10 = ESC 02/04 04/02 ESC 02/08 04/10 ^ ESC 02/04 04/02 ESC 02/08 04/10
Encoded representation:
13/04 12/15 12/00 13/14 05/14 12/00 13/11 11/03 03/13 01/11 02/04 04/02 03/11 03/03 04/05 04/04 01/11 02/08 04/10 05/14 01/11 02/04 04/02 04/02 04/00 04/15 03/10 01/11 02/08 04/10 03/13 01/11 02/04 04/02 02/04 06/04 02/04 05/14 02/04 04/00 01/11 02/08 04/10 05/14 01/11 02/04 04/02 02/04 03/15 02/04 06/13 02/04 02/06 01/11 02/08 04/10
An example of what might be displayed or printed by an ASCII based machine that displays or prints the Control Character ESC (01/11) using \033:
\324\3l7\300\336^\300\333\263=\033$B;3ED\033(J^\033$BB@O:\033(J=\033$B$d$^$@\033(J^\033$B$?$m$&\033(J
- Standard -
DICOM PS3.5 2020a - Data Structures and Encoding Page 137
Table H.3-2. Character Sets and Escape Sequences Used in Example 2
CharacterComponent Value of |
ISO |
Standard forESC Sequence Code Character Set: Purpose |
||||||
Set |
Group |
(0008,0005) |
Registration Code |
|
Element |
of Use |
||
Description |
Defined Term |
Number |
Extension |
|
|
|
|
|
Japanese First: |
Value 1: |
ISO-IR 13 |
ISO 2022 |
ESC 02/09 |
GR |
JIS X 0201: |
||
|
Single-byteISO 2022 IR 13 |
|
|
04/09 |
|
Japanese katakana |
||
|
|
|
|
|
||||
|
|
|
ISO-IR 14 |
ISO 2022 |
ESC 02/08 |
GL |
JIS X 0201: |
|
|
|
|
|
|
04/10 |
|
Japanese romaji for |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
delimiters |
|
|
Second: |
Value 2: |
ISO-IR 87 |
ISO 2022 |
ESC 02/04 |
GL |
JIS X 0208: |
|
|
IdeographicISO 2022 IR 87 |
|
|
04/02 |
|
Japanesekanji,hiragana, |
||
|
|
|
|
|
||||
|
|
|
|
|
|
|
katakana |
|
|
|
Value 1: |
ISO-IR 14 |
ISO 2022 |
ESC 02/08 |
GL |
JIS X 0201: |
|
|
|
ISO 2022 IR 13 |
|
|
04/10 |
|
Japanese romaji for |
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
delimiters |
|
|
Third: |
Value 2: |
ISO-IR 87 |
ISO 2022 |
ESC 02/04 |
GL |
JIS X 0208: |
|
|
Phonetic |
ISO 2022 IR 87 |
|
|
04/02 |
|
Japanese hiragana, and |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
katakana |
|
|
|
Value 1: |
ISO-IR 14 |
ISO 2022 |
ESC 02/08 |
GL |
JIS X 0201: |
|
|
|
ISO 2022 IR 13 |
|
|
04/10 |
|
Japanese romaji for |
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
delimiters |
|
- Standard -
Page 138 |
DICOM PS3.5 2020a - Data Structures and Encoding |
- Standard -
DICOM PS3.5 2020a - Data Structures and Encoding |
Page 139 |
I Character Sets and Person Name Value Representation in the Korean Language (Informative)
I.1 Character Sets For The Korean Language in DICOM
KS X 1001 (registered as ISO-IR 149) is used as a Korean character set in DICOM. This character set is the one most broadly used for the representation of Korean characters. It can be encoded by ISO 2022 code extension techniques, and is registered in ISO 2375.
The Escape Sequence is shown for reference in Table I.1-1 (see PS3.3)
Table I.1-1. ISO/IEC 2022 Escape Sequence for ISO-IR 149
ISO-IR 149
G0 set |
ESC 02/04 02/08 04/03 |
G1 set |
ESC 02/04 02/09 04/03 |
Note |
|
1.ISO-IR 149 is only used as a G1 set in DICOM.
2.TheKoreancharacterset(ISOIR149)isinvokedtotheG1area.ThisisdifferentfromtheJapanesemulti-bytecharacter sets (ISO 2022 IR 87 and ISO 2022 IR 159), which use the G0 code area. Japan's choice of G0 is due to the adoption of an encoding method based on "ISO-2022-JP". ISO-2022-JP, the most familiar encoding method in Japan, and uses only the G0 code area. In Korea, most operating systems adopt an encoding method that invokes the Hangul character set(KSX1001)intheG1codearea.So,thedifferencebetweencodeareasofKoreanandJapanesecharacteroriginates inconvention,notatechnicalproblem.Invocationofmulti-bytecharactersetstotheG1areadoesnotchangethecurrent DICOM normative requirements.
I.2 Example of Person Name Value Representation in the Korean Language
Example I.2-1. Example of Person Name Value Representation in the Korean Language
Person names in the Korean language may be written in Hangul (phonetic characters), Hanja (ideographic characters), or Latin (al- phabetic characters). The three component groups should be written in the order of alphabetic, ideographic, and phonetic (see Table 6.2-1).
Specific Character Set:
(0008,0005) \ISO 2022 IR 149
Character String:
Hong^Gildong= ^ = ^
Hong^Gildong= ESC 02/04 02/09 04/03 ^ ESC 02/04 02/09 04/03 = ESC 02/04 02/09 04/03 ^ ESC 02/04 02/09 04/03
Encoded representation:
- Standard -
Page 140 |
DICOM PS3.5 2020a - Data Structures and Encoding |
04/08 06/15 06/14 06/07 05/14 04/07 06/09 06/12 06/04 06/15 06/14 06/07 03/13 01/11 02/04 02/09 04/03 15/11 15/03 05/14 01/11 02/04 02/09 04/03 13/01 12/14 13/04 13/07 03/13 01/11 02/04 02/09 04/03 12/08 10/11 05/14 01/11 02/04 02/09 04/03 11/01 14/06 11/05 11/15
An example of what might be displayed or printed by an ASCII based machine that displays or prints the Control Character ESC (01/11) using \033:
Hong^Gildong=\033$) C\373\363^\033$)C\321\316\324\327=\033$)C\310\253^\033$)C\261\346\265\277
Note
1.The multi-byte character set (ISO-IR 149) and single-byte character set (ISO 646) can be used intermixed without any explicit escape sequence after the initial escape sequence. Once ISO 646 has been designated to the GL area and ISO-IR 149 to the GR area, each character set has different code area, thus can be used intermixed. The decoder will check the most significant bit of a character to know whether it is a two byte character in the GR area (high bit one) or a one byte character in the GL area (high bit zero).
2.Intheaboveexampleofpersonnamerepresentation,explicitescapesequencesprecedeeachHangulandHanjastring. These escape sequences are to meet the requirements of the code extension technique that specifies a switch to the Default Character Repertoire before delimiters. In the previous example, it is assumed that the Default Character Rep- ertoire (ISO-646) is invoked to G0 code area and no character set to G1 area after delimiters ("^" and "=" signs). See Section 6.1.2.5.3.
I.3 Example of Long Text Value Representation in the Korean Language Without
Explicit Escape Sequences Between Character Sets
Example I.3-1. Example of Long Text Value Representation in the Korean Language Without Explicit Escape Sequences Between Character Sets
Hangul (ISO IR 149) and ASCII (ISO 646) character sets can be used intermingled without explicit escape sequences between them. The Hangul character set ISO IR 149 is invoked to the G1 area, so this invocation doesn't affect the G0 area to which the ASCII character set has been invoked. The following is an example of a Long Text value representation that includes ASCII and Hangul character set.
Specific Character Set:
(0008,0005) \ISO 2022 IR 149
Character String:
The first line includes .
The second line includes , too.
The third line
Encoded String:
ESC 02/04 02/09 04/03 The first line includes .
ESC 02/04 02/09 04/03 The second line includes , too.
The third line
Once having invoked the ISO IR 149 character set to G1 area by the escape sequence in the head of line, one can use Hangul and ASCII intermixed in that line.
- Standard -