Page 36 |
DICOM PS3.5 2020a - Data Structures and Encoding |
•Implementation level: ISO 2022 Level 1 - Elementary 7-bit code (code-level identifier 1)
•Initial designation: ISO-IR 6 (ASCII) as G0.
•Code Extension shall not be used.
b.Attribute Specific Character Set (0008,0005) single value other than "ISO_IR 192", "GB18030" or "GBK":
•8-bit code
•Implementation level: ISO 2022 Level 1 - Elementary 8-bit code (code-level identifier 11)
•Initial designation: One of the ISO 8859-defined character sets, or the 8-bit code table of JIS X 0201 specified by value 1 of the Attribute Specific Character Set (0008,0005), as G0 and G1.
•Code Extension shall not be used.
c.Attribute Specific Character Set (0008,0005) multi-valued:
•8-bit code
•Implementation level: ISO 2022 Level 4 - Redesignation of Graphic Character Sets within a Code (code-level identifier 14)
•Initial designation: One of the ISO 8859-defined character sets, or the 8-bit code table of JIS X 0201 specified by value 1 of the Attribute Specific Character Set (0008,0005), as G0 and G1. If value 1 of the Attribute Specific Character Set (0008,0005) is empty, ISO-IR 6 (ASCII) is assumed as G0, and G1 is undefined.
•All character sets specified in the various values of Attribute Specific Character Set (0008,0005), including value 1, may parti- cipate in Code Extension.
d.Attribute Specific Character Set (0008,0005) single value "ISO_IR 192", "GB18030" or "GBK":
•variable length code
•Implementation level: not specified (not compatible with ISO 2022)
•Initial designation: as specified by value 1 of the Attribute Specific Character Set (0008,0005)
•Code Extension shall not be used.
6.1.3 Control Characters
Textual data that is interchanged may require some formatting information. Control Characters are used to indicate formatting, but their use in DICOM is kept to a minimum since some machines may handle them inappropriately. ISO 646:1990 and ISO 6429:1990 define Control Characters. As shown in Table 6.1-1 below, only a subset of five Control Characters from the C0 set shall be used in DICOM for the encoding of Control Characters in text strings.
Table 6.1-1. DICOM Control Characters and Their Encoding
Acronym |
Name |
Coded Value |
LF |
Line Feed |
00/10 |
FF |
Form Feed |
00/12 |
CR |
Carriage Return |
00/13 |
ESC |
Escape |
01/11 |
TAB |
Horizontal Tab |
00/09 |
The ESC character shall be used only for ISO 2022 character set control sequences, in accordance with Section 6.1.2.5.
In text strings (value representation ST, LT, or UT) a new line shall be represented as CR LF.
- Standard -
DICOM PS3.5 2020a - Data Structures and Encoding |
Page 37 |
Note
1.Some machines (such as UNIX based machines) may interpret LF (00/10) as a new line. In such cases, it is expected that the DICOM format is converted to the correct internal representation for that machine.
2.In previous editions of the Standard (see PS3.5 2015a), the TAB character was not listed as a Control Character.
6.2 Value Representation (VR)
The Value Representation of a Data Element describes the data type and format of that Data Element's Value(s). PS3.6 lists the VR of each Data Element by Data Element Tag.
Values with VRs constructed of character strings, except in the case of the VR UI, shall be padded with SPACE characters (20H, in the Default Character Repertoire) when necessary to achieve even length. Values with a VR of UI shall be padded with a single trailing NULL (00H) character when necessary to achieve even length. Values with a VR of OB shall be padded with a single trailing NULL byte value (00H) when necessary to achieve even length.
AllnewVRsdefinedinfutureversionsofDICOMshallbeofthesameDataElementStructureasdefinedinSection7.1.2withreserved bytes after the VR and a 32-bit unsigned integer VL (i.e., following the format for VRs such as OB or UT), and may or may not permit undefined length.
Note
1.Since all new VRs will be defined as specified in Section 7.1.2, an implementation may choose to ignore VRs not recog- nized by applying the rules stated in Section 7.1.2.
2.When converting a Data Set from an Explicit VR Transfer Syntax to a different Transfer Syntax, an implementation may copy Data Elements with unrecognized VRs in the following manner:
•If the endianness of the Transfer Syntaxes is the same, the Value of the Data Element may be copied unchanged and if the target Transfer Syntax is Explicit VR, the VR bytes copied unchanged. In practice this only applies to Little Endian Transfer Syntaxes, since there was only one Big Endian Transfer Syntax defined.
•If the source Transfer Syntax is Little Endian and the target Transfer Syntax is the (retired) Big Endian Explicit VR Transfer Syntax, then the Value of the Data Element may be copied unchanged and the VR changed to UN, since being unrecognized, whether or not byte swapping is required is unknown. If the VR were copied unchanged, the byte order of the value might or might not be incorrect.
•If the source Transfer Syntax is the (retired) Big Endian Explicit VR Transfer Syntax, then the Data Element cannot be copied, because whether or not byte swapping is required is unknown, and there is no equivalent of the UN VR to use when the value is big endian rather than little endian.
The issues of whether or not the element may be copied, and what VR to use if copying, do not arise when converting a Data Set from Implicit VR Little Endian Transfer Syntax, since the VR would not be present to be unrecognized, and if the data element VR is not known from a data dictionary, then UN would be used.
An individual Value, including padding, shall not exceed the Length of Value, except in the case of the last Value of a multi-valued field as specified in Section 6.4.
Note
ThelengthsofValueRepresentationsforwhichtheCharacterRepertoirecanbeextendedorreplacedareexpresslyspecified in characters rather than bytes in Table 6.2-1. This is because the mapping from a character to the number of bytes used for that character's encoding may be dependent on the character set used.
Escape Sequences used for Code Extension shall not be included in the count of characters.
- Standard -
Page 38
VR Name
AE
Application
Entity
AS
Age String
AT
Attribute Tag
CS
Code String
DA
Date
|
DICOM PS3.5 2020a - Data Structures and Encoding |
Page 39 |
|
VR Name |
Definition |
Character Repertoire |
Length of Value |
DS |
A string of characters representing either a fixed point number or"0"-"9","+","-","E","e","."and16bytesmaximum |
||
a floating point number. A fixed point number shall contain onlythe SPACE character of
DecimalString
thecharacters0-9withanoptionalleading"+"or"-"andanoptionalDefault Character Repertoire "." to mark the decimal point. A floating point number shall be
conveyed as defined in ANSI X3.9, with an "E" or "e" to indicate the start of the exponent. Decimal Strings may be padded with leading or trailing spaces. Embedded spaces are not allowed.
Note
Data Elements with multiple values using this VR may not be properly encoded if Explicit-VR Transfer Syntax isusedandtheVLofthisattributeexceeds65534bytes.
- Standard -
Page 40 |
DICOM PS3.5 2020a - Data Structures and Encoding |
|
|
VR Name |
Definition |
Character Repertoire |
Length of Value |
DT |
|
"0"-"9", "+", "-", "." and the |
26bytesmaximum |
Date Time |
|
SPACE character of Default |
In the context of a |
|
Character Repertoire |
||
|
|
|
Query with range |
matching (see PS3.4), the length is 54 bytes maximum.
- Standard -