Page 46 |
DICOM PS3.5 2020a - Data Structures and Encoding |
VR Name
UN
Unknown
Definition |
Character Repertoire |
Length of Value |
An octet-stream where the encoding of the contents is unknownnot applicable |
Anylengthvalidfor |
|
(see Section 6.2.2). |
|
any of the other |
|
|
DICOM Value |
|
|
Representations |
UR
Universal
Resource Identifier or Universal Resource Locator (URI/URL)
US
Unsigned
Short
A string of characters that identifies a URI or a URL as defined inThe subset of the Default 232-2 bytes [RFC3986].Leadingspacesarenotallowed.TrailingspacesshallCharacterRepertoirerequiredmaximum.
be ignored. Data Elements with this VR shall not be multi-valued.for the URI as defined in IETF
RFC3986 Section 2, plus theSee Note 2 space (20H) character
permitted only as trailing padding.
Characters outside the permitted character set must be "percent encoded".
Note |
|
The Backslash |
|
(5CH) character is |
|
among those |
|
disallowed in URIs. |
|
Unsigned binary integer 16 bits long. Represents integer n in thenot applicable |
2 bytes fixed |
range: |
|
0 <= n < 216. |
|
UT |
A character string that may contain one or more paragraphs. It Default Character Repertoire232-2 bytes |
||
|
maycontaintheGraphicCharactersetandtheControlCharacters,and/or as defined by |
maximum |
|
UnlimitedText |
|
|
|
|
CR,LF,FF,andESC.Itmaybepaddedwithtrailingspaces,which(0008,0005)excludingControl |
||
|
may be ignored, but leading spaces are considered to be |
Characters except TAB, LF,See Note 2 |
|
|
significant. Data Elements with this VR shall not be multi-valuedFF, CR (and ESC when used |
||
|
and therefore character code 5CH (the BACKSLASH "\" in ISO-IRfor ISO 2022 escape |
|
|
|
6) may be used. |
sequences). |
|
UV |
Unsigned binary integer 64 bits long. Represents an integer n innot applicable |
8 bytes fixed |
|
Unsigned |
the range: |
|
|
0 <= n < 264. |
|
|
|
64-bit Very |
|
|
|
Long
Note
1.For attributes that were present in ACR-NEMA 1.0 and 2.0 and that have been retired, the specifications of Value RepresentationandValueMultiplicityprovidedarerecommendationsforthepurposeofinterpretingtheirvaluesinobjects createdinaccordancewithearlierversionsofthisStandard.Theserecommendationsaresuggestedasmostappropriate for a particular attribute; however, there is no guarantee that historical objects will not violate some requirements or specified VR and/or VM.
2.ThelengthofthevalueofUC,URandUTVRsislimitedonlybythesizeofthemaximumunsignedintegerrepresentable in a 32 bit VL field minus two, since FFFFFFFFH is reserved and lengths are required to be even.
3.In previous editions of the Standard (see PS3.5 2015a), the TAB character was not listed as permitted for the ST, LT and UT VRs. It has been added for the convenience of formatting and the encoding of XML text.
- Standard -
DICOM PS3.5 2020a - Data Structures and Encoding |
Page 47 |
6.2.1 Person Name (PN) Value Representation
6.2.1.1 Examples of PN VR and Notes
Examples:
•Rev. John Robert Quincy Adams, B.A. M.Div.
"Adams^John Robert Quincy^^Rev.^B.A. M.Div."
[One family name; three given names; no middle name; one prefix; two suffixes.]
•Susan Morrison-Jones, Ph.D., Chief Executive Officer
"Morrison-Jones^Susan^^^Ph.D., Chief Executive Officer"
[Two family names; one given name; no middle name; no prefix; two suffixes.]
•John Doe
"Doe^John"
[One family name; one given name; no middle name, prefix, or suffix. Delimiters have been omitted for the three trailing null com- ponents.]
•(for examples of the encoding of Person Names using multi-byte character sets see Annex H)
•"Smith^Fluffy"
[A cat, rather than a human, whose responsible party family name is Smith, and whose own name is Fluffy]
•"ABC Farms^Running on Water"
[A horse whose responsible organization is named ABC Farms, and whose name is "Running On Water"]
Note
1.A similar multiple component convention is also used by the HL7 v2 XPN data type. However, the XPN data type places the suffix component before the prefix, and has a sixth component "degree" that DICOM subsumes in the name suffix. There are also differences in the manner in which name representation is identified.
2.In typical American and European usage the first occurrence of "given name" would represent the "first name". The second and subsequent occurrences of the "given name" would typically be treated as a middle name(s). The "middle name" component is retained for the purpose of backward compatibility with existing standards.
3.The implementer should remain mindful of earlier usage forms that represented "given names" as "first" and "middle" and that translations to and from this previous typical usage may be required.
4.For reasons of backward compatibility with older versions of this Standard, person names might be considered a single family name complex (single component without "^" delimiters).
6.2.1.2 Ideographic and Phonetic Characters in Data Elements with VR of PN
Character strings representing person names are encoded using a convention for PN value representations based on component groups with 5 components.
For the purpose of writing names in ideographic characters and in phonetic characters, up to 3 component groups may be used. The delimiter of the component group shall be the equals character "=" (3DH). The three component groups in their order of occurrence are: an alphabetic representation, an ideographic representation, and a phonetic representation.
- Standard -
Page 48 |
DICOM PS3.5 2020a - Data Structures and Encoding |
Any component group may be absent, including the first component group. In this case, the person name may start with one or more "=" delimiters. Delimiters are also required for interior null component groups. Trailing null component groups and their delimiters may be omitted.
The first component group (identified by DICOM as "alphabetic") shall be encoded using the character set specified by the Attribute Specific Character Set (0008,0005), value 1. If Attribute Specific Character Set (0008,0005) is not present, the Default Character Repertoire ISO-IR 6 shall be used. ISO 2022 escapes for Code Extension shall not be used in this component group. When Specific Character Set (0008,0005) value 1 specifies a multi-byte character set without Code Extension (i.e., Unicode in UTF-8, GB18030 or GBK), the characters of this component group may be encoded with multiple bytes, but shall be drawn from the code points U+0020 through U+1FFF of ISO/IEC 10646, or the following ISO/IEC 10646 code points:
U+3001, U+3002, U+300C, U+300D, U+3099 through U+309C, and U+30A0 through U+30FF
The second group shall be used for ideographic characters. The character sets used will usually be those from Attribute Specific Character Set (0008,0005), value 2 through n, and may use ISO 2022 escapes.
The third group shall be used for phonetic characters. The character sets used shall be those from Attribute Specific Character Set (0008,0005), value 1 through n, and may use ISO 2022 escapes.
Delimitercharacters"^"and"="aretakenfromthecharactersetspecifiedbyvalue1oftheAttributeSpecificCharacterSet(0008,0005). If Attribute Specific Character Set (0008,0005), value 1 is not present, the Default Character Repertoire ISO-IR 6 shall be used.
AtthebeginningofthevalueofthePersonNamedataelement,thefollowinginitialconditionisassumed:ifAttributeSpecificCharacter Set (0008,0005), value 1 is not present, the Default Character Repertoire ISO-IR 6 is invoked, and if the Attribute Specific Character Set (0008,0005), value 1 is present, the character set specified by value 1 of the Attribute is invoked.
At the end of the value of the Person Name data element, and before the component delimiters "^" and "=", the character set shall be switched to the Default Character Repertoire ISO-IR 6, if value 1 of the Attribute Specific Character Set (0008,0005) is not present. If value 1 of the Attribute Specific Character Set (0008,0005) is present, the character set shall be switched to that specified by value 1 of the Attribute.
Thevaluelengthofeachcomponentgroupis64charactersmaximum,includingthedelimiterforthecomponentgroup.Eachcombining character (e.g., diacritics or vowel marks) shall be considered a separate character for this maximum length, regardless of how an application may display such combining characters (i.e., combined into the glyph for the base character, or rendered separately).
6.2.2 Unknown (UN) Value Representation
The Unknown (UN) VR shall only be used for Private Attribute Data Elements and Standard Data Elements previously encoded as someDICOMVRotherthanUNusingtheDICOMDefaultTransferSyntax(ImplicitVRLittleEndian),andwhoseValueRepresentation is currently unknown, or whose known Value Representation is none of OB, OD, OF, OL, OW, SQ, UC, UR or UT and whose value length exceeds 65534 (216-2) and therefore cannot be encoded as a 16-bit unsigned integer in the Value Length Field defined for the known Value Representation (see Section 6.2.1). As long as the VR is unknown the Value Field is insensitive to byte ordering and shall not be 'byte-swapped' (see Section 7.3). In the case of undefined length sequences, the value shall remain in implicit VR form. SeeSection7.8foradescriptionofPrivateDataAttributeElementsandsection10andAnnexAforadiscussionofTransferSyntaxes.
The UN VR shall not be used for Private Creator Data Elements (i.e., the VR is equal to LO, see Section 7.8.1).
The UN VR shall not be used for File Meta Information Data Elements (any Tag (0002,xxxx), see PS3.10).
Note
1.All other (non-default) DICOM Transfer Syntaxes employ explicit VR in their encoding, and therefore any Private and/or Standard Data Element Value Field Attribute value encoded and decoded using any Transfer Syntax other than the default, and not having been translated to the DICOM Default Transfer Syntax default in the interim, will have a known VR.
2.IfatsomepointanapplicationknowstheactualVRforanAttributeofVRUN(e.g.,hasitsownapplicabledatadictionary), it can assume that the Value Field of the Attribute is encoded in Little Endian byte ordering with implicit VR encoding, irrespective of the current Transfer Syntax.
3.ThisVRofUNisneededwhenanexplicitVRmustbegiventoaDataElementwhoseValueRepresentationisunknown (e.g., store and forward).
- Standard -
DICOM PS3.5 2020a - Data Structures and Encoding |
Page 49 |
4.This VR of UN is also needed for the encoding of Data Elements with explicit VR whose value length exceeds 65534 (216-2) (FFFEH, the largest even length unsigned 16 bit number) but which are defined to have a 16 bit explicit VR length field.
5.ThelengthfieldoftheValueRepresentationofUNmaycontainthevalueofUndefinedLength,inwhichcasethecontents can be assumed to be encoded with implicit VR. See Section 7.5.1 to determine how to parse Data Elements with an Undefined Length.
6.An example of a Standard Data Element using a UN VR is a Type 3 or Type U Standard Attribute added to an SOP Class definition. An existing application that does not support that new Attribute (and encounters it) could convert the VR to UN.
6.2.3 URI/URL (UR) Value Representation
TheURI/URL(UR)VRusesasubsetoftheDefaultCharacterRepertoireasdefinedin[RFC3986],andshallnotuseanycodeextension or replacement techniques. URI/URL domain name components that in their original form use characters outside the permitted charactersetshallusetheInternationalizedDomainNamesforApplicationsencodinginaccordancewithIETFRFC5890andRFC5891. Other URI/URL content that uses characters outside the permitted character set shall use the Internationalized Resource Identifiers encoding mechanism of IETF RFC 3987, representing the content string in UTF-8 and percent encoding characters as required.
Note
For example, the use of a patient name in a URI/URL string may require use of the [RFC3987] technique.
6.3 Enumerated Values and Defined Terms
The value of certain Data Elements may be chosen among a set of explicit Values satisfying its VR. These explicit Values are either Enumerated Values or Defined Terms and are specified in PS3.3 and PS3.4.
Enumerated Values are used when the specified explicit Values are the only Values allowed for a Data Element. A Data Element with Enumerated Values that does not have a Value equivalent to one of the Values specified in this Standard has an invalid value within the scope of a specific Information Object/SOP Class definition.
Note
1.Patient Sex (0010, 0040) is an example of a Data Element having Enumerated Values. It is defined to have a Value that is either "M", "F", or "O" (see PS3.3). No other Value shall be given to this Data Element.
2.Future modifications of this Standard may add to the set of allowed values for Data Elements with Enumerated Values. Such additions by themselves may or may not require a change in SOP Class UIDs, depending on the semantics of the Data Element.
DefinedTermsareusedwhenthespecifiedexplicitValuesmaybeextendedbyimplementerstoincludeadditionalnewValues.These new Values shall be specified in the Conformance Statement (see PS3.2) and shall not have the same meaning as currently defined Values in this Standard. A Data Element with Defined Terms that does not contain a Value equivalent to one of the Values currently specified in this Standard shall not be considered to have an invalid value. An empty (zero length) value is not a valid new Value for a Defined Term; empty values shall be considered invalid unless the Standard specifically permits empty values. New Values shall not have a meaning of unknown, since that concept, if permitted by the Standard, shall be conveyed explicitly either by allowing the Data Element to be zero length or by provision of a standard Defined Term with such a meaning.
Note
1.Reporting Priority (0040,1009) is an example of a Data Element having Defined Terms. It is defined to have a Value that may be one of the set of standard Values; HIGH, ROUTINE, MEDIUM, or LOW (see PS3.3). Because this Data Element has Defined Terms other reporting priorities may be defined by the implementer.
2.The validity of empty values is usually specified by the attribute being defined as Type 2 (see Section 7.4.3). However, in the context of a required Type 1 attribute with multiple values, some (but not all) values may be allowed to be empty (see Section 7.4.1); in this case the Standard explicitly specifies the validity of empty values in the list of Defined Terms for each value. Specific Character Set (0008,0005) is an example of a Data Element for which the Standard specifically permits the first value to be empty when multiple values are present. Image Type (0008,0008) is an example of a Data
- Standard -
Page 50 |
DICOM PS3.5 2020a - Data Structures and Encoding |
Element that in some IODs defined in PS3.3 is required to be present with multiple values, but if an empty value is not explicitly listed in the Defined Terms for Value 3 by an IOD an empty value is invalid.
The Value Representation may affect the interpretation of Defined Terms and Enumerated Values for numeric values. For binary Value Representations, the textual representation of the Value in the Standard does not affect the interpretation. For string Value Representations (IS and DS), the meaning of the Value in the Standard shall be used, not the literal string.
Note
For example, an Enumerated Value of "1" expressed in the text of the Standard matches an IS or DS value encoded as "001", or a DS value encoded as "1.0" or "1." or "1.0000E+00" or any permitted encoding. Leading and trailing spaces are defined in Table 6.2-1 not to be significant and hence do not affect the interpretation.
6.4 Value Multiplicity (VM) and Delimitation
The Value Multiplicity of a Data Element specifies the number of Values that can be encoded in the Value Field of that Data Element. The VM of each Data Element is specified explicitly in PS3.6. If the number of Values that may be encoded in an element is variable, it shall be represented by two numbers separated by a dash; e.g., "1-10" means that there may be 1 to 10 Values in the element.
Note
Elements having a multiplicity of "S", which represented "single", in older versions of this Standard, will have a multiplicity of "1" in this version of this Standard.
When a Data Element has multiple Values, those Values shall be delimited as follows:
•For character strings, the character 5CH (BACKSLASH "\" in the case of the repertoire ISO IR-6) shall be used as a delimiter between Values.
Note
BACKSLASH ("\") is used as a delimiter between character string Values that are of fixed length as well as variable length.
•Multiple binary Values of fixed length shall be a series of concatenated Values without any delimiter.
Each string Value in a multi-valued character string may be of even or odd length, but the length of the entire Value Field (including "\" delimiters) shall be of even length. If padding is required to make the Value Field of even length, a single padding character shall be applied to the end of the Value Field (to the last Value), in which case the length of the last Value may exceed the Length of Value by 1.
Note
A padding character may need to be appended to a fixed length character string value in the above case.
Only the last UID Value in a multi-valued Data Element with a VR of UI shall be padded with a single trailing NULL (00H) character when necessary to ensure that the entire Value Field (including "\" delimiters) is of even length.
Data Elements with a VR of LT, OB, OD, OF, OL, OW, SQ, ST, UN, UR or UT shall always have a Value Multiplicity of one. See Table 6.2-1.
- Standard -