HL7 guidelines for 'escaping' data

IMPORTANT: When creating your HL7 messages in the schema database for export you must be very careful that your data doesn't UNINTENTIONALLY contain any reserved characters. These characters are typically:

 

1.The HL7 segment delimiter (ANSI default is a 'Carriage Return' character (Ascii character 13 Hex 0x0D)

2.The HL7 field delimiter (ANSI default is a 'Pipe' character [ | ] )

3.The HL7 'encoding' characters ( Ansi default [^&~\] )

 

When exporting HL7 data the outbound processor will automatically apply some basic 'escaping' rules to try to insure that your raw data will not corrupt the outbound message. When a field is 'escaped' only the HL7 Segment Delimiter and the HL7 Encoding (Component, Subcomponent,Repeat and Escape) Delimiters are modified to be "Safe" according to HL7 Standard Escape Rules (see below).

 

The "Escape" rule is only applied if the following conditions are detected when exporting the data:

 

1.If the data column is NOT a C0 column. All data in C0 columns is assumed to be 'Raw' HL7

2.If the data column contains the HL7 Segment Delimiter (HL7 default: Ascii chr 0x0D) OR the HL7 Field Delimiter (HL7 default: The pipe [ | ] character).

3.If the database column containing the field is of type "Text" (a blob field) then the Escape rule is automatically applied.

 

Programmer's Note: When creating your HL7 data in the database you should be 'scrubbing' your data to insure that it either:

 

1.Does not contain any reserved characters which are not automatically handled.

2.If it does contain a character which could potentially corrupt the message (like a component delimiter, or a repeat delimiter) you should scrub the data yourself using HL7 standard escaping rules outlined below.

 

Below is an excerpt from the HL7 standard on the use of escape sequences in HL7

From the version 2.3.1 HL7 standard.

 

Use of escape sequences in text fields

 

Formatting codes

 

When a field of type TX, FT, or CF is being encoded, the escape character may be used to signal certain special characteristics of portions of the text field. The escape character is whatever display ASCII character is specified in the Escape Character component of MSH-2-encoding characters. For purposes of this section, the character \ will be used to represent the character so designated in a message. An escape sequence consists of the escape character followed by an escape code ID of one character, zero (0) or more data characters, and another occurrence of the escape character. The following escape sequences are defined:

 

\H\        start highlighting        

 

\N\        normal text (end highlighting)        

 

\F\        field separator        

 

\S\        component separator        

 

\T\        subcomponent separator        

 

\R\        repetition separator        

 

\E\        escape character        

 

\Xdddd...\        hexadecimal data        

 

\Zdddd...\        locally defined escape sequence        

 

The escape sequences for field separator, component separator, subcomponent separator, repetition separator, and escape character are also valid within an ST data field.

 

No escape sequence may contain a nested escape sequence.

 

Escape sequences supporting multiple character sets for PN, XPN, XCN, XON, and XAD data types

 

The following HL7 escape sequences are defined to support multiple character sets for fields of the PN and XPN data type. They allow HL7 parsers to use escape codes (defined in the standards used below), without breaking, and without being non-conformant, to the HL7 escape paradigm defined in this section.

 

\Cxxyy\ single-byte character set escape sequence with two hexadecimal values, xx and yy, that indicate the escape sequence defined for one of the character repertoires supported for the current message (i.e., ISO-IR xxx).

 

\Mxxyyzz\ multi-byte character set escape sequence with three hexadecimal values, xx, yy and zz. zz is optional.

 

Common character set escape sequences include the following which are defined in the standards mentioned:

 

Single-byte character sets:

 

\C2842\        ISO-IR 6 G0 (ISO 646 : ASCII)        

 

\C2D41\        ISO-IR 100 (ISO 8859 : Latin Alphabet 1)        

 

\C2D42\        ISO-IR 101 (ISO 8859 : Latin Alphabet 2)        

 

\C2D43\        ISO-IR 109 (ISO 8859 : Latin Alphabet 3)        

 

\C2D44\        ISO-IR 110 (ISO 8859 : Latin Alphabet 4)        

 

\C2D4C\        ISO-IR 144 (ISO 8859 : Cyrillic)        

 

\C2D47\        ISO-IR 127 (ISO 8859 : Arabic)        

 

\C2D46\        ISO-IR 126 (ISO 8859 : Greek)        

 

\C2D48\        ISO-IR 138 (ISO 8859 : Hebrew)        

 

\C2D4D\        ISO-IR 148 (ISO 8859 : Latin Alphabet 5)        

 

\C284A\        ISO-IR 14 (JIS X 0201 -1976: Romaji)        

 

\C2949\        ISO-IR 13 (JIS X 0201 : Katakana)        

 

Multi-byte codes :

 

\M2442\        ISO-IR 87 (JIS X 0208 : Kanji, hirigana and katakana)        

 

\M242844\        ISO-IR 159 (JIS X 0212 : Supplementary Kanji)        

 

 

 

Highlighting

 

In designating highlighting, the sending application is indicating that the characters that follow somehow should be made to stand out, but leaving the method of doing so to the receiving application. Depending on device characteristics and application style considerations, the receiving application may choose reverse video, boldface, underlining, blink, an alternate color or another means of highlighting the displayed data. For example the message fragment:

 

DSP| TOTAL CHOLESTEROL \H\240*\N\ [90 - 200]

 

might cause the following data to appear on a screen or report:

 

TOTAL CHOLESTEROL 240* [90 - 200]

 

whereas another system may choose to show the 240* in red.

 

Special characters

 

The special character escape sequences (\F\, \S\, \R\, \T\, and \E\) allow the corresponding characters to be included in the data in a text field, though the actual characters are reserved. For example, the message fragment

 

DSP| TOTAL CHOLESTEROL 180 \F\90 - 200\F\

 

DSP| \S\----------------\S\

 

would cause the following information to be displayed, given suitable assignment of separators:

 

TOTAL CHOLESTEROL 180 |90 - 200|

 

^----------------^

 

Hexadecimal

 

When the hexadecimal escape sequence (\Xdddd...\) is used the X should be followed by 1 or more pairs of hexadecimal digits (0, 1, . . . , 9, A, . . . , F). Consecutive pairs of the hexadecimal digits represent 8-bit binary values. The interpretation of the data is entirely left to an agreement between the sending and receiving applications that is beyond the scope of this Standard.

 

 

 

Formatted text

 

If the field is of the formatted text (FT) data type, formatting commands also may be surrounded by the escape character. Each command begins with the . (period) character. The following formatting commands are available:

 

.sp <number>        End current output line and skip <number> vertical spaces. <number> is a positive integer or absent. If <number> is absent, skip one space. The horizontal character position remains unchanged. Note that for purposes of compatibility with previous versions of HL7, “^\.sp\” is equivalent to “\.br\.”        

 

.br        Begin new output line. Set the horizontal position to the current left margin and increment the vertical position by 1.        

 

.fi        Begin word wrap or fill mode. This is the default state. It can be changed to a no-wrap mode using the .nf command.        

 

.nf        Begin no-wrap mode.        

 

.in <number>        Indent <number> of spaces, where <number> is a positive or negative integer. This command cannot appear after the first printable character of a line.        

 

.ti <number>        Temporarily indent <number> of spaces where number is a positive or negative integer. This command cannot appear after the first printable character of a line.        

 

.sk < number>        Skip <number> spaces to the right.        

 

.ce        End current output line and center the next line.        

 

 

 

The component separator that marks each line defines the extent of the temporary indent command (.ti), and the beginning of each line in the no-wrap mode (.nf). Examples of formatting instructions that are NOT included in this data type include: width of display, position on page or screen, and type of output devices.

 

 

 

Figure 2-3 is an example of the FT data type from a radiology impression section of a radiology report:

 

 

 

Figure 2-3. Formatted text as transmitted

 

|\.in+4\\.ti-4\ 1. The cardiomediastinal silhouette is now within normal limits.^\.sp\\.ti-4\ 2. Lung fields show minimal ground glass appearance.^\.sp\\.ti-4\ 3. A loop of colon visible in the left upper quadrant is distinctly abnormal with the appearance of mucosal effacement suggesting colitis.\.in-4\|        

 

 

 

Figure 2-4 shows one way of presenting the data in Figure 2-3. The receiving system can create many other interpretations by varying the right margin.

 

Figure 2-4. Formatted text in one possible presentation

 

1. The cardiomediastinal silhouette is now within normal limits. 2. Lung fields show minimal ground glass appearance. 3. A loop of colon visible in the left upper quadrant is distinctly abnormal with the appearance of mucosal effacement suggesting colitis.