Glycominds Linear Code Tutorial


The linear code representation for glycan structures consists of two major aspects-sugar residues and linkages. Monosaccharides are represented by one or two capital letters, while linkages are abbreviated as a lowercase letter (a or b) followed by a number (the position of linkage). The code for some of the more commonly found monosaccharides are listed below:

G=glucose
A=galactose
GN=N-acetylglucosamine
AN=N-acetylgalactosamine
M=mannose
NN=N-acetylneuraminic acid
N=Neuraminic acid
F=fucose

The non-reducing end of a sugar will be on the left, working to the reducing end on the right. Branching is represented by using parentheses. The branch to the lowest position of a branching residue is considered part of the main chain, and is not bracketed, while linkages to higher positions are sequentially bracketed. For example, Man-5 is written as Ma3(Ma3(Ma6)Ma6)Mb4GNb4GN

Uncertainty in linkage or composition can be expressed in several ways, depending on the information available. The most generic form uses a "?" symbol in place of an unknown entity (e.g. Mb3?b4GN). Slashes, "/", can be used to indicate that there is one of two possibilities, as in Mb3/4GNb4GN. In this case, the mannose could be linked to the 3 or 4 position of the N-acetyl glucosamine.

Modifications to individual monosaccharides are indicated by brackets. For example, glucose with a 3-O sulfate would be represented as G[3S]. For more detailed information, including complete lists of monosaccharides and modifications, please see this.

Close this window