01/22/85 sort_strings, sstr Syntax as a command: sstr {-control_args} strings Syntax as an active function: [sstr {-control_args} strings] Function: orders the argument strings according to the ASCII collating sequence. Arguments: strings are the strings to be sorted. All arguments following the first strings are treated as strings. You can use -string to identify a first string that looks like a control argument or to separate a numeric string from operands of -field. Control arguments (sort units): -all, -a makes the primary (and only) sort field the entire sort unit; i.e., each string is considered to be a sort unit when sorting. (Default) -block N, -bk N makes the sort unit a block of N strings, where N must be a positive integer (see "Examples" below). (Default: 1 string) Control arguments (handling duplicates): -duplicates, -dup retains duplicate sort units in the sorted results. (Default) -only_duplicates, -odup only sort units that occur more than once in the input appear in the sorted results. One unit from each set of duplicate sort units is placed in the return value, in sorted order. -only_duplicate_keys, -odupk only sort units that have duplicate sort fields appear in the sorted results. All such units having duplicate sort fields are placed in the return value, since the nonsort field portions of the units may differ. -only_unique, -ouq only sort units that are unique appear in the sorted results. Whenever a set of duplicate units are found, they are removed entirely from the return value. -only_unique_keys, -ouqk only sort units that have unique sort fields appear in the sorted results. All units having duplicate sort fields are removed entirely from the return value. -unique, -uq deletes duplicate sort units from the sorted results. For each set of duplicate sort units, only the first appears in the sorted results, along with nonduplicate sort units. -unique_keys, -uqk deletes sort units having duplicate sort fields from the sorted results. For each set of sort units having duplicate fields, only the first appears in the sorted results, along with nonduplicate sort units. Control arguments (input strings): -string strings, -str strings identifies the strings that follow as the strings to be sorted. All remaining arguments are treated as input strings. Control arguments (sort order): -ascending, -asc returns the sorted results in ascending order. (Default) -case_sensitive, -cs makes the sort by comparing sort fields without translating letters to lowercase. (Default) -character, -ch makes the sort based on the character representation of the sort field. (Default) -descending, -dsc returns the sorted results in descending order. -field field_specs, -fl field_specs specifies the field(s) to be used when comparing two sort units. This allows units to be sorted based upon comparison of only a part of each sort unit. (See "Notes on field specifications.") Multiple -field control arguments may be used to specify multiple fields. -integer, -int makes the sort by converting the sort field to fixed binary (71,0) integers when comparing one sort unit with another (see "Notes"). -non_case_sensitive, -ncs makes the sort by translating letters in the sort fields to lowercase when comparing one sort unit with another. The actual sorted results remain unchanged. -numeric, -num makes the sort by converting the sort field to float decimal (59) numbers when comparing one sort unit with another (see "Notes"). Syntax of field specification: field_start field_length {sort_controls} Notes on field specification: The field_spec operands of -field define the fields within each sort unit by which the unit is sorted. The first field_spec defines the primary sort field, the second, a secondary sort field, and so forth. Each field_spec consists of a field start location, field length, and optional sorting controls. List of field_start formats: You can give the field start location in one of the following formats: S a positive integer, giving the character position of the start of the field in the sort unit (e.g., 1 if the field begins at the first character). If the sort unit contains fewer than S characters, then the unit is sorted as if space characters appeared in the sort field. -from S, -fm S where S is a positive integer giving the character position of the start of the field in the sort unit. -from STR, -fm STR where STR is a character string that identifies the beginning of the sort field. The field begins with the first character of the sort unit that follows STR. If STR does not appear in the sort unit, then the unit is sorted as if the sort field contained space characters. -from /REGEXP/, -fm /REGEXP/ where REGEXP is a regular expression that identifies the beginning of the sort field. The field begins with the first character of the sort unit that follows the part of the sort unit matching REGEXP (see the qedx command). If no match for REGEXP is found in the sort unit, then the unit is sorted as if the sort field contained space characters. -from -string STR, -fm -str STR treats STR as a character string that identifies the beginning of the sort field, even though STR may look like an integer or a regular expression. For example, -from -string 25 identifies a sort field that begins with the character following 25 in the sort unit. List of field_length formats: You can specify the sort field length in one of the following ways: L a positive integer, giving the length of the sort field in characters. If the sort unit is too short to hold a sort field of L characters (i.e., if the number of characters from the first character of the sort field to the end of the sort unit is less than L), then the unit is sorted as if the field were extended on the right with space characters to a length of L characters. Alternately, L can be -1 to indicate that the remainder of the sort unit is to be used as the sort field. -for L where L is a positive integer giving the length of the sort field in characters, or -1 to use the remainder of the sort unit as the sort field. -to E where E is a positive integer giving the character position of the end of the sort field in the sort unit (e.g., 5 if the field stops after the fifth character of the sort unit). If the sort unit contains fewer then E characters, then the unit is sorted as if space characters were added on the right to extend the unit to E characters. -to STR where STR is a character string that identifies the end of the sort field. The field ends with the first character of the sort unit preceding STR. If STR does not appear in the sort unit after the starting position of the sort field, then the unit is sorted as if space characters appeared in the sort field. -to /REGEXP/ where REGEXP is a regular expression that identifies the end of the sort field. The field ends with the first character of the sort unit that precedes the part of the sort unit matching REGEXP (see the qedx command). If no match for REGEXP is found in the sort unit after the starting position of the sort field, then the unit is sorted as if space characters appeared in the sort field. -to -string STR treats STR as a character string that identifies the end of the sort field, even though STR may look like an integer or a regular expression. Note that when you use -to to indicate the end of the field, then sort_strings examines all sort units to determine the length of the longest instance of this sort field in any sort unit; it then sort units as if the sort field in each unit were extended on the right with space characters to the length of the longest sort field instance. List of sort_controls: The sort controls may be one from each of the following three sets of arguments; the arguments within each set are incompatible with each other. If you give none, then the default is specified by the corresponding control argument. ascending, asc sorts units with this field in ascending order. descending, dsc sorts units with this field in descending order. case_sensitive, cs sorts units by treating uppercase letters in this field as being different from lowercase letters. non_case_sensitive, ncs sorts units by translating this field to lowercase. character, ch sorts units with this field by the character representation. integer, int sorts unit with this field by converting the character representation to its integer value (fixed binary (71,0)). numeric, num sorts units with this field by converting the character representaion to its numeric value (float decimal (59)). Notes: Using the control arguments, each string (or group of strings if you supply -block) is treated as a separate sort unit. These sort units are then sorted, and the ordered units are printed or returned as the active function return value. If you invoke sort_strings without any control arguments, -ascending, -all, and -character are assumed. You can sort a maximum of 261,119 units. The sort is stable; i.e., duplicate units appear in the same order in the sorted results as in the original input. The input strings are sorted using temporary segments in the process directory. The determination of whether or not a sort unit is to be deleted (see -unique) is independent of sort field specifications; i.e., given a number of nonidentical sort units that contain identical sort fields, all the units do appear in the sorted results. The following groups have control arguments that are mutually exclusive with each other. If you provide more than one from a group in a single command, the last one given in the command overrides the others. 1. -all, -field 2. -ascending, -descending 3. -case_sensitive, -non_case_sensitive 4. -character, -integer, -numeric 5. -duplicates, -only_duplicates, -only_duplicate_keys, -unique, -unique_keys. ----------------------------------------------------------- Historical Background This edition of the Multics software materials and documentation is provided and donated to Massachusetts Institute of Technology by Group BULL including BULL HN Information Systems Inc. as a contribution to computer science knowledge. This donation is made also to give evidence of the common contributions of Massachusetts Institute of Technology, Bell Laboratories, General Electric, Honeywell Information Systems Inc., Honeywell BULL Inc., Groupe BULL and BULL HN Information Systems Inc. to the development of this operating system. Multics development was initiated by Massachusetts Institute of Technology Project MAC (1963-1970), renamed the MIT Laboratory for Computer Science and Artificial Intelligence in the mid 1970s, under the leadership of Professor Fernando Jose Corbato. Users consider that Multics provided the best software architecture for managing computer hardware properly and for executing programs. Many subsequent operating systems incorporated Multics principles. Multics was distributed in 1975 to 2000 by Group Bull in Europe , and in the U.S. by Bull HN Information Systems Inc., as successor in interest by change in name only to Honeywell Bull Inc. and Honeywell Information Systems Inc. . ----------------------------------------------------------- Permission to use, copy, modify, and distribute these programs and their documentation for any purpose and without fee is hereby granted,provided that the below copyright notice and historical background appear in all copies and that both the copyright notice and historical background and this permission notice appear in supporting documentation, and that the names of MIT, HIS, BULL or BULL HN not be used in advertising or publicity pertaining to distribution of the programs without specific prior written permission. Copyright 1972 by Massachusetts Institute of Technology and Honeywell Information Systems Inc. Copyright 2006 by BULL HN Information Systems Inc. Copyright 2006 by Bull SAS All Rights Reserved