MULTICS DESIGN DOCUMENT MDD-007 To: MDD Distribution From: Benson I. Margulies Ed Sharpe Date: August 1, 1985 Subject: VTOCE File System Abstract: The management and internal organization of storage system physical disk volumes on Multics. Revisions: REVISION DATE AUTHOR initial 85-06-01 Benson I. Margulies 01 85-08-01 Ed Sharpe _________________________________________________________________ Multics Design Documents are the official design descriptions of the Multics Trusted Computing Base. They are internal documents, which may be released outside of Multics System Development only with the approval of the Director. i MDD-007 VTOCE File System CONTENTS Page Section 1 Overview of the VTOCE File System . . . . 1-1 1.1 Services of the VFS . . . . . . . 1-1 1.2 Databases of the VFS . . . . . . . 1-2 1.3 Programs of the VFS . . . . . . . 1-3 Section 2 Security Issues in the VTOCE file system 2-1 2.1 AIM restrictions on Logical Volume Contents . . . . . . . . . . . . . . 2-1 2.2 System Integrity Policies . . . . 2-1 2.2.1 Address Management . . . . . 2-1 2.2.2 VTOCE/branch consistency checking . . . . . . . . . . . . . 2-2 2.2.3 VTOCE checksums . . . . . . . 2-2 2.2.4 Salvaging and Scavenging . . 2-2 ii VTOCE File System MDD-007 SECTION 1 OVERVIEW OF THE VTOCE FILE SYSTEM The Multics user-visible file system is a hierarchy of directories and segments. Interfaces and maintenance of this hierarchy is provided by the Directory Control subsystem, which described in the published user documentation and the Directory Control SDN. Segment Control and Page Control are subsystems that support the movement of segments between disk volumes and main memory. The VTOC File System (VFS) provides the interface between these subsystems and the individual physical disk volumes. The acronym VTOC represents Volume Table of Contents. Each physical disk volume has a separate VTOC, which is a vector of Volume Table of Contents Entries (VTOCE). Each VTOCE describes the location of all pages of the segment with which it is associated. It also contains some properties of the segment (others are kept in the directory branch). Within the VFS, a segment consists of a VTOCE and zero or more records. In comments and names of programs, though, "VTOCE" is used to refer to a VTOCE and its associated records. _1_._1 _S_E_R_V_I_C_E_S _O_F _T_H_E _V_F_S The VFS provides the following services to other Multics subsystems: ox Initialization of new physical volumes. When a new physical volume is added to a logical volume, the storage on the volume must be divided amongst the label, the VTOC, the paging region (storage used for the contents of segments) and other special partitions. The label and VTOC must be initialized. ox Creation and destruction of segments. When a new segment is created on a physical volume, the VFS 1-1 MDD-007 VTOCE File System must allocate a VTOCE on that volume. When a segment is deleted, the VTOCE must be freed. ox VTOCE update While a segment is active, the file map and associated quantities (records used, etc.) are maintained by segment control and page control in the Active Segment Table (AST). The VFS is called to update this information in the permanent copy of the VTOCE on disk. ox VTOCE get, put, await Directory Control and/or Page control call upon the VFS to read and write the VTOCE. Under some circumstances, system security policy requires that a VTOCE write be completed before an operation proceeds. The VFS provides an interface for this purpose. ox Volume Consistency After crashes without ESD, the per-volume maps of assigned disk records can become inconsistent with the record assignments recorded in the VTOCE's. The volume salvager and scavenger scan the disk and check to be sure that no record is claimed to be free while in use by a VTOCE, and that no record is claimed in use by two different VTOCE's. They also check the other fields of the VTOCE data structure for consistency. _1_._2 _D_A_T_A_B_A_S_E_S _O_F _T_H_E _V_F_S There are three databases with which the VTOC File System is concerned: the VTOC itself; the VTOC stock; and the VTOC buffer. The VTOC is an integral part of the Multics physical volume. It is shown below in context of the physical volume layout. The addresses shown are offsets of 1024 word records on the disk. 0 Label 1-3 Volume Map 4-5 Dumper Map 6 VTOC Map 7 unused 8-n VTOC n+1-N paging region/ partitions The label contains information about the volume such as its name, the name of the logical volume with which it is associated, the access class range of segments on the volume, the location and extent of partitions, and the length of the VTOC. The Volume Map is a bit map of free records on the volume. The Dumper Map 1-2 VTOCE File System MDD-007 is a bit map of segments (VTOCEs) modified since the last volume dumper pass. The VTOC Map is a bit map of free VTOCEs. Any VTOCE indicated as free in the VTOC Map may be used for storage of a new segment. When segments are deleted, the bit in this map corresponding to the segment's VTOCE is set to indicate that it is once again free for use. Each VTOCE contains a table of record addresses for the pages of the segment. This is called the file map. The VTOCE also contains a number of properties of the segment such as the time is was last modified, its current length, and dumper information. Some of the properties are duplicated in the segment's branch (directory entry) to facilitate consistency checking. The most important of these is the segment's unique ID and access class. The VTOC stock is a per-volume list of VTOCE indicies which have been reserved for use. A number of VTOCEs are marked as used in the VTOC map at the time the volume is mounted. VTOCEs for new segments are drawn from this stock. More entries are added as necessary. (An analogous mechanism is used for segment pages). This is part of a mechanism which ensures volume integrity in the event of a crash. If the system crashes without ESD, the only ill effect is to leave all of the VTOCE's so-marked in use whether or not they are in use. The on-line volume scavenger is later used to free those which were not actually in use. The VTOC buffer is an area of memory used for VTOCE I/O. Each entry contains room for the VTOCE itself, properties/status of the vtoce or buffer entry, and information necessary to notify waiting processes. _1_._3 _P_R_O_G_R_A_M_S _O_F _T_H_E _V_F_S The code of the VFS may be divided into the following general areas: ox Physical Volume Initialization Volume initialization is performed via the operator "init_vol" command. The volume must be pre-registered by a system administrator. The registration information is used to fill in the volume label with the physical and logical volume identifiers and the logical volume access class range. The operator must supply other information such as the name and sizes of special partitions and the size of the VTOC. 1-3 MDD-007 VTOCE File System ox VTOCE I/O management I/O of VTOCEs is performed upon request from Segment Control, Page Control, or Directory Control. The programs manage the VTOC buffer and make I/O requests if necessary. Interrupts due to VTOCE I/O completion are transferred to the VFS by Disk Control; notification of any waiting processes is done via the appropriate Traffic Control entrypoints. ox VTOC stock management The VFS maintains the in-memory stock of free VTOCEs. It must be initialized when the volume is mounted, and drained when the volume is demounted. The stock must be replenished as VTOCEs in the stock are allocated for use. ox File System Support VTOCE manipulations corresponding to file system operations are supported by the VFS. Information from the Active Segment Table (AST) must be updated in the VTOCE. Operations which change a segment's length results in requests of the VFS to update VTOCE's to reflect pages added or nulled. ox Support for Salvage, Scavenge, and Rebuild These subsystems provide the mechanisms for recovery after system failures and for general file system maintenance. The volume salvager and scavenger ensure consistancy of physical volumes. (The volume salvager runs during system initialization before volumes are in use, the scavenger cooperates with page control to check volumes during normal service). The hierarchy salvager checks consistancy between directory contents and the VFS. The rebuild facility provides the ability to reallocate areas on a volume (e.g. change VTOC size, change partitions, etc). The actual modules of the VFS are all in the unit bound_vtoc_man. Each module in the VFS contributes in some way to the support of the various subsystems described above. The following paragraphs outline their major functions. vtoc_man provides the lowest level support within the VFS. It has entrypoints for reading and writing VTOCEs, allocating and freeing VTOCEs, and some system cleanup functions. It is the primary manager of the VTOC buffer. vtoc_search provides support for vtoc_man in the maintenance of the VTOC buffer. Specifically, it implements the hashing algorithm used in the allocation and reference of buffer entries. 1-4 VTOCE File System MDD-007 vtoc_stock_man manages the per-volume stock of available VTOCEs. It is called upon by vtoc_man for allocation of new VTOCEs and freeing of old VTOCEs. It adjusts the size of the stock as necessary. vtoc_interrupt is called upon by Disk Control upon VTOCE I/O completion. Its task is to notify any waiting processes of the I/O completion. (This module is actually in bound_wired_1 since it is an interrupt handler). create_vtoce supports segment creation. It is called by the file system primitives append and retv_copy. It in turn uses vtoc_man to accomplish its work. delete_vtoce supports segment deletion. If is called by the file system primitives delentry and append (if some error occurred late in segment creation). priv_delete_vtoce supports privileged deletion of VTOCEs via the hp_delete_vtoce command or the sweep_pv volume maintenance command. update_vtoce updates VTOCE information from the segments ASTE. It is called by the file system during segment deactiviation, by the disk rebuilder, by the volume scavenger, by system initilization, and by force write primitives. truncate_vtoce update the VTOCE file map when a segment is truncated. It calls upon page control to perform the operation on active segments. For inactive segments, it reads and updates the VTOCE via vtoc_man. It is also responsible for reporting quota changes for inactive segment truncations. Its callers are the file system truncate primitive, delte_vtoce, and the hierarchy salvager. 1-5 VTOCE File System MDD-007 SECTION 2 SECURITY ISSUES IN THE VTOCE FILE SYSTEM The VTOCE file system implements only one element of the system's functional security policy: the mandatory (AIM) restrictions on logical volume contents. In addition, the VTOCE file system implements parts of several important system integrity protocols which are essential to system security assurance. _2_._1 _A_I_M _R_E_S_T_R_I_C_T_I_O_N_S _O_N _L_O_G_I_C_A_L _V_O_L_U_M_E _C_O_N_T_E_N_T_S Multics system administrators specify an AIM range that constrains the information that may be stored on a logical volume. This information is stored in the volume registration for the logical volume and in the label of each physical volume. When the logical volume is accepted (mounted and made ready for use), the AIM restrictions are recorded in the Logical Volume Table (LVT) (see the Volume Management SDN). The program create_vtoce.pl1 enforces these restrictions, by comparing the access class from the branch of the new entry to the fields in the LVT Attempts to create a VTOCE ouside the AIM boundaries of the volume are audited. (Note that the multiple_class bit of the branch is not checked. As a result, the only restriction enforced for multi-class segments is that the maximum access class of the segment is within the range. The minimum is not checked. Thus, only the maximum is enforced for multi-class segments.) _2_._2 _S_Y_S_T_E_M _I_N_T_E_G_R_I_T_Y _P_O_L_I_C_I_E_S The VTOCE file system implements all or part of several system integrity protocols. They are briefly described here. _2_._2_._1 _A_d_d_r_e_s_s _M_a_n_a_g_e_m_e_n_t The system guarantees that normal operation, including no-ESD crash, can never result in two segments claiming the same page of data (record), or in a record being associated with the wrong 2-1 MDD-007 VTOCE File System segment. Most of this policy is implemented by Page Control. Page Control depends on the VTOCE file system to write VTOCE's to disk in order to guarantee that the permanent list of records associated with a segment is updated before any records are freed. _2_._2_._2 _V_T_O_C_E_/_b_r_a_n_c_h _c_o_n_s_i_s_t_e_n_c_y _c_h_e_c_k_i_n_g There are several important fields that are duplicated in the branch and the VTOCE, including the access class. Inconsistencies between the branch and the VTOCE cause the segment to be marked security-out-of-service by the program salv_check_vtoce_.pl1. This program is part of the directory salvager, and is invoked only if the directory salvager is running in -check_vtoce mode. _2_._2_._3 _V_T_O_C_E _c_h_e_c_k_s_u_m_s To improve the reliability of the above, the record list in the VTOCE (the file map) is checksummed. Unfortunately, there is a bit that indicates whether a particular VTOCE has a valid checksum. Should that bit be zeroed by disk errors, the rest of the data in the VTOCE will be trusted. The volume and hierarchy salvagers should be run after any hardware errors that may have resulted in data errors on the volume. _2_._2_._4 _S_a_l_v_a_g_i_n_g _a_n_d _S_c_a_v_e_n_g_i_n_g In spite of the reused address policy, hardware errors can result in multiple segments claiming the same page, or a page in use by a segment to be marked free. The volume salvager and scavenger find and correct all such errors. (This may be done by truncating the VTOCEs involved and setting the damaged switch to prevent use of the data until a retrieval is performed). ----------------------------------------------------------- Historical Background This edition of the Multics software materials and documentation is provided and donated to Massachusetts Institute of Technology by Group BULL including BULL HN Information Systems Inc. as a contribution to computer science knowledge. This donation is made also to give evidence of the common contributions of Massachusetts Institute of Technology, Bell Laboratories, General Electric, Honeywell Information Systems Inc., Honeywell BULL Inc., Groupe BULL and BULL HN Information Systems Inc. to the development of this operating system. Multics development was initiated by Massachusetts Institute of Technology Project MAC (1963-1970), renamed the MIT Laboratory for Computer Science and Artificial Intelligence in the mid 1970s, under the leadership of Professor Fernando Jose Corbato. Users consider that Multics provided the best software architecture for managing computer hardware properly and for executing programs. Many subsequent operating systems incorporated Multics principles. Multics was distributed in 1975 to 2000 by Group Bull in Europe , and in the U.S. by Bull HN Information Systems Inc., as successor in interest by change in name only to Honeywell Bull Inc. and Honeywell Information Systems Inc. . ----------------------------------------------------------- Permission to use, copy, modify, and distribute these programs and their documentation for any purpose and without fee is hereby granted,provided that the below copyright notice and historical background appear in all copies and that both the copyright notice and historical background and this permission notice appear in supporting documentation, and that the names of MIT, HIS, BULL or BULL HN not be used in advertising or publicity pertaining to distribution of the programs without specific prior written permission. Copyright 1972 by Massachusetts Institute of Technology and Honeywell Information Systems Inc. Copyright 2006 by BULL HN Information Systems Inc. Copyright 2006 by Bull SAS All Rights Reserved