09/21/87  hardcore 
Known errors in the current release of hardcore.
#	Associated TR's
Description

922  phx20933
Some hardcore module needs to know if the disk is operative.  This is
done by calling disk_control$test_disk or dctl$test_disk.  The module
loops until the IO is complete.  The problems come when the hardware is
broken in such a way that the IO never completes.  Therefore the
pvte.testing is never reset.  I gess that this is another place that
the disk dim should give up.  Because it knows that the IO did not
complete and that it is a test type IO.  One of the problems with
makeing disk_control smarter is that more pages need to be wired in
ring 0.

921  phx20930
During a BCE restore tape record sequence errors are occuring at the
end of the tape.  Sometimes the sequence error shows an actual disk
record missing and others appear to only show the tape record numbers
in error with the disk record numbers still in sequence (no gap).

920  phx20152
vacate_pv is setting pvte.pc_vacating and pvte.vacating.  The use of
pvte.vacating is to keep new segments from being created on this pv.
pc_vacating will inhibit and new pages being created on this pv.  The
contract of vacate_pv is only to keep new segments from beeing created.
Therefore pvte.pc_vacating should not be set in vacate_pv.pl1.

919  phx20908
Another call in disk_queue to code in >udd>m>lib.  The fix will be to
remove the -interpret support from the disk_queue command.  This should
not present any grate problems because it could not be working at sites
other than system M.

918  phx20868
The TR claimes a 17th level can exists in the hierachy and problemes
exists if the pack is demounted when this segment is active?
decativate_for_demount.pl1 line 261.

917  phx20922
disk_control will, on certain types of disk errors such as MPC data
alerts, continuously retry the failing IO.  The main complant is for
bootload_io type at BCE.  this includes such things as copy_disk and
save and restore commands.  The reason for this is disk_control
determines that this is a "bad_path" status, its job is to delete this
channel and then another will be tried until all channels, save one
have been deleted .  Then add them all back and just keep doing it over
and over again.

897  phx13424 phx17773 phx17819
Problems with directory quota management/enforcement.

895  
No automatic hierarchy salvage is occuring when "boot rpvs" or "boot
rlvs" is done.

894  phx20661
Linkage error at bce early loading firmware in mpcs.

891  
delete_ calls hcs_$get_segment_ptr_path to determine if a segment is
known in the calling ring (it wants to call term_ only when known
segments are being deleted).  The hcs_ gate target is
initiate_$get_segment_ptr_path, which currently calls
dc_find$obj_initiate to find the object's directory entry.  This can
cause a superfluous GRANT audit message, since $get_segment_ptr_path
only returns a pointer to the segment if it is already known (in any
ring) to the process.  And it can cause a superfluous DENY audit
message, since no operation is performed unless the segment is known.

The fix involves creating a new entrypoint, dc_find$obj_initiate_priv,
which bypasses access checks and auditing, and changing
initiate_$get_segment_ptr_path to call this new entrypoint.

The intent of the fix would be to never audit the operation of
hcs_$get_segment_ptr_path.  This is true even if the caller asks about
a segment known only in a ring other than the caller's ring.  Since the
original audit message included the ring brackets of the segment, it
documents the caller's access to the segment from all rings within
those ring brackets.

890  phx19527
ioa_$ioa_stream prints garbage or blows up when no control string is
given.

887  phx19986
The disk_control$test_drive entry does not wait for an interrupt for
its I/O, but polls the status word.  For FIPS devices or those on a
DAU, this will not work since the status words are not valid under the
interrupt is sent.

885  
The program install_ttt_ does no auditing.

884  
The hcs_$truncate_file entry logs a DENIED message even though other
entries log GRANTED, as the reason the call fails (this operation is
not allowed for a directory) has nothing to do with access control.

882  
It appears that hcs_$make_entry does not null its output argument when
it returns an error code, although the documentation states that it
does.  Since it doesn't modify the output argument at all in this case,
this is not a security problem.

881  
Several problems with hcs_$fs_move_file and hcs_$fs_move_seg.

They return an error code if the caller has rw access to both the
source and destination segments, but null access to the directory in
which they are contained.  The audit messages show various GRANTED and
DENIED fs_obj_prop_read's.  The reason is that the inner ring module
attempts to get the status on the destination to find out its current
length.  Unfortunately it uses an entry in status_ which returns more
information (which requires S on the parent).

Since the entries are considered obsolete, it's not worth fixing this
silly restriction.

Another, more serious problem with hcs_$fs_move_file is that if the
user does not have RW access to the destination, error_table_$no_move
is returned, but no DENIED is logged.  It audits GRANTED read of fs_obj
prop, and GRANTED initiation of FS_obj.  This was in a case where the
user's authorization was greater than the access class of the existing
destination segment, so the process had R effective access to the
segment and S effective access to the containing dir.  This bug should
be fixed, but it requires a new entry into dc_find.

880  
Many filesystem operations consist of a name lookup followed by an
access check.  The way dc_find implements these, an operation which
requires more than S access to a directory can fail (with
error_table_$namedup or error_table_$seg_not_found) and generate no
audit message, even though the caller has insufficient access to
perform the operation.  This occurs when the eventual failure of the
operation can be determined from the name lookup.

879  
The hcs_$tty_get_name returns a channel name for a channel belonging to
a process other than the caller.

877  
None of the entries in the dm_hcs_ gate do any auditing.

876  
Several file system attribute setting operations generate audit
messages which say GRANTED even though the operation is later denied.
This happens when M access is required to the parent and the process
must be in the write bracket of the entry.  Worse, no DENIED audit
message is ever generated.  The entries in question are:  set_$(copysw
volume_dump_switches safety_sw_ptr safety_sw_path synchronized_sw
max_length_ptr max_length_path entry_bound_ptr entry_bound_path)

With the fixing of the bug described in entry 23, the entries
set$(damaged_sw_path damaged_sw_ptr dnzp_sw_path dnzp_ptr) must be
added to the list.

875  
Upgraded directories created under dir privilege are left in a
process's address space after dir privilege is turned off.  The
suspected cause is that the pathname associative memory is not being
flushed when dir privilege is turned off.

This poses no security problems since only a person with privileges
could have gotten into this position.

874  
log_read_$position_time will not find any messages later then the
latest message in the log at the time that the log was opened.
log_read_$position_sequence has the equivalent problem.

872  
There is an ambiguity in the definition of "security auditing" that is
particularly apparent in the case of append.  The ambiguity is this:
some system operations make both security-related and
non-security-related checks.  Either check can fail.  If the security
check passes, but the non- security check fails, it is unclear what the
"correct" security audit message is:  Grant, or Deny?

The ideal implementation would probably be to indicate the exact
situation in the audit message:  that access would have been granted,
but was not.

The current implementation of append (and others) is to audit the
access grant, but later abort the operation if the non-security check
fails.  This is particularly confusing in the case where the requested
multi-class max authorization is above the process authorization or in
the case that the requested authorization is below the containing
directory access class.  This is considered to be a non-security
related failure (no attempt was made to access information or destroy
it) but the error code, ai_restricted, appears security-related.
Nonetheless, the audit is a GRANT.

This behavior should be documented in MDD004 and in the MDD on
Ring 0 Auditing and Logging.

863  phx19695
If Data Management has not yet been used during a bootload, and a fault
while in ring-0 causes verify_lock to be invoked, a ring-0 loop will
result because verify_lock attempts to reference dm_journal_seg without
first checking the switch sst$dm_enabled to determine if data
management has been enabled.

861  phx19582
The entry dc_find$dir_move_quota performs an superfluous and incorrect
AIM check.  It is superfluous because the KST access modes will ensure
there is no writedown path and it is incorrect because the call to
aim_check_ attempts to compare the access class in the directory header
with the access class in the entry for the directory -- both of these
should always be equal.  The check may be safely removed.

858  phx19491
The alarm_clock_meters command is missing its addname, "acm".  The
documention claims the addname exists.

856  phx19472
ioi_page_table$ptx_to_ptp may return an invalid pointer it the supplied
ptx is invalid.  The verify_ptx internal suboutine causes a non-local
return (via procedure quit) if the ptx is invalid, this will result in
a return to the caller of ioi_page_table$ptx_to_ptp with an invalid
return pointer.

852  phx19433
The check_vtoce dir salvager and the volume retriever can both produce
segments whose security out of service switch is set on.  reset_soos,
however, refuses to work on non-directory segments.

851  phx19285
sys_trouble.alm lacks message documentation for "Fault while in masked
environment"

850  phx16984
Nothing in MDC will replace missing add-names in >lv.  This can cause
various inconsistencies.,

849  phx17979
Disk MPC's get confused when individual drives generate many, many,
errors, and begin to report errors for other drives.  This is reported
here to cover the TR and to record it for future reference.

841  phx19270
Because page control will not decrement a quota through zero, this can
invalidate the assumptions made by fix_quota_used with respect to the
constancy of the quota error during operation.

839  phx19254
initiate_ does not distinguish calls from phcs_$initiate's gate target
(ring0_init_$initiate) from calls to hcs_$initiate.  For the former,
attempts to initiate a directory should return error_table_$moderr if
user does not have proper ACL or AIM access to the directory.  For the
latter, it should return the "traditional" error_table_$dirseg, since
directories can never be initiated (via hcs_$initiate) from an outer
ring.

Fixing this may require a change to dc_find$obj_initiate and
$obj_initiate_raw since these entrypoints currently map
error_table_$moderr into error_table_$dirseg.  And the fix may require
separating the entrypoint in initiate_ used by ring 0 modules (eg,
ring0_init_) from that used by hcs_$initiate.

836  phx19180
vtoc buffer allocation and usage can too easily crash the system from
lack of buffers.  A more graceful way to warn about pending doom
appears in the TR, along with a suggestion for avoiding the problem at
ast flush time.

835  phx15923
hc_ipc$send_wakeup should protest if a non-null info pointer is
supplied for a fast channel.

833  phx19071
quota uses error_table_$invalid_qmax for any error.  It should be more
informative.

832  phx19073
You can set maxe as high as max_maxe.  Unfortunately, this is too high
(does not count max stopped stack_0's) and therefore crashes the system
when the system runs out of stack_0's.

831  phx19074
The two calls to range in hc_tune for setting mine are out of order.
As such, attempts to set mine above maxe produces the wrong error
message.

829  phx18779
add_bit_offset_ (and the corresponding addbitno pl1 builtin) do not
properly handle negative bit offsets.  Similarly, add_char_offset_ (and
the corresponding addcharno pl1 builtin function) do not properly
handle negative character offsets.

The failure lies in the abd and a9bd instructions, which assume that
only positive offsets will be used.  These instructions assume that
negative offsets will be handled by negating the offset and using the
sbd or s9bd instruction to subtract from the bit or character
displacement.  The proper solution is to detect negative offsets,
negate the offsets and use the sbd or s9bd instruction.

828  phx15340
terminate_proc should not truncate the ring 0 stack; it should leave it
around for analysis.  terminate_proc needs clean up in general.

827  phx18873
Inner rings should not be allowed to set search rules or working dirs.

825  phx15219
Attempts to type start after a call to sub_err_ with the can't restart
option causes an illegal return.

822  phx18837
make_msf_ copies the IACL from a dir onto the components of an MSF it
creates.  If the IACL does not give the specified user w access to
these components, then copy/move will fail to be able to copy/move the
MSF into the directory.

815  phx18756
Having any AIM privilege on makes RCP think that you are a system_high
process.

810  phx18607
If a SCU's size (as correctly described by its config card) is less
than the port switches on the CPUs (i.e., it is 3M whereas the CPU says
4M, as it must), running ISOLTS (memory tests) in this case can crash
the system with a store fault.

809  phx18517
The system has been known to crash in ioi_masked while processing a
channel time-out.

806  phx18566
Typos in fim, et al, misinterpret the hregs bits associated with parity
faults.

805  phx18565
The history registers for a parity fault that crashes the system do not
appear in the pds.  See the TR for details.

798  phx18352
sct_manager_$get is supposed to return a null pointer for non-set sct
values.  However, it checks for the sct entry being null after
converting the null value (a zero packed pointer) into a unpacked
pointer.  This unpacked pointer is not all zero so sct_manager_'s zero
check fails.  The fix is to check for zero before the pointer
assignment.

783  phx09958
The default potential attributes for a resource in the RTDT can be
mistreated when the RTDT is installed.  The symptoms are that the
attributes are shifted in the attributes word, causing all attempts to
access the resource to fail.

775  phx17026
The limit and process_limit fields in the rtdt are ignored.  (Actually,
only the values for the fields in the default_rtdt on the MST are
used.)

765  phx18243
The ring zero derail fault mechanism needs improvement.  In particular,
it should save as much information as other faults (fault_time
especially) so that azm displays this fault in proper order with the
others.

760  phx18185
Calling hcs_$grow_lot makes your lot of max size.  Calling it again
causes a FPE even if you have more room left in the lot.

754  phx17875
It has been experienced, on single physical volume logical volumes,
that, when the volume becomes full (and a user encounteres the logical
volume full error), that deleting segments from the volume does not
seem to reset the logical volume full condition for some number of
minutes afterwards.  This is not well understood.

751  phx17482
msf_manager_ does not understand multiclass msfs.  For such an msf,
msf_manager_ will add new components at the aim level of the dir that
is the msf, rather than at the aim class of the components of the msf.

749  phx17981
ips signals are not correctly masked in mrd_util_.  As a result, it is
possible to hit QUIT or have other conditions which can cause
operations to fail, killing off the daemon in question.  A fix is
known.

744  phx17943 phx18054
status_ won't allow the allocated return structures to be in a
different segment than the segment supplied as the return area (that
is, it doesn't allocate into extensible areas).

742  phx17838
The volume salvager should report page and vtoce bit map
inconsistencies.

735  phx17815
set_mdir_quota correctly sets the quota in the vtoce, but incorrectly
sets the value in the aste, when inferior dirs have terminal quotas.

733  phx17690
If an error is indicated when an i/o completion of a volmap page is
posted, volmap_page does not strip the state away from the page number
producing a bogus error message.

732  phx15640
Hardcore sets damage switches for directories and there is no way for
users to turn them off.  The Salvager should be changed to salvage
directories that have the damage switch set and turn it off once
salvaging is complete.

731  phx17688
Hardcore should validate pds$stacks (validation_level) before using it.

730  phx17662
A second call to delmain to delete a frame previosuly deleted will
cause the calling process to hang on a bogus page wait event.

723  phx17551
More errors in hdx (not copying args, not terminating segments
correctly).

722  phx17553
More errors in mdx (not copying parameters, not terminating
disk_table_).

721  phx17615
init_disk_pack_ (actually, calling countervalidate_label_) produces an
error message not documented within init_disk_pack_.

720  phx17614
init_disk_pack_ references an unreferenced variable when looking for
the undocumented copy option.

718  phx17552
mdc_status_ does not properly copy all of its args.  For that matter,
it doesn't even compile.

717  phx17597
io_syserr_msg is declared to be three words long, but is overlaid with
a structure which is five words long.

712  phx17186
You will die if another process deletes your working dir.

711  phx16992
A page error uses mc.errcode to encode the relevant information.
Unfortunately, system_startup_ cannot decipher this and crashes the
system (which would have happened probably anyway).

708  phx17416
hcs_$status_mins does not work on the root.

707  phx17413
act_proc uses the wrong value when determining maximum possible access
class.

705  phx17394
A timeout from resetting a channel from a timeout will cause a fault
while in masked environment, crashing the system.

704  phx17374
hcs_$quota_read returns "Some directory in path..." instead of "Entry
not found" when the target does not exist but its parent does.

701  phx17302
hcs_$fs_get_brackets will not return the ring brackets of an inner ring
object.

700  phx17259
attach_lv references the non-existant error_table_$notacted.

699  phx17257
scavange_vol refers to the non-existant error_table_$no_arg.

698  phx17219
disk_rebuild examines too many bits in a vtoce file map when examining
it to see if it is free, when performing volmap compression.  This
sometimes causes the compression to fail.

696  phx17141
The aste/vtoce.dtm fields are examined to set the dbm_map bits used by
the volume dumper when dumping objects.  For directories, these fields
lead to an incorrect interpretation as to whether a directory has been
modified, leading to extraneous directory dumping.

694  phx17132
The volume retriever does not collect enough AIM related information.
To process a retrieval request, it needs to store, in ring 1, the user
auth, and max auth.  Now it only stores the auth, which is
automatically stored by message_segment_.

The volume retriever needs its own gate to ring 1 which will store the
ring, auth, and max auth securely in the message.

693  phx17132
append$retv_append cannot possibly append a multi-class object, since
it only has two of the three quantities

    user auth
    user max auth
    desired object max acc

THe structure passed to it needs to be changed.

692  phx17141
The volume dumper examines the wrong field when determining if it
should dump a directory, thus dumping unneeded directories.

691  phx16992
A page_fault_error occuring at the Initializer's ring-1 command level
causes a crash, but the attempt to produce the crash message itself
produces a crash because the ring-1 condition handler cannot interpret
the mc.errcode value.

690  phx15255
The SCU can return the same value for the clock twice.  Some software
uniquification isa needed.

689  phx14716
When the directory salvager determines that the sons LVID in a
directory header is different from the value in the branch for the
directory, it mindlessly copies the value from the branch into the
directory header.  This has the effect that if the value is wrong in
the branch, it will be wrong everywhere afterwards.

At least, the salvager should check the value to see whether it's zero
(and obviously invalid) before propagating it.

This is a genuine problem, and not already on the hardcore error list.
The particular problem that provoked this report has been fixed
elsewhere, and is no longer relevant, but the general problem remains.

685  phx17055
Various modules, in particular sys_trouble, are missing some error
message documentation.

684  phx15585
A situation (not understood) exists in which the records used exceeds
the current length, preventing further access to the segment.

682  phx15752
core flushing (for pleasure from the as) should not flush pdir segs.
Also, thew scheduling of the core flush is not at precise times.

681  phx15833
reclassify_seg should avoid the work if what it is reclassifying is
already at the level it needs to be.

679  phx15852
Both illegal_procedure.pl1 and the documentation suggest that illegal
op_code, illegal addr/modifier and other illegal procedure faults
should be audited.  This third group, however, is not.

678  phx15172
syserr_real should check its error code parameter for non-zero-ness
when producing the message text.

676  phx14420
The ascii_to_ebcdic_ and ebcdic_to_ascii_ tables and routines should
handle the 256 character ebcdic set and map it onto some extended ascii
set.

664  phx17116
The vtoce_checksum implementation is hamstrung by two problems:

  1) the "checksum_valid" flag is quite likely to be turned off by
damage, causing the checksum to be recalculated for invalid data.

  2) part 3 has no checksum, and disk damage quite frequently fries it.

663  phx17010
Hot buffers can fill up vtoc_buffer_seg, crashing the system.  The
retry fix for 662 reduces the problem, but not all if it, since an
authentically broken disk can fill up the segment.

657  phx17050
No gullibility checking, checksumming, or other protection against
damage exists for

    Record 6 -- the vtoc map
    Record 0 -- the label (except for "Multics Storage System Volume")

Damage to these areas can cause widespread disaster, due to confusion
as to the location of the paging region!

We need:

   1) Sentinels on all records of the label
   2) Checksums on all records of the label
   3) A (or multiple) safe-store records that store only permanent
information for recovery from damage to one of the records (like the
vtoc map) that contain both permanent and dynamic information.

656  phx17052
Detaching a device with I/O in progress can cause a fualt while in
wired environment due to an uninitialized pointer in the reset_device
entry of the program ioi_masked.

654  phx16046
No re-verification of the label of an offline disk is made when it
comes back online.  As a result, mistakes with patch plusgs are
extremely dangerous.  disk_control should not declare a disk back to
life unless the label checks out in some simple fashion.

652  phx16979
The ring 0 portion of the three-ring circus (volume management) is not
protected by a cleanup handler, and can leave pvtes in an inconsistent
state.

647  phx16929
See the TR for a complete exposition of this.  When all 4K aste's have
a page in memory get_aste behaves very badly (very slowly).

643  phx16592
master directory acs checking should use raw access.  Otherwise, it is
impossible to get e access to work right in both ring 4 and ring 1.

642  phx16743
disk_pack.incl.pl1 has the wrong include file listed as the home of the
dumper bit map.

634  phx16905
boundsfault.pl1 does not recognize the case where the bound is less
than the msl but still within the page table size.  This breaks setting
the max length within the page table but larger for active segments,
since the 10.2 performance optimization for set_max_length took out the
setfaults in this case.

628  phx14990
Volume backup to a IO disk does not work with the current
implementation of rdisk_ stream IO.  The current version has no
buffering ability and no sense of logical End of Space (ala EOT on
tape) and physical End of Space on the pack, which is needed to allow
flushing of IO when this(EOT) is detected.

626  phx16692
append$retv_append has a bug wherein it misuses the "max_authorization"
field of the structure.  It should just consider that the max to put in
the multi-class segment max.

There is a companion bug in the retriever (volume) that fills in the
structure wrong to begin with.  The field has to be filled in with the
authorization out of the message segment for the retrieval request.

625  phx16548
When you try to terminate a segment with more than about 250 ref names,
the call aborts with the message "The RNT is in an inconsistent state."

623  phx02779
Because of a problem with accepting a zero buffer size, it has been
found that a returned hardware status that contains channel or central
fault status is being overlooked and assumed to be good.

614  phx16489
ring0_get_ miscdeclares code parameters as fixed bin.

613  phx16351
set_bc should not let you set a negative bit count.  (set or change).

611  phx16506
append only checks mountedp when segments are appended, not dirs or
links. While this may be convienient, it is inconsistent. The marginal
utility of creating dirs and links on unmounted LV's is outweighed by:

  1) the inconsistency: some operations work, some don't.
  2) for private LV's: the desire to have NOTHING happen to the LV when
     unmounted. Even if your access to attach a private logical volume
    has been taken away, you can still append links and dirs.
  3) If we ever move dirs onto the LV that they describe, this will
     clearly have to have the restriction.
  4) LV aim restrictions cannot be enforced if the LV is not munted.

605  phx16501
check_mdcs does not salvage quota inconsistencies between master
directories and their registration in the mdcs.  Only register_mdir
does this.  This requires the administrator to run register_mdir over
each mdir on a logical volume to be sure that everything is consistent.

Also, check_mdcs does not validate that a master directory actually has
the correct sons logical volume.

604  phx16500
Master directory control allows up to fixed bin (35) worth of quota for
an entire logical volume, but many fields are only declared fixed bin;
This creates periodic disasters in the control segments.

603  phx16499
Master directory control was not updated when quota was increased to 18
bits.  This can cause a wide variety of misbehaviors.

593  phx16015
The file system should log or meter invalid quota changes (attempts to
decrement used below 0).

592  phx16093
quota_received is not supported very nicely.  The TR complains that it
is not reported by any existing hcs_ entry.  There are other problems,
such as failure of salvagers to correct it, a way to forcibly set it.

587  phx15298 phx16005
peruse_crossref bugs:  does not detect LV not mounted; does not
initialize brief_sw; does not print satisfactory message when module is
not referenced.

583  phx15258 phx15275
Invalid iacl terms cause append to fail.  asd_ allows acl terms that
are invalid, like R..*, to be added to an initial acl.  append fails
trying to copy then the assumption that the
entire RVL will be mounted, else you will be doing 1pack recovery (a
risky assumption).

This is a limitation rather than a suggestion since we really aught to
have such a mechanism.

581  phx15044
fim should not save history registers that have just been freshly
cleared by fim_util.

572  phx14942
act_proc$create fails to return the empty APT entry in almost all error
cases.

569  phx14225
Incorrect warning message from scas_init.

568  phx14877
It is impossible to run hc_pf_meters without phcs_ access; metering
gate access should be sufficient.

566  phx14824
sweep_pv (segment_mover actually) cannot move rpv-only segments.  This
makes it difficult or impossible to compress the RPV VTOC.

565  phx14875
When the operator does an x deny (using RCPRM at site) the process
still thinks it has the drive.

561  phx14705
The accept_fs_disk check for partitions overlapping gets confused by
HIGH hardcore partitions.

557  phx14657
ebcidic_to_ascii_ and ascii_to_ebcidic_ should be in the same bound
segment, and not bound in with anything that uses them.  This will
allow prople to replace them when reading tapes with nonstandard (or,
nonMultics) EBCDIC encoding.

529  phx10098
save_dir_info fails if any of the entries in the dir are connection
failures.

527  phx08068
Strange things are done with the IC for certain faults in the FIM.
Perhaps they should be improved.  In particular, the IC reported in the
machine conditions for dfmp taking underflows is unexpected.

523  phx05319
ioa_ ^( and ^) execute at least once, instead of zero times, when fed
zero things to iterate over.

520  phx14440
page_error displays an erroneous disk address in the error message for
an I/O error on the volume map.  The fix is to ANA -1,dl before saving
the Areg, which contains the disk address in the lower.

518  phx14405
print_configuration_deck does not display negative numbers correctly.
It prints them as very large positive numbers.  This is not currently a
problem, since the BOS command parser does not understand negative
numbers completely (and marks them as octal in the config deck).  It
will be a problem when BOS is fixed or superceded.

516  phx14381
copy_out will fail is requested to copy a segment whose length is
larger than 255K.  In this case, it should attempt to set the max
length to 256K via phcs_ (or hcs_$something, when this operation
becomes non-privileged).

514  phx14387
rebuild_disk for the RPV may not copy the root directory correctly.
Specifically, modified pages in memory will not be copied - instead,
the earlier instances on disk will be copied instead.  This may cause a
crash during the subsequent initialization until the root in salvaged
(due to bad_dir_).  The problem is that disk_rebuild (the ring-0 module
which does the rebuild) does not call pc$cleanup for entry-held
segments (indeed, it should not do so in general).  The root directory
is entry-held, and so it goes.

513  phx14276
If a trouble fault occurs at a point where it is not caught by
fim_util$check_fault, the history registers from the trouble fault will
be over-written by those from the subsequent sys_trouble connect.  This
destroys potentially useful diagnostic data.

501  phx14181
There is a window in ring-0 ITT message processing.  If a fault occurs
in that window, ITT entries are lost for the bootload.  Further, they
are lost in a way which disables the logic in pxss which prevents ITT
overflow.  The likely result is a crash in pxss when the system runs
out of ITT entries.

498  phx05686
The time-record product maintained for a directory with a terminal
quota account is only an approximation to an ideal space-time integral
of disk usage.  This approximation is reasonably accurate for accounts
which have stable usage, but it has several anomalies for more volatile
accounts.  The problem is that the cumulative time-record product is
updated only when the directory VTOCE is updated (it is incremented by
the product of the instantaneous quota used and the delta-time since
the last update).  If, for example, a large amount of space was used
and returned in the interval between updates, there is no accounting
for that space.  A visible anomaly results from a further approximation
when get_quota is invoked.  At this time, the time-record product is
reported as the value it would have if the VTOCE were being updated at
that time (although it is not).  For the reasons cited, this can cause
time-record product to decrease with time.  The only reasonable
solution is to maintain time-record product continuously.  This would
not be expensive computationally, but it would require significantly
more wired storage per active segment.

497  phx14069
Most store faults should be recorded into the Syserr Log, as they are
usually indicative of faulty hardware [sic.].  hardware_fault should
filter out store faults in BAR mode, however, as they are caused by
program error.

490  phx13931
Values for select_switch parameters to hcs_$star_XXX entries in
star_structures.incl.pl1 are declared as fixed bin (2) (e.g.,
star_LINKS_ONLY).  They should be fixed bin (3).

487  phx13896
It should be possible to change the size of the AST pools while the
system is running (well, it should be possible to increase them,
anyway).  If the SST is expanded to multiple segments, this could be
done with moderately more work.

486  phx13897 phx14320
A volume which is inoperative cannot be demounted.  There should be a
way to do this, such as abandoning everything associated with the
volume which is in memory (VTOC buffers, ASTEs, pages, etc.)  and
marking it as demounted.  Also, disk I/O error processing should be
smarter about detecting inoperative devices, particularly devices which
appear operative but cannot do I/O without errors.

Note that this is the one case where it is safe to abandon VTOCE
buffers, since nobody will do an await_vtoce afterwards and lose (if
demounting does things in the proper order).  If there are I/O errors
and the volume remains mounted, it is never safe to abandon VTOCE
buffers.

468  phx13716
The various tables used in disk volume management (ring-0, ring-1, and
ring-4) can become inconsistent.  Several instances of this problem
have been corrected.  One which has not shows itself after an "alv"
followed by an "av -all".  The ring-4 copy of the disk table is not
updated after the second command, preventing pdir_volume_manager_ from
knowing that the logical volume is mounted (and hence eligible for
pdirs).

460  phx13544
master directory control can become confused if a master directory has a
subordinate directory with quota.  A set_mdir_quota {plus or minus} X
will cause the page control quota of the master directory to be the same
as the master directory quota.

448  phx12864
KST overflow has strange effects, not readily traceable to this problem.
KST overflow should probably be signalled, rather than indicated by an
error code.

436  phx05497
When signaller.alm pushes a stack frame, it first extends the previous
frame by 48 words to allow for interrupted push operations.  If a non
local goto is used to transfer control back into that extended stack
frame, it never gets shrunk.  Repeated occurences of this will
eventually use up the stack.

The fix should be to change signaller.alm to put the new frame 48 words
up the stack without doing an extension of the existing frame.  This
requires hand-coding the push, but thats not too hard.  The alternative
is to try to use a cleanup handler to shrink it, which would be awfully
hard since the cleanup handler would be associated with the frame above,
which would still be on the stack.  Its hard to shrink your callerr's
stack frame.

429  phx12689
When cpt is invoked with the -lg control argument, it does not print
full pathnames in the summary report.  It does, however, print full
pathnames in the detailed trace file if -trace is also specified.

410  phx12355
Attempted logins to ring-6 or ring-7 fail, since makestack requires
non-null effective access (at the validation level of the initial ring)
to signal_, unwinder_, operator_pointers_, and pl1_operators_.  These
have ring brackets of 0,5,5.  The general solution is not clear.  Rings
6 and 7 are supposed to be available for totally encapsulated
subsystems, with only facilities provided explicitly by the subsystem
available.  The difficulty is to balance this against the need to
provide a rudimentary environment to initialize the subsystem.

409  phx12251
A more compact method of logging I/O errors is needed.  Currently, each
I/O error is logged into the syserr log.  This can flood the log with
largely meaningless I/O error messages (for example, when reading a tape
of marginal quality.  An approach is to write summary records,
periodically (based on time or on error thresholds), and optionally
record detailed messages.

407  phx12250
Deletion of a segment with wired pages causes the segment not to be
deleted, left active, with PTWs for the wired pages having nulled
addresses and wired bits on.  Under some circumstances, this can cause a
system crash.  This situation can be caused by a user wiring pages
(through hphcs_).  This can also happen if a process terminates with an
active ioi buffer.

399  phx12134
append$retv should validate the entry supplied more carefully.  An
instance An instance of the problem is that the cross-retrieval of an
object with multiple names will contain a non-null forward name thread
in the primary name field.

393  phx12070 phx10495
Segments should be created with access of r to *.SysDaemon, rather than
rw.

383  
There should be a system-maintained database which keeps track of recent
crash history, and types of shutdowns.  Possibly it could be as simple
as logging, at bootload, the time and type of the last shutdown.  The
syserr log is probably robust enough, and can easily be scanned to find
the information.

382  phx04847
fix_quota_used should also adjust TRP totals in accordance with the
adjustment being applied to quota used and the length of time since the
last ESD failure crash.  This should be automatically driven from the
last crash info, and be manually overridable if necessary.

378  phx12013
setfaults should have a recovery strategy for page_fault_errors on a
target dseg; probably it should kill the other process, rather than
crashing the system with a crawlout with AST lock set.

376  phx12003
trace_mc should use a hardcore segment for the buffer, to avoid problems
with recursive faults caused by flushing trailers or dseg ptw misses.

364  phx01612
The iocb structure in iocb.incl.pl1 contains an implicit word of padding
between iocb.name and iocb.actual_iocb_ptr, which should be explicitly
declared as pad.

362  phx11904
verify_lock should check all ring-0 locks which could be held on
call-side.  It should not allow a process to crawlout with any ring-0
lock held.  For some locks detected by verify_lock, the system should be
crashed immediately; for others (vtoc buffer lock), some recovery is
possible.

360  phx11870
On a multi-process salvage, one of the processes may take an unexpected
error (page_fault_error, for example).  This will cause the process to
go to a new command level and wait for terminal input.  Eventually, all
other processes will hang (blocked) waiting for this process to respond
to the dispatch wakeup.  The solution is probably for do_subtree to
establish an any_other handler and do something appropriate on
unexpected signals.

357  phx11839
The supervisor should take more pains to ensure that a setfaults
operation is performed on segments dynamically marked as damaged, either
when the damage is detected, or soon thereafter.

356  phx10004
The primitive for setting the damaged switch should perform a setfaults
operation, since it operates in a better environment than page control
does when doing so, and it is desirable to provide damage notification
as quickly as possible to other processes.

352  phx11831
If a directory hash table overflows while the directory is being rebuilt
by salv_dir_checker_, some names on the entry which caused the overflow
may not be hashed in correctly.  This is because the special-case code
to keep hash from faulting on the partially rebuilt directory does not
ensure that all the names already processed are rehashed.

306  phx11600
The entry structure (dir_entry.incl.pl1) is misdeclared; the structure
takes only 37 words, despite the comment claiming that it takes 38.
This seems to be benign, but should be rectified.

305  phx11593
Although there are hcs_ entries to set it, the DNZP switch is not
reported by any status_ entrypoints.

303  phx11555 phx06112 phx04846
The quota salvager should correct inconsistencies in quota allocated and
quota received fields, as well as quota used.  There is presently no way
to repair these fields other than BOS PATCH.

300  phx11553
Damage to >lv and >disk_table_ should be detected and acted upon
automatically at bootload, rather than requiring use of BOOT NOLV and
NODT.

272  phx11009
traffic_control_queue should never be reporting a negative value for
tssc.  It does so because the snap of the APTEs consumes non-negligible
time (due to paging) with no locks held.  A fix is to read the current
time immediately after copying out the APTEs.

260  phx10996
A volume administrator can adjust the quota on a master directory of
which he is not the owner, if he has sma access.  This use charges the
quota account of the Initializer, which is clearly bogus.

239  phx10114
Although the salvager can set the security-out-of-service bit for
segment branches as well as directories, the privileged gate entry to
reset the switch works only on directories.  It should work on segments
as well.

229  phx09675
There should be a mechanism for establishing hardcore crash handlers
which would be executed by sys_trouble before crashing the system, so
that (for instance) the IMPDIM could shut itself down, by establishing a
handler to send a going-down connect to the IMP.

223  phx09383
Attempting to add a memory which is already online causes an OOB fault
in reconfigure (line 193) because it fumbles one of the error codes.

222  phx09341
The error message for incorrect access should be specific about the type
of access which the process lacks:  ACL, ring bracket, or AIM.
Presently, some primitives distinguish between ring bracket and ACL
violations, and others do not.  AIM violations would have to be detected
specially; there is no error code for this today.  See also entries 78
and 157.

219  phx09240 phx11009
system_performance_graph cannot properly represent more than 100
logged-in users.  It should use a different scale, or wrap around.

217  phx09162
When walking the AST to demount a volume, demount_pv gives up upon
encountering very minor anomalies, causing ESD to fail completely when
it should have almostr succeeded.  It needs a better way of walking the
AST, to eliminate the "demount_pv:  AST out of sync" message.  The AST
pools should be described by pointers and counts kept in the SST, rather
than just by count.

215  phx09082 phx12302
Checking of CPUs which are being added should be both more complete and
more flexible.  Proper settings for both cache and associative memories
should be checked.  It should also be possible for a site to over-ride
these checks (by arguments to add_cpu).

214  phx09047
There should be a DRL instruction at the beginning of page_fault, so
that history registers would be saved if a wild transfer occurred.

213  phx08965
There should be more state recorded in the PVT when a volume cannot be
accessed, such as the real fsdisk error coderather than just
pvte.device_inoperative.  This lack causes add_vol to be unable to
distinguish between "drive in protect" and "drive offline".

212  phx08963
The check_trailers procedure can only be enabled by recompilation.  It
should be possible to simply patch something.

211  phx10123
Messages from hardcore (disk_control, get_aste, hc_dmpr_primitives,
etc.)  should include the physical volume name where appropriate.  This
must be preceded by putting the name into ring zero.  (see entry 210)

210  phx11769 phx08952
The ring one volume management tables should be direct copies of the
ring zero PVT and LVT, which should be changed to include all the
information (names and special flags) now only in the disk_table.  This
is the only real way to fix the problems due to inconsistencies between
these databases.

203  phx11765
hcs_$fs_get_mode always returns the 4 bit set in directory modes.  It
should leave this bit off, like hcs_$get_user_effmode.

199  phx11761
The ioa controls ^e and ^f have difficulty formatting integers.  For
instance, ^.2f gives completely inappropriate results when given
1234567, though it does fine with 1234567.12

193  phx08451 phx11705
There should be special entries to status_ for the primary name, the
link path, and the list of names.  The existing status_ interfaces are
seriously defective here (see entry 192).  See phx11705 for interface
details.

189  phx08286
There should be a way to turn on the audit flag in the branch.  A
primitive mechanism, but better than nothing.  Now that the audit flag
does nothing, this will become a limitation until a proper per branch
audit mechanism is created.

188  phx08284
The privileged quota-setting primitives should log a message when used,
to aid in keeping track of the operations.

187  phx08076
When a process running ISOLTS is temrinated abnormally, the CPU and
memory is was using for the test are not released.  This, despite the
code in deact_proc which appears to do just that.

186  phx08263 phx03859 phx06694
There should be a way to interrupt the Initializer process, "no matter
what".  Perhaps a tiny debugging environment entered on receipt of an
execute fault.

184  phx10589
The MPC error counters should be read out and stored in the syserr log
when a pack is mounted or dismounted; this would make it much easier to
keep track of per-drive error histories.

183  phx07983 phx11700
The system should perform probabilistic verification of disk writes,
checking some small fraction of them for success.  The fraction would be
increased if errors occurred, decreased as the drive was seen to
operate, and be manually tunable, as well.

181  phx08237
There should be a way to change the time zone (CLOK card and sys_info
correction constant) while the system is running.

179  phx07814
verify_lock will recurse, faulting, if it tries to unlock a directory
which is no longer accessable due to seg_fault_error or page_fault_error
problems.  It should have condition handlers for this.

176  phx07711
The traffic_control_queue command should display the states of all the
interesting APTE flags; pre_empt_pending, in particular.

170  phx06979
The system should further analyze the MOS EDAC error messages to the
extent that it determines which pages in the SCU are affected by the
error, so that the pages can be removed, either manually or
automatically.  This will also save syserr log space.

167  phx06374
When a hardware fault occurs as a result of an Illegal Action from an
SCU, software should unlock the SCU history registers on that SCU, to
allow data from a fault which crashes the system (later) to be retained.
Unfortunately, it is not possible for software to read these registers.

166  phx06326
The hp_delete command tries to set some AIM flags in the directory it is
trying to delete.  This will not work if the directory is
connection-failed.  Since initiate was changes to activate directories
immediately, this problem is masked, but hp_delete shouldn't do this
anyway.

164  phx04854 phx05954
The UID generator and pxss should check the difference between the last
clock reading and the current one periodically, and crash the system if
it is too large.  This situation arises when a clock makes a sudden
jump, and could otherwise seriously damage the file system.

163  phx04854 phx05954
Dates in VTOCEs and directories should be corrected by the volume and
hierarchy salvagers.  Dates in the future should be set to the current
time, and dates from before NSS should be set to some early date.  This
situation can arise either from damage, or because the clock was
incorrectly set.  UIDs should also be checked for validity, and reset to
new UIDs (from getuid) if they fall outside the range of acceptable
times.

161  phx07238
The system should make some attempt to determine whether all the
configured IOMs can access a memory module being added.  This is
probably difficult to do, since it would have to be done by experiment,
which might prove disasterous if the IOM configuration panel were not
set properly.

157  phx06101
When attempting to append an entry, if the append cannot be performed
because of containing directory ring brackets, the error message should
be Validation level not in ring bracket, rather than Incorrect access to
directory containing entry.

155  phx06075
When a name on a branch is changed, it should be changed in place, so it
remains in the same place in the list of names, rather than behave as if
it had been deleted and added back.

145  phx03708
The attach_lv command should accept -a as well as -all.

142  phx03109
The FIM should distinguish (via different error codes for termination)
between an out-of-bounds on the ring zero stack and one on an outer ring
stack, to aid in identifying situations which cause this particular ring
zero error condition.

139  phx07240
When there is bad parity in memory, the resulting error messages are
very verbose.  Especially at ESD time, they should simply be flushed.
This requires more specific info about the messages in question to solve
well.

137  phx08082
The reclassify_sys_seg primitive doesn't work when system_high equals
system_low, because it requires that the segment end up with an acccess
class greater than that of the containing directory.  This is a
limitation derived from the implementation of multi-class segments,
which are required by various modules of directory control to really be
multi-class.

135  phx07543
When a directory is deleted from another process, strange things happen
when it is referenced.  Most often, lock takes a fault trying to look at
the UID.  Perhaps it should have a handler for that condition.

130  phx05245
It is possible for a users virtual CPU time to become very inaccurate as
the result of a large number of faults, because of the adjustment which
must be applied to compensate for fault processing time.  There is no
real way to fix this.

121  
A crawlout may leave a directory initiated which really should be
terminated, cluttering the KST.

119  
Reference names for inner ring segments can be made available to outer
ring programs; a violation of security.  Not well understood.

118  
copy_on_write makes the copy unencachable until the next setfaults
restores access.  Not well understood.

114  
The messages in the syserr log describing page control errors are
truncated when printed.  This appears to be a problem in the printing
routines, rather than in page_error or the log itself.

108  phx04071 phx04955
The cleanup handler in an absentee job is never executed if the absentee
terminates by a call to cu_$cl.  This mechanism should be considerably
more robust.

102  phx03345 phx09268
The fim does not properly handle EIS decimal overflows and underflows,
in that it does not respect the values to be reset, and also does not
reset the IC properly.

95  phx03943
The machine conditions resulting from inability to add a processor
should be saved somewhere for later analysis.  Presently they are just
discarded by init_processor.

80  phx03232
The write_limit is reset at each memory reconfiguration, resulting in
the PARM WLIM value apparently being ignored if reconfigurations occur.
Should fix it by having reconfiguration not reset it.

77  phx11596
The error code from hcs_$fs_move_xxx is not specific enough, partly due
to the lack of a corresponding source/target switch.

73  
Pathnames can be much longer than 168 characters (max is 16*32+1, 513).
This causes problems for all the interfaces which use the standard char
(168) declarations.  Fortunately, find_ can handle it, but many user
ring programs behave inconsistently.  The solution is not easy.

69  phx03152
The initializer can "find" directories by its linker search rules, due
to the special-casing in access_mode$effective.  This leads to
surprising, though harmless, behaviour.

68  phx11588
The structure for hcs_$create_branch_ has not kept up to date with file
system changes, and no longer contains all the values which might want
to be set when a branch is created.  It should be upgraded whenever the
file system is changed.

65  
The SST, limited in size to but one segment, cannot be made large
enough to optimally support the largest configurations available today,
and this situation can only get worse.  The fix is to split it up into
several tables, possibly using more than one segment for the AST
itself.  This is very hard.  83-01-18:  well, try this.  Get a pointer
register back by changing all references to sst|foo to sst$foo (use
pr4, that is).  Now, make a wired table of packed pointers to astes.
Interpret the aste threads as ndexes into this table.  This costs only
1 word per aste, as opposed to changing all 6 threads to packed
pointers (3 words).  It should just be grunt work to implement.

60  
There is no general mechanism for determining how many pages should be
wired by pmut$wire_and_mask, since error cases (calls to syserr, mainly)
may use up a large amount of stack space not normally required.  This
has been partially fixed by changing syserr to run on the PRDS when
called masked.

53  phx01533 phx01978
ESD will fail if an MPC is broken.  Multics should be more robust about
dealing with bad hardware, and delete the devices more rapidly.

32  
Many system meters overflow when the system stays up for a long time.
This causes faults in the idle process, and in various places in ring
zero.  This is a catch-all error list entry, to be reserved for the
general solution if we ever invent one.  Other specific entries address
specific instances of the problem.

22  phx02203
The quota moving primitives sometimes fail to adjust things properly
when working on active directories.  More details are not known at this
time.

19  
If the HC partition on the RPV is not large enough, it may not be
possible to boot with a partial RLV.

11  
A bad error message is provided if process initialization fails; for
instance, if the user has incorrect access to the process overseer.
This is possibly an answering service problem, actually.

10  
The linker and the fim look at instructions in the object segment
itself, rather than in the SCU data.  This is just one more reason why
execute-only code does not work.

9  
The system loops or otherwise misbehaves when the permanent syserr log
is damaged.  (>sc1>perm_syserr_log) This is partly a vfile_ problem in
dealing with trashed keyed vfiles.  Should fix syserr_log_man_ to be
better about dealing with problems in >sc1>perm_syserr_log.  If it has
difficulty, it should rename the old one and create a new one, rather
than simply giving up and not copying the partition.


                                          -----------------------------------------------------------


Historical Background

This edition of the Multics software materials and documentation is provided and donated
to Massachusetts Institute of Technology by Group BULL including BULL HN Information Systems Inc. 
as a contribution to computer science knowledge.  
This donation is made also to give evidence of the common contributions of Massachusetts Institute of Technology,
Bell Laboratories, General Electric, Honeywell Information Systems Inc., Honeywell BULL Inc., Groupe BULL
and BULL HN Information Systems Inc. to the development of this operating system. 
Multics development was initiated by Massachusetts Institute of Technology Project MAC (1963-1970),
renamed the MIT Laboratory for Computer Science and Artificial Intelligence in the mid 1970s, under the leadership
of Professor Fernando Jose Corbato. Users consider that Multics provided the best software architecture 
for managing computer hardware properly and for executing programs. Many subsequent operating systems 
incorporated Multics principles.
Multics was distributed in 1975 to 2000 by Group Bull in Europe , and in the U.S. by Bull HN Information Systems Inc., 
as successor in interest by change in name only to Honeywell Bull Inc. and Honeywell Information Systems Inc. .

                                          -----------------------------------------------------------

Permission to use, copy, modify, and distribute these programs and their documentation for any purpose and without
fee is hereby granted,provided that the below copyright notice and historical background appear in all copies
and that both the copyright notice and historical background and this permission notice appear in supporting
documentation, and that the names of MIT, HIS, BULL or BULL HN not be used in advertising or publicity pertaining
to distribution of the programs without specific prior written permission.
    Copyright 1972 by Massachusetts Institute of Technology and Honeywell Information Systems Inc.
    Copyright 2006 by BULL HN Information Systems Inc.
    Copyright 2006 by Bull SAS
    All Rights Reserved