UOR_5.4.4

5.4.4 System Performance Measures (Infinite Line Capacity)

Now that we know how to obtain the steady-state probabilities of the hypercube model, it is natural to inquire how to use these probabilities to obtain values of useful system performance measures.

Workloads. We can immediately obtain the workloads of the individual servers. The workload _n of server n, which is the fraction of time that server n is busy, is equal to the sum of all steady-state probabilities having the state of server n equal to 1 (rather than 0) plus the fraction of time that a queue exists (during which time all servers are working). Thus, for our three-server example,

₁ = P₀₀₁ +P₁₀₁ +P₀₁₁ +P₁₁₁ + P_Q = 0.5574	(5.30a)
₁ = P₀₁₀ +P₁₁₀ +P₀₁₁ +P₁₁₁ + P_Q = 0.4734	(5.30b)
₁ = P₁₀₀ +P₁₁₀ +P₁₀₁ +P₁₁₁ + P_Q = 0.4693	(5.30c)

These results check (to four significant figures) with the requirement that the average workload = /3 = 0.5. Note that the workload sharing among response units caused the workloads of the units to be more evenly distributed than the workloads of the primary response areas; if each unit served only the customers of its own response area, the workloads would have been ₁ = 0.75, ₂ = 0.35, ₃ = 0.40. In fact, it is possible for a particular primary response area to generate more work than one unit could handle, and workload sharing would facilitate the overflow (assuming, of course, that the total system is not saturated, which in this case would require that the sum of the _i's be less than 3).

Interatom dispatch frequencies. For virtually all non-workload-oriented performance measures it is necessary to compute

f_nj fraction of all dispatches that send unit n to geographical atom j(f_nj = 1)

Let

E_nj set of states in which unit n is to be assigned any service request from atom j (assigning any ties arbitrarily)

For instance, in our three-server example, E₂₁ = {001, 101} and E₁₁ = {000, 010, 100, 110}. If the system is in any state in the set E_nj, then the fraction of service requests that result in the dispatch of unit n to atom j is _j/. So, excluding delays in queue, the fraction of all dispatches that send unit n to atom j is (_j/), multiplied by the sum of probabilities of states in E_nj. However, unit n can also be dispatched to atom j from a queue of waiting service requests. Thus, it is convenient to write

f_nj = f_nj^[1] + f_nj^[2]

(5.31)

As argued above, for nonsaturated states we have

where	f_nj^[1] = fraction of all dispatches that send unit n to atom j and incur no queue delay
	f_nj^[2] = fraction of all dispatches that send unit n to atom j and incur a positive queue delay

f_nj^[1] =

P_{B_i}

(5.32)

The term f_nj^[2] is equal to the product of three terms: (1) the probability that a randomly arriving service request incurs a queue delay; (2) the conditional probability that the request originated from atom j, given it incurs a queue delay; and (3) the conditional probability that the request results in the dispatch of unit n, given that it originates from atom j and incurs a queue delay. Clearly, a queue delay will be incurred by any request arriving while all N response units are busy, and thus the first term is

P'_Q

P_Q + P_{B_2^N
-1}

(5.33)

The second term is equal to the fraction of calls (_j / ) that are generated from atom j, and is not dependent on the fact that a queue exists. To obtain the third term we use the fact the queued calls are handled in a FCFS manner, or in some other manner that ignores the location of the service request and the location of the responding unit (at the scene of the previous service request); thus, any of the N busy units is equally likely to be assigned to any particular queued request, yielding a conditional probability of 1 / N. Summarizing, we have

f_nj^[2] =

(5.34)

and thus

f_nj =

P_{B_i} +

(5.35)

Given that we now know how to calculate the f_nj's from the state probabilities, we can immediately obtain several related performance measures of interest:

Fraction of total dispatches that are interresponse area

f_I = f_nj (5.36)
Fraction of dispatches of unit n that are interresponse area

f_{I_n} = (5.37)
Fraction of response area i requests that require other than unit i

f'_{I_i} = (5.38)

We illustrate each of these computations with our three-server example. First computing the additive term associated with queued requests, we note that P'_Q = 0.23684 and thus f_nj^[2] = _j(0.052631). Using this fact, we obtain for f₁₁, for instance,

f₁₁	= (P₀₀₀ + P₀₁₀ + P₁₀₀ + P₁₁₀) + ₁(0.052631)
	= 0.25[0.6667(0.4426) + 0.052631]
	= 0.08692

In other words, about 8.7 percent of all service requests are the type that result in unit 1 being sent to atom 1. The full set of f_nj's is displayed in Table 5-5.

Employing (5.36), the fraction of responses that are interresponse area is 0.43529. This figure checks with our intuition which might argue, roughly, that the fraction of intraresponse area responses would be equal to the average availability (50 percent) plus one-third the probability of incurring a queue delay (P'_Q = 0.23684), yielding a fraction of intraresponse area responses equal to 0.57895, or a fraction of interresponse area responses equal to 0.42105. The actual figure of 0.435 is larger than that conjectured due to workload imbalances, as we will argue in Section 5.6.

Employing (5.37), the fraction of dispatches of unit 1 that are interresponse area is 0. 11077/0.3715 = 0.2982.

Employing (5.38), the fraction of responses into response area 1 that are interresponse area responses is 0.23919/0.500 = 0.4784.

Thus, we see that the unit-specific frequency of interresponse area responses can be significantly different from the response-area-specific frequency. In this case, fully (1.0 - 0.4784) x 100 percent = 52.16 percent of response area 1's workload is handled by unit 1, and thus 47.84 percent is handled by units 2 and 3. But (1.0 - 0.2982) x 100 percent = 70.18 percent of unit 1's workload is within response area 1.

Travel times. Travel time is a central performance measure computed by the hypercube model. All travel times are computed from a travel-time matrix whose generic element is

, the mean travel time from atom i to atom j. The numerical values of the

_ij's may reflect complications to travel such as

one-way streets, barriers, traffic conditions, and so on. Thus, the time required to travel from i to j may not be the same as the time to travel from j to i and thus we allow _ij _ji. If no matrix of _ij's is obtainable empirically for the period of time under study, then by specifying the centroid (_i, _i) of each atom, one can approximate _ij to be (|_i - _j| + |_i - _j|)/, where is the effective response speed. One might wish to selectively override this equation, particularly for the case i = j, in which case a simple area square-root law (see Chapter 3) might be used to estimate mean intraatom travel time. For our 3-unit example, for simplicity we have set = 1 unit of distance per minute and have used the right-angle distance metric (without overrides) to obtain the travel-time matrix shown in Table 5-6.

To compute system mean travel times, we also require knowledge of the location of a unit when dispatched to a request. In the hypercube framework, the geographical depiction of the "location" of a response unit is general enough to model the fixed locations of ambulances and emergency repair vehicles and the mobile locations of police patrol units. This is accomplished by specifying a location matrix L = (l_nj), where l_nj is the probability that response unit n is located in atom j while available for dispatch (or, equivalently, the fraction of available or idle time that response unit n spends in atom j). We require that L be a stochastic matrix (i.e., for all n, l_nj = 1). A fixed-location unit would have l_nj = 1 for some (small) atom j and l_nk = 0 for k j. In fact, if one wished to be precise about the fixed location, the atom j having l_nj = 1 could be defined to be a point in the city (having zero area and zero workload). A mobile location unit would most likely have several l_nj's nonzero. Note that within this structure it is very natural to allow mobile units to have overlapping patrol areas; for instance, atom k would belong to overlapping patrol areas if l_n₁k 0 and l_n₂k 0 for some n₁ and n₂ n₁. A unit n's patrol area contains all atoms j for which l_nj > 0, whereas unit n's primary response area contains all atoms for which unit n is the first preferred unit to dispatch. In certain applications a unit's patrol area and response area are identical, but in many they are not.

To obtain travel-time performance measures, it is necessary to compute

t_nj mean time required for unit n, when available, to travel to atom j; n = 1, 2,. . . , N; j = 1, 2,. . . , N_A

Since unit n will be located in atom k with probability l_nk, we can write

t_nj =

l_nk_kj

(5.39)

To include the case of assignments from queues, we define the mean "queued request travel time,"

(5.40)

To interpret this expression, consider any particular service request which incurs a queue delay. The request is generated from atom j with probability _j / . Each one of the N busy response units (due to equal service rates) is equally likely to be dispatched to the request. The probability that the dispatched unit must travel from atom i is _i / , the fraction of region-wide workload generated from atom i, and the dispatch assignment is made independently of the particular values assumed by i and j. Thus, with conditional probability _i_j / ², the travel time to a queued request will be the travel time _ij from atom i to atom j. Hence, (5.40) is the mean travel time to a request that incurs a queue delay.

The expression for the region-wide unconditional mean travel time can now be written

f_nj^[1]t_nj + P'_Q

(5.41)

If we define

_j = average travel time to atom j

following reasoning analogous to that above, we write

_j =

(1 - P'_Q) +

_ijP'_Q

(5.42)

If we are interested in the mean travel time to requests in a particular primary response area, we define

average travel time to requests in primary response area n

Assuming response area n includes at least one atom j with _j 0, we have (following the reasoning above)

(1 - P'_Q) +

(P'_Q

(5.43)

Unfortunately, there is no known (exact) expression for the average travel time of unit n (assuming infinite queue capacity). The problem in deriving such an expression arises from the fact that the unit's position when dispatched for the first time back-to-back (with no idle time) during a system busy period is not selected from the probability distribution of request locations _j/; such a unit was most probably assigned to its current service request when several units were available, and thus its location tends to be near its home location (or patrol area). As an approximation, we estimate the mean travel time for unit n as follows:

(5.44)

We must recognize that the term reflecting travel time to queued requests is an overestimate which becomes asymptotically exact as the system utilization factor = /N 1.

Returning to our 3-unit example, substitution into (5.40) yields a mean queued request travel time _Q = 4.34 minutes. To compute the remainder of the (conditional) mean travel times, we must specify the location matrix L = (l_nj). For simplicity in this example, suppose that l₁₄ = l₂₅ = 1 and l₃₇ = l₃₈ = ½:

L = (l_nj) =

From this information, we can compute that the overall system mean travel time, from (5.41), is = 3.26 minutes. The area-specific and unit-specific mean travel times are displayed in Table 5-7.⁹ Note the wide variation in atom-specific mean travel times, ranging from 1.75 minutes to 5.26 minutes, while the higher aggregation averages ( and ) exhibit much less variability. This type of behavior is consistent with the analytical models we studied in Chapters 2 and 3.

⁹If the computed mean travel times in Table 5-7 are found to be inconsistent with the assumption that the mean total service time of each unit is the same known constant, then one must enter these travel times into a new computation for (unit-specific) mean service times and execute the model again to accomplish mean service time calibration (see Sec. 5.3.1). Several such executions may be necessary to achieve convergence.