Dmytro Taranovsky
July 17, 2005
Extensively Expanded and Revised: December 6, 2016

Finitistic Properties of High Complexity

Abstract: We use fast-growing finite and infinite sequences of natural numbers and more complicated constructs to define models of hypercomputation and interpret non-arithmetic predicates, with the strongest extensions reaching full second order arithmetical truth and beyond. Since the predicates are interpreted using properties of certain natural finite structures, they are arguably finitistic.

Contents:
1 Introduction
2 A Model of Hypercomputation
3 Strengthenings to Multiple Sequences
4 Finite Structures
5 $Σ^1_2$ Truth
6 Second Order Arithmetic
7 Beyond Second Order Arithmetic
    7.1 Estimators of Transfinite Levels
    7.2 Using Pattern Enumerators
    7.3 Using Countable Infinity

1 Introduction

Finitism refers to definitions and arguments that do not use an actual infinity, though they may involve a potential infinity. The potential infinity refers to the fact that the natural numbers are unlimited, and that there are arbitrarily large natural numbers. Hyperfinitism by contrast deals only with "small" natural numbers, such as those whose binary representation is feasible.

The central question about finitism is what properties are definable and what claims are demonstrable finitistically. At one extreme is the view that finitistic properties are the recursive ones, and that finitistic demonstration does not go much further beyond PRA. At the other extreme is the view that every definable property (including the truth predicate of set theory) of natural numbers can be defined finitistically, and that every true mathematical claim is finitistically demonstrable. On that view, infinite structures are useful as conceptual aids and for argument simplification, but can ultimately be replaced by finitistically definable finite analogues. Finitistic strong consistency proofs can involve either new finitistic properties or just new insights on the recursive properties. Most logicians have a narrow view of finitistic definability because of absence of evidence to the contrary. Plausibly finitistic ways to define properties of high complexity or to prove strong consistency statements were simply not known before this paper (or at least not generally known). One compromise view is that philosophically, there is a continuum of notions between being fully finitistic and using infinity, and our results help to elucidate that continuum. And one formalist (or under Platonism, metaphoric) view is that our notions are an alternative approach to infinity. In any case, whether or not the definitions here are finitistic, they enrich our foundational understanding by interpreting statements involving infinity as properties of certain natural finite structures.

Hypercomputation refers to idealized computers computing non-recursive functions. Models of hypercomputation may be divided into two types. One uses oracles, either directly or through some auxiliary mechanism. The other uses an infinite amount of computation in finite time. In the models we propose (which are technically oracle based), finite computations correspond to infinite computations by using fast-growing sequences to convert unbounded searches into bounded ones.

In a way, this paper complements my paper [1], especially the last section, and its follow-up [2]. These papers use largeness to extend the language — of number theory in this paper, and of set theory in the other papers — though here the extensions are definable in preexisting mathematics.

For each notion here, the strategy is to intuitively define the notion using just finite numbers, give a formal definition using infinite sets, argue that the intuitive and formal notions agree, and prove the complexity of the formal notion. Precise notions will be intuitively defined by defining a vague notion, and then defining the precise notion such that the definition does not depend on how vagueness is resolved. In the infinitary formal treatment, possibilities for the vague notion will form a directed system such that every sufficiently large choice works correctly.

We assume some understanding of set theory and second order arithmetic; relevant background can be found in a standard text such as [3] (especially for second order arithmetic).

The second section describes computation using a single sufficiently fast-growing infinite sequence. The third section discusses computation using multiple sequences. The fourth section presents similar results for finite sequences. The fifth section describes an arguably finitistic way to compute $Σ^1_2$ truth. The sixth section extends the results to full second order arithmetic, and the final section goes beyond second order arithmetic using transfinite levels of estimators (7.1), pattern enumerators (7.2), and countable infinity (7.3).

2 A Model of Hypercomputation

A powerful hypercomputation model rests on a surprisingly mild physical assumption:

There is a type of event that occurs an unlimited number of times, but with frequency of occurrence decreasing sufficiently fast.

Definition 2.1: A language L is recognized by a Turing machine with the fast growing sequence oracle iff there is a total function $A$ such that for every $B$ with $B(n)≥A(n)$ (for every natural number $n$), the machine using $B$ as an oracle halts on input $S$ iff $S$ is in $L$.

A variation on the definition would be to require $A(n)$ to be sufficiently large relative to $(A(0),A(1),...,A(n-1))$, but the expressive power is the same as we can get the stronger version using $(A(0),A(A(0)), A(A(A(0))), ...)$.

It is not known whether such machines are physically constructible. Currently, the biggest problem is surviving for the extremely long time required, rather than the apparent lack of the required physical events. In any case, studying the machines improves our understanding of the recursion theory. Below, we prove that a language is recognized by such a hypercomputer iff it is $Π^1_1$. Thus, a language is decidable in the model iff it is hyperarithmetic. Note that $Π^1_1$ rather than $Σ^1_1$ is the correct analogue of recursively enumerable. Also, a set is definable in first order arithmetic with a sufficiently fast growing function $A$ (independent of $A$) iff it is recursive in a finite hyperjump of 0.

Theorem 2.2: Recognizability for Turing machines with the fast growing sequence oracle equals $Π^1_1$.
Proof: For a sufficiently fast growing sequence $A$, a recursive relation has an infinite descending path through $n$ iff it has an infinite descending path through $n$ and then through a natural number less than $A(\text{max}(n, m))$ where $m$ is the length of the given recursive definition of the relation. By König's lemma, if the relation is well-founded, then the tree is finite, and thus the machine will eventually discover that the relation is well-founded.
For the converse, for each such machine and input, the machine halts iff for every $A:ℕ→ℕ$ there is a halting computation such that the answers the machine receives are at least as large as the answers given by $A$.

If we do not require a Turing machine to act uniformly on all sufficiently fast-growing $g$, we have the following.

Theorem 2.3:
(a) The set of Turing machines that halt for some sufficiently fast-growing $g$ (that is $g(n)$ is sufficiently large relative to $n$) is $Π^1_1$-complete.
(b) The set of Turing machines that halt for every monotonic sufficiently fast-growing $g$ is $Π^1_1$-complete.
(c) The set of Turing machines that halt for every sufficiently fast-growing g captures (under one-one reduction) the game quantifier for $Π^0_2$-games on integers.
Note: Lack of monotonicity makes the condition in (c) less natural. It is unclear if the bound in (c) is optimal; it is far beyond a finite iteration of hyperjump.
Proof:
(a) Follows from the proof of the theorem.
(b) We reduce this to recognizability with the fast-growing sequence oracle. Let f be monotonic and sufficiently fast-growing. Given a Turing machine T, consider the tree of sequences $S$ with $f(n+|T|)<S(n)$, and $S(n+1)<f(S(n)+|T|)$, and halt if $T$ halts on every path through the tree. If the machine halts for every sufficiently fast-growing $g$, then the tree (being finitely branching) is finite. Conversely, if for some sufficiently fast-growing $g$, the machine does not halt, consider the subtree of nodes for which there is a sufficiently fast-growing (not necessarily part of the tree) nonhalting continuation. This subtree is infinite, as desired.
(c) Given a $Π^0_2$ ($¬∃m∀n \, φ(m,n)$)-game quantifier, run the following machine: Use $g$ (and some coding consistent with $g(n)$ being sufficiently large relative to $n$) to get a number $N$ and moves for the first player, and restrict the second player moves to those below g(number_coding_the_play_so_far). Simulate the game, and halt if $∃m<N ∀n \, φ(m,n)$ is refuted on one of the paths.

Computational Complexity

For each machine, input, and the set of possible $A$, we may define its computation tree, where we start at the root, each vertex is a query to $A$, and each branch is an answer to the query, and we may define the ordinal given machine and input as the ordinal depth of the computation tree for the set of sufficiently fast-growing $A$ (where 'sufficiently fast-growing' may depend on the machine and input). Intuitively, the ordinal says how long in the worst case the computation will take relative to the rate of growth of $A$. From there, the ordinal for a machine is the supremum of ordinals across inputs (assuming the machine always halts for sufficiently fast-growing $A$), and the ordinal for a problem is smallest ordinal for a machine (whose output is independent of the choice of a sufficiently fast-growing $A$) that solves the problem.

Theorem 2.4: For each recursive ordinal $α$, the ordinal for a problem is at most $α$ iff the problem is recursive in $0^{(α)}$ (αth Turing jump of 0).
Proof: To compute $0^{(α+1)}$, given an input $n$, compute $0^{(α)}$ up to $A(n)$, and then check whether machine $n$ halts on $0^{(α)}$ before getting out of bounds. Conversely, given a computation tree of depth $α$, start with accepting halting positions, and work backwards, and at each stage, mark positions such that for every sufficiently large answer, the resulting position was marked. For $α=β+n$, note that each level adds a $Δ^0_2$-quantifier, so $n$ levels merge into $Δ^0_{n+1}$ as required. (Each quantifier $Δ^0_2$ since assuming the sequence so far was sufficiently fast-growing, and given that we measure at maximum depth, every sufficiently large choice for the next element will give the same acceptance value.)

Informally, we can speak of computational complexity on a much more fine-grained level as follows: A problem can be solved in time $T(n)$ if there is a Turing machine and a natural set $A$ such that the machine using $A$ solves the problem in time $T(n)$ and that if $∀n B(n)≥A(n)$, the machine using $B$ correctly solves the problem. Thus, recursive languages have recursive complexity, arithmetic languages have arithmetic complexity, and so on. The converse might be false, and it is unclear what the complexity of the inverse Busy Beaver function should be. Unfortunately, we do not know how to define complexity formally because using unnatural $A$ we can code solutions to recursive problems as follows: If instance $k$ is true, set $A(2k)$ to a certain value, and otherwise make $A(k)$ larger, and similarly with k being false and $A(2k+1)$. One mitigation (which still breaks down at high complexity) would be to make $A$ sufficiently sparse, for example by requiring that $A$ is coded through a binary tape with sequential access such that every '1' is at a position $2^n$ (for some n) and that if a '1' is position $2^n$, the next '1' cannot occur until position $2^{n^2}$. That way, for problems with easily verifiable solutions there is at most a polylogarithmic speed up (relative to the running time) as we can try all possible initial oracle tape segments.

3 Strengthenings to Multiple Sequences

Still higher complexity is reachable by computers that use a sufficiently fast growing function $A$ and a function $B$ that grows sufficiently fast relative to $A$. We can extend the model to allow more sequences, each of which growing sufficiently fast relative to the previous ones. We can even (not part of definition 3.1) allow transfinitely many sequences through the use of diagonalization. To handle such cases, an infinite alternation of quantifiers is defined to mean existence of a winning strategy in the corresponding game.

Definition 3.1: A language L is recognizable by a Turing machine with the fast-growing level $k$ sequence oracle iff there is a Turing a machine T and a function $f$ such that for every $A_1,...,A_k$ with $∀i< k \, ∀n \, A_{i+1}(n)>f(A_i)(n)$ ($A_0$ is 0 here), T halts on input S with oracles $A_1,...,A_k$ iff S is in L.
Note: We use the term "level $k$" (or $k$-level) because for a number $n$ in the range, the maximum $i$ such that $n∈A_i$ is from 1 to $k$.

Proposition 3.2: The set of winning positions in a $Σ^0_2$ game is recognizable using fast-growing level 2 sequence oracle.
Proof: If a player has a winning strategy, then he has a winning strategy using moves less than $A_1$(position), where position includes the position in the game and a canonical description of the rules. Given this constraint on the moves, the second player has a winning strategy iff he has a strategy that enters then $n$th open set before time $A_2$($n$+position). Given these constraints, the game becomes an open one with finite number of moves per turn, and hence recognizable.

Conjecture 3.3: We conjecture the following to have equal complexity ($n>0$):

  1. Recognizability using sequences of level $n+1$.
  2. Existence (or the predicate for existence) of winning strategies in level n of the difference hierarchy of (lightface) $Δ^0_3$ sets (level 1 is $Σ^0_2$; see below for the right sign for higher levels).
  3. Monotone inductive definitions for co-recognizable sets (relative to the current real number) for sequences of level $n$ (for $Π^0_1$ if $n=0$).

Moreover, we conjecture that allowing number quantification in the model (i.e. allowing arbitrary arithmetical formulas in place of Turing machines) corresponds to boldface (in terms of $Σ^1_1$ strength of the corresponding theory) or finitely iterated versions of these properties.

Furthermore, using the results in [4], we appear to have the following equivalence:
(a) $x$ is expressible using using (2) for some $n$.
(b) $x$ is in the least $\text{L}_κ$ satisfying $∀n ∃κ_1,...,κ_n \, κ_1<...<κ_n<κ ∧ \text{L}_{κ_1}≺_1...≺_1 \text{L}_{κ_n}$ (these sets for each numeral $n$ and relativized to each real $y$ exactly capture the $Π^1_3$ strength of $Π^1_2-\text{CA}_0$).
(c) $x$ is recursively reducible to the set of trees accepted by a tree automaton (equivalently by a formula in S2S) (the reduction $f$ is such that $x(i)=1$ iff $f(i)$ is a (possibly infinite) tree that is accepted).

Towards Proving the Conjecture

Conjectured computation of recognizability for these machines using the game quantifier:
Let player 1 choose oracles $A_1,A_2,...,A_k$, at move $m$ disclosing $\{(x,i): A_i(x)=m\}$, and similarly for player 2 and $B_1,B_2,...,B_k$ with $∀i,n \, A_i(n)≤B_i(n)$. Player 1 wins if the machine halts using $B_1,B_2,...,B_k$ as oracles or there is $i$ such that each of $A_1,...,A_i$ is infinite while one of $B_1,...,B_i$ is finite.

Conjectured computation of the game quantifier using recognizability for these machines:
To compute the game quantifier for the ($k-1$)th level of the lightface $Δ^0_3$ difference hierarchy (for the Baire space, which corresponds to level $k$ for the Cantor space), it is sufficient to consider games of the following form. Given a polynomial time computable $f:N→{0..k}$ (which will determine the game rules), the first player wins iff the highest $f(m)$ seen infinitely often is even, where $m$ codes the sequence of moves (each move being 0 or 1) made so far. To recognize the game quantifier (or its complement depending on the parity of $k$) using the oracles, modify the game as follows. Halt at stage $A_i(x)$ if $x≥n$ and no value of $f$ that is $≥i$ occured after stage $x$ (with each player's move being one stage). First player wins iff $i$ is odd (if there are multiple i that simultaneously trigger the halt, pick the lowest one). If we never halt, the winner is the player corresponding to value k being seen infinitely often.

Relation with monotonic induction:
* The game quantifier can be computed using monotonic induction: Use the previous quantifier to find positions where player two can win while preventing the game from getting into state $k$ (where player one wins if $k$ is seen infinitely often), and mark these positions as player two wins, and repeat with the modified game, continuing using monotonic induction until no new positions are added. The other positions are player one wins. (Side note: A winning strategy for player two would be to play the auxiliary game (for the step where the current position was marked), and each time a marked position for the auxiliary game is reached, switch to the auxiliary game that led to the position being marked. For player 1, the auxiliary game is keeping to unmarked positions, with a win if state $k$ is reached, and if the game does not halt at a finite step, the winning condition is the same as in the original game.)
* To simulate the multiple sequence oracle, we can build the computation tree where each vertex corresponds to a query to $A_k$. Work backwards from halting configurations, and using the previous quantifier for each step, use monotonic induction to reach the initial state if it is halting.

For transfinite sets of sequences, the above constructions suggest correspondence with winning strategies for transfinite levels of $Δ^0_3$ difference hierarchy.

4 Finite Structures

The infinite structures have finite analogues. A key concept is that of a sufficiently fast-growing sufficiently long finite sequence. Sufficiently long is with respect to the rate of growth, and is preserved under extending the sequence. In computation, along with input, we will be given such a sequence, with the first element larger than the input size. Vagueness will be avoided because there is no error in interpreting "sufficiently" too strictly. What we are defining are not the sequences themselves, but the results of the computations — for computations that agree for all appropriate sequences. The appropriate sequences can then be formally defined as the ones giving the correct results. Also, by using infinite sets, we can quantify over all possible definitions (like we did for the fast-growing sequence oracle) and thus formally define the results without using "sufficiently".

Let us now describe the construction in detail. In the definitions, 'vague' indicates dependency on a notion of 'sufficient'.

Vague Definition 4.1:
(a) A sufficiently long fast-growing sequence is a finite function $g:\{0,...,N\}→ℕ$ such that
- $g(0)$ is sufficiently large (relative to the problem size)
- $g(n+1)$ is sufficiently large relative to $g(n)$ ($n<N$)
- $g$ or an initial segment of $g$ is sufficiently long relative to its rate growth. The rate of growth is how large is $g(0)$, and how large is $g(n+1)$ relative to $g(n)$ (for $n+1$ in the range of the initial segment).
(b) A sufficiently long fast-growing sequence or its extension is any finite extension of a sequence in (a).
(c) (generalization of (a)) A sufficiently long fast-growing level $k+1$ sequence is a sufficiently long finite sequence of sufficiently long fast-growing level $k$ sequences such that once the sequences are expanded to numbers, each number is sufficiently large relative to the previous one. Here, a level 0 sequence is a number sufficiently large relative to the problem size. "Sufficiently long" may depend on the sequence (specifically, required length is relative to the rate of growth or its analog) but if a sequence is sufficiently long, then so are all of its extensions. One representation of the sequence is a function $g:\{0,...,n\}→\{0..k\}$ with $g(x)≥i$ iff $x$ is the last element (after expansion into a sequence of numbers) of an included level $i-1$ sequence (and $g(n)=k$).

In one finitistic treatment, we assume that in certain contexts, "sufficiently" is a primitive concept — it can be explained, partially axiomatized, and connected to other concepts, but not formally defined in terms of more primitive concepts. (On the other hand, one could argue that our uses of 'sufficiently' and other terms are a recharacterization of aspects of infinity.) Using that concept, we can define a predicate $P$ as $P(n)$ holds iff Turing machine $T$ accepts given input $n$ and a sufficiently large/complete/etc example (relative to $n$ and $T$) of the appropriate type, provided that $T$ (independent of $n$) is a machine that halts with the same (dependent on $n$) output for every $n$ and sufficient example. While without context, 'sufficient' is vague, the above usage is (at least arguably) precise because it does not depend on how the vagueness is resolved. One would then finitistically study the properties of $P$ and other such predicates, and develop axiom systems. However, with a little reasoning, many uses of "sufficiently" can also be formally defined using infinite sets, and we will use infinity as a shortcut and to unambiguously interpret the concepts in existing mathematics.

Template Definition 4.2.
Assumption: We are given a set of potential examples, and a set $S$ of possible notions of sufficiency: These must form a directed set under strictness, and each notion must be nonempty, or the definition fails. Alternatively, instead of $S$, we may given $T$ that is equivalent to $S$ in that for every notion in $S$ there is at least as strict notion in $T$ and vice versa.
(a) A property holds for every 'sufficient' example iff there is a notion $Q$ in $S$ such that the property holds for every example in $Q$.
(b) A predicate $P$ is computable using a sufficient example (assuming the examples are reducible to integers) iff there is a Turing machine $T$ such that for every $n$, $T$ outputs $P(n)$ given $n$ and a sufficient example (that is for every sufficient example) as input.

To apply the definition (which we will sometimes (for example, in section 7.2) do implicitly), we need the set of possible notions of sufficiency. Here is one choice, noting that selection between equivalent (as used in 4.2) sets may be arbitrary.

Definition 4.3:
(a) A notion of sufficiently long fast-growing sequences is valid iff there is a function $f$ such that for every infinite sequence $g$ with $g(0)>f(0)$, $g(n+1)>f(g(n))$ there is $m$ such that every initial segment of $g$ of length at least $m$ is sufficient.
(b) The extension for 4.1(b) is trivial. (A notion of sufficiently long fast-growing sequence or extension thereof is valid if it is the closure under finite extension of a valid notion in (a).)
(c) A notion of sufficiently long level $k+1$ fast-growing sequences is valid iff there is a function $f$ and a valid notion $P$ of sufficiently long level $k$ sequences such that for every infinite sequence $g$ of elements of $P$ with, when $g$ is expanded into a sequence of numbers, $g(0)>f(0)$, $g(n+1)>f(g(n))$, there is $m$ such that every initial segment of $g$ of length at least $m$ is sufficient. For every $n$, $\{x:x≥n\}$ is a valid notion of level 0 sequences.

Theorem 4.4:
(a) Predicates that are $Δ^1_1$ in the hyperjump of 0 are computable using a sufficiently long fast-growing sequence.
(b) A predicate is computable using a sufficiently long fast-growing sequence or extension thereof iff it is $Δ^1_1$.
Notes:
* We conjecture the result in (a) to be optimal. There should also be an analogous statement for 4.1(c).
* A finite sequence (as in (a)) is more expressive than the corresponding infinite sequence because it gives us a stopping point.
* For mass problems (where the correct answer is non-unique), the notion in 4.1(b) can be used to compute $Π^1_1$ separation (with the choice of the correct answer depending the sequence). A weakening of 4.1(b) that does not allow that would be to store the sequence as a predicate so one cannot test whether one reached the end.
Proof:
(a) The algorithm above (Theorem 2.2) for testing well-foundness (with reaching the end of the sequence indicating an infinite path), will work correctly and compute the hyperjump of 0 up to a sufficiently large unspecified point in the sequence. In deciding a $Δ^1_1$ in the hyperjump of 0 problem, use the partially computed hyperjump of 0 to check which of the two opposing sequences (for the sentence and its negation) terminates first: For inputs in the size limit, that will happen before that unspecified point.
(b) $Δ^1_1$ predicates are clearly computable. For the converse, for every rate of growth $g$, there is a finite sequence $h$ with $h(n+1)>g(h(n))$ such that the output is given on every continuation of $h$ (and one similarly has a $Σ^1_1$ description).

Intuitively, one can treat a sufficiently fast-growing sufficiently long sequence as if it is a definite object, but restricting reasoning to forms that remain valid; I am not sure if there is a reasonable formal system (perhaps analogous to nonstandard analysis) that corresponds to such a restriction.

If we allow the computers to produce sufficiently long level $n$ fast growing sequences unlimited number of times (for variable input length), that corresponds to an infinite sequence of level $n+1$. However, to make the (arguably) finitistic definition as concrete as possible (and to allow a more fine-grained hierarchy of notions), we limit ourselves to a single application (of the producer of the sequences), and to correctness with respect to an ordinary Turing machine using that application.

Restrictions on Length
(Disclaimer: Statements here were not formally verified.)
     In addition to considering sufficiently long sequences, we can also consider specific length conditions. Given a recursive well-founded relation '≺', a natural corresponding length condition is the following: There is no $x_0 ≻ x_1 ≻ ... ≻ x_n$ with $x_i < g_i$ and $n+1$ being the length of $g$. Computability for sufficiently fast-growing $g$ with this length condition equals $0^{(α+1)}$ (or $0^{(α)}$ for finite $α$) where $α$ is the rank of '≺'.
     For fixed finite $k$, $g$ having length $k$ corresponds to ordinal $k$ for '≺'. If $h$ is a fixed recursive function $\lim_{n→∞} h(n) = ∞$, then $g$ having length $h(g(0))$ corresponds $0^{(ω+1)}$. The independence of expressiveness of $h$ may appear counterintuitive — if we increase the length of $g$ by 1, we can compute the halting problem for appropriate machines that use the shorter $g$ — but we cannot determine which machines are valid (in the sense of being independent of the choice of sequence), hence halting reversal does not increase expressiveness. Going further, length $h(g(k))$ corresponds to $0^{(ω+k+1)}$, length $g(h(g(k))$ corresponds to $0^{(ω 2 + k + 1)}$, length $g^{h(g(k))}(0)$ (superscript indicates repetition) corresponds to $0^{(ω^2 (k+1) + 1)}$, and with natural extensions (allowing nested superscripts) for ordinals below $ω^ω$.

5 $Σ^1_2$ Truth

We now give an apparently finitistic definition of $Σ^1_2$ truth. Essentially, this section informally describes in detail a variation on level 2 estimators of the next section.

Along with input, we are given a finite set of finite sequences. Each sequence is sufficiently long relative to itself, or relative to its rate of growth. (Optional) If $A$ is sufficiently long and $|A|≤|B|$ and $∀n<|A| \, B(n)≤A(n)$, then $B$ may be considered sufficiently long. (Optional) Arbitrary conditions may be imposed on the sequences as long as they do not impose an upper bound on the rate of growth. (An upper bound would be a function $f$ such that for every permitted sequence $A$, $∃n<|A|-1 \, A(n+1)<f(A(n))$ or $A(0)<f(0)$.) Given these conditions, the set is sufficiently complete for the input size. Completeness is preserved under addition of new sequences or decrements to input size. We are not given the conditions on the sequences, but regardless of what they are, the algorithm below is guaranteed to be correct.

The algorithm aims to check whether the sentence encoded in the input has a transitive model. It enumerates potential initial segments (whose length equals the length of the longest sequence in the set of sequences we were given) of the truth predicate for the language with a distinguished symbol for a well-ordering of the elements. Each segment is checked for a quick (relative to its length) inconsistency with the sentence. If not found inconsistent, it is converted into a finite quasi-model, with elements coded by sentences of the right syntactic form: $∃x \, P(x)$ codes the least such $x$ if true, and 0 if false. Then, for every sequence $A$ with $A(0)$ larger than the input size, it tries to form a descending path $(a, b, c, …)$ of elements with $a<A(1), b<A(2), c<A(3), …$ . If the path reaches length $|A|-1$, the truth predicate is rejected; otherwise it is accepted by the sequence. The algorithm returns 'yes' iff a truth predicate segment is accepted by all sequences.

To see that the computation is correct, note that if a sentence has a transitive model, then it has a transitive model of bounded complexity, and hence without sufficiently long descending sequences of the required type. If it does not have a transitive model, then in the complete binary tree of candidate truth predicate initial segments, every path will have an infinite descending sequence (or inconsistency). Thus, regardless of how strictly "sufficiently long" is defined, every path will have sufficiently long infinite descending sequences. By Konig's lemma, a finite set of such sequences can cover all paths.

The construction has an infinite version, namely a predicate or a tree on finite sequences of natural numbers. Consider a well-founded tree of such sequences (ordered by extension) such that every non-leaf node can be extended by any natural number, and if a sequence is a leaf node, then it is sufficiently long relative to itself (or relative to its rate of growth; the difference does not affect the expressive power) — or any tree that is a well-founded extension of such a tree. Using such a tree, recognizability is $Π^1_2$ complete. (However, $Σ^1_2$ may be a better analogue of being recursively enumerable: The relevant parameter is not the halting time but the ordinal depth of the visited portion of the tree for nonhalting runs, which corresponds to the complexity of the least witness for a $Σ^1_2$ statement.) In one direction, the above computational procedure works, except that if the theory has a well-founded model, we do not halt. In the other direction, the machine halts iff for every well-founded tree $T$ for every non-halting computation corresponding to an extension $S$ of $T$, $S$ has an infinite path, which is $Π^1_2$.

We can extend definability be using an infinite sequence growing sufficiently fast relative to the tree. Recognizability should then correspond to having a $Π^1_2$ monotonic inductive definition. A stronger extension is to use an n-tuple of trees, with sequences in each tree sufficiently long relative to themselves, and to every previous tree.

6 Second Order Arithmetic

Vague Definition 6.1:
A level 0 estimator is a pair $(a, b)$ with $b$ sufficiently large relative to $a$ and the problem size.
A level n+1 estimator is a sufficiently complete finite set of level $n$ estimators, where 'sufficient completeness' is defined such that it does not prevent the notion of level $n$ estimators from being arbitrarily strict.

The above definition is vague but the following is well-defined regardless of how the vagueness is resolved.

Definition 6.2: Computation of second order arithmetical truth:
* Use standard rules to convert the sentence (or its negation) to $Q X_1 Q X_2 … ∃ X_n ∀x ∃y>x \, P$ where $n>0$, each $Q$ is a quantifier, and $P$ is a bounded quantifier formula that does not use $x$ (and $P$ can even be chosen to be polynomial time computable), and each $X_i$ is a set of natural numbers.
* Pick any (it does not matter which one) level $n$ estimator on problem sizes at least as large as the sentence. The sentence is true iff it passes the estimator.
* A formula $∀x ∃y>x \, P(y, …)$ ($P$ as above) passes a level zero estimator $(a, b)$ iff $∃y (a≤y<b) P(y, …)$.
* A formula $∀Xφ(X)$ passes an estimator iff for every binary sequence $X$, $φ(X)$ passes at least one of the estimators included.
* A formula $∃Xφ(X)$ passes an estimator iff for some binary sequence $X$, $φ(X)$ passes all of the estimators included.
(Note: Because $P$ is bounded quantifier and an estimator is finite, the search over $X$ is also finite).

To informally see that the description is correct, if a $Σ^1_n$ statement is true, then it has a witness (for the outermost existential quantifier) of bounded complexity. Level $n-1$ estimators will work correctly with that witness and hence the sentence. On the other hand, if a $Σ^1_n$ statement is false, then every potential witness will be intercepted by some level $n-1$ estimator, and hence the falsehood will be witnessed by a sufficiently complete set of such estimators. By induction on $n$, the method is correct.

To prove correctness, we first have to formalize what we want to prove. If we treat 'sufficiently' as a parameter in the definition, we naturally get the following:
Definition 6.3:
* A notion $R$ of 0-estimators is valid iff $R(a,b) ∧ c≤a ∧ b≤d ⇒ R(c,d)$ and $∀a∃b R(a,b)$.
* A notion $R$ of $n+1$ estimators is valid iff
    (1) every element of $R$ is finite.
    (2) $∪R$ is valid and $x∈R ∧y∈∪R ⇒ x∪\{y\}∈R$. (Equivalently, there is a valid notion $T$ of $n$-estimators such that every element of $R$ consists of elements of $T$, and adding an element of $T$ to an element of $R$ results in an element of $R$.)
    (3) for every valid notion $S$ of $n$-estimators, there is an element of $R$ consisting of elements of $S$.

Definition 6.4: A sentence (in second order arithmetic or another well-defined language) using a (sufficiently good) $n$-estimator as a parameter is well-defined iff there is a valid notion $R$ such that all elements of $R$ agree on whether the sentence is true, which is then the truth value of the sentence.

The truth value of a well-defined sentence is unambiguous:

Proposition 6.5: Given two valid notions $R$ and $S$ of $n$-estimators, $R∩S$ is also a valid notion.
Proof: Assuming $n>0$ ($n=0$ is easy), note that $∪R$ and $∪S$ are valid. Assume that the proposition applies for $n-1$, and let $T$ be an arbitrary notion of $n-1$ estimators. Thus, $T' = T ∩ ∪R ∩ ∪S$ is valid. There is $x∈R$ and $y∈S$ such that $x⊂T'$ and $y⊂T'$, and for every such $x,y$, we have $x∪y∈R∩S$, which satisfies (3) in the definition. Using a similar argument, $∪R ∩ ∪S = ∪(R∩S)$, and (2) follows.

Theorem 6.6: The above notion/computation of second order arithmetical truth is well-defined and agrees with the actual truth.
Proof: Given problem size $s$, and a real number parameter $X$, consider the following notions:
$(a,b)$ is a level 0 $(s,X)$ estimator iff for every $φ$ in $\text{Z}_2$ for with two free variables and length at most $2^{s+a}$, $∃c \, φ(c, X) ⇒ ∃c<b \, φ(c, X)$. For this notion, if working in just ZFC without projective determinacy, augment $\text{Z}_2$ with a function that uniformizes projective sets.
$x$ is a level $n+1$ $(s,X)$ estimator iff every element of $x$ is a level $n$ $(s,X)$ estimator and for every $Y⊂N$, there is an element of $x$ that is $(s,X,Y)$ $n$-estimator (coding $(X,Y)$ by a single real number).
By induction and using Konig's lemma to get finiteness, these are valid notions of $n$-estimators. Given a level $n$ $(s,X)$ estimator, computation of $Σ^1_n$ truth works for sentences of length at most $s$ that use $X$ as a parameter. At the top level (and similarly for other levels):
true $∃Xφ(X)$ — there is a bounded complexity witness $X_0$, and all included estimators will accept $φ(X_0)$.
false $∃Xφ(X)$ — for every $X$, an $X$-estimator is included, and it will reject $φ(X)$.
true $∀Xφ(X)$ — for every $X$, an $X$-estimator is included, and it will accept $φ(X)$.
false $∀Xφ(X)$ — there is a bounded complexity counterexample $X_0$, and all included estimators will reject $φ(X_0)$.

Variations on the definition:
* Instead of 0-estimators, we could have started with level -1 estimators being sufficiently large numbers (or even every number being a sufficiently good level -2 estimator), and $k$-estimators for $k>0$ should then still have the same power.
* If sufficiently complete did not obey the "arbitrarily strict" qualification and we used the definition like in Strengthenings to Multiple Sequences, the expressibility using $k$-estimators ($k>0$) would collapse to that of being hyperarithmetic in the $k$th hyperjump of 0.

The estimators have an infinite version: A predicate that codes a sufficiently strict valid notion of $n$-estimators. Recognizability for machines using such predicates (where the machine must work for every sufficiently strict notion of $n$-estimators) appears to equal $Π^1_{n+1}$. Also, a single predicate suffices for all input sizes: Given input size $s$, the machine can pick $n$-estimators such that for every included level 0 estimator $(a,b)$, $b>s$, and then ignore 0-estimators $(a',b')$ with $a'<s$.

This completes our description at the level of the second order arithmetic. What remains is finitistically studying which axioms to include in the system, and relating infinite statements to natural statements about finite structures. If, for example, projective determinacy for the results of the estimations will be deemed a finitistically acceptable axiom schema, we will have a finitistic consistency proof of ZFC.

7 Beyond Second Order Arithmetic

7.1 Estimators of Transfinite Levels

To go further, we can define transfinite levels of estimators. For an ordinal $α>0$ (alternatively, for consistency with $n$-estimators, limit $α$), an $α$-estimator is a sufficiently complete collection of estimators of lower levels, with α being marked in the estimator, and where 'sufficiently' complete is defined so as not to preclude notions of lower level estimators being arbitrarily strict.

A notion $R$ of $α$-estimators ($α>0$) is valid iff:
* every element of $R$ is finite,
* for every sequence $S$ of length $α$ of valid notions of 0,1,... estimators, there is an element of $R$ all of whose elements are elements of $∪S$, and
* there is a sequence $S$ of length $α$ of valid notions of 0,1,... estimators such that every element of R consists of elements of $∪S$, and $x∈R ∧ y∈∪S ⇒ x∪{y}∈R$.

As before, for $α$-estimators $R$ and S, $R∩S$ will be valid, so we can meaningfully speak of 'for every sufficiently good $α$-estimator'. We can also meaningfully speak of 'for every sufficiently strict predicate $P$ for $<α$-estimators'.

Definability of reals using estimators of levels below $ω(α+1)$ appears to equal to definability in $\text{L}_α(ℝ)$, at least when $α$ is not too large. Note that for the definition not to (overtly) use infinity, we would need a coding of for ordinals up to $α$ by integers.

Estimators of Rational Levels: One possibility for further extension is for the levels to be nonnegative rationals, with each estimator of positive level being a sufficiently complete finite collection of estimators of lower levels, with a label for the level. It is unclear how to define sufficiently complete here (and whether such a definition is possible). In the infinitary treatment, the set of valid notions would be a set of real numbers, and we would want to specify conditions so as to define (or sufficiently constrain) that set; it is not clear whether the analogue of $α$-estimator definition for well-ordered subsets of $ℚ$ works as the condition.
Note that while there are infinite descending sequences of rationals, infinite regress is avoided here because in such a sequence the denominators become increasingly large, so for each given problem size, for a sufficiently small $r>0$, an $r$-estimator might consist of just 0-estimators. We can embed ordinals that we can construct into rationals, and one possibility is that the expressiveness corresponds to $Δ_1(\{ℝ\})$ definability in $\text{L}_α(ℝ)$ for the least $α$ such that $\text{L}_α(ℝ)$ is admissible.

7.2 Using Pattern Enumerators

A (not necessarily finitistic) notion related to that of a sufficiently large Turing degree (or more closely, fast growing infinite sequence of Turing degrees) is that of a sufficiently closed pattern enumerator.

Vague Definition 7.2.1: A sufficiently closed pattern enumerator is a function $f:N→2^N$ such that
- $f(n)(i)$ is character $i$ in the infinite string corresponding to pattern $n$. Numbering of patterns is permitted to be arbitrary.
- if $x$ follows a sufficiently simple pattern relative to $f(n)$, then $x$ is in the range of $f$.
- (optional for our purposes) if $x$ and $y$ are in the range of $f$, then so is $x_0,y_0,x_1,y_1,...$ .

For example, for some $n$, $f(n)(i)=0$ for all $i$, but the least such $n$ may be arbitrarily large. Also, without the optional condition, enumerators $A$ and $B$ can be trivially combined into an enumerator of patterns included in $A$ or $B$.

Let $A_1, A_2, ..., A_n $be sufficiently closed pattern enumerators such that $A_{i+1}$ includes $A_i$ as a pattern, and $S$ be a sufficiently fast growing sequence relative to the enumerators.

Assuming sufficient determinacy, we may recognize the set of winning positions of games of integers of length $ω^2 n$ with $Σ^0_1$ payoff as follows. Replace the first $ω$ moves of the first player by a strategy $p_1$ for these moves where $p_1$ is in $A_1$, replace the first $ω$ moves of player 2 by strategy $s_1$, such that $p_1$ join $s_1$ is in by $A_1$, and so on replacing the first $ω^2$ moves and making game length $ω^2 (n-1)$ (if $n>1$). We might not be able to go further with $A_1$ since $A_1$ does not recognize its behavior as a pattern, but we can use $A_2$ for the next $ω^2$ moves and so on, converting the game to one of length $ω$. We then constrain the move values using $S$, and if there is a strategy that wins at a finite stage, halt with 'accept'. We conjecture the converse to hold as well (assuming sufficient determinacy).

Notes:
* Without a sufficiently fast-growing sequence $S$, an unspecified sufficiently closed pattern enumerator is essentially useless to a Turing machine since it may be arbitrary up to an arbitrarily large finite point.
* Sufficiently closed pattern enumerators have a number of equivalent (in terms of expressiveness) definitions. For example, we may view them as pattern detectors with $f'(s)$ being the least $n$ such that $f(n)$ starts with $s$.
* We may use a transfinite sequence of pattern enumerators, with each enumerator including the sequence of enumerators of lower levels (combined into a binary sequence) as a pattern.
* If we used 'sufficiently simple pattern' instead 'sufficiently simple relative pattern relative to $f(n)$', we should get the same expressiveness by using $ωα$ enumerators in place of $α$, while requiring each enumerator to include all sufficiently simple patterns with respect to the join of the lower level enumerators.
* Like fast growing sequences, the enumerators (and chains of them with the fast-growing sequences), have a finite version; simply cut off the chain (coded as an infinite string) at a sufficiently large number relative to the chain and the problem size.

7.3 Using Countable Infinity

We can also consider what can be expressed using real numbers but without using an uncountable infinity. One way to increase expressiveness is to allow an infinite alternation of real quantifiers. Formally, $∃X_1 ∀X_2 …$ can be defined as true when the first player has a winning strategy in the game where the players takes turns to play $X_i$ and the goal of the first player is to satisfy the formula after the quantifiers. Under reasonable axioms, this leads to expressiveness beyond L(ℝ). Technically, for every countable ordinal $α$, real numbers definable from a countable ordinal using $1+α+1$ real quantifiers are precisely those that are in the minimal iterable inner model with $α$ Woodin cardinals. Just $ω+2$ real quantifiers go beyond first order definability in L(ℝ).

This definition uses strategies, which are formally uncountable. However, we can avoid the uncountable by introducing a notion of $Y$ having a sufficiently large complexity (under Turing reducibility) relative to $X$. As before, vagueness is avoided because every sufficiently strict notion works. We obtain the necessary complexity by using a sequence of $α$ real numbers, each sufficiently complex relative to a join of the previous real numbers. Statements with infinite alternations of quantifiers can be decided by restricting the complexity of the strategies: For $γ$ levels of play, the strategies must be recursive in the $γ$'s member of the sequence. By determinacy, and by sufficiency of the sequence, this does not affect who wins. We can go even further by using sequences with ordinal length sufficiently long relative to themselves. Reaching higher complexity is an endless endeavor.

REFERENCES

[1] "Extending the Language of Set Theory" by Dmytro Taranovsky (2005), arXiv:math/0504375.
[2] "Reflective Cardinals" by Dmytro Taranovsky (2012), arXiv:1203.2270.
[3] Simpson, Stephen G. Subsystems of Second Order Arithmetic (Perspectives in Logic) 2nd Edition (2009).
[4] "How unprovable is Rabin's decidability theorem?" by Leszek Aleksander Kołodziejczyk, Henryk Michalewski (2015), arXiv:1508.06780.