Foundation of Mathematics

First-order logic (show)

First-order logic
First-order logic is not a theory in itself, but a guideline on the construction of first-order theories.

Syntax
The syntax determines which tuples of symbols are well-formed expressions in the first-order theory, and which expressions are syntactic consequences of which expressions. The syntax consists of an alphabet, a collection of formulation rules, and a deductive system:
- Alphabet
  The alphabet is the collection of symbols allowed to form a well-formed expression in the first-order theory. It is implicitly assumed that whenever we define new symbols, they are distinct from all symbols already defined.
  - Logical symbols
    Logical symbols always have the same meaning. They include:
    - Variable symbols
      We have a distinct variable symbol for every natural number. Note that distinct notations of variable symbols may not necessarily represent distinct variable symbols.
    - Logical connectives
      We have negation $\neg$, a unitary connective, and implication $\to$, a binary connective.
    - Punctuation symbols
      We have left parenthesis $($, right parenthesis $)$, and comma $,$.
    - Quantifier
      We have the universal quantification $\forall$. Note that given a variable symbol $x$, the tuple of symbols $\forall x$ may also be called a quantifier.
  - Non-logical symbols
    The non-logical symbols of a first-order theory are specified via a theory-dependent signature, which consists of a collection of predicate symbols and a collection of function symbols. Each predicate symbol or function symbol has an arity number, which is a natural number representing the number of arguments the predicate symbol or function symbol takes. A function symbol with $0$ arity is also called a constant symbol. We require the inclusion of a predicate symbol $=$ called equality in the signature, with arity number $2$.
- Formation rules
  Given a signature, formation rules are a collection of rules the symbols have to follow in order to form a well-formed expression in the first-order theory.
  - Terms
    A tuple of symbols $t$ is called a term if and only if there exists a tuple $(t_1,\ldots,t_n)$ of tuples of symbols for some non-zero natural number $n$, such that $t$ is $t_n$, and for all $i$ from $1$ to $n$, one, and thus exactly one, of the following is true:
    - 1. $t_i$ is $x$, where $x$ is a variable symbol.
    - 2. $t_i$ is $f(t_{s_1},\ldots,t_{s_k})$, where $f$ is a $k$-ary function symbol in the signature and $s_1,\ldots,s_k$ are among $1,\ldots,i-1$.
    In particular, since constant symbols are nullary function symbols, $c()$ is a term if $c$ is a constant symbol, and we may use the notation $c$ to denote $c()$. We will also allow other forms of the expression $f(t_1,\ldots,t_n)$ that unambiguously have the same meaning, such as $(a=b)$ in place of $=(a,b)$. Given a term $t$ and a variable symbol $x$, an occurrence of $x$ in $t$ is an index at which $x$ occurs in the tuple $t$.
  - Formulas
    A tuple of symbols $\varphi$ is called a formula if and only if there exists a tuple $(\varphi_1,\ldots,\varphi_n)$ of tuples of symbols for some non-zero natural number $n$, such that $\varphi$ is $\varphi_n$, and for all $i$ from $1$ to $n$, one, and thus exactly one, of the following is true:
    - 1. $\varphi_i$ is $P(t_1,\ldots,t_k)$, where $P$ is a $k$-ary predicate symbol in the signature and $t_1,\ldots,t_k$ are terms.
    - 2. $\varphi_i$ is $\neg\varphi_j$, where $j$ is among $1,\ldots,i-1$.
    - 3. $\varphi_i$ is $(\varphi_j\to\varphi_k)$, where $j$ and $k$ are among $1,\ldots,i-1$.
    - 4. $\varphi_i$ is $\forall x\varphi_j$, where $x$ is a variable symbol and $j$ is among $1,\ldots,i-1$.
    We call formulas that takes the form $P(t_1,\ldots,t_k)$ atomic formulas. Given a formula $\varphi$ and a variable symbol $x$, we call $\forall x\varphi$ the closure of $\varphi$ with respect to $x$. We will allow other forms of the expression $P(t_1,\ldots,t_n)$ that unambiguously have the same meaning, such as $(a=b)$ in place of $=(a,b)$. By induction, given a formula $\varphi$,
    - at any index $i$, the number of left parentheses at or prior to $i$ is greater than or equal to the number of right parentheses at or prior to $i$, and
    - the number of left parentheses in $\varphi$ equals the number of right parentheses in $\varphi$.
    Thus at any index $i$ in $\varphi$, we can define the depth at $i$ to be the number of left parentheses at or prior to $i$ minus the number of right parentheses at or prior to $i$. By induction, given a formula $\varphi$,
    - every symbol $,$ and every symbol $\to$ in a formula has depth at least $1$,
    - for each left parenthesis in $\varphi$, let $i$ be its index and $d$ be the its depth, then $d$ is non-zero and there exists a right parenthesis in $\varphi$ with index greater than $i$ and depth $d-1$, and
    - the last symbol of $\varphi$ is the unique right parenthesis in $\varphi$ with depth $0$.
    Thus for each left parenthesis in $\varphi$, with index $i$ and depth $d$, we can pair it up with the first right parenthesis in $\varphi$ with index greater than $i$ and depth $d-1$, which is said to close the left parenthesis, and we define the scope of this pair of parentheses to be the indexes of $\varphi$ that is greater than the index of the left parenthesis and less than the index of the right parenthesis. And by induction,
    - every symbol in the scope of the parentheses has depth at least $d$.
    Given a formula $\varphi$ and a variable symbol $x$, an occurrence of $x$ in $\varphi$ is an index at which $x$ occurs in the tuple $\varphi$ and is not immediately preceded by a quantifier. By induction:
    - The tuple of symbols formed by replacing an occurrence of a variable symbol $x$ in a formula $\varphi$ by a term $t$ is a formula.
    - The tuple of symbols formed by replacing the variable symbol $x$ in a quantifier $\forall x$ in a formula $\varphi$ by a variable symbol $y$ is a formula.
    Every occurrence of a variable symbol in a formula $\varphi$ is either free or bounded by a quantifier, determined recursively by the inductive steps $(\varphi_1,\ldots,\varphi_n)$ that form the formula $\varphi$:
    - 1. If $\varphi_i$ is an atomic formula, then every occurrence of every variable symbol in $\varphi_i$ is free.
    - 2. If $\varphi_i$ is $\neg\psi$ for some formula $\psi$, it preserves the freeness or boundedness of every occurrence of every variable symbol as in $\psi$.
    - 3. If $\varphi_i$ is $(\psi\to\theta)$ for some formulas $\psi$ and $\theta$, it preserves the freeness or boundedness of every occurrence of every variable symbol as in $\psi$ and in $\theta$.
    - 4. If $\varphi_i$ is $\forall x\psi$ for some variable symbol $x$ and formula $\psi$, then every occurrence in $\varphi_i$ corresponding to a free occurrence of $x$ in $\psi$ is bounded by the quantifier $\forall x$, as the first two symbols of $\varphi_i$, while preserving the freeness or boundedness of other occurrences of variable symbols as in $\psi$.
    By induction:
    - The freeness or boundedness of an occurrence in a formula is independent on the choice of inductive steps that form the formula.
    - Given a formula $\varphi$ and a formula $\varphi'$ formed by replacing an occurrence of a variable symbol in $\varphi$ by a term, $\varphi'$ preserves the freeness or boundedness of other occurrences of variable symbols, besides the replacements, as in $\varphi$.
    If a variable symbol has at least one free occurrence in a formula, we call the variable symbol a free variable of the formula. A formula with no free variables is called a sentence. For a formula $\varphi$, we denote the collection of free variables in $\varphi$ by $\text{free}(\varphi)$. Given a formula $\varphi$, a term $t$, and a variable symbol $x$, we recursively define the substitutability of $t$ for $x$ in $\varphi$ by the inductive steps $(\varphi_1,\ldots,\varphi_n)$ that form the formula $\varphi$:
    - 1. If $\varphi_i$ is an atomic formula, then $t$ is substitutable for $x$ in $\varphi_i$.
    - 2. If $\varphi_i$ is $\neg\psi$ for some formula $\psi$, then $t$ is substitutable for $x$ in $\varphi_i$ if and only if $t$ is substitutable for $x$ in $\psi$.
    - 3. If $\varphi_i$ is $(\psi\to\theta)$ for some formulas $\psi$ and $\theta$, then $t$ is substitutable for $x$ in $\varphi_i$ if and only if $t$ is substitutable for $x$ in $\psi$ and $\theta$.
    - 4. If $\varphi_i$ is $\forall y\psi$ for some variable symbol $y$ and formula $\psi$, then $t$ is substitutable for $x$ in $\varphi_i$ if and only if either $x$ does not occur free in $\varphi_i$, or $y$ does not occur in $t$ and $t$ is substitutable for $x$ in $\psi$.
    By induction:
    - The substitutability of a term $t$ for a variable symbol $x$ in a formula $\varphi$ is independent on the choice of inductive steps that form $\varphi$.
    - A term $t$ is substitutable for a variable symbol $x$ in a formula $\varphi$ if and only if no variable symbol in $t$ becomes bounded as a result of replacing all free occurrences of $x$ by $t$ in $\varphi$.
    We denote the formula formed by substituting $t$ for every free occurrence of $x$ in $\varphi$ by $\varphi[t/x]$, whether $t$ is substitutable for $x$ in $\varphi$ or not.
  - Abbreviations
    Let $x$ be an arbitrary variable symbol, let $\varphi$ and $\psi$ be arbitrary formulas, let $y$ be a variable symbol not occurring in $\varphi$ and distinct from $x$, then:
    - Conjunction $\land$: $(\varphi\land\psi)$ is an abbreviation for $\neg(\varphi\to\neg\psi)$, and we allow the expression $(\varphi_1\land\ldots\land\varphi_n)$ in place of $(\ldots(\varphi_1\land\varphi_2)\land\ldots\land\varphi_n)$ for natural number $n$ at least $2$.
    - Disjunction $\lor$: $(\varphi\lor\psi)$ is an abbreviation for $(\neg\varphi\to\psi)$, and we allow the expression $(\varphi_1\lor\ldots\lor\varphi_n)$ in place of $(\ldots(\varphi_1\lor\varphi_2)\lor\ldots\lor\varphi_n)$ for natural number $n$ at least $2$.
    - Biconditional $\leftrightarrow$: $(\varphi\leftrightarrow\psi)$ is an abbreviation for $((\varphi\to\psi)\land(\psi\to\varphi))$.
    - Existential Quantifier $\exists$: $\exists x\varphi$ is an abbreviation for $\neg\forall x\neg\varphi$.
    - Uniqueness Quantifier $\exists!$: $\exists!x\varphi$ is an abbreviation for $\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))$.
  A first-order theory contains a collection of formulas called axioms, including logical axioms and non-logical axioms, with the latter being theory-dependent. Axioms may be generated from an axiom schema, a rule to find a collection of axioms when given a signature.
- Deductive system
  Deductive systems are used to demonstrate, on a purely syntactic basis, that one formula is a logical consequence of a collection of formulas in a first-order theory. Given a signature and a collection of non-logical axioms, below is the deductive system we employ.
  - Logical axioms
    Let $x$ and $y$ be variable symbols and let $\varphi$ and $\psi$ be formulas, the following are the logical axioms:
    - LA1. Tautologies, with every propositional variable substituted by a formula (we can show that these are indeed formulas by using induction on the inductive steps that form the tautologies).
    - LA2. $(\forall x\varphi\to\varphi[t/x])$, where $t$ is a term substitutable for $x$ in $\varphi$.
    - LA3. $(\forall x(\varphi\to\psi)\to(\forall x\varphi\to\forall x\psi))$.
    - LA4. $(\varphi\to\forall x\varphi)$, where $x$ is not a free variable of $\varphi$.
    - LA5. $(x=x)$.
    - LA6. $((x=y)\to(\varphi\to\varphi'))$, where $\varphi'$ is obtained from $\varphi$ by replacing a sub-collection of free occurrences of $x$ in $\varphi$ by $y$, such that these occurrences of $y$ are free in $\varphi'$.
    LA5 and LA6 are called axioms of equality. Technically, every entry above is an axiom schema, as we may replace the variable symbols. But we will only call those quantifying over the collection of formulas as axiom schemas. Clearly, all logical axioms can be generated by an axiom schema, called the logical axiom schema.
  - Inference rules
    Let $\varphi$ and $\psi$ be formulas, the following are the inference rules:
    - MP. Modus ponens. $\cfrac{\varphi\quad(\varphi\to\psi)}{\psi}$.
    - UG. Universal generalization. For any variable symbol $x$, $\cfrac{\varphi}{\forall x\varphi}$.
  If we denote the collection of axioms in the theory, logical or non-logical, as $\Lambda$, a formal proof from a collection of formulas $\Gamma$ called premises in a theory $T$ is a tuple of formulas, constructed by the following inductive rules:
  - 1. An empty tuple of formulas is a formal proof from $\Gamma$ in $T$.
  - 2. For any natural number $k$, if $(\psi_1,\ldots,\psi_k)$ is a formal proof from $\Gamma$ in $T$ and $\psi$ is in $\Gamma$ or $\Lambda$, then $(\psi_1,\ldots,\psi_k,\psi)$ is a formal proof from $\Gamma$ in $T$.
  - 3. For any natural number $k$, if $(\psi_1,\ldots,\psi_k)$ is a formal proof from $\Gamma$ in $T$ and there is an inference rule $\cfrac{\varphi_1\quad\ldots\quad\varphi_n}{\psi}$ in $T$, such that all of $\varphi_1,\ldots,\varphi_n$ have appeared in said formal proof, then $(\psi_1,\ldots,\psi_k,\psi)$ is a formal proof from $\Gamma$ in $T$.
  For every non-empty formal proof $\mathcal P$ from $\Gamma$ in $T$, let $\psi^*$ be the last formula in $\mathcal P$, then we say $\mathcal P$ is a formal proof of $\psi^*$ from $\Gamma$ in $T$. For a formula $\varphi$ and a collection of formulas $\Gamma$, we say $\varphi$ is a syntactic consequence of $\Gamma$ in a theory $T$, denoted $T,\Gamma\vdash\varphi$ or $T\vdash\varphi$ if $\Gamma$ is empty, if and only if there is a formal proof of $\varphi$ from $\Gamma$ in $T$. A sentence $\theta$ is said to be provable in a theory $T$, if and only if $T\vdash\theta$. A theory $T$ is called consistent if and only if there is no sentence $\theta$ such that both $\theta$ and $\neg\theta$ are provable.
Semantics
The semantics determines the meanings behind well-formed expressions in the first-order theory by defining a universe and an interpretation for the signature. It also determines which expressions are semantic consequences of which expressions.
- Universe
  A universe is a non-empty collection of objects. It is the collection of objects over which variable symbols may range from. Note that a first-order theory does not specify a universe.
- Interpretation
  An interpretation $I$ consists of two functions defined in the following way (we will denote both as $I$):
  - Given a universe $U$, let $P$ be an $n$-ary predicate symbol in the signature, $I(P)$ is a collection of $n$-tuples of $U$. Note that $I(P)$ may also be interpreted as a function from the collection of all $n$-tuples of $U$ to the collection of truth values, such that if an $n$-tuples of $U$ is in the collection $I(P)$, it is mapped to $\top$ by the function $I(P)$; otherwise, it is mapped to $\bot$.
  - Given a universe $U$, let $f$ be an $n$-ary function symbol in the signature, $I(f)$ is a function from the collection of all $n$-tuples of $U$ to $U$. In particular, for a constant symbol $c$, the domain of $I(c)$ contains only an empty tuple, so the range of $I(c)$ contains exactly one object in $U$, and we say that $c$ represents that object.
  Note that the theory also does not specify an interpretation.
A universe and an interpretation together form a structure. Suppose we have a structure $\mathcal M$, consisting of the universe $U$ and the interpretation $I$. Given a term $t$, there exists a tuple of steps $(t_1,\ldots,t_n)$ that forms $t$. We will recursively define functions $(F_1,\ldots,F_n)$ such that $F_i$ maps a $k_i$-tuple of $U$ to $U$, where $k_i$ is the number of occurrences of variable symbols of $t_i$, by the following procedure:
- 1. If $t_i$ is a variable symbol, then $F_i$ maps $(x)$ to $x$.
- 2. If $t_i$ takes the form $f(t_{s_1},\ldots,t_{s_l})$, then $F_i$ maps $(x_1,\ldots,x_k)$ to $I(f)(y_1,\ldots,y_l)$, where $y_j$ is the result of applying $F_{s_j}$ on the corresponding subtuple of $(x_1,\ldots,x_k)$.
Then we define $\mathcal F_t$ to be $F_n$. By induction, the function $\mathcal F_t$ is independent on the choice of inductive steps that form the term $t$. Given a formula $\varphi$, there exists a tuple of steps $(\varphi_1,\ldots,\varphi_n)$ that forms $\varphi$. We will recursively define functions $(F_1,\ldots,F_n)$ such that $F_i$ maps a $k_i$-tuple of $U$ to a truth value, where $k_i$ is the number of occurrences of free variable symbols of $\varphi_i$, by the following procedure:
- 1. If $\varphi_i$ is an atomic formula that takes the form $P(t_1,\ldots,t_l)$, then $F_i$ maps $(x_1,\ldots,x_k)$ to $I(P)(y_1,\ldots,y_l)$, where $y_j$ is the result of applying $\mathcal F_{t_j}$ on the corresponding subtuple of $(x_1,\ldots,x_k)$.
- 2. If $\varphi_i$ takes the form $\neg\varphi_j$, then $F_i$ maps $(x_1,\ldots,x_k)$ to $\neg F_j(x_1,\ldots,x_k)$.
- 3. If $\varphi_i$ takes the form $(\varphi_j\to\varphi_{j'})$, then $F_i$ maps $(x_1,\ldots,x_k)$ to $F_j(x_1,\ldots,x_l)\to F_{j'}(x_{l+1},\ldots,x_k)$, where $(x_1,\ldots,x_l)$ and $(x_{l+1},\ldots,x_k)$ correspond to $\varphi_j$ and $\varphi_{j'}$.
- 4. If $\varphi_i$ takes the form $\forall x\varphi_j$, let $l$ be the number of occurrences of free variable symbols in $\varphi_j$, then $F_i$ maps $(x_1,\ldots,x_k)$ to $\top$ if for every object $u$ in $U$, every $l$-tuple $(y_1,\ldots,y_l)$ of $U$ that
  - matches $(x_1,\ldots,x_m)$ at the corresponding indexes and
  - is $u$ at other indexes
  is mapped to $\top$ by $F_j$; otherwise, $F_i$ maps $(x_1,\ldots,x_k)$ to $\bot$.
Then we define $\mathcal F_\varphi$ to be $F_n$. By induction, the function $\mathcal F_\varphi$ is independent on the choice of inductive steps that form the formula $\varphi$. Given a collection $\mathcal C$ of variable symbols, an assignment function on $\mathcal C$ is a function from $\mathcal C$ to $U$. Since $U$ is by definition non-empty, given an arbitrary collection of variable symbols, there always exists an assignment function on it. Given a formula $\varphi$, we define an assignment function of $\varphi$ to be an assignment function on $\text{free}(\varphi)$. Then we can define a validity function $\delta_\varphi$ for $\varphi$, from the collection of assignment functions of $\varphi$ to the collection of truth values, that maps $\sigma$ to $\mathcal F_\varphi(\sigma(v_1),\ldots,\sigma(v_k))$, where $v_j$ is the variable symbol at the index of the $j$-th free occurrence in $\varphi$. Given a structure $\mathcal M$ and a formula $\varphi$ with an assignment function $\sigma$, if $\delta_\varphi(\sigma)$ is $\top$, we say $\varphi$ is true in $\mathcal M$ with respect to $\sigma$; if $\delta_\varphi(\sigma)$ is $\bot$, we say $\varphi$ is false in $\mathcal M$ with respect to $\sigma$. If $\varphi$ is true in $\mathcal M$ with respect to every assignment function of $\varphi$, we say $\mathcal M$ satisfies $\varphi$. A sentence has a unique assignment function from an empty collection, so the sentence itself can be assigned a unique truth value by $\mathcal M$. And we can say a sentence is true or false in a structure depending on the assigned truth value. A structure satisfying every axiom of a theory, logical or non-logical, is called a model of the theory. For a formula $\varphi$ and a collection of formulas $\Gamma$, we say $\varphi$ is a semantic consequence of $\Gamma$ in a theory $T$, denoted $T,\Gamma\vDash\varphi$ or $T\vDash\varphi$ if $\Gamma$ is empty, if and only if every model of $T$ that satisfies all members of $\Gamma$ also satisfies $\varphi$. A sentence $\theta$ is said to be a theorem of a theory $T$, if and only if $T\vDash\theta$.

A first-order theory provides the theory-dependent parts: a signature and a collection of axioms containing every logical axiom generated by the logical axiom schema.

Note. Given a formula $\varphi$, certain subtuples of $\varphi$, defined by a starting index and an ending index, are called subformulas of $\varphi$, determined recursively by the inductive steps $(\varphi_1,\ldots,\varphi_n)$ that form the formula $\varphi$:

1. If $\varphi_i$ is an atomic formula, then a subtuple $\phi$ of $\varphi_i$ is a subformula of $\varphi_i$ if and only if $\phi$ is $\varphi_i$.
2. If $\varphi_i$ is $\neg\psi$ for some formula $\psi$, then a subtuple $\phi$ of $\varphi_i$ is a subformula of $\varphi_i$ if and only if $\phi$ is $\varphi_i$ or $\phi$ is a subformula of $\psi$.
3. If $\varphi_i$ is $(\psi\to\theta)$ for some formulas $\psi$ and $\theta$, then a subtuple $\phi$ of $\varphi_i$ is a subformula of $\varphi_i$ if and only if $\phi$ is $\varphi_i$ or $\phi$ is a subformula of $\psi$ or $\theta$.
4. If $\varphi_i$ is $\forall x\psi$ for some variable symbol $x$ and formula $\psi$, then a subtuple $\phi$ of $\varphi_i$ is a subformula of $\varphi_i$ if and only if $\phi$ is $\varphi_i$ or $\phi$ is a subformula of $\psi$.

By induction:

Whether a subtuple $\psi$ of $\varphi$ is a subformula of $\varphi$ is independent on the choice of inductive steps that form $\varphi$.
A subformula of a formula is indeed a formula.
Suppose $\psi$ is a subformula of $\varphi$, then there exists a unique tuple $(\varphi_0,\ldots,\varphi_n)$ of subformulas of $\varphi$ such that $\psi$ is $\varphi_0$, $\varphi$ is $\varphi_n$, and for each $i$ from $1$ to $n$, one of the following is true:
- 1. $\varphi_i$ is $\neg\varphi_{i-1}$.
- 2. $\varphi_i$ is $(\varphi_{i-1}\to\theta)$ for some subformula $\theta$ of $\varphi$.
- 3. $\varphi_i$ is $(\theta\to\varphi_{i-1})$ for some subformula $\theta$ of $\varphi$.
- 4. $\varphi_i$ is $\forall x\varphi_{i-1}$ for some variable symbol $x$.

By induction, given a formula $\varphi$,

for each quantifier $\forall x$ in $\varphi$, the subtuple of symbols from the quantifier symbol $\forall$ to the right parenthesis that closes the first left parenthesis after $\forall$ in $\varphi$ takes the form $\forall x\psi$, where $\psi$ and $\forall x\psi$ are both subformulas of $\varphi$, and the occurrences bounded by $\forall x$ in $\varphi$ are exactly the free occurrences of $x$ in $\psi$. We call $\psi$ the range of the quantifier $\forall x$.

Then by induction, given a formula $\varphi$,

for each quantifier $\forall x$ in $\varphi$, every occurrence of $x$ in the range of that quantifier $\forall x$ is bounded.

Given a formula $\varphi$ and a term $t$, the formula $\varphi'$ obtained by replacing an occurrence of a variable symbol in $\varphi$ by $t$ preserves the depths of other symbols and the ranges of the quantifiers in $\varphi$. And given a subformula $\psi$ of a formula $\varphi$, by induction on the inductive steps to construct $\varphi$ from $\psi$,

$\varphi$ preserves the ranges of the quantifiers in $\psi$.

Thus if terms $t$ and $t'$ are substitutable for a variable symbol $x$ in a formula $\varphi$, and we choose a sub-collection $\mathcal C$ of free occurrences of $x$ in $\varphi$, let $y$ be a variable symbol not occurring in $t$, $t'$, and $\varphi$, and let $\varphi'$ be the formula obtained from $\varphi$ by replacing the occurrences in $\mathcal C$ with $y$, then $\varphi'$ preserves the ranges of the quantifiers in $\varphi$ and these occurrences of $y$ are free in $\varphi'$. Hence by LA6 we have the axiom $((x=y)\to(\varphi\to\varphi'))$, which preserves the ranges of the quantifiers in $\varphi$ and in $\varphi'$. By what we have shown above, it is easy to see that $t$ is substitutable for $x$ in $((x=y)\to(\varphi\to\varphi'))$, thus by UG, LA2, and MP, we have $((t=y)\to(\varphi[t/x]\to\varphi'[t/x]))$. Again, by what we have shown above, $t'$ is substitutable for $y$ in $((t=y)\to(\varphi[t/x]\to\varphi'[t/x]))$, thus by UG, LA2, and MP, we have $((t=t')\to(\varphi[t/x]\to\varphi'[t/x][t'/y]))$. Note that given any term $t$, by LA5, UG, LA2, and MP, we have $(t=t)$. With these, we will extend the axioms of equality with the following:

LA5. $(t=t)$, where $t$ is a term.
LA6. $((t=t')\to(\varphi[t/x]\to\varphi'[t/x][t'/y]))$, where $x$ and $y$ are variable symbols, $t$ and $t'$ are terms, $\varphi$ and $\varphi'$ are formulas, such that
- $t$ and $t'$ are substitutable for $x$ in $\varphi$,
- $y$ does not occur in $t$, $t'$, and $\varphi$, and
- $\varphi'$ is obtained from $\varphi$ by replacing a sub-collection of free occurrences of $x$ in $\varphi$ with $y$.

In addition, we will allow LA1~LA6 to represent arbitrary closures of the logical axioms, which are provable by simply applying UG on the axioms.

Note. Suppose $x$ is a variable symbol, and $\varphi,\psi$ are formulas.

By "for all $x$ with $\varphi$, we have $\psi$", we mean $\forall x(\varphi\to\psi)$.
By "there exists $x$ with $\varphi$, such that $\psi$", we mean $\exists x(\varphi\land\psi)$.

Soundness
The deductive system we constructed has the following property: for any theory $T$, any formula $\theta$, and any collection of formulas $\Gamma$, we have $T,\Gamma\vdash\theta$ implies $T,\Gamma\vDash\theta$. The weaker version of this property with $\Gamma$ being empty is called soundness. (show proof)

Proof. Suppose $T,\Gamma\vdash\theta$, meaning a formal proof $(\theta_1,\ldots,\theta_n)$ of $\theta$ from $\Gamma$ exists. Let $k$ in $1,\ldots,n$, and suppose for inductive hypothesis that $T,\Gamma\vDash\theta_j$ for all $j$ in $1,\ldots,k-1$, then:

If $\theta_k$ is an axiom or a member of $\Gamma$, then trivially, $T,\Gamma\vDash\theta_k$.
If $\theta_k$ is obtained from MP, then for some $i,j$ in $1,\ldots,k-1$, $\theta_j$ is $(\theta_i\to\theta_k)$. Let $\mathcal M$ be a model of $T$ that satisfies every member of $\Gamma$, then $\mathcal M$ satisfies $\theta_i$ and $\theta_j$. Let $\sigma_k$ be an assignment function of $\theta_k$, let $\sigma_j$ be an extension of $\sigma_k$ on $\text{free}(\theta_j)$, and let $\sigma_i$ be the restriction of $\sigma_j$ on $\text{free}(\theta_i)$. Then $\mathcal M$ satisfies $\theta_i$ with respect to $\sigma_i$ and $\theta_j$ with respect to $\sigma_j$. Suppose $(a_1,\ldots,a_m)$ are occurrences of free variables in $\theta_i$ and $(b_1,\ldots,b_l)$ are occurrences of free variables in $\theta_k$ in order of occurrence, then $(a_1,\ldots,a_m,b_1,\ldots,b_l)$ are occurrences of free variables in $\theta_j$ in order of occurrence. And both $\mathcal F_{\theta_i}(\sigma_i(a_1),\ldots,\sigma_i(a_m))$ and $\mathcal F_{\theta_j}(\sigma_j(a_1),\ldots,\sigma_j(b_l))$ are $\top$. Note that $\mathcal F_{\theta_i}(\sigma_j(a_1),\ldots,\sigma_j(a_m))$ is $\top$. If $\mathcal F_{\theta_k}(\sigma_k(b_1),\ldots,\sigma_k(b_l))$ is $\bot$, then $\mathcal F_{\theta_k}(\sigma_j(b_1),\ldots,\sigma_j(b_l))$ is $\bot$, implying $\mathcal F_{\theta_j}(\sigma_j(a_1),\ldots,\sigma_j(b_l))$ is $\bot$, a contradiction. Thus we have that $\mathcal F_{\theta_k}(\sigma_k(b_1),\ldots,\sigma_k(b_l))$ is $\top$. Hence $\mathcal M$ satisfies $\theta_k$. Therefore $T,\Gamma\vDash\theta_k$.
If $\psi$ is obtained from UG, then for some variable symbol $x$ and some $j$ in $1,\ldots,k-1$, $\theta_k$ is $\forall x\theta_j$. Let $\mathcal M$ be a model of $T$ that satisfies every member of $\Gamma$, then $\mathcal M$ satisfies $\theta_j$. Let $\sigma_k$ be an assignment function of $\theta_k$. Suppose $(a_1,\ldots,a_m)$ are occurrences of free variables in $\theta_j$ and $(b_1,\ldots,b_l)$ are occurrences of free variables in $\theta_k$ in order of occurrence. Let $u$ be an object in the universe determined by $\mathcal M$, and let $\sigma_j$ be
- an extension of $\sigma_k$ on $\text{free}(\theta_j)$ that maps $x$ to $u$, if $x$ is in $\text{free}(\theta_j)$;
- $\sigma_k$, if $x$ is not in $\text{free}(\theta_j)$.
Then $\mathcal M$ satisfies $\theta_j$ with respect to $\sigma_j$. Thus $\mathcal F_{\theta_j}(\sigma_j(a_1),\ldots,\sigma_j(a_m))$ is $\top$. Note that $(a_1,\ldots,a_m)$ matches $(b_1,\ldots,b_l)$ at the corresponding indexes and is $x$ at other indexes, thus $(\sigma_j(a_1),\ldots,\sigma_j(a_m))$ matches $(\sigma_k(b_1),\ldots,\sigma_k(b_l))$ at the corresponding indexes and is $u$ at other indexes. And we have that $\mathcal F_{\theta_k}(\sigma_k(b_1),\ldots,\sigma_k(b_l))$ is $\top$. Hence $\mathcal M$ satisfies $\theta_k$. Therefore $T,\Gamma\vDash\theta_k$.

In all cases, $T,\Gamma\vDash\theta_j$ for all $j$ in $1,\ldots,k$, which concludes the inductive step. By induction, $T,\Gamma\vDash\theta$. $\blacksquare$

Contradiction

For any theory $T$ and any collection of formulas $\Gamma$, if there exists a formula $\theta$ such that $T,\Gamma\vdash\theta$ and $T,\Gamma\vdash\neg\theta$, then we have a syntactic contradiction, in which case, for every formula $\psi$, we have $T,\Gamma\vdash\psi$. (show proof)

Proof. Suppose $T,\Gamma\vdash\theta$ and $T,\Gamma\vdash\neg\theta$. Let $\psi$ be an arbitrary formula. By LA1 we have $T,\Gamma\vdash(\theta\to(\neg\theta\to\psi))$. Applying MP twice, we get $T,\Gamma\vdash\psi$. $\blacksquare$
For any theory $T$ and any collection of formulas $\Gamma$, if there exists a formula $\theta$ such that $T,\Gamma\vDash\theta$ and $T,\Gamma\vDash\neg\theta$, then we have a semantic contradiction, in which case, for every formula $\psi$, we have $T,\Gamma\vDash\psi$. (show proof)

Proof. Suppose $T,\Gamma\vDash\theta$ and $T,\Gamma\vDash\neg\theta$. Let $\mathcal M$ be a model of $T$ that satisfies every member of $\Gamma$, then $\mathcal M$ satisfies both $\theta$ and $\neg\theta$. Let $\sigma$ be an assignment function of $\theta$, then $\sigma$ is also an assignment function of $\neg\theta$. Thus both $\theta$ and $\neg\theta$ are true in $\mathcal M$ with respect to $\sigma$. But then $\neg\theta$ is both true and false in $\mathcal M$ with respect to $\sigma$, implying that $\delta_{\neg\theta}(\sigma)$ is both $\top$ and $\bot$, which implies that $\top$ and $\bot$ are the same object, a (meta-logical) contradiction. Hence every proposition is both true and false, including that for every formula $\psi$, we have $T,\Gamma\vDash\psi$. $\blacksquare$

Deduction theorem
For any theory $T$, any formulas $\psi$ and $\theta$, and any collection of formulas $\Gamma$, if we have a formal proof of $\theta$ from $\Gamma\cup\{\psi\}$ such that for every free variable $x$ in $\psi$, there is no quantifier $\forall x$ in the formal proof, then we have $T,\Gamma\vdash(\psi\to\theta)$. Note that if $\psi$ is a sentence, then we have that $T,\Gamma\cup\{\psi\}\vdash\theta$ implies $T,\Gamma\vdash(\psi\to\theta)$. (show proof)

Proof. Fix a formal proof of $\theta$ from $\Gamma\cup\{\psi\}$ such that for every free variable $x$ in $\psi$, there is no quantifier $\forall x$ in the formal proof. For any formula $\varphi$ in the formal proof, assume the inductive hypothesis that $T,\Gamma\vdash(\psi\to\varphi')$ for every formula $\varphi'$ before $\varphi$ in the formal proof, then there are four cases for $\varphi$:

If $\varphi$ is an axiom or a member of $\Gamma$, then $T,\Gamma\vdash\varphi$, and since $T,\Gamma\vdash(\varphi\to(\psi\to\varphi))$ by LA1, we have $T,\Gamma\vdash(\psi\to\varphi)$ by MP.
If $\varphi$ is the same formula as $\psi$, since $T\vdash(\varphi\to\varphi)$ by LA1, we have $T\vdash(\psi\to\varphi)$.
If $\varphi$ is obtained from MP, then there exists some $\varphi^*$ such that both $\varphi^*$ and $(\varphi^*\to\varphi)$ have appeared earlier in the formal proof, and by inductive hypothesis we have $T,\Gamma\vdash(\psi\to\varphi^*)$ and $T,\Gamma\vdash(\psi\to(\varphi^*\to\varphi))$. Since $T,\Gamma\vdash((\psi\to(\varphi^*\to\varphi))\to((\psi\to\varphi^*)\to(\psi\to\varphi)))$ by LA1, applying MP twice, we get $T,\Gamma\vdash(\psi\to\varphi)$.
If $\varphi$ is obtained from UG, then $\varphi$ takes the form $\forall x\varphi^*$ for some variable symbol $x$ and some formula $\varphi^*$ that appeared earlier in the formal proof. Note that $x$ is not a free variable in $\psi$. By inductive hypothesis, we have $T,\Gamma\vdash(\psi\to\varphi^*)$, and by applying UG, we also have $T,\Gamma\vdash\forall x(\psi\to\varphi^*)$. Since we have $T,\Gamma\vdash(\forall x(\psi\to\varphi^*)\to(\forall x\psi\to\forall x\varphi^*))$ by LA3, applying MP, we get $T,\Gamma\vdash(\forall x\psi\to\forall x\varphi^*)$. Since $x$ does not occur free in $\psi$, by LA4 we have $T,\Gamma\vdash(\psi\to\forall x\psi)$. Since by LA1 we have $T,\Gamma\vdash((\psi\to\forall x\psi)\to((\forall x\psi\to\forall x\varphi^*)\to(\psi\to\forall x\varphi^*)))$, applying MP twice, we get $T,\Gamma\vdash(\psi\to\forall x\varphi^*)$, or $T,\Gamma\vdash(\psi\to\varphi)$.

We have shown that in every possible case, $T,\Gamma\vdash(\psi\to\varphi)$. This concludes the inductive step. By induction, we have $T,\Gamma\vdash(\psi\to\theta)$. $\blacksquare$

Additional rules of inference
Given a theory $T$ and premises $\Gamma$, an inference rule $\cfrac{\varphi_1\quad\ldots\quad\varphi_n}{\psi}$ is said to be a deduced rule, if whenever $\varphi_1,\ldots,\varphi_n$ have appeared in a formal proof from $\Gamma$ in $T$, there exists a formal proof of $\psi$ from $\Gamma$ in $T$ constructed from said formal proof. To shorten notations, we can use deduced rules as an abbreviation for the inductive steps that lead to the conclusion. Given any $T$ and any $\Gamma$, suppose $\varphi,\psi,\theta$ are arbitrary formulas, the following are deduced rules:

RI. Reiteration. $\cfrac{\varphi}{\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi$ appeared in it, then the following is a formal proof of $\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\varphi\to\varphi) & LA1 \\ 2 & \varphi & MP(*,1) \end{matrix} $$
NI. Double negation introduction. $\cfrac{\varphi}{\neg\neg\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi$ appeared in it, then the following is a formal proof of $\neg\neg\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\varphi\to\neg\neg\varphi) & LA1 \\ 2 & \neg\neg\varphi & MP(*,1) \end{matrix} $$
NE. Double negation elimination. $\cfrac{\neg\neg\varphi}{\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\neg\neg\varphi$ appeared in it, then the following is a formal proof of $\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\neg\neg\varphi\to\varphi) & LA1 \\ 2 & \varphi & MP(*,1) \end{matrix} $$
CP. Contraposition. $\cfrac{(\varphi\to\psi)}{(\neg\psi\to\neg\varphi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\to\psi)$ appeared in it, then the following is a formal proof of $(\neg\psi\to\neg\varphi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\to\psi)\to(\neg\psi\to\neg\varphi)) & LA1 \\ 2 & (\neg\psi\to\neg\varphi) & MP(*,1) \end{matrix} $$
MT. Modus tollens. $\cfrac{\neg\psi\quad(\varphi\to\psi)}{\neg\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\neg\psi,(\varphi\to\psi)$ appeared in it, then the following is a formal proof of $\neg\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\neg\psi\to\neg\varphi) & CP(*) \\ 2 & \neg\varphi & MP(*,1) \end{matrix} $$
ID. Implication distribution. $\cfrac{(\varphi\to(\psi\to\theta))}{((\varphi\to\psi)\to(\varphi\to\theta))}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\to(\psi\to\theta))$ appeared in it, then the following is a formal proof of $((\varphi\to\psi)\to(\varphi\to\theta))$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\to(\psi\to\theta))\to((\varphi\to\psi)\to(\varphi\to\theta))) & LA1 \\ 2 & ((\varphi\to\psi)\to(\varphi\to\theta)) & MP(*,1) \end{matrix} $$
CI. Conjunction introduction. $\cfrac{\varphi\quad\psi}{(\varphi\land\psi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi,\psi$ appeared in it, then the following is a formal proof of $(\varphi\land\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\varphi\to(\psi\to(\varphi\land\psi))) & LA1 \\ 2 & (\psi\to(\varphi\land\psi)) & MP(*,1) \\ 3 & (\varphi\land\psi) & MP(*,2) \end{matrix} $$
CE. Conjunction elimination. $\cfrac{(\varphi\land\psi)}{\varphi}$ and $\cfrac{(\varphi\land\psi)}{\psi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\land\psi)$ appeared in it, then the following is a formal proof of $\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\land\psi)\to\varphi) & LA1 \\ 2 & \varphi & MP(*,1) \end{matrix} $$ and the following is a formal proof of $\psi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\land\psi)\to\psi) & LA1 \\ 2 & \psi & MP(*,1) \end{matrix} $$
DI. Disjunction Introduction. $\cfrac{\varphi}{(\varphi\lor\psi)}$ and $\cfrac{\psi}{(\varphi\lor\psi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi$ appeared in it, then the following is a formal proof of $(\varphi\lor\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\varphi\to(\varphi\lor\psi)) & LA1 \\ 2 & (\varphi\lor\psi) & MP(*,1) \end{matrix} $$ Let $\mathcal P$ be an arbitrary formal proof such that $\psi$ appeared in it, then the following is a formal proof of $(\varphi\lor\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\psi\to(\varphi\lor\psi)) & LA1 \\ 2 & (\varphi\lor\psi) & MP(*,1) \end{matrix} $$
DE. Disjunctive elimination. $\cfrac{(\varphi\lor\psi)\quad\neg\varphi}{\psi}$ and $\cfrac{(\varphi\lor\psi)\quad\neg\psi}{\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\lor\psi),\neg\varphi$ appeared in it, then the following is a formal proof of $\psi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\lor\psi)\land\neg\varphi) & CI(*,*) \\ 2 & (((\varphi\lor\psi)\land\neg\varphi)\to\psi) & LA1 \\ 3 & \psi & MP(1,2) \end{matrix} $$ Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\lor\psi),\neg\psi$ appeared in it, then the following is a formal proof of $\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\lor\psi)\land\neg\psi) & CI(*,*) \\ 2 & (((\varphi\lor\psi)\land\neg\psi)\to\varphi) & LA1 \\ 3 & \varphi & MP(1,2) \end{matrix} $$
HS. Hypothetical syllogism. $\cfrac{(\varphi\to\psi)\quad(\psi\to\theta)}{(\varphi\to\theta)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\to\psi),(\psi\to\theta)$ appeared in it, then the following is a formal proof of $(\varphi\to\theta)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\to\psi)\land(\psi\to\theta)) & CI(*,*) \\ 2 & (((\varphi\to\psi)\land(\psi\to\theta))\to(\varphi\to\theta)) & LA1 \\ 3 & (\varphi\to\theta) & MP(1,2) \end{matrix} $$
BI. Biconditional introduction. $\cfrac{(\varphi\to\psi)\quad(\psi\to\varphi)}{(\varphi\leftrightarrow\psi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\to\psi),(\psi\to\varphi)$ appeared in it, then the following is a formal proof of $(\varphi\leftrightarrow\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\to\psi)\land(\psi\to\varphi)) & CI(*,*) \\ 2 & (((\varphi\to\psi)\land(\psi\to\varphi))\to(\varphi\leftrightarrow\psi)) & LA1 \\ 3 & (\varphi\leftrightarrow\psi) & MP(1,2) \end{matrix} $$
BE. Biconditional elimination. $\cfrac{(\varphi\leftrightarrow\psi)}{(\varphi\to\psi)}$ and $\cfrac{(\varphi\leftrightarrow\psi)}{(\psi\to\varphi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\varphi\leftrightarrow\psi)$ appeared in it, then the following is a formal proof of $(\varphi\to\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\leftrightarrow\psi)\to((\varphi\to\psi)\land(\psi\to\varphi))) & LA1 \\ 2 & ((\varphi\to\psi)\land(\psi\to\varphi)) & MP(*,1) \\ 3 & (\varphi\to\psi) & CE(2) \end{matrix} $$ and the following is a formal proof of $(\psi\to\varphi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & ((\varphi\leftrightarrow\psi)\to((\varphi\to\psi)\land(\psi\to\varphi))) & LA1 \\ 2 & ((\varphi\to\psi)\land(\psi\to\varphi)) & MP(*,1) \\ 3 & (\psi\to\varphi) & CE(2) \end{matrix} $$
UI. Universal instantiation. For any variable symbol $x$ and any term $t$ substitutable for $x$ in $\varphi$, $\cfrac{\forall x\varphi}{\varphi[t/x]}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\forall x\varphi$ appeared in it, then the following is a formal proof of $\varphi[t/x]$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\forall x\varphi\to\varphi[t/x]) & LA2 \\ 2 & \varphi[t/x] & MP(*,1) \end{matrix} $$
ST. Substitution. For any variable symbol $x$ and any term $t$ substitutable for $x$ in $\varphi$, $\cfrac{\varphi}{\varphi[t/x]}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi$ appeared in it, then the following is a formal proof of $\varphi[t/x]$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \forall x\varphi & UG(*) \\ 2 & (\forall x\varphi\to\varphi[t/x]) & LA2 \\ 3 & \varphi[t/x] & MP(1,2) \end{matrix} $$
EG. Existential generalization. For any variable symbol $x$ and any term $t$ substitutable for $x$ in $\varphi$, $\cfrac{\varphi[t/x]}{\exists x\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\varphi[t/x]$ appeared in it, then the following is a formal proof of $\exists x\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\forall x\neg\varphi\to\neg\varphi[t/x]) & LA2 \\ 2 & (\neg\neg\varphi[t/x]\to\neg\forall x\neg\varphi) & CP(1) \\ 3 & \neg\neg\varphi[t/x] & NI(*) \\ 4 & \neg\forall x\neg\varphi & MP(3,2) \\ 5 & \exists x\varphi & RI(4) \end{matrix} $$
UD. Universal deduction. For any variable symbol $x$, $\cfrac{\forall x\varphi\quad(\varphi\to\psi)}{\forall x\psi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\forall x\varphi,(\varphi\to\psi)$ appeared in it, then the following is a formal proof of $\forall x\psi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \forall x(\varphi\to\psi) & UG(*) \\ 2 & (\forall x(\varphi\to\psi)\to(\forall x\varphi\to\forall x\psi)) & LA3 \\ 3 & (\forall x\varphi\to\forall x\psi) & MP(1,2) \\ 4 & \forall x\psi & MP(*,3) \\ \end{matrix}
ED. Existential deduction. For any variable symbol $x$, $\cfrac{\exists x\varphi\quad(\varphi\to\psi)}{\exists x\psi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\exists x\varphi,(\varphi\to\psi)$ appeared in it, then the following is a formal proof of $\exists x\psi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \neg\forall x\neg\varphi & RI(*) \\ 2 & (\neg\psi\to\neg\varphi) & CP(1) \\ 3 & \forall x(\neg\psi\to\neg\varphi) & UG(2) \\ 4 & (\forall x(\neg\psi\to\neg\varphi)\to(\forall x\neg\psi\to\forall x\neg\varphi)) & LA3 \\ 5 & (\forall x\neg\psi\to\forall x\neg\varphi) & MP(3,4) \\ 6 & (\neg\forall x\neg\varphi\to\neg\forall x\neg\psi) & CP(5) \\ 7 & \neg\forall x\neg\psi & MP(1,6) \\ 8 & \exists x\psi & RI(7) \end{matrix}
UC. Universal conjunction. For any variable symbol $x$, $\cfrac{\forall x\varphi\quad\forall x\psi}{\forall x(\varphi\land\psi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\forall x\varphi,\forall x\psi$ appeared in it, then the following is a formal proof of $\forall x(\varphi\land\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\forall x\varphi\to\varphi) & LA2 \\ 2 & \varphi & MP(*,1) \\ 3 & (\forall x\psi\to\psi) & LA2 \\ 4 & \psi & MP(*,3) \\ 5 & (\varphi\land\psi) & CI(2,4) \\ 6 & \forall x(\varphi\land\psi) & UG(5) \end{matrix}
DL. DeMorgan's law. For any variable symbol $x$, $\cfrac{\neg\forall x\varphi}{\exists x\neg\varphi}$ and $\cfrac{\neg\exists x\varphi}{\forall x\neg\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\neg\forall x\varphi$ appeared in it, then the following is a formal proof of $\exists x\neg\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \forall x(\neg\neg\varphi\to\varphi) & LA1 \\ 2 & (\forall x(\neg\neg\varphi\to\varphi)\to(\forall x\neg\neg\varphi\to\forall x\varphi)) & LA3 \\ 3 & (\forall x\neg\neg\varphi\to\forall x\varphi) & MP(1,2) \\ 4 & (\neg\forall x\varphi\to\neg\forall x\neg\neg\varphi) & CP(3) \\ 5 & \neg\forall x\neg\neg\varphi & MP(*,4) \\ 6 & \exists x\neg\varphi & RI(5) \end{matrix} $$ Let $\mathcal P$ be an arbitrary formal proof such that $\neg\exists x\varphi$ appeared in it, then the following is a formal proof of $\forall x\neg\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \neg\neg\forall x\neg\varphi & RI(*) \\ 2 & \forall x\neg\varphi & NE(1) \end{matrix} $$
CQ. Commutative law of quantifiers. For any variable symbols $x$ and $y$, $\cfrac{\forall x\forall y\varphi}{\forall y\forall x\varphi}$ and $\cfrac{\exists x\exists y\varphi}{\exists y\exists x\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\forall x\forall y\varphi$ appeared in it, then the following is a formal proof of $\forall y\forall x\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \forall y\varphi & UI(*) \\ 2 & \varphi & UI(1) \\ 3 & \forall x\varphi & UG(2)\\ 4 & \forall y\forall x\varphi & UG(3) \end{matrix} $$ Let $\mathcal P$ be an arbitrary formal proof such that $\exists x\exists y\varphi$ appeared in it, then the following is a formal proof of $\exists y\exists x\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & \neg\forall x\neg\neg\forall y\neg\varphi & RI(*) \\ 2 & \forall x(\forall y\neg\varphi\to\neg\neg\forall y\neg\varphi) & LA1 \\ 3 & (\forall x(\forall y\neg\varphi\to\neg\neg\forall y\neg\varphi)\to(\forall x\forall y\neg\varphi\to\forall x\neg\neg\forall y\neg\varphi)) & LA3 \\ 4 & (\forall x\forall y\neg\varphi\to\forall x\neg\neg\forall y\neg\varphi) & MP(2,3) \\ 5 & (\neg\forall x\neg\neg\forall y\neg\varphi\to\neg\forall x\forall y\neg\varphi) & CP(4) \\ 6 & \neg\forall x\forall y\neg\varphi & MP(1,5) \\ 7 & (\forall y\forall x\neg\varphi\to\forall x\neg\varphi) & LA2 \\ 8 & (\forall x\neg\varphi\to\neg\varphi) & LA2 \\ 9 & (\forall y\forall x\neg\varphi\to\neg\varphi) & HS(7,8) \\ 10 & \forall y(\forall y\forall x\neg\varphi\to\neg\varphi) & UG(9) \\ 11 & (\forall y(\forall y\forall x\neg\varphi\to\neg\varphi)\to(\forall y\forall y\forall x\neg\varphi\to\forall y\neg\varphi)) & LA3 \\ 12 & (\forall y\forall y\forall x\neg\varphi\to\forall y\neg\varphi) & MP(10,11) \\ 13 & (\forall y\forall x\neg\varphi\to\forall y\forall y\forall x\neg\varphi) & LA4 \\ 14 & (\forall y\forall x\neg\varphi\to\forall y\neg\varphi) & HS(13,12) \\ 15 & \forall x(\forall y\forall x\neg\varphi\to\forall y\neg\varphi) & UG(14) \\ 16 & (\forall x(\forall y\forall x\neg\varphi\to\forall y\neg\varphi)\to(\forall x\forall y\forall x\neg\varphi\to\forall x\forall y\neg\varphi)) & LA3 \\ 17 & (\forall x\forall y\forall x\neg\varphi\to\forall x\forall y\neg\varphi) & MP(15,16) \\ 18 & (\forall y\forall x\neg\varphi\to\forall x\forall y\forall x\neg\varphi) & LA4 \\ 19 & (\forall y\forall x\neg\varphi\to\forall x\forall y\neg\varphi) & HS(18,17) \\ 20 & (\neg\forall x\forall y\neg\varphi\to\neg\forall y\forall x\neg\varphi) & CP(19) \\ 21 & \neg\forall y\forall x\neg\varphi & MP(6,20) \\ 22 & \exists y\neg\forall x\neg\varphi & DL(21) \\ 23 & \exists y\exists x\varphi & RI(22) \end{matrix} $$
EI. Existential instantiation. For any variable symbols $x$ that does not occur free in $\varphi$, $\cfrac{\exists x\varphi}{\varphi}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $\exists x\varphi$ appeared in it, then the following is a formal proof of $\varphi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\neg\varphi\to\forall x\neg\varphi) & LA4 \\ 2 & (\neg\forall x\neg\varphi\to\neg\neg\varphi) & CP(1) \\ 3 & (\exists x\varphi\to\neg\neg\varphi) & RI(2) \\ 4 & (\neg\neg\varphi\to\varphi) & LA1 \\ 5 & (\exists x\varphi\to\varphi) & HS(3,4) \\ 6 & \varphi & MP(*,5) \end{matrix} $$
PC. Proof by contradiction. $\cfrac{(\neg\psi\to\theta)\quad(\neg\psi\to\neg\theta)}{\psi}$ and $\cfrac{((\varphi\land\neg\psi)\to\theta)\quad((\varphi\land\neg\psi)\to\neg\theta)} {(\varphi\to\psi)}$. (show proof)

Proof. Let $\mathcal P$ be an arbitrary formal proof such that $(\neg\psi\to\theta),(\neg\psi\to\neg\theta)$ appeared in it, then the following is a formal proof of $\psi$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\neg\theta\to\psi) & CP(*) \\ 2 & (\theta\to\psi) & CP(*) \\ 3 & ((\neg\theta\to\psi)\land(\theta\to\psi)) & CI(1,2) \\ 4 & (((\neg\theta\to\psi)\land(\theta\to\psi))\to\psi) & LA1 \\ 5 & \psi & MP(3,4) \end{matrix} $$ Let $\mathcal P$ be an arbitrary formal proof such that $((\varphi\land\neg\psi)\to\theta), ((\varphi\land\neg\psi)\to\neg\theta)$ appeared in it, then the following is a formal proof of $(\varphi\to\psi)$ constructed from $\mathcal P$. $$ \begin{matrix} * & \mathcal P \\ 1 & (\neg\theta\to\neg(\varphi\land\neg\psi)) & CP(*) \\ 2 & (\theta\to\neg(\varphi\land\neg\psi)) & CP(*) \\ 3 & ((\neg\theta\to\neg(\varphi\land\neg\psi))\land(\theta\to\neg(\varphi\land\neg\psi))) & CI(1,2) \\ 4 & (((\neg\theta\to\neg(\varphi\land\neg\psi))\land(\theta\to\neg(\varphi\land\neg\psi)))\to\neg(\varphi\land\neg\psi)) & LA1 \\ 5 & \neg(\varphi\land\neg\psi) & MP(3,4) \\ 6 & (\neg(\varphi\land\neg\psi)\to(\varphi\to\psi)) & LA1 \\ 7 & (\varphi\to\psi) & MP(5,6) \end{matrix} $$

Proposition. For any formula $\varphi$, let $\bar\varphi$ denote any closure of $\varphi$ with respect to any tuple of variable symbols $\mathcal T$ such that all free variables in $\varphi$ are in $\mathcal T$, then $\bar\varphi$ is always a sentence. Given a theory $T$ and premises $\Gamma$, if $\cfrac{\varphi_1\quad\ldots\quad\varphi_n}{\psi}$ is a deduced rule, then

$\cfrac{}{\bar\psi}$ is a deduced rule if $n$ is equal to $0$, or
$\cfrac{}{(\bar\varphi_1\to\bar\psi)}$ is a deduced rule if $n$ is equal to $1$, or
$\cfrac{}{((\bar\varphi_1\land\ldots\land\bar\varphi_n)\to\bar\psi)}$ is a deduced rule if $n$ is greater than $1$.

(show proof)

Proof. Suppose $\cfrac{\varphi_1\quad\ldots\quad\varphi_n}{\psi}$ is a deduced rule. Take $\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_n\}$ as premises, then there exists a formal proof such that all of $\varphi_1,\ldots,\varphi_n$ appeared in it. Since $\cfrac{\varphi_1\quad\ldots\quad\varphi_n}{\psi}$ is a deduced rule, there exists a formal proof of $\psi$ and hence $\bar\psi$; that is, $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_n\}\vdash\bar\psi$. We will discuss by cases:

If $n$ is equal to $0$, then $T,\Gamma\vdash\bar\psi$. Hence, given $\Gamma$ as premises, there exists a formal proof of $\bar\psi$ constructed from any formal proof, so $\cfrac{}{\bar\psi}$ is a deduced rule.
If $n$ is equal to $1$, since $T,\Gamma\cup\{\bar\varphi_1\}\vdash\bar\psi$, by deduction theorem, we have $T,\Gamma\vdash(\bar\varphi_1\to\bar\psi)$. Hence, given $\Gamma$ as premises, there exists a formal proof of $(\bar\varphi_1\to\bar\psi)$ constructed from any formal proof, so $\cfrac{}{(\bar\varphi_1\to\bar\psi)}$ is a deduced rule.
If $n$ is greater than $1$, since $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_{n-1}\}\cup\{\bar\varphi_n\}\vdash\bar\psi$, by deduction theorem, we have $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_{n-1}\}\vdash(\bar\varphi_n\to\bar\psi)$. For any natural number $i$ from $1$ to $n-1$, take the following as an inductive hypothesis: $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_i\}\vdash(\varphi\to\bar\psi)$, where $\varphi$ is $\bar\varphi_n$ if $i=n-1$, or $(\bar\varphi_{i+1}\land\ldots\land(\bar\varphi_{n-1}\land\bar\varphi_n)\ldots)$ if otherwise. Given $i$ and the inductive hypothesis, by deduction theorem, we have $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_{i-1}\}\vdash(\bar\varphi_i\to(\varphi\to\bar\psi))$. By LA1, we have $((\psi_1\to(\psi_2\to\psi_3))\to((\psi_1\land\psi_2)\to\psi_3))$ for arbitrary formulas $\psi_1,\psi_2,\psi_3$, so by MP we have $T,\Gamma\cup\{\bar\varphi_1,\ldots,\bar\varphi_{i-1}\}\vdash((\bar\varphi_i\land\varphi)\to\bar\psi)$. This concludes the inductive step. Since the inductive hypothesis for $i=n-1$ is satisfied, by induction, we have $T,\Gamma\vdash((\bar\varphi_1\land\ldots\land(\bar\varphi_{n-1}\land\bar\varphi_n)\ldots)\to\bar\psi)$. By LA1, we have $((\psi_1\land\ldots\land\psi_n)\to(\psi_1\land\ldots\land(\psi_{n-1}\land\psi_n)\ldots))$ for arbitrary natural number $n$ and arbitrary formulas $\psi_1,\ldots,\psi_n$. so by HS we have $T,\Gamma\vdash((\bar\varphi_1\land\ldots\land\bar\varphi_n)\to\bar\psi)$. Hence, given $\Gamma$ as premises, there exists a formal proof of $((\bar\varphi_1\land\ldots\land\bar\varphi_n)\to\bar\psi)$ constructed from any formal proof, so $\cfrac{}{((\bar\varphi_1\land\ldots\land\bar\varphi_n)\to\bar\psi)}$ is a deduced rule. $\blacksquare$

Proposition. Given any theory and any premises, let $t_1,t_2,t_3$ be arbitrary terms, we have the following deduced rules:

E1. $\cfrac{}{(t_1=t_1)}$.
E2. $\cfrac{}{((t_1=t_2)\to(t_2=t_1))}$.
E3. $\cfrac{}{(((t_1=t_2)\land(t_2=t_3))\to(t_1=t_3))}$.

(show proof)

Uniqueness
Let $x$ be a variable symbol, let $\varphi$ be a formula, and let $y,z$ be variable symbols not occurring in $\varphi$ such that $x,y,z$ are distinct, then $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\leftrightarrow\exists x(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ and $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\leftrightarrow(\exists x\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))))$$ (show proof)

Proof. For these ones, we will only list the key steps. Note that $\varphi[y/x][x/y]$ is the same formula as $\varphi$, and $\varphi[y/x][z/y]$ is the same formula as $\varphi[z/x]$. $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to((x=x)\to\varphi))$$ $$(((x=x)\to\varphi)\to\varphi)$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to(\varphi[y/x]\to(x=y)))$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to\forall y(\varphi[y/x]\to(x=y)))$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ $$((x=y)\to(\varphi\to\varphi[y/x]))$$ $$(\varphi\to((x=y)\to\varphi[y/x]))$$ $$(\forall y(\varphi[y/x]\to(x=y))\to(\varphi[y/x]\to(x=y)))$$ $$((\varphi\land\forall y(\varphi[y/x]\to(x=y)))\to(\varphi[y/x]\leftrightarrow(x=y)))$$ $$((\varphi\land\forall y(\varphi[y/x]\to(x=y)))\to\forall y(\varphi[y/x]\leftrightarrow(x=y)))$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\leftrightarrow(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\leftrightarrow\exists x(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ And we have the first formula.
$$(\exists x(\varphi\land\forall y(\varphi[y/x]\to(x=y)))\to\exists x\varphi)$$ $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\to\exists x\varphi)$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to((\varphi[y/x]\leftrightarrow(x=y))\land(\varphi[z/x]\leftrightarrow(x=z))))$$ $$(((\varphi[y/x]\leftrightarrow(x=y))\land(\varphi[z/x]\leftrightarrow(x=z)))\to((\varphi[y/x]\land\varphi[z/x])\to(y=z)))$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to((\varphi[y/x]\land\varphi[z/x])\to(y=z)))$$ $$(\forall y(\varphi[y/x]\leftrightarrow(x=y))\to\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))$$ $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\to\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))$$ $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\to(\exists x\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))))$$ $$(\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))\to((\varphi[y/x]\land\varphi)\to(x=y)))$$ $$(((\varphi[y/x]\land\varphi)\to(x=y))\to(\varphi\to(\varphi[y/x]\to(x=y))))$$ $$((\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))\land\varphi)\to(\varphi[y/x]\to(x=y)))$$ $$((\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))\land\varphi)\to\forall y(\varphi[y/x]\to(x=y)))$$ $$(\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))\to(\varphi\to\forall y(\varphi[y/x]\to(x=y))))$$ $$((\varphi\to\forall y(\varphi[y/x]\to(x=y)))\to(\varphi\to(\varphi\land\forall y(\varphi[y/x]\to(x=y)))))$$ $$((\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))\to(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ $$(\exists x(\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))\to\exists x(\varphi\land\forall y(\varphi[y/x]\to(x=y))))$$ Let $A$ denote $\varphi$ and $B$ denote $\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))$. $$(\forall x\neg(A\land B)\to\forall x(B\to\neg A))$$ $$(\forall x(B\to\neg A)\to(\forall xB\to\forall x\neg A))$$ $$(B\to\forall xB)$$ $$((\forall xB\to\forall x\neg A)\to(B\to\forall x\neg A))$$ $$((B\to\forall x\neg A)\to\neg(B\land\neg\forall x\neg A))$$ $$(\forall x\neg(A\land B)\to\neg(B\land\neg\forall x\neg A))$$ $$((B\land\neg\forall x\neg A)\to\neg\forall x\neg(A\land B))$$ $$((\exists x\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))\to\exists x(\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))))$$ $$((\exists x\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z)))\to\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y)))$$ $$(\exists x\forall y(\varphi[y/x]\leftrightarrow(x=y))\leftrightarrow(\exists x\varphi\land\forall y\forall z((\varphi[y/x]\land\varphi[z/x])\to(y=z))))$$ And we have the second formula. $\blacksquare$

Extension by definition
A first-order theory $T$ can be extended by the following procedures:

Definition of predicate symbols: Suppose we have a formula $\varphi$ with $n$ distinct free variables $x_1,\ldots,x_n$. We can construct $T'$ from $T$ by adding a new predicate symbol $P$ to the signature, with arity number $n$, and adding the following non-logical axiom called the defining axiom of $P$: $$(P(x_1,\ldots,x_n)\leftrightarrow\varphi)$$
Definition of function symbols: Suppose we have a formula $\varphi$ with $n+1$ distinct free variables $x_1,\ldots,x_n$ and $y$. Suppose $T\vdash\forall x_1\ldots\forall x_n\exists!y\varphi$ and each of $x_1,\ldots,x_n$ is substitutable for $y$ in $\varphi$. We can construct $T'$ from $T$ by adding a new function symbol $f$ to the signature, with arity number $n$, and adding the following non-logical axiom called the defining axiom of $f$: $$\varphi[f(x_1,\ldots,x_n)/y]$$ This procedure can be applied to constant symbols as well.

Extension by witness
Suppose we have a theory $T$ and an axiom schema $\mathcal A$ that generates the axioms of $T$, such that given any signature, the collection of axioms generated by the logical axiom schema is a sub-collection of the collection of axioms generated by $\mathcal A$. Define $T_0$ to be $T$. For every natural number $i$, $T_{i+1}$ is an extension of $T_i$ recursively defined by the following procedure. For every formula $\varphi$ of $T_i$, for every variable symbol $x$, for every permutation $(x_1,\ldots,x_n)$ of distinct free variables in $\exists x\varphi$, if each of $x_1,\ldots,x_n$ is substitutable for $x$ in $\varphi$, define a new $n$-ary function symbol $f$ and a non-logical axiom in $T_{i+1}$: $$(\exists x\varphi\to\varphi[f(x_1,\ldots,x_n)/x])$$ The function symbol $f$ is called a witness function, or a witness, for $\exists x\varphi$ in $T_{i+1}$, and the axiom above is called the witness axiom of $f$. Also, the axiom schema $\mathcal A$ should generate axioms for $T_{i+1}$ with its signature. The witness axioms and the axioms generated by $\mathcal A$ together form the axioms of $T_{i+1}$, which contains every logical axiom. Now define $T^*$ by:

The signature of $T^*$ is defined by:
- The collection of predicate symbols of $T^*$ is the same as that of $T$.
- The collection of function symbols of $T^*$ is the union of $\{F_0,F_1,\ldots\}$, where $F_i$ is the collection of function symbols of $T_i$ for every natural number $i$.
The collection of axioms of $T^*$ is the union of $\{\Lambda_0,\Lambda_1,\ldots\}$, where $\Lambda_i$ is the collection of axioms of $T_i$ for every natural number $i$. Note that this collection of axioms contain every logical axiom generated with the signature of $T^*$.

Then $T^*$ is a theory and an extension of $T$, called the witness-complete extension of $T$.

Note. A witness with its arguments is a term.

Note. From this point on, we may use meta-logical style proofs in place of formal proofs, but the underlying logic is presentable in formal proofs.

Lemma. Suppose $\psi$ is a subformula of $\varphi$, then the tuple of symbols $\varphi'$ formed by replacing $\psi$ with a formula $\theta$ in $\varphi$ is the same as the tuple of symbols $\varphi''$ obtained by applying the inductive steps to obtain $\varphi$ from $\psi$ on $\theta$, and they are the same formula. (show proof)

Proposition. Suppose $\psi$ is a subformula of $\varphi$, and suppose $\varphi'$ is obtained by replacing $\psi$ with a formula $\psi'$ in $\varphi$, then we have $((\psi\leftrightarrow\psi')\to(\varphi\leftrightarrow\varphi'))$. (show proof)

Proof. By the lemma above, $\varphi'$ is obtainable by applying the inductive steps to obtain $\varphi$ from $\psi$ on $\psi'$. We will denote these inductive steps as $\varphi_0,\ldots,\varphi_n$ for $\varphi$ and $\varphi'_0,\ldots,\varphi'_n$ for $\varphi'$, for some natural number $n$. For each $i$ from $1$ to $n$, suppose we have $(\varphi_{i-1}\leftrightarrow\varphi'_{i-1})$, then:

1. If $\varphi_i$ is $\neg\varphi_{i-1}$, then $\varphi'_i$ is $\neg\varphi'_{i-1}$. And we have $(\neg\varphi_{i-1}\leftrightarrow\neg\varphi'_{i-1})$ by contraposition.
2. If $\varphi_i$ is $(\varphi_{i-1}\to\theta)$ for some formula $\theta$, then $\varphi'_i$ is $(\varphi'_{i-1}\to\theta)$. If $(\varphi_{i-1}\to\theta)$, then by $(\varphi_{i-1}\leftrightarrow\varphi'_{i-1})$ we have $(\varphi'_{i-1}\to\theta)$. Similarly, if $(\varphi'_{i-1}\to\theta)$ then $(\varphi_{i-1}\to\theta)$. Thus we have $((\varphi_{i-1}\to\theta)\leftrightarrow(\varphi'_{i-1}\to\theta))$.
3. If $\varphi_i$ is $(\theta\to\varphi_{i-1})$ for some formula $\theta$, then $\varphi'_i$ is $(\theta\to\varphi'_{i-1})$. If $(\theta\to\varphi_{i-1})$, then by $(\varphi_{i-1}\leftrightarrow\varphi'_{i-1})$ we have $(\theta\to\varphi'_{i-1})$. Similarly, if $(\theta\to\varphi'_{i-1})$ then $(\theta\to\varphi_{i-1})$. Thus we have $((\theta\to\varphi_{i-1})\leftrightarrow(\theta\to\varphi'_{i-1}))$.
4. If $\varphi_i$ is $\forall x\varphi_{i-1}$ for some variable symbol $x$, then $\varphi'_i$ is $\forall x\varphi'_{i-1}$. By $(\varphi_{i-1}\leftrightarrow\varphi'_{i-1})$, we have $\forall x(\varphi_{i-1}\to\varphi'_{i-1})$ and $\forall x(\varphi'_{i-1}\to\varphi_{i-1})$, thus $(\forall x\varphi_{i-1}\leftrightarrow\forall x\varphi'_{i-1})$.

In all cases, we have $(\varphi_i\leftrightarrow\varphi'_i)$. Thus by induction, we have $((\varphi_0\leftrightarrow\varphi'_0)\to(\varphi_n\leftrightarrow\varphi'_n))$. Since $\varphi_0,\varphi'_0,\varphi_n,\varphi'_n$ are $\psi,\psi',\varphi,\varphi'$ respectively, we have $((\psi\leftrightarrow\psi')\to(\varphi\leftrightarrow\varphi'))$. $\blacksquare$

Lemma. Let $\varphi$ be a formula, and let $y$ be a variable symbol not occurring in $\varphi$, then for every quantifier $\forall x$ in $\varphi$, the formula $\varphi'$ obtained by

replacing the quantifier $\forall x$ in $\varphi$ by $\forall y$ and
replacing all occurrences originally bounded by $\forall x$ by $y$

keeps the freeness or boundedness of every occurrence of variable symbol as in $\varphi$, and we have $(\varphi\leftrightarrow\varphi')$. (show proof)

Proposition. Let $\varphi$ be a formula, and let $\theta$ be a formula obtained from $\varphi$ by replacing all quantifiers and bounded variable symbols, while keeping the freeness or boundedness of every occurrence of variable symbol as in $\varphi$, then we have $(\varphi\leftrightarrow\theta)$. (show proof)

Property
When we define a property $P(x_1,\ldots,x_n)$ in the form of a formula $\varphi$, where $n$ is a natural number, $x_1,\ldots,x_n$ are distinct variable symbols, and $\varphi$ is a formula whose free variables are among $x_1,\ldots,x_n$, we are defining a meta-logical function $\mathcal F$ that associates an $n$-tuple $(t_1,\ldots,t_n)$ of terms with a formula $\varphi'$ obtained from $\varphi$ by

replacing all quantifiers and bounded variable symbols with variable symbols not occurring in any of $(t_1,\ldots,t_n)$, while keeping the freeness or boundedness of every occurrence of variable symbol as in $\varphi$, and
replacing all free occurrences of $x_i$ by $t_i$ for $i$ from $1$ to $n$.

When we say $P(t_1,\ldots,t_n)$, or that $t_1,\ldots,t_n$ has the property $P$, we mean the formula $\mathcal F(t_1,\ldots,t_n)$. And by the proposition above, any other choice of variable symbols for replacement in $\varphi$ as described above results in a formula equivalent to $\mathcal F(t_1,\ldots,t_n)$.

Note. Suppose we have a theory extended by witnesses. Given a formula $\varphi$ and a variable symbol $x$, we can obtain a formula $\varphi'$ by replacing all quantifiers and bounded variable symbols by variable symbols not occurring in $\varphi$ and distinct from $x$, while keeping the freeness or boundedness of every occurrence of variable symbol as in $\varphi$, so to guarantee that $\exists x\varphi'$ permits witness axioms. Note that $\varphi$ and $\varphi'$ essentially represent the same property, as in that $t_1,\ldots,t_n$ has the property $P_\varphi$ if and only if $t_1,\ldots,t_n$ has the property $P_{\varphi'}$.

Notation. For the rest of this section, we may use the notation $\varphi(x_1,\ldots,x_n)$ to denote a formula $\varphi$ with free variable symbols among the distinct variable symbols $x_1,\ldots,x_n$ for some natural number $n$. And given terms $t_1,\ldots,t_n$, we may use $\varphi(t_1,\ldots,t_n)$ to denote $\varphi[t_1/x_1]\ldots[t_n/x_n]$.

Unique witness
Given a witness-complete theory, if we have $$(\psi(x_1,\ldots,x_n)\to\exists!y\varphi(x_1,\ldots,x_n,y))$$ where $x_1,\ldots,x_n,y$ are distinct variable symbols for some natural number $n$ and $\exists!y\varphi$ permits witnesses, then we have $$(\psi(x_1,\ldots,x_n)\to\forall z(\varphi(x_1,\ldots,x_n,z)\to(f(\vb x)=z)))$$ where $f$ is a witness defined by $\exists!y\varphi(x_1,\ldots,x_n,y)$, $\vb x$ represents the permutation of arguments chosen by the witness, and $z$ is the hidden variable symbol in the uniqueness statement. Given terms $t_1,\ldots,t_n$ with variable symbols among $w_1,\ldots,w_k$ for some natural number $k$, we can replace all quantifiers and bounded variable symbols in $\psi$ and $\varphi$ with new variable symbols, while keeping the freeness or boundedness of every occurrence of variable symbol, to obtain $\psi'$ and $\varphi'$. Let $y',z'$ be new distinct variable symbols, then we have the formulas $$(\psi'(x_1,\ldots,x_n)\to\exists!y'\varphi'(x_1,\ldots,x_n,y'))$$ $$(\psi'(x_1,\ldots,x_n)\to\forall z'(\varphi'(x_1,\ldots,x_n,z')\to(f(\vb x)=z')))$$ where the hidden variable symbol in the uniqueness statement is $z'$. Generalizing and instantiating $x_1,\ldots,x_n$ with new distinct variable symbols $x'_1,\ldots,x'_n$, then generalizing and instantiating $x'_1,\ldots,x'_n$ with $t_1,\ldots,t_n$, we have $$(\psi'(t_1,\ldots,t_n)\to\exists!y'\varphi'(t_1,\ldots,t_n,y'))$$ $$(\psi'(t_1,\ldots,t_n)\to\forall z'(\varphi'(t_1,\ldots,t_n,z')\to(f(\vb t)=z')))$$ where $\vb t$ represents the arguments obtained by replacing $x_j$ with $t_j$ in $\vb x$. Thus we have $$(\psi'(t_1,\ldots,t_n)\to\varphi'(t_1,\ldots,t_n,f'(\vb w)))$$ where $f'$ is a witness defined by $\exists!y'\varphi'(t_1,\ldots,t_n,y')$, and $\vb w$ represents the permutation of arguments chosen by the witness. Hence $$(\psi'(t_1,\ldots,t_n)\to(f(\vb t)=f'(\vb w)))$$ Note that if the original hypothesis is $$\exists!y\varphi(x_1,\ldots,x_n,y)$$ then we have $$(f(\vb t)=f'(\vb w))$$

Higher-order logic
Let the objects in universe be called first-order objects with shape $()$, and the quantifiers that range over first-order objects be called first-order quantifiers. Then we may recursively define an $(n+1)$th-order theory from an $n$th-order theory by allowing quantifiers that range over relations of $n$th-order objects, in the form of collections of tuples of $n$th-order objects, with the size $k$ of the tuple and the shape $S_i$ allowed in each slot $i$ predetermined. These relations are then called $(n+1)$th-order objects with shape $(S_1,\ldots,S_k)$. And these quantifiers are called $(n+1)$th-order quantifiers. We may call propositional logic a $0$th-order theory as it allows no quantifier.

Zermelo-Fraenkel set theory (show)

Zermelo-Fraenkel set theory
Zermelo-Fraenkel set theory, or $ZF$, is a first-order theory, in which objects represented by variable symbols are called sets. $ZF$ is constructed from the following signature and axioms:

Signature
Two binary predicate symbols: Equality $=$ and Membership $\in$.
Axioms
By default, let $a,b,c,s,u,x,y,z,F,S,X,Y$ be any distinct variable symbols.
- $ZF1$. Axiom of extensionality. if $X$ and $Y$ contain the same elements, then $X$ and $Y$ are equal. $$\forall X\forall Y(\forall u((u\in X)\leftrightarrow(u\in Y))\to(X=Y))$$
- $ZF2$. Axiom of pairing. For any $a$ and $b$, there exists a set that contains exactly a and b. $$\forall a\forall b\exists c\forall x((x\in c)\leftrightarrow((x=a)\lor (x=b)))$$
- $ZF3$. Axiom schema of specification. For any natural number $n$, let $X, Y, u, w_1, \ldots, w_n$ be any distinct variable symbols, and let $\varphi$ be any formula with free variables among $X, u, w_1, \ldots, w_n$. Given any arguments $w_1, \ldots, w_n$ and any set $X$, there exists a set $Y$ that contains exactly all members of $X$ that have the property represented by $\varphi$. $$\forall w_1\ldots\forall w_n\forall X\exists Y\forall u((u\in Y)\leftrightarrow((u\in X)\land\varphi))$$
- $ZF4$. Axiom of union. For any $X$ there exists a set $Y$ that is the union of all elements of $X$. $$\forall X\exists Y\forall u((u\in Y)\leftrightarrow\exists x((x\in X)\land(u\in x)))$$
- $ZF5$. Axiom of power set. For any $X$ there exists a set $Y$ that contains exactly all subsets of $X$. $$\forall X\exists Y\forall y((y\in Y)\leftrightarrow\forall x((x\in y)\to(x\in X)))$$
- $ZF6$. Axiom of infinity. There exists an infinite set $S$ that contains the empty set and for every $x$ in $S$, the successor of $x$ is also in $S$. $$\exists S(\forall z(\forall y\neg(y\in z)\to(z\in S))\land\forall x((x\in S)\to \forall X(\forall u((u\in X)\leftrightarrow((u\in x)\lor(u=x)))\to(X\in S))))$$
- $ZF7$. Axiom schema of replacement. For any natural number $n$, let $X, Y, x, y, w_1, \ldots, w_n$ be any distinct variable symbols, and let $\varphi$ be any formula with free variables among $X, x, y, w_1, \ldots, w_n$. Given any arguments $w_1, \ldots, w_n$ and any set $X$, if $\varphi$ defines a function from $X$, then there exists a set $Y$ that can be taken as the codomain of the function. $$\forall w_1\ldots\forall w_n\forall X(\forall x((x\in X)\to\exists!y\varphi)\to\exists Y\forall x((x\in X)\to\exists y((y\in Y)\land\varphi)))$$
- $ZF8$. Axiom of foundation. Every non-empty set $S$ contains an element that is disjoint from $S$. $$\forall S(\exists x(x\in S)\to\exists s((s\in S)\land\neg\exists y((y\in S)\land(y\in s))))$$

Zermelo-Fraenkel set theory with axiom of choice, or $ZFC$, is $ZF$ extended by the following non-logical axiom (we use the same notations as above):

$ZF9$. Axiom of choice. Given any family $F$ of pairwise disjoint non-empty sets, there exists a set $S$ that contains exactly one element in common with each of the sets in $F$. $$\forall F((\forall X((X\in F)\to\exists x(x\in X))\land \forall X\forall Y(((X\in F)\land(Y\in F))\to(\exists u((u\in X)\land(u\in Y))\to(X=Y))))\to \exists S\forall X((X\in F)\to\exists!x((x\in S)\land(x\in X))))$$

ZFW
We will use the witness-complete extension of $ZF$, called $ZFW$, as our foundation.

Note. Note that, the concept of sets is purely semantic. Syntactically, sets are represented by terms, instead of just variable symbols. A witness with its arguments together represents a set.

Note. Form this point on:

by default, we work on $ZFW$;
by default, distinct notations of variable symbols represent arbitrary distinct variable symbols;
we may not state the nature of a notation if it is obvious;
we may omit parentheses if the resulting notations are unambiguous;
when we introduce new variable symbols, unless otherwise specified, we implicitly assume that they are chosen to not occur in anything already introduced;
in general, any unintended collision of variable symbols should be implicitly avoided by replacement of quantifiers and bounded variables with new variable symbols, and/or generalization and instantiation on free variables with new variable symbols;
we may use notations like $\varphi(x_1,\ldots,x_n)$ for formulas or $y(x_1,\ldots,x_n)$ for terms to emphasize on the variable symbols $x_1,\ldots,x_n$, which may or may not occur as free variables, and given terms $t_1,\ldots,t_n$, we may denote $\varphi[t_1/x_1]\ldots[t_n/x_n]$ as $\varphi(t_1,\ldots,t_n)$ and $y[t_1/x_1]\ldots[t_n/x_n]$ as $y(t_1,\ldots,t_n)$;
when we say "there exists $x$ such that $\varphi$", we may implicitly use the same notation $x$ to represent a witness with arguments of the statement;
we may use the notation $\{u\subseteq X|\varphi\}$ in place of $\{u\in\mathcal P(X)|\varphi\}$;
we may use the notation $\forall x_1,\ldots,x_n\in X\varphi$ in place of $\forall x_1\ldots\forall x_n(((x_1\in X)\land\ldots\land(x_n\in X))\to\varphi)$;
we may use the notation $\exists x_1,\ldots,x_n\in X\varphi$ in place of $\exists x_1\ldots\exists x_n(((x_1\in X)\land\ldots\land(x_n\in X))\land\varphi)$;
we will allow various notations of quantifiers, if unambiguous.

Notation. Let $t_1,t_2$ be arbitrary terms, let $x$ be a variable symbol not occurring in $t_1,t_2$, we will use the following notations:

non-equality: we use $(t_1\neq t_2)$ to denote $\neg(t_1=t_2)$
non-membership: we use $(t_1\notin t_2)$ to denote $\neg(t_1\in t_2)$
subset: we use $(t_1\subseteq t_2)$ to denote $\forall x((x\in t_1)\to(x\in t_2))$
proper subset: we use $(t_1\subset t_2)$ to denote $((t_1\subseteq t_2)\land\neg(t_2\subseteq t_1))$

Definition. We say $t_1$ is a superset of $t_2$ if and only if $t_2$ is a subset of $t_1$.

Empty set
There exists a unique set that contains no sets (show proof).

Proof. The formula we are trying to prove is $\exists!X\forall u\neg(u\in X)$. We will be skipping obvious steps in this proof. $$ \begin{matrix} 1 & \forall w_1\forall X\exists Y\forall u((u\in Y)\leftrightarrow((u\in X)\land\neg(w_1=w_1))) & ZF3 \\ 2 & \exists Y\forall u((u\in Y)\leftrightarrow((u\in X)\land\neg(w_1=w_1))) \\ 3 & \forall u((u\in f(w_1,X))\leftrightarrow((u\in X)\land\neg(w_1=w_1))) & \text{Witness }f \\ 4 & ((u\in f(w_1,X))\leftrightarrow((u\in X)\land\neg(w_1=w_1))) \\ 5 & (w_1=w_1) \\ 6 & \neg((u\in X)\land\neg(w_1=w_1)) \\ 7 & \neg(u\in f(w_1,X)) & \text{From }4,6 \\ 8 & \forall u\neg(u\in f(w_1,X)) \\ 9 & \exists X\forall u\neg(u\in X) \\ 10 & \forall u\neg(u\in\mathcal E) & \text{Witness }\mathcal E \\ 11 & \neg(u\in\mathcal E) \\ 12 & \forall X\forall Y(\forall u((u\in X)\leftrightarrow(u\in Y))\to(X=Y)) & ZF1 \\ 13 & (\forall u((u\in\mathcal E)\leftrightarrow(u\in Y))\to(\mathcal E=Y)) \\ 14 & \forall u((\neg(u\in\mathcal E)\land\neg(u\in Y))\to((u\in\mathcal E)\leftrightarrow(u\in Y))) & \text{Tautology} \\ 15 & (\forall u(\neg(u\in\mathcal E)\land\neg(u\in Y))\to\forall u((u\in\mathcal E)\leftrightarrow(u\in Y))) \\ 16 & (\neg(u\in\mathcal E)\to(\neg(u\in Y)\to(\neg(u\in\mathcal E)\land\neg(u\in Y)))) & \text{Tautology} \\ 17 & (\neg(u\in Y)\to(\neg(u\in\mathcal E)\land\neg(u\in Y))) & \text{From }11,16 \\ 18 & \forall u(\neg(u\in Y)\to(\neg(u\in\mathcal E)\land\neg(u\in Y))) \\ 19 & (\forall u\neg(u\in Y)\to\forall u(\neg(u\in\mathcal E)\land\neg(u\in Y))) \\ 20 & (\forall u\neg(u\in Y)\to(\mathcal E=Y)) & \text{From }13,15,19 \\ 21 & ((\mathcal E=Y)\to(\forall u\neg(u\in\mathcal E)\to\forall u\neg(u\in Y))) \\ 22 & ((\forall u\neg(u\in\mathcal E)\land((\mathcal E=Y)\to(\forall u\neg(u\in\mathcal E)\to\forall u\neg(u\in Y))))\to((\mathcal E=Y)\to\forall u\neg(u\in Y))) & \text{Tautology} \\ 23 & (\forall u\neg(u\in\mathcal E)\land((\mathcal E=Y)\to(\forall u\neg(u\in\mathcal E)\to\forall u\neg(u\in Y)))) & \text{From }10,21 \\ 24 & ((\mathcal E=Y)\to\forall u\neg(u\in Y)) & \text{From }22,23 \\ 25 & (\forall u\neg(u\in Y)\leftrightarrow(\mathcal E=Y)) & \text{From }20,24 \\ 26 & \forall Y(\forall u\neg(u\in Y)\leftrightarrow(\mathcal E=Y)) \\ 27 & \exists X\forall Y(\forall u\neg(u\in Y)\leftrightarrow(X=Y)) \\ 28 & \exists!X\forall u\neg(u\in X) \end{matrix} $$

We call this set emptyset and denote it as $\emptyset$.

Note. From this point on, we will not use formal proofs to prove theorems. Instead, we will use meta-logical style reasoning, but the underlying logic is justifiable within $ZFW$.

Proposition. Equal sets contain the same elements. (show proof)

Note. Combining axiom of extensionality and the above proposition, two sets are equal if and only if they contain the same elements.

Proposition. Equal sets must simultaneously belong or not belong to a set. (show proof)

Set-builder notation
Let $n$ be a meta-logical natural number. let $\varphi$ be a formula with free variables among $X,u,w_1,\ldots,w_n$. Then by axiom schema of specification, there exists, and thus uniquely exists, a set $Y$ such that a set $u$ is in $Y$ if and only if $u$ is in $X$ and $\varphi$ is satisfied. We will denote the set $Y$ as $$\{u\in X|\varphi\}$$

Enumeration notation
Let $n$ be a meta-logical natural number. Given sets $x_1,\ldots,x_n$, there exists a unique set $X$ such that a set $u$ is in $X$ if and only if $u$ is equal to one of $x_1,\ldots,x_n$ (show proof).

Proof. In first-order language, this means

$\exists!X\forall u((u\in X)\leftrightarrow((u=x_1)\lor\ldots\lor(u=x_n)))$ if $n$ is non-zero, and
$\exists!X\forall u(u\notin X)$ if $n$ is $0$.

The case $0$ is proven already, which is instantiated by the empty set. Suppose the case $n$ is proven. Given sets $x_1,\ldots,x_{S(n)}$, there exists some set $X_n$ such that a set $u$ is in $X_n$ if and only if $u$ is equal to one of $x_1,\ldots,x_n$. By axiom of pairing, there exists a set $Y_{S(n)}$ such that for all $u$, $u\in Y_{S(n)}$ if and only if $u=x_{S(n)}$ or $u=x_{S(n)}$, if and only if $u=x_{S(n)}$. Again by axiom of pairing, there exists a set $Z_{S(n)}$ such that for all $u$, $u\in Z_{S(n)}$ if and only if $u=X_n$ or $u=Y_{S(n)}$. By axiom of union, there exists a set $X_{S(n)}$ such that for all $u$, $u\in X_{S(n)}$ if and only if for some $w\in Z_{S(n)}$, $u\in w$. If a set $u$ is in $X_{S(n)}$, then for some $w\in Z_{S(n)}$, $u\in w$. Then $w=X_n$ or $w=Y_{S(n)}$, thus $u\in X_n$ or $u\in Y_{S(n)}$.

If $n$ is $0$, then $u\notin X_n$, thus $u\in Y_{S(n)}$ , implying $u=x_{S(n)}$, which is one of $x_1,\ldots,x_{S(n)}$.
If $n$ is non-zero:
- if $u\in X_n$, then $(u=x_1)\lor\ldots\lor(u=x_n)$, thus $u$ is in one of $x_1,\ldots,x_{S(n)}$;
- if $u\in Y_{S(n)}$, then $u=x_{S(n)}$, which is one of $x_1,\ldots,x_{S(n)}$.

In all cases, $u$ is in one of $x_1,\ldots,x_{S(n)}$. For the other direction, suppose $u$ is in one of $x_1,\ldots,x_{S(n)}$, which means $(u=x_1)\lor\ldots\lor(u=x_{S(n)})$.

If $n$ is $0$, then we have $u=x_{S(n)}$, thus $u\in Y_{S(n)}$ Note that $Y_{S(n)}\in Z_{S(n)}$, thus for some $w\in Z_{S(n)}$, $u\in w$. And we have $u\in X_{S(n)}$.
If $n$ is non-zero, then either $(u=x_1)\lor\ldots\lor(u=x_n)$ or $u=x_{S(n)}$. We show above that if $u=x_{S(n)}$, then $u\in X_{S(n)}$. If $(u=x_1)\lor\ldots\lor(u=x_n)$, then $u\in X_n$. Note that $X_n\in Z_{S(n)}$, thus for some $w\in Z_{S(n)}$, $u\in w$. And we have $u\in X_{S(n)}$.

In all cases, $u\in X_{S(n)}$. We have shown that a set $u$ is in $X_{S(n)}$ if and only if $u$ is equal to one of $x_1,\ldots,x_{S(n)}$. By axiom of extensionality, this set is unique. Hence the case $S(n)$ is proven. By meta-logical induction, given any meta-logical natural number $n$, the case $n$ is proven. $\blacksquare$

We will denote this set as $$\{x_1,\ldots,x_n\}$$

Note. By definition, $\{\}=\emptyset$.

Proposition. There is no set that contains itself. (show proof)

Note. The above proposition implies that there is no set of all sets, because such a set would contain itself.

Union
Given a set $X$, by axiom of union, there exists, and thus uniquely exists, a set $Y$ such that a set $u$ is in $Y$ if and only if there exists $x\in X$ such that $u\in x$. We will denote the set $Y$ as $$\bigcup X$$ Given sets $a,b$, we denote $\bigcup\{a,b\}$ as $a\cup b$.

Proposition. $x\in X$ implies $x\subseteq\bigcup X$. (show proof)

Proposition. $x\in a\cup b$ if and only if $x\in a$ or $x\in b$. (show proof)

Intersection
Given a non-empty set $X$, there exists a set $S_X$ in $X$, and the set $\{u\in S_X|\forall x((x\in X)\to(u\in x))\}$ is the unique set such that a set $u$ is in it if and only if for all $x\in X$ we have $u\in x$. We denote this set as $$\bigcap X$$ Given sets $a,b$, we denote $\bigcap\{a,b\}$ as $a\cap b$.

Proposition. Suppose $X$ is non-empty, then $x\in X$ implies $\bigcap X\subseteq x$. (show proof)

Proposition. $x\in a\cap b$ if and only if $x\in a$ and $x\in b$. (show proof)

Disjoint
Given sets $x,y$, we say $x$ and $y$ are disjoint if and only if $(x\cap y)=\emptyset$. Given a set $X$, we say $X$ is pairwise disjoint if and only if $$\forall a\forall b(((a\in X)\land(b\in X)\land((a\cap b)\neq\emptyset))\to(a=b))$$

Difference
Given sets $x,y$, we define $x\setminus y$ as $\{u\in x|u\notin y\}$.

Complement
If we are given a set $X$ that contains all objects in consideration with respect to the given context, then for a subset $S$ of $X$, the complement of $S$, denoted $S^c$, is $X\setminus S$.

Power set
Given a set $X$, by axiom of power set, there exists, and thus uniquely exists, a set $Y$ such that a set $u$ is in $Y$ if and only if $u$ is a subset of $X$. We denote the set $Y$ as $\mathcal P(X)$.

Ordered pair
Given sets $a,b$, an ordered pair $(a,b)$ is defined as $\{\{a\},\{a,b\}\}$. When we say a set $p$ is an ordered pair of sets $A$ and $B$, we mean for some $a\in A$ and $b\in B$, $p=(a,b)$.

Proposition. Given sets $a,b$, let $p$ denote $(a,b)$, then we have $$a=\cup\cap p\quad\text{and}\quad b=\cup\{u\in\cup p|((\cup p\neq\cap p)\to(u\notin\cap p))\}$$ (show proof)

Note. By the above proposition, given a set $p$, if there exist sets $a$ and $b$ such that $p=(a,b)$, then $$p=(\cup\cap p,\cup\{u\in\cup p|((\cup p\neq\cap p)\to(u\notin\cap p))\})$$

Proposition. Given sets $a,b,c,d$, if $(a,b)=(c,d)$, then $a=c$ and $b=d$. (show proof)

Cartesian product
Given sets $A$ and $B$, we define the Cartesian product of $A$ and $B$, denoted $A\times B$, as $\{u\in\mathcal P(\mathcal P(A\cup B))|\exists a\exists b((a\in A)\land(b\in B)\land(u=(a,b)))\}$. Then a set $p$ is in $A\times B$ if and only if $p$ is an ordered pair of $A$ and $B$ (show proof).

Function
Given sets $f,X,Y$, we say $f$ is a function from $X$ to $Y$, denoted $f:X\to Y$, if and only if $f\in\mathcal P(X\times Y)$ and for all $x\in X$, there exists a unique $y$ such that $(x,y)\in f$. We will denote a witness of "there exists a unique $y$ such that $(x,y)\in f$", with arguments $f$ and $x$, by $f(x)$, so that given any $f:X\to Y$ and $x\in X$, we have $(x,f(x))\in f$, and for all $y$, $(x,y)\in f$ implies $f(x)=y$. If $f$ is a function from $X$ to $Y$, then $X$ is called the domain of $f$, $Y$ is called the codomain of $f$, and $\{y\in Y|\exists x\in X,f(x)=y\}$ is called the range of $f$.

Function without codomain
Given sets $f,X$, we say $f$ is a function from $X$ if for all $u\in f$, there exist $a,b$ such that $a\in X$ and $u=(a,b)$, and for all $x\in X$, there exists a unique $y$ such that $(x,y)\in f$. By axiom schema of replacement, there exists $Y$ such that $f$ is a function from $X$ to $Y$.

Equality of functions
Given $f:X\to Y$ and $g:X\to Z$, if for all $x\in X$, $f(x)=g(x)$, then $f=g$. (show proof)

Independence of range on codomain
Suppose $f$ is both a function from $X$ to $Y$ and a function from $X$ to $Z$, then $$\{y\in Y|\exists x\in X,f(x)=y\}=\{y\in Z|\exists x\in X,f(x)=y\}$$ (show proof)

Proposition. Given $f:X\to Y$, suppose $V$ is the range of $f$, then $f$ is a function from $X$ to $Z$ if and only if $V\subseteq Z$. (show proof)

Defining functions with formulas
Given $X$ and $Y$ and a formula $\varphi(x,y)$ such that for all $x\in X$ there exists a unique $y\in Y$ that satisfies $\varphi(x,y)$, $$\{u\in(X\times Y)|\forall x\forall y((x,y)=u\to\varphi(x,y))\}$$ defines a function $f$ from $X$ to $Y$ (show proof). And we say $f:X\to Y$ is defined by $\varphi(x,y)$ with $x$ as input and $y$ as output.

Proposition. Given $X$ and $Y$, suppose $y(x)$ is some term such that for all $x\in X$, $y(x)\in Y$, then we can define a function $f:X\to Y$ such that for all $x\in X$, $f(x)=y(x)$. (show proof)

Proposition. Given $X$, suppose $y(x)$ is some term, then for some $Y$, we can define a function $f:X\to Y$ such that for all $x\in X$, $f(x)=y(x)$. (show proof)

Implicit range notation
Suppose we have a term $y(x)$ and a formula $\varphi(x)$ such that there exists a set $X$ such that for all $x$, $x\in X$ if and only if $\varphi(x)$. Then for some $Y$, we can define a function $f:X\to Y$ such that for all $x\in X$, $f(x)=y(x)$. We use the notation $$\{y(x):\varphi(x)\}$$ to denote the range of $f$. Note that $u\in\{y(x):\varphi(x)\}$ if and only if there exists $x$ such that $\varphi(x)$ and $y(x)=u$ (show proof).

Proposition. Suppose we have terms $y_1(x),y_2(x)$ and a formula $\varphi(x)$ such that there exists a set $X$ with $x\in X$ if and only if $\varphi(x)$. If for all $x$ such that $\varphi(x)$, we have $y_1(x)=y_2(x)$, then $\{y_1(x):\varphi(x)\}=\{y_2(x):\varphi(x)\}$. (show proof)

Restriction
Given $f:X\to Y$, and $U\subseteq X$, $$\{u\in(U\times Y)|u\in f\}$$ is a function from $U$ to $Y$, called the restriction of $f$ on $U$, and denoted $f|_U$. Note that for all $x\in U$, $f|_U(x)=f(x)$.

Image
Given $f:X\to Y$, and $U\subseteq X$, the range of $f|_U$ is called the image of $f$ on $U$, denoted $f(U)$. Note that $y\in f(U)$ if and only if there exists $x\in U$ such that $f(x)=y$.

Preimage
Given $f:X\to Y$, and $V\subseteq Y$, $$\{x\in X|f(x)\in V\}$$ is called the preimage of $f$ on $V$, denoted $f^{-1}(V)$.

Proposition. Given $f:X\to Y$ and $x\in U\subseteq X$, we have $f(x)\in f(U)$. (show proof)

Proposition. Given $f:X\to Y$ and $V\subseteq U\subseteq X$, we have $f(V)\subseteq f(U)$. (show proof)

Proposition. Given $f:X\to Y$ and $T\subseteq S\subseteq Y$, we have $f^{-1}(T)\subseteq f^{-1}(S)$. (show proof)

Proposition. Given $f:X\to Y$, $U\subseteq X$, and $V\subseteq Y$, we have $U\subseteq f^{-1}(f(U))$ and $f(f^{-1}(V))\subseteq V$. (show proof)

Injectivity
Given $f:X\to Y$, if for all $x_1,x_2\in X$, $f(x_1)=f(x_2)$ implies $x_1=x_2$, then $f$ is said to be injective.

Surjectivity
Given $f:X\to Y$, if for all $y\in Y$, there exists $x\in X$ such that $f(x)=y$, then $f$ is said to be surjective.

Bijectivity
Given $f:X\to Y$, if $f$ is both injective and surjective, then $f$ is said to be bijective.

Proposition. Given $f:X\to Y$, $f$ is bijective if and only if for all $y\in Y$, there exists a unique $x\in X$ such that $f(x)=y$. (show proof)

Inverse function
Given bijective $f:X\to Y$, $$\{u\in(Y\times X)|\forall y\forall x((y,x)=u\to f(x)=y)\}$$ is a function from $Y$ to $X$ (show proof), called the inverse function of $f$, and denoted $f^{-1}$.

Proposition. Given bijective $f:X\to Y$, for all $x\in X$, $f^{-1}(f(x))=x$, and for all $y\in Y$, $f(f^{-1}(y))=y$. (show proof)

Proposition. Inverse functions are bijective. (show proof)

Proposition. Given bijective $f:X\to Y$, $$(f^{-1})^{-1}=f$$ (show proof)

Proposition. Given $f:X\to Y$ and $g:Y\to X$, if for all $x\in X$, $g(f(x))=x$, and for all $y\in Y$, $f(g(y))=y$, then $f$ is bijective and $g=f^{-1}$. (show proof)

Proposition. Given bijective $f:X\to Y$ and $U\subseteq X$, let $V=f(U)$, then $f|_U:U\to V$ is bijective and $$(f|_U)^{-1}=(f^{-1})|_V$$ (show proof)

Proposition. Given bijective $f:X\to Y$ and $V\subseteq Y$, the preimage of $f$ on $V$ equals the image of $f^{-1}$ on $V$. (show proof)

Note. The above proposition shows that the notation $f^{-1}(V)$ is unambiguous when $f$ is bijective.

Proposition. Given bijective $f:X\to Y$, $U\subseteq X$, and $V\subseteq Y$, we have $f^{-1}(f(U))=U$ and $f(f^{-1}(V))=V$. (show proof)

Composite function
Given $f:A\to B$ and $g:C\to D$, then $$\{u\in (f^{-1}(C)\times D)|\forall x\forall y(u=(x,y)\to(y=g(f(x))))\}$$ is a function from $f^{-1}(C)$ to $D$ (show proof), denoted $(g\circ f)$. Note that for all $x\in f^{-1}(C)$, $$(g\circ f)(x)=g(f(x))$$

Proposition. Given $f:A\to B$, $g:C\to D$, $h:E\to F$, $$h\circ(g\circ f)=(h\circ g)\circ f$$ (show proof)

Proposition. Given bijective $f:X\to Y$ and $g:Y\to Z$, $g\circ f$ is bijective and $$(g\circ f)^{-1}=f^{-1}\circ g^{-1}$$ (show proof)

Note. With the notations we now have, we can rewrite axiom of infinity as $\exists S((\emptyset\in S)\land\forall x((x\in S)\to((x\cup\{x\})\in S)))$ (show proof). For an arbitrary set $X$, we will denote the formula $((\emptyset\in X)\land\forall x((x\in X)\to((x\cup\{x\})\in X)))$ by $\mathcal I(X)$.

Natural number
By axiom of infinity, we have $\exists S\mathcal I(S)$. Denote a witness of this by $C$, then we have $\mathcal I(C)$. Denote $\{A\in\mathcal P(C)|\mathcal I(A)\}$ by $I$, then $C\in I$, so $I$ is non-empty. Denote $\cap I$ by $N$, then a set $n$ is called a natural number if and only if $n\in N$. Note that $N$ is a term with no free variables and we have $\mathcal I(N)$ (show proof). Since for all $n\in N$, $n\cup\{n\}\in N$, we can define a function $S:N\to N$ such that for all $n\in N$, $S(n)=n\cup\{n\}$, called the successor function. Note that we can use meta-logical recursion to define a meta-logical function $f$ from meta-logical natural numbers to the collection of terms such that

$f(0)=\emptyset$, and
for all meta-logical natural number $n$, $f(S^*(n))$ and $S(f(n))$ are identical, where $S^*$ is the meta-logical successor function.

By meta-logical induction, for all meta-logical natural number $n$, we have $f(n)\in N$. Given a meta-logical natural number $n$, we use the numeral of $n$ to represent the term $f(n)$.

Proposition. For all $n\in N$, $S(n)\neq0$. (show proof)

Induction theorem
Suppose $\varphi(n)$ is a formula. If we have $\varphi(0)$, and for every $n\in N$, $\varphi(n)$ implies $\varphi(S(n))$, then for every $n\in N$ we have $\varphi(n)$. (show proof)

Proposition. For all $n\in N$, $n\subset N$. (show proof)

Proposition. For all $n\in N$, for all $x\in n$, we have $x\subset n$. (show proof)

Proposition. For all $n\in N$, if $n\neq0$, then there exists $m\in N$ such that $S(m)=n$. (show proof)

Proposition. Suppose $\varphi(n)$ is a formula. If $n\in N$, $\varphi(0)$, and for all $k\in n$, $\varphi(k)$ implies $\varphi(S(k))$, then we have $\varphi(n)$. (show proof)

Proposition. The successor function is injective. (show proof)

Proposition. Given arbitrary sets $m,n$, if $m,n\in N$ then we have exactly one of $m\in n$, $n\in m$, $m=n$. (show proof)

Proof. We will first prove that if $m,n\in N$ then we have at least one of $m\in n$, $n\in m$, $m=n$. Denote that we have at least one of $m\in n$, $n\in m$, $m=n$ by $\varphi(m,n)$, denote $\{s\in N|\varphi(x,s)\}$ by $C(x)$, and denote $\{r\in N|C(r)=N\}$ by $I$.

Since $0=0$, we have $\varphi(0,0)$, so $0\in C(0)$. Suppose $a\in N$ and $a\in C(0)$, then $\varphi(0,a)$,

if $0\in a$, then $0\in S(a)$, so $\varphi(0,S(a))$;
if $a\in 0$, we have a contradiction, hence $\varphi(0,S(a))$;
if $a=0$, then $0\in S(a)$, so $\varphi(0,S(a))$.

And we conclude that $a\in N$ and $a\in C(0)$ imply $S(a)\in C(0)$. So by induction, $C(0)=N$, hence $0\in I$.

Suppose $a\in N$ and $a\in I$, then $C(a)=N$, so $\varphi(a,0)$,

if $a\in 0$, we have a contradiction, hence $\varphi(S(a),0)$;
if $0\in a$, then $0\in S(a)$, so $\varphi(S(a),0)$;
if $a=0$, then $0\in S(a)$, so $\varphi(S(a),0)$.

And we conclude that $0\in C(S(a))$. Suppose $b\in N$ and $b\in C(S(a))$, then $\varphi(S(a),b)$,

if $S(a)\in b$, then $S(a)\in S(b)$, so $\varphi(S(a),S(b))$;
if $b\in S(a)$, then either $b=a$ or $b\in a$,
- if $b=a$, then $S(a)=S(b)$, so $\varphi(S(a),S(b))$;
- if $b\in a$, then $b\subset a$; note that $C(a)=N$, we have $\varphi(a,S(b))$,
  - if $a\in S(b)$, then either $a=b$ or $a\in b$,
    - if $a=b$, then $S(a)=S(b)$, so $\varphi(S(a),S(b))$;
    - if $a\in b$, then $a\subset b$, a contradiction, hence $\varphi(S(a),S(b))$;
  - if $S(b)\in a$, then $S(b)\in S(a)$, so $\varphi(S(a),S(b))$;
  - if $a=S(b)$, then $S(b)\in S(a)$, so $\varphi(S(a),S(b))$;
if $S(a)=b$, then $S(a)\in S(b)$, so $\varphi(S(a),S(b))$.

And we conclude that $b\in N$ and $b\in C(S(a))$ imply $S(b)\in C(S(a))$. So by induction, $C(S(a))=N$, hence $S(a)\in I$. Note that we are working under the assumption that $a\in N$ and $a\in I$, so we conclude that $a\in N$ and $a\in I$ imply $S(a)\in I$. Again by induction, we have that $I=N$. If $m,n\in N$, then $m\in I$, or $C(m)=N$, so $n\in C(m)$, or $\varphi(m,n)$, we have at least one of $m\in n$, $n\in m$, $m=n$.

We now prove that if $m,n\in N$, then we have at most one of $m\in n$, $n\in m$, $m=n$. Suppose $m,n\in N$, then $m\in n$ implies $m\subset n$, and $n\in m$ implies $n\subset m$. Clearly, we have at most one of $m\subset n$, $n\subset m$, $m=n$. Hence we have at most one of $m\in n$, $n\in m$, $m=n$.

Combining the above results, we conclude that $m,n\in N$ implies that we have at least and at most one of $m\in n$, $n\in m$, $m=n$; that is, we have exactly one of $m\in n$, $n\in m$, $m=n$. $\blacksquare$

Order on natural numbers
For natural numbers $m$ and $n$, we denote $(m\in n)$ by $(m\lt n)$ or $(n\gt m)$, and $((m\in n)\lor(m=n))$ by $(m\le n)$ or $(n\ge m)$.

Proposition. For all $m,n,k\in N$, if $m\lt n$ and $n\lt k$, then $m\lt k$. (show proof)

Proposition. A non-empty set of natural numbers has a least element. (show proof)

Lemma. For all $m,n\in N$, if $m\lt n$, then $S(m)\le n$. (show proof)

Recursion theorem
For arbitrary sets $X,a,f$, if $a\in X$ and $f:X\to X$, then there exists a unique function $u:N\to X$ such that $u(0)=a$ and for every $n\in N$, $u(S(n))=f(u(n))$. (show proof)

Proof. Consider $\{A\in\mathcal P(N\times X)|(((0,a)\in A)\land\forall n\forall x(((n,x)\in A)\to((S(n),f(x))\in A)))\}$, denoted $I$. Clearly $N\times X$ is in $I$, so $I$ is non-empty. So we can define $u$ to be $\cap I$, then $u$ is a subset of $N\times X$, and clearly,

$(0,a)\in u$, and
$(n,x)\in u$ implies $(S(n),f(x))\in u$.

Now consider $\{n\in N|\exists!x((n,x)\in u)\}$, denoted $C$.

Suppose $0\notin C$, since $(0,a)\in u$, there exists some $b\neq a$ such that $(0,b)\in u$. Denote $u\setminus\{(0,b)\}$ by $u'$. Since $(0,a)\in u$ and $(0,a)\notin\{(0,b)\}$, $(0,a)\in u'$; if $(n,x)\in u'$, then $(n,x)\in u$, so $(S(n),f(x))\in u$, but then, since $n\in N$, $S(n)\neq0$, so $(S(n),f(x))\notin\{(0,b)\}$, and hence $(S(n),f(x))\in u'$. We have shown that $u'\in I$, so $u\subseteq u'$, a contradiction. Therefore, $0\in C$.

Suppose there exists some $n^*\in C$ such that $S(n^*)\notin C$, then $n^*\in N$ and for some unique $x^*$ we have $(n^*,x^*)\in u$, thus $(S(n^*),f(x^*))\in u$. Since $S(n^*)\in N$, in order for $S(n^*)$ to not be in $C$, there has to be some $y\neq f(x^*)$, such that $(S(n^*),y)\in u$. Denote $u\setminus\{(S(n^*),y)\}$ by $u'$. Since $(0,a)\in u$ and $(0,a)\notin\{(S(n^*),y)\}$, $(0,a)\in u'$. If $(n,x)\in u'$, then $(n,x)\in u$, so $(S(n),f(x))\in u$.

Suppose $n=n^*$, then $x=x^*$ by uniqueness, so $f(x)\neq y$.
Suppose $n\neq n^*$, since $S$ is injective, we have $S(n)\neq S(n^*)$.

In both cases we have $(S(n),f(x))\notin\{(S(n^*),y)\}$, and we conclude that $(S(n),f(x))\in u'$. We have shown that $u'\in I$, so $u\subseteq u'$, a contradiction. Therefore, $n\in C$ implies $S(n)\in C$. And we have that for all $n\in N$, $n\in C$ implies $S(n)\in C$.

By induction, for all $n\in N$, $n\in C$. Since $u$ is a subset of $N\times X$, and for all $n\in N$, there exists a unique $x$ such that $(n,x)\in u$, $u$ is a function from $N$ to $X$. Since $(0,a)\in u$, we have $u(0)=a$. let $n\in N$, then $(n,u(n))\in u$, thus $(S(n),f(u(n)))\in u$, implying $u(S(n))=f(u(n))$.

Now suppose $u'$ is a function from $N$ to $X$ such that $u'(0)=a$ and for all $n\in N$, $u'(S(n))=f(u'(n))$. Then $u'(0)=a=u(0)$, and given $n\in N$, if $u'(n)=u(n)$, then $u'(S(n))=f(u'(n))=f(u(n))=u(S(n))$. Thus by induction, for all $n\in N$, $u'(n)=u(n)$. Hence $u'=u$. $\blacksquare$

Addition of natural numbers
For every natural number $m$, by recursion theorem we can define a function $S_m:N\to N$ such that $S_m(0)=m$ and for every $n\in N$, $S_m(S(n))=S(S_m(n))$. Then the formula $\forall a\forall b((x=(a,b))\to(y=S_a(b)))$ defines a function $+:(N\times N)\to N$. Given natural numbers $m,n$, we denote $+((m,n))$ by $m+n$. Note that $m+n=S_m(n)$, thus $m+S(n)=S(m+n)$.

Proposition. $0$ is an additive identity for natural numbers: for all $n\in N$, $n+0=n$ and $0+n=n$. (show proof)

Proposition. Addition of natural numbers is associative: for all $m,n,k\in N$, $(m+n)+k=m+(n+k)$, and commutative: for all $m,n\in N$, $m+n=n+m$. (show proof)

Proposition. For all $m,n,k\in N$, if $m\lt n$, then $m+k\lt n+k$ and $k+m\lt k+n$. (show proof)

Proposition. For all $m,n,k\in N$, $m+n=0$ if and only if $m=0$ and $n=0$. (show proof)

Proposition. For all $m,n\in N$, if $m\le n$, then there exists a unique $k\in N$ such that $m+k=n$. (show proof)

Subtraction on natural numbers
For all $p\in\{u\in N\times N|\exists m,n\in N,m\ge n\land u=(m,n)\}$, denoted $D$, there exists a unique $k\in N$ such that there exist $m,n\in N$ such that $p=(m,n)$ and $n+k=m$. Thus we can define a function $-:D\to N$ by the formula "there exist $m,n\in N$ such that $p=(m,n)$ and $n+k=m$" with $p$ as input and $k$ as output. Then given $m,n\in N$ such that $m\ge n$, $(m,n)\in D$, and we have $n+-((m,n))=m$. If we denote $-((m,n))$ by $m-n$, then we have $n+(m-n)=m$.

Multiplication of natural numbers
For every natural number $m$, by recursion theorem we can define a function $P_m:N\to N$ such that $P_m(0)=0$ and for every $n\in N$, $P_m(S(n))=S_m(P_m(n))$. Then the formula $\forall a\forall b((x=(a,b))\to(y=P_a(b)))$ defines a function $\times:(N\times N)\to N$. Given natural numbers $m,n$, we denote $\times((m,n))$ by $m\times n$, or simply $mn$. Note that $mn=P_m(n)$, thus $mS(n)=m+(mn)$.

Note. By default, multiplication has priority over addition. Thus expressions like $m+mn$ is equivalent to $m+(mn)$.

Proposition. For natural numbers, multiplication with $0$ equals $0$: for all $n\in N$, $n0=0$ and $0n=0$. (show proof)

Proposition. $1$ is an multiplicative identity for natural numbers: for all $n\in N$, $n1=n$ and $1n=n$. (show proof)

Proposition. Multiplication distributes over addition for natural numbers: for all $m,n,k\in N$, $m(n+k)=mn+mk$ and $(m+n)k=mk+nk$. (show proof)

Proposition. Multiplication of natural numbers is associative: for all $m,n,k\in N$, $(mn)k=m(nk)$, and commutative: for all $m,n\in N$, $mn=nm$. (show proof)

Proposition. For all $m,n,k\in N$, if $k\neq0$ and $m\lt n$, then $mk\lt nk$ and $km\lt kn$. (show proof)

Proposition. For all $m,n\in N$, $mn=0$ if and only if $m=0$ or $n=0$. (show proof)

Proposition. For all $m,n\in N$, if $n\neq0$, then there exists a unique pair of natural numbers $q,r$ such that $r\lt n$ and $m=nq+r$. (show proof)

Division of natural numbers
For all $p\in N\times(N\setminus\{0\})$, there exists a unique $s\in N\times N$ such that there exist $m,n,q,r\in N$ such that $p=(m,n)$, $s=(q,r)$, $r\lt n$, and $m=nq+r$. Thus we can define a function $\divsymbol:N\times(N\setminus\{0\})\to N\times N$ by the formula "there exist $m,n,q,r\in N$ such that $p=(m,n)$, $s=(q,r)$, $r\lt n$, and $m=nq+r$" with $p$ as input and $s$ as output. Note that we can define a function that maps a pair of natural numbers $p$ to the first element, denoted $p_0$, and a function that maps a pair of natural numbers $p$ to the second element, denoted $p_1$. We call the composite function $p\mapsto\divsymbol(p)_0$ natural number division and the composite function $p\mapsto\divsymbol(p)_1$ modulo operation. Given natural number $m$ and non-zero natural number $n$, we denote $\divsymbol((m,n))_0$ by $m\divsymbol n$ and $\divsymbol((m,n))_1$ by $m\bmod n$, which are called the quotient and remainder of the division of $m$ by $n$. Then we have $m\bmod n\lt n$ and $m=n(m\divsymbol n)+(m\bmod n)$.

Proposition. For all $m\in N$ and $n\in N\setminus\{0\}$, $mn\divsymbol n=m$ and $mn\bmod n=0$. (show proof)

Proposition. For all $k,m\in N$ and $n\in N\setminus\{0\}$,

$(m\bmod n)\bmod n=m\bmod n$;
$(m+k)\bmod n=((m\bmod n)+(k\bmod n))\bmod n$;
$(mk)\bmod n=((m\bmod n)(k\bmod n))\bmod n$.

(show proof)

Proposition. For all $k,m\in N$ and $n\in N\setminus\{0\}$, if $k\lt m$, $k\divsymbol n\le m\divsymbol n$. (show proof)

Exponentiation of natural numbers
For every natural number $m$, by recursion theorem we can define a function $E_m:N\to N$ such that $E_m(0)=1$ and for every $n\in N$, $E_m(S(n))=P_m(E_m(n))$. Then the formula $\forall a\forall b((x=(a,b))\to(y=E_a(b)))$ defines a function $\wedge:(N\times N)\to N$. Given natural numbers $m,n$, we denote $\wedge((m,n))$ by $m\wedge n$, or simply $m^n$. Note that $m^n=E_m(n)$, thus $m^{S(n)}=m(m^n)$.

Note. By default, exponentiation has priority over multiplication. Thus expressions like $mm^n$ is equivalent to $m(m^n)$.

Note. With this definition, $0^0=1$.

Proposition. For all $k\in N$, $1^k=1$. (show proof)

Proposition. Exponentiation of natural numbers has the following properties: for all $m,n,k\in N$,

$(mn)^k=m^kn^k$
$m^{n+k}=m^nm^k$
$m^{nk}={(m^n)}^k$

(show proof)

Proposition. For all $m,n\in N$, $m^n=0$ if and only if $m=0$ and $n\neq0$. (show proof)

Proposition. For all $m,n,k\in N$, if $k\neq0$ and $m\lt n$, then $m^k\lt n^k$. (show proof)

Proposition. For all $m,n,k\in N$, if $k\gt 1$ and $m\lt n$, then $k^m\lt k^n$. (show proof)

Tuple
Given a set $X$ and a natural number $n$, we call a function from $n$ to $X$ an $n$-tuple of $X$. Note that the set $\mathcal P(n\times X)$ contains every $n$-tuple of $X$, hence the collection of all $n$-tuples of $X$ is a set, denoted $X^n$. Given an $n$-tuple $t$ of $X$ and a natural number $k\lt n$, we usually denote $t(k)$ by $t_k$, and we may denote $t$ as $(t_0,\ldots,t_{n-1})$. Note that an $n$-tuple $t$ of $X$ may also refer to a function from $\{1,\ldots,n\}$ to $X$, in which case we say $t$ is an $n$-tuple with indexes shifted by $1$, and we may write $t$ as $(t_1,\ldots,t_n)$.

Sequence
Given a set $X$, we call a function from $N$ to $X$ a sequence of $X$. Note that the set $\mathcal P(N\times X)$ contains every sequence of $X$, hence the collection of all sequences of $X$ is a set. Given a sequence $s$ of $X$ and a natural number $n\in N$, we usually denote $s(n)$ by $s_n$, and we may denote $s$ as $(s_n)_{n\in N}$ or just $(s_n)$. Note that a sequence $s$ of $X$ may also refer to a function from $N^+$, the non-zero natural numbers, to $X$, in which case we say $s$ is a sequence with indexes shifted by $1$, and we may write $s$ as $(s_n)_{n\in N^+}$.

Infinity
Given a set $X$, if there exists $n\in N$, such that there exists a bijection from $n$ to $X$, then we say the cardinality of $X$, denoted $|X|$, is $n$, and we say $X$ is finite, otherwise we say $X$ is infinite. If either $X$ is finite or there exists a bijection from $N$ to $X$, we say $X$ is countable, otherwise we say $X$ is uncountable.

Proposition. A finite set has a unique cardinality in $N$. (show proof)

Proposition. No finite set has a bijection from $N$ to itself. (show proof)

Lemma. Let $n\in N$, then for every subset $S$ of $n$, there exists $m\le n$ such that there is a bijection from $m$ to $S$. (show proof)

Proposition. Suppose $\abs{A}=n$ for some $n\in N$ and $B\subseteq A$, then there exists $m\le n$ such that $\abs{B}=m$, and $\abs{A\setminus B}=n-m$. (show proof)

Proposition. Suppose $\abs{A}=n$ and $\abs{B}=m$ for some $n,m\in N$, then $$\abs{A\cup B}=\abs{A}+\abs{B}-\abs{A\cap B}$$ (show proof)

Repeated operation
Suppose we have a non-empty set $X$, a function $f:N\to X$, and an operation $\cdot:X\times X\to X$. Then we can define a function $\mathscr F:N\times X\to N\times X$ by $\mathscr F((k,x))=(S(k),x\cdot f(S(k)))$. Thus by recursion theorem, we can define a function $u:N\to N\times X$ such that $u(0)=(0,f(0))$ and $u(S(k))=\mathscr F(u(k))$. Finally, we can define a function $F:N\to X$ that maps each $k$ to the second element of $u(k)$, then $F(n)$ represents $f(0)\cdot\ldots\cdot f(n)$. By induction, we can see that the first element of $u(n)$ is $n$ and $F(S(n))=F(n)\cdot f(S(k))$. Note that this applies to repeated pairwise union or intersection as well. Properties of repeated operations are usually provable by induction straightforwardly.

Note. Let $\mathcal F$ be the meta-logical function that maps meta-logical natural numbers to corresponding terms. By meta-logical induction, for all meta-logical natural number $n$, $\{\mathcal F(1),\ldots,\mathcal F(n)\}=\{1,\ldots,\mathcal F(n)\}$. Also by meta-logical induction, given a formula $\varphi(x)$, for all non-zero meta-logical natural number $n$, $\varphi(\mathcal F(1))\land\ldots\land\varphi(\mathcal F(n))$ is equivalent to $\varphi(j)$ for all $j\in\{\mathcal F(1),\ldots,\mathcal F(n)\}$.

Tuple without codomain
Let $n$ be a natural number. An $n$-tuple without codomain is a function from $n$, or $1,\ldots,n$ if indexes are shifted by $1$. A tuple without codomain can also be formed from a meta-logical tuple of terms. By meta-logical induction, we have that for all meta-logical natural number $n$, let $(t_1,\ldots,t_n)$ be a meta-logical tuple of terms, then there exists a function $F$ from $\{\mathcal F(1),\ldots,\mathcal F(n)\}$ such that $(F(\mathcal F(1))=t_1)\land\ldots\land(F(\mathcal F(n))=t_n)$. We denote this function as $(t_1,\ldots,t_n)$, which is a $\mathcal F(n)$-tuple without codomain, and we call it a tuple of the sets $t_1,\ldots,t_n$.

Repeated Cartesian product
Let $X$ be an $n$-tuple without codomain. Then $X_1\times\ldots\times X_n$ represents the set of $n$-tuples $T$ of $\bigcup_{j\in\{1,\ldots,n\}}X_j$ such that $T_j\in X_j$ for all $j\in\{1,\ldots,n\}$. Suppose $n$ is a non-zero meta-logical natural number and $(X_1,\ldots,X_n)$ is a tuple of the sets $X_1,\ldots,X_n$, let $T$ be an element of the repeated Cartesian product of $(X_1,\ldots,X_n)$, denoted $X_1\times\ldots\times X_n$, then $(T_{\mathcal F(1)}\in X_1)\land\ldots\land(T_{\mathcal F(n)}\in X_n)$. Conversely, suppose we have $x_1,\ldots,x_n$ such that $(x_1\in X_1)\land\ldots\land(x_n\in X_n)$, then $(x_1,\ldots,x_n)\in X_1\times\ldots\times X_n$.

Proposition. There is no infinite descending $\in$-sequence. Formally, there is no set $X$ such that

for all $x\in N$ there is exactly one $y$ such that $(x,y)\in X$, and
for all $n\in N$, for all $a,b$ such that $(n,a),(S(n),b)\in X$, we have $b\in a$.

(show proof)

Relation
Given a set $S$ and a natural number $n$, an $n$-ary relation on $S$ is a subset of $S^n$. By default, when we define a relation $\sim$ on $S$ it is implicitly assumed that $\sim$ is binary. A subset of $S\times S$, instead of $S^2$, can also be regarded as a binary relation on $S$. Given $a,b\in S$, we often denote $(a,b)\in \sim$ by $a\sim b$ and $(a,b)\notin \sim$ by $a\nsim b$. A relation is often defined by $$\{u\in S\times S|\forall a,b\in S(u=(a,b)\to\varphi(a,b))\}$$ for some formula $\varphi(a,b)$, then for all $a,b\in S$, $a\sim b$ if and only if $\varphi(a,b)$. Hence, we often write "for all $a,b\in S$, $a\sim b$ if and only if $\varphi(a,b)$" to represent the definition above, and we say that $\sim$ is defined by $\varphi$ with $a$ and $b$, in that order, as variables. Note that $\gt,\lt,\ge,\le$ naturally define relations on natural numbers, and their properties naturally transfer to their corresponding relations.

Partition
Given a set $X$, a partition of $X$ is a pairwise disjoint collection $S$ such that $\cup S=X$.

Definition. We say a tuple $T$ is pairwise disjoint if and only if for all $i,j$ in domain of $T$, if $T_i$ and $T_j$ are not disjoint, then $i=j$.

Indexed partition
Given a set $X$, an indexed partition of $X$ is a pairwise disjoint tuple $T$ such that $\cup_i T_i=X$.

Proposition. Given sets $X$ and $Y$, given an indexed partition $T$ of $X$, and given a tuple $F$ that shares the same domain as $T$ and for each $i$ in domain of $T$, $F_i$ is a function from $T_i$ to $Y$, then $\bigcup_i F_i$ is a function from $X$ to $Y$, such that for all $j$ in domain of $T$, $\p{\bigcup_i F_i}|_{T_j}=F_j$. (show proof)

Equivalence relation
A relation $\sim$ on a set $S$ is called an equivalence relation when it has the following properties:

1. Reflexivity. $\forall x\in S(x\sim x)$
2. Symmetry. $\forall x,y\in S(x\sim y\to y\sim x)$
3. Transitivity. $\forall x,y,z\in S((x\sim y\land y\sim z)\to x\sim z)$

Given an equivalence relation $\sim$ on $S$ and $x\in S$, we denote $\{u\in S|u\sim x\}$ by $[x]$, called an equivalence class of $x$ with respect to $\sim$. Then $\{[x]:x\in S\}$, the set of equivalence classes with respect to $\sim$, is a partition of $S$ (show proof).

Partial order
Let $\preceq$ be a binary relation on a set $S$. If $\preceq$ has the following properties:

for all $x\in S$, $x\preceq x$;
for all $x,y\in S$, if $x\preceq y$ and $y\preceq x$, then $x=y$;
for all $x,y,z\in S$, if $x\preceq y$ and $y\preceq z$, then $x\preceq z$;

then $\preceq$ is called a partial order on $S$.

Total order
Let $(S,\preceq)$ be a partially ordered set. If for all $x,y\in S$, either $x\preceq y$ or $y\preceq x$, then $\preceq$ is called a total order on $S$. Clearly, if we define $x\prec y$ by $x\preceq y$ and $x\neq y$, then for all $x,y\in S$, we have exactly one of $x\prec y$, $x=y$, $y\prec x$.

Well-order
Let $(S,\preceq)$ be a totally ordered set. If every non-empty subset $E$ of $S$ has a least element $x^*$ given by $\preceq$, such that for all $x\in E$, $x^*\preceq x$, then $\preceq$ is called a well-order on $S$.

Chain
Let $(S,\preceq)$ be a partially ordered set. If $A\subseteq S$ such that $\preceq$ is a total order on $A$, then $A$ is called a chain in $S$.

Choice function
Given any family $F$ of non-empty sets, a function $f:F\to\cup F$ such that for all $X\in F$, $f(X)\in X$ is called a choice function of $F$.

Note. The formula listed as $ZF9$ is provable in $ZFW$ (show proof).

Axiom of choice
Every family of non-empty sets has a choice function (show proof). From this point on, we will refer to this, instead of the one we gave above, as axiom of choice.

Note. Suppose we have a formula of the form "for all $x\in X$ there exists $y\in Y$ such that $\varphi(x,y)$" where $Y$ is non-empty. With $a$ as input and $b$ as output, the formula $b=\{u\in Y|\varphi(a,u)\}$ defines a function $\mathcal F$ from $X$ to $\mathcal P(Y)\setminus\{\emptyset\}$. By axiom of choice, there exists a choice function $\mathcal C$ from $\mathcal P(Y)\setminus\{\emptyset\}$ to $Y$. Then the composite function $\mathcal C\circ\mathcal F$ is a function from $X$ to $Y$, and for all $c\in X$, we have $\varphi(c,(\mathcal C\circ\mathcal F)(c))$. By "let $f:X\to Y$ be a function that maps every $x\in X$ to a $y\in Y$ such that $\varphi(x,y)$", we are implicitly defining $f$ to be $\mathcal C\circ\mathcal F$. This applies to similar formulas with unique existence as well.

Zorn's lemma
Let $(S,\preceq)$ be a non-empty partially ordered set. If every chain $C$ in $S$ has an upper bound in $S$, an element $s\in S$ such that $c\preceq s$ for all $c\in C$, then $S$ has a maximal element, an element $s^*\in S$ such that $s^*\preceq s$ implies $s^*=s$ for all $s\in S$. (show proof)

Proof. Let $(S,\preceq)$ be a non-empty partially ordered set. Denote the set of all chains in $S$ by $\mathcal C$. Then $(\mathcal C,\subseteq)$ is a partially ordered set. Our main goal is to show that $S$ has a maximal chain, a chain $C^*\in \mathcal C$ such that if $C^*\cup\{s\}$ is a chain then $s\in C^*$ for all $s\in S$.

By axiom of choice, there exists a choice function $f:\mathcal P(X)\setminus\{\emptyset\}\to X$. For each chain $C\in\mathcal C$, denote $\{u\in X|u\notin C\land C\cup\{u\}\in\mathcal C\}$ by $C'$. Then define a function $g:\mathcal C\to\mathcal C$ by

$g(C)=C$ if $C'=\emptyset$, and
$g(C)=C\cup\{f(C')\}$ if $C'\neq\emptyset$.

Now we can define towers $\mathcal T$ in $S$ as subsets of $\mathcal C$ with the following properties:

(1) $\emptyset\in\mathcal T$;
(2) for all $C$, $C\in\mathcal T$ implies $g(C)\in\mathcal T$;
(3) if $\mathcal D\subseteq\mathcal T$ is a chain in $\mathcal C$, then $\cup\mathcal D\in\mathcal T$.

$\mathcal C$ is itself a tower in $S$: (1) and (2) are clearly satisfied. For (3), suppose $\mathcal D\subseteq\mathcal T$ is a chain in $\mathcal C$, if $a,b\in\cup\mathcal D$, then for some $A,B\in\mathcal D$ we have $a\in A$ and $b\in B$. Since $\mathcal D$ is a chain in $\mathcal C$, either $A\subseteq B$ or $B\subseteq A$, so either $a,b\in A$ or $a,b\in B$. Since $A$ and $B$ are both chains in $S$, we have $a\preceq b$ or $b\preceq a$. We have shown that $\cup\mathcal D$ is a chain in $S$, so $\cup\mathcal D\in\mathcal C$.

Since there exists a tower in $S$, we can define $\mathcal T_0$ to be the intersection of all towers in $S$. Then $\mathcal T_0$ is also a tower:

(1) For each tower $\mathcal T$ in $S$, $\emptyset\in\mathcal T$, so $\emptyset\in\mathcal T_0$.
(2) For all $C$, if $C\in\mathcal T_0$, then for each tower $\mathcal T$ in $S$, $C\in\mathcal T$, so $g(C)\in\mathcal T$, therefore $g(C)\in\mathcal T_0$.
(3) Suppose $\mathcal D\subseteq\mathcal T_0$ is a chain in $\mathcal C$, then $\mathcal D\subseteq\mathcal T_0\subseteq\mathcal T$ for every tower $\mathcal T$ in $S$, hence $\cup\mathcal D\in\mathcal T$, therefore $\cup\mathcal D\in\mathcal T_0$.

Our sub-goal is to show that $\mathcal T_0$ is a chain in $\mathcal C$.

Denote $\{A\in\mathcal T_0|\forall B\in\mathcal T_0,(A\subseteq B\lor B\subseteq A)\}$ by $\mathcal E$, then $\mathcal E$ is a tower in $S$:

(1) Since $\emptyset\in\mathcal T_0$ and for each $B\in\mathcal T_0$, $\emptyset\subseteq B$, we know $\emptyset\in\mathcal E$.
(2) Let $A\in\mathcal E$, Since $A\in\mathcal E\in\mathcal T_0$, and $T_0$ is a tower, we have $g(A)\in T_0$. Denote $\{B\in\mathcal T_0|g(A)\subseteq B\lor B\subseteq A\}$ by $\mathcal F$, then $\mathcal F$ is a tower in $S$:
- (1) Since $\emptyset\in\mathcal T_0$ and $\emptyset\subseteq A$, we know $\emptyset\in\mathcal F$.
- (2) Let $B\in\mathcal F$, then either $g(A)\subseteq B$ or $B\subseteq A$. Since $B\in\mathcal F\in\mathcal T_0$, and $T_0$ is a tower, we have $g(B)\in T_0$. Since $A\in\mathcal E$ and $g(B)\in T_0$, we have either $A\subseteq g(B)$ or $g(B)\subseteq A$.
  - If $g(A)\subseteq B$, then $g(A)\subseteq g(B)$, so $g(B)\in\mathcal F$.
  - If $B\subseteq A$:
    - If $A\subseteq g(B)$, we have $B\subseteq A\subseteq g(B)$, so either $A=B$ or $A=g(B)$,
      - if $A=B$, then $g(A)\subseteq g(B)$, so $g(B)\in\mathcal F$;
      - if $A=g(B)$, then $g(B)\subseteq A$, so $g(B)\in\mathcal F$.
    - If $g(B)\subseteq A$, we have $g(B)\in\mathcal F$.
  In all cases, $g(B)\in\mathcal F$.
- (3) Suppose $\mathcal D\subseteq\mathcal F$ is a chain in $\mathcal C$, then $\mathcal D\subseteq\mathcal F\subseteq\mathcal T_0$; since $\mathcal T_0$ is a tower, we have $\cup\mathcal D\in\mathcal T_0$. For each $B\in\mathcal D$, since $B\in\mathcal F$, either $g(A)\subseteq B$ or $B\subseteq A$.
  - If for all $B\in\mathcal D$, $B\subseteq A$, then for all $u\in\cup\mathcal D$, there exists $D\in\mathcal D$ such that $u\in D$, then $D\subseteq A$, so $u\in A$; therefore, $\cup\mathcal D\subseteq A$.
  - If there exists $B\in\mathcal D$ such that not $B\subseteq A$, then $g(A)\subseteq B\subseteq\cup\mathcal D$.
  In both cases, either $\cup\mathcal D\subseteq A$ or $g(A)\subseteq\cup\mathcal D$, so $\cup\mathcal D\in\mathcal F$.
Clearly, we have both $\mathcal F\subseteq\mathcal T_0$ and $\mathcal T_0\subseteq\mathcal F$, so $\mathcal T_0=\mathcal F$. So if $B\in\mathcal T_0$, we also have $B\in\mathcal F$, then we have either $g(A)\subseteq B$ or $B\subseteq A\subseteq g(A)$. Therefore, $g(A)\in\mathcal E$.
(3) Suppose $\mathcal D\subseteq\mathcal E$ is a chain in $\mathcal C$, then $\mathcal D\subseteq\mathcal E\subseteq\mathcal T_0$; since $\mathcal T_0$ is a tower, we have $\cup\mathcal D\in\mathcal T_0$. Now let $B\in\mathcal T_0$. For each $A\in\mathcal D$, since $A\in\mathcal E$, either $A\subseteq B$ or $B\subseteq A$.
- If for all $A\in\mathcal D$, $A\subseteq B$, then for all $u\in\cup\mathcal D$, there exists $D\in\mathcal D$ such that $u\in D$, then $D\subseteq B$, so $u\in B$; therefore, $\cup\mathcal D\subseteq B$.
- If there exists $A\in\mathcal D$ such that not $A\subseteq B$, then $B\subseteq A\subseteq\cup\mathcal D$.
In both cases, either $\cup\mathcal D\subseteq B$ or $B\subseteq\cup\mathcal D$, so $\cup\mathcal D\in\mathcal E$.

Clearly, we have both $\mathcal E\subseteq\mathcal T_0$ and $\mathcal T_0\subseteq\mathcal E$, so $\mathcal T_0=\mathcal E$. Given $A,B\in\mathcal T_0$, we have $A\in\mathcal E$, so $A\subseteq B$ or $B\subseteq A$. Therefore, we have shown that $\mathcal T_0$ is a chain in $\mathcal C$.

Let $C^*=\cup\mathcal T_0$. Since $\mathcal T_0$ is a chain in $\mathcal C$, by property (3) of towers, $C^*=\cup\mathcal T_0\in\mathcal T_0$. Then by property (2) of towers, $g(C^*)\in\mathcal T_0$. Hence, $g(C^*)\subseteq\cup\mathcal T_0=C^*$. Since $g(C^*)\subseteq C^*\subseteq g(C^*)$, we have $g(C^*)=C^*$, implying ${C^*}'=\emptyset$. Let $s\in S$. If $C^*\cup\{s\}$ is a chain, then $s\in C^*$. We have shown that $S$ has a maximal chain, which is $C^*$.

Since every chain in $S$ has an upper bound, $C^*$ has an upper bound $c^*$. Let $s\in S$, if $c^*\preceq s$, then for all $c\in C^*$, $c\preceq c^*\preceq s$, so $c\preceq s$. Since we also have $s\preceq s$, $C^*\cup\{s\}$ is a chain, then $s\in C^*$, hence $s\preceq c^*$, therefore $s=c^*$. We have shown that $c^*$ is a maximal element in $S$. $\blacksquare$

Well-ordering principle
Every set is well-orderable. (show proof)

Proof. Let $X$ be a set. Let $W$ be the collection of pairs $(S,\preceq)$ such that $S\subseteq X$ and $\preceq$ well-orders $S$. Define a relation $\preceq_W$ on $W$ by $(S,\preceq)\preceq_W(S',\preceq')$ if and only if

$S\subseteq S'$,
$\preceq$ is the restriction of $\preceq'$ on $S$, and
for all $s\in S$ and $s'\in S'\setminus\{S\}$, $s\preceq' s'$.

Then $\preceq_W$ is clearly a partial order on $W$. Note that $(\emptyset,\emptyset)\in W$.

Now we will show that every chain in $W$ has an upper bound in $W$. Let $C$ be a chain in $W$. Define $\cup C$ by $\cup_{(S,\preceq)\in C} S$ and $\preceq_{\cup C}$ by $\cup_{(S,\preceq)\in C} \preceq$ We will first show that $(\cup C,\preceq_{\cup C})\in W$. Clearly, we have $\cup C\subseteq X$. Now we need to show that $\preceq_{\cup C}$ well-orders $\cup C$:

$\preceq_{\cup C}$ is a relation on $\cup C$: if $p\in\preceq_{\cup C}$, then for some $(S,\preceq)\in C$, $p\in\preceq$, so $p\in S\times S$, thus $p\in\cup C\times \cup C$.
$\preceq_{\cup C}$ is a partial order on $\cup C$:
- Let $x\in\cup C$, then for some $(S,\preceq)\in C$, $x\in S$, so $x\preceq x$, so $x\preceq_{\cup C}x$.
- Let $x,y\in\cup C$, if $x\preceq_{\cup C}y$ and $y\preceq_{\cup C}x$, then for some $(S,\preceq),(S',\preceq')\in C$, $x\preceq y$ and $y\preceq' x$. Since $C$ is a chain, either $(S,\preceq)\preceq_W(S',\preceq')$ or $(S',\preceq')\preceq_W(S,\preceq)$. So one of $\preceq$ and $\preceq'$ is a restriction of the other, so either $x\preceq y$ and $y\preceq x$, or $x\preceq' y$ and $y\preceq' x$. We have $x=y$ in either case.
- Let $x,y,z\in\cup C$, if $x\preceq_{\cup C}y$ and $y\preceq_{\cup C}z$, by the same reasoning as above, there exists $(S,\preceq)\in C$ such that $x\preceq y$ and $y\preceq z$, so we have $x\preceq z$, and hence $x\preceq_{\cup C}z$.
$\preceq_{\cup C}$ is a total order on $\cup C$: let $x,y\in\cup C$, then for some $(S,\preceq),(S',\preceq')\in C$, $x\in S$ and $y\in S'$. Since $C$ is a chain, either $(S,\preceq)\preceq_W(S',\preceq')$ or $(S',\preceq')\preceq_W(S,\preceq)$. So one of $S$ and $S'$ is a subset of the other. So either $x,y\in S$ or $x,y\in S'$, we either have $x\preceq y$ or $y\preceq x$, or have $x\preceq' y$ and $y\preceq' x$. In either case, either $x\preceq_{\cup C}y$ or $y\preceq_{\cup C}x$.
$\preceq_{\cup C}$ is a well-order on $\cup C$: Let $T$ be a non-empty subset of $\cup C$. Let $t\in T$, then $t\in\cup C$, so for some $(S,\preceq)\in C$, $t\in S$. Let $s$ be a least element of $T\cap S$ given by $\preceq$. Let $t'\in T$. If $t'\in S$, then $s\preceq t'$, so $s\preceq_{\cup C} t'$. If $t'\notin S$, since for some $(S',\preceq')\in C$, $t'\in S'$, and $C$ being a chain implies $(S,\preceq)\preceq_W(S',\preceq')$, we have $s\preceq't'$, and hence $s\preceq_{\cup C}t'$. So $s$ is a least element of $T$ given by $\preceq_{\cup C}$.

Therefore, $(\cup C,\preceq_{\cup C})\in W$.

The next goal is to show that $(\cup C,\preceq_{\cup C})$ is an upper bound of $C$. For all $(S,\preceq)\in C$:

$S\subseteq \cup C$.
Clearly, $\preceq\subseteq\preceq_{\cup C}\cap (S\times S)$. Suppose $p\in\preceq_{\cup C}\cap (S\times S)$, then for some $(S',\preceq')\in C$, $p\in\preceq'$. Since $C$ is a chain, one of $\preceq$ and $\preceq'$ is a restriction of the other. If $\preceq$ is the restriction of $\preceq'$ on $S$, since $p\in\preceq'\cap (S\times S)$, $p\in\preceq$. If $\preceq'$ is a restriction of $\preceq$, then $p\in\preceq$. So $p\in\preceq$ in either case. Hence $\preceq=\preceq_{\cup C}\cap (S\times S)$.
Let $s\in S$ and $s'\in \cup C\setminus\{S\}$, then for some $(S',\preceq')\in C$, $s'\in S'$. Since $C$ is a chain, we have $(S,\preceq)\preceq_W(S',\preceq')$. Note that $s'\in S'\setminus\{S\}$, so $s\preceq's'$, and hence $s\preceq_{\cup C}s'$.

Therefore, $(S,\preceq)\preceq_W(\cup C,\preceq_{\cup C})$.

Since $(W,\preceq_W)$ is a non-empty partially ordered set such that every chain in $W$ has an upper bound in $W$, by Zorn's lemma, $W$ has a maximal element $(S^*,\preceq)$. Suppose for contradiction that $S^*\neq X$. Let $x^*\in X\setminus\{S^*\}$, define a relation $\preceq'$ on $S^*\cup\{x^*\}$ such that $\preceq'$ contains exactly

$(x^*,x^*)$,
$(x,x^*)$, for all $x\in S^*$, and
$p$, for all $p\in\preceq$.

Then $\preceq'$ clearly well-orders $S^*\cup\{x^*\}$. And we have $(S^*,\preceq)\preceq_W(S^*\cup\{x^*\},\preceq')$ but $(S^*,\preceq)\neq(S^*\cup\{x^*\},\preceq')$, a contradiction to $(S^*,\preceq)$ being a maximal element in $W$. Therefore, $S^*=X$, which implies that $X$ is well-orderable by $\preceq$. $\blacksquare$

Number system (show)

Note. we have defined natural numbers in the previous section. Below we will use natural numbers to construction more number systems.

Integer
Define a relation on $N\times N$ by: for all $u,v\in N\times N$, $u\sim v$ if and only if for all $a,b,c,d\in N$ such that $u=(a,b)$ and $v=(c,d)$, we have $a+d=c+b$. Then for any $a,b,c,d\in N$, $(a,b)\sim(c,d)$ if and only if $a+d=c+b$ (show proof). Note that $\sim$ is an equivalence relation (show proof). If $a,b\in N$, then $[(a,b)]$ is an equivalence class with respect to $\sim$ and $(a,b)$. Denote the set of equivalence classes with respect to $\sim$ by $Z$. Then a set $z$ is called an integer if and only if $z\in Z$. Therefore, for all $a,b\in N$, $[(a,b)]$ is an integer; and for all $z\in Z$, there exist $a,b\in N$ such that $z=[(a,b)]$. If a notation $n$ represents a natural number, then we may use $n$ to denote $[(n,0)]$.

Order on integers
Define a relation $\le$ on $Z$ by: for all $x,y\in Z$, $x\le y$ if and only if for all $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $a+d\le c+b$. Then we define the relations $\lt,\ge,\gt$ on $Z$ by: $x\lt y$ if and only if $x\le y$ and $x\neq y$; $x\ge y$ if and only if $y\le x$; $x\gt y$ if and only if $y\lt x$. For any $x,y\in Z$, and any $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$,

$x=y$ if and only if $a+d=c+b$;
$x\le y$ if and only if $a+d\le c+b$;
$x\ge y$ if and only if $a+d\ge c+b$;
$x\lt y$ if and only if $a+d\lt c+b$;
$x\gt y$ if and only if $a+d\gt c+b$.

(show proof)

Proof. Fix $x,y\in Z$, suppose we have $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$. Clearly, if $x=y$, then $[(a,b)]=[(c,d)]$, so $a+d=c+b$; if $a+d=c+b$, then $(a,b)\sim(c,d)$, so $[(a,b)]=[(c,d)]$, and hence $x=y$. we have shown that $x=y$ if and only if $a+d=c+b$.

If $x\le y$, then given any $a',b',c',d'\in N$ such that $x=[(a',b')]$ and $y=[(c',d')]$, we have $a'+d'\le c'+b'$. Hence $a+d\le c+b$. Suppose $a+d\le c+b$. Given any $a',b',c',d'\in N$ such that $x=[(a',b')]$ and $y=[(c',d')]$, we have $[(a,b)]=[(a',b')]$ and $[(c,d)]=[(c',d')]$, so $a+b'=a'+b$ and $c+d'=c'+d$. Then we have $(a+d)+(c'+b')=(c+b)+(a'+d')$. Since $a+d\le c+b$, there exists $k\in N$ such that $(a+d)+k=(c+b)$, so $(a+d)+(c'+b')=(a+d)+k+(a'+d')$, thus $(c'+b')=k+(a'+d')$, hence $a'+d'\le c'+b'$. Therefore, $a+d\le c+b$ implies $x\le y$. We have shown that $x\le y$ if and only if $a+d\le c+b$.

By a symmetric argument, we can show that $y\le x$ if and only if $c+b\le a+d$. Since we also have $x\ge y$ if and only if $y\le x$ and $c+b\le a+d$ if and only if $a+d\ge c+b$, we conclude that $x\ge y$ if and only if $a+d\ge c+b$.

If $x\lt y$, then $x\le y$ and $x\neq y$. By $x\le y$, we have $a+d\le c+b$; by $x\neq y$, we have $a+d\neq c+b$. Hence, we have $a+d\le c+b$ and $a+d\neq c+b$, and therefore, $a+d\lt c+b$. Suppose $a+d\lt c+b$. Then we have $a+d\le c+b$, so $x\le y$. Since $a+d\lt c+b$, we have $a+d\neq c+b$, so $x\neq y$. By $x\le y$ and $x\neq y$, we have $x\lt y$. We have shown that $x\lt y$ if and only if $a+d\lt c+b$.

By a symmetric argument, we can show that $y\lt x$ if and only if $c+b\lt a+d$. Since we also have $x\gt y$ if and only if $y\lt x$ and $c+b\lt a+d$ if and only if $a+d\gt c+b$, we conclude that $x\gt y$ if and only if $a+d\gt c+b$. $\blacksquare$

Proposition. For all $m,n\in N$,

If $m\gt n$, then $[(m,n)]\gt0$;
If $m\lt n$, then $[(m,n)]\lt0$;

(show proof)

Proposition. For all $m,n\in Z$, we have exactly one of $m\lt n$, $m=n$, $m\gt n$. (show proof)

Notation. We denote $\{z\in Z|z\gt 0\}$ by $Z^+$ and $\{z\in Z|z\lt 0\}$ by $Z^-$. These apply to other number systems with order and $0$ defined as well.

Proposition. For all $m,n,k\in Z$, if $m\lt n$ and $n\lt k$, then $m\lt k$. (show proof)

Note. We have shown that $\le$ is a total order on $Z$.

Addition of integers
From $Z\times Z$ to $Z$, with $w$ as input and $z$ as output, "for all $x,y\in Z$ such that $w=(x,y)$, for all $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, $z=[(a+c,b+d)]$" defines a function $+:Z\times Z\to Z$ called addition of integers (show proof). For any $x,y\in Z$, and any $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $x+y=[(a+c,b+d)]$.

Proposition. $0$ is an additive identity for integers: for all $z\in Z$, $z+0=z$ and $0+z=z$. (show proof)

Proposition. Addition of integers is associative: for all $m,n,k\in Z$, $(m+n)+k=m+(n+k)$, and commutative: for all $m,n\in Z$, $m+n=n+m$. (show proof)

Proposition. For all $x,y,z\in Z$, if $y\lt z$, then $x+y\lt x+z$. (show proof)

Proposition. Given $m,n\in Z$, if $m\lt n$, then $m+1\le n$. (show proof)

Proposition. Every integer has a unique additive inverse: for all $z\in Z$, there exists a unique $w\in Z$, such that $z+w=0$ and $w+z=0$. We denote the additive inverse of an integer $z$ by $-z$. Given natural numbers $a,b$, we have $-[(a,b)]=[(b,a)]$. (show proof)

Proposition. For all $m,n\in Z$, if $m\lt n$ then $-m\gt -n$. (show proof)

Proposition. For all $z\in Z$, $--z=z$. (show proof)

Sign function
We define a function $\sgn:Z\to Z$ by

$\sgn(x)=1$ if $x\gt 0$;
$\sgn(x)=0$ if $x=0$;
$\sgn(x)=-1$ if $x\lt 0$.

Then $\sgn$ is called the sign function of $Z$. When we say the sign of a number, we mean the output of the sign function given that number as input. When we say two numbers have opposite signs, we mean their signs add up to $0$. Sign functions may be defined on other number systems as well, but the codomain of a sign function is always $Z$, so that signs of different number systems can be manipulated together.

Notation. Let $z\in Z$, we use the notation $\pm z$ to denote $\{z,-z\}$. Expressions like $y=\pm x$ are understood as $y\in\{x,-x\}$. This applies to any number system where additive inverse makes sense.

Subtraction of integers
From $Z\times Z$ to $Z$, with $w$ as input and $z$ as output, "for all $x,y\in Z$ such that $w=(x,y)$, $z=x+(-y)$" defines a function $-:Z\times Z\to Z$ called subtraction of integers. For any $x,y\in Z$, and any $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $x-y=[(a+d,b+c)]$.

Proposition. For all $m,n\in Z$, $m+(n-m)=n$. (show proof)

Multiplication of integers
From $Z\times Z$ to $Z$, with $w$ as input and $z$ as output, "for all $x,y\in Z$ such that $w=(x,y)$, for all $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, $z=[(ac+bd,ad+bc)]$" defines a function $\times:Z\times Z\to Z$ called multiplication of integers (show proof).

Proof. Fix $w\in Z\times Z$, there exist $x',y'\in Z$ such that $w=(x',y')$, and for all $x,y\in Z$ such that $w=(x,y)$, we have $x=x'$ and $y=y'$. Since $x',y'\in Z$, there exist $a',b',c',d'\in N$ such that $x'=[(a',b')]$ and $y'=[(c',d')]$, and for all $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $[(a,b)]=x=x'=[(a',b')]$ and $[(c,d)]=y=y'=[(c',d')]$, so $a+b'=a'+b$ and $c+d'=c'+d$. Then,

note that either there exists $m\in N$ such that $a+m=a'$, or there exists $m\in N$ such that $a'+m=a$; suppose there exists $m\in N$ such that $a+m=a'$, then $a+b'=a+m+b$, so $b'=m+b$; suppose there exists $m\in N$ such that $a'+m=a$, then $a'+m+b'=a'+b$, so $m+b'=b$;
similarly, either there exists $n\in N$ such that $c+n=c'$, or there exists $n\in N$ such that $c'+n=c$; suppose there exists $n\in N$ such that $c+n=c'$, then $c+d'=c+n+d$, so $d'=n+d$; suppose there exists $n\in N$ such that $c'+n=c$, then $c'+n+d'=c'+d$, so $n+d'=d$.

There are four possible cases:

suppose there exists $m\in N$ such that $a+m=a'$, and there exists $n\in N$ such that $c+n=c'$, then $(ac+bd)+(a'd'+b'c')=ac+bd+(a+m)(d+n)+(b+m)(c+n)=ac+bd+ad+an+md+mn+bc+bn+mc+mn=(a+m)(c+n)+(b+m)(d+n)+ad+bc=(a'c'+b'd')+(ad+bc)$;
suppose there exists $m\in N$ such that $a+m=a'$, and there exists $n\in N$ such that $c'+n=c$, then $(ac+bd)+(a'd'+b'c')=a(c'+n)+b(d'+n)+(a+m)d'+(b+m)c'=ac'+an+bd'+bn+ad'+md'+bc'+mc'=(a+m)c'+(b+m)d'+a(d'+n)+b(c'+n)=(a'c'+b'd')+(ad+bc)$;
suppose there exists $m\in N$ such that $a'+m=a$, and there exists $n\in N$ such that $c+n=c'$, then $(ac+bd)+(a'd'+b'c')=(a'+m)c+(b'+m)d+a'(d+n)+b'(c+n)=a'c+mc+b'd+md+a'd+a'n+b'c+b'n=a'(c+n)+b'(d+n)+(a'+m)d+(b'+m)c=(a'c'+b'd')+(ad+bc)$;
suppose there exists $m\in N$ such that $a'+m=a$, and there exists $n\in N$ such that $c'+n=c$, then $(ac+bd)+(a'd'+b'c')=(a'+m)(c'+n)+(b'+m)(d'+n)+a'd'+b'c'=a'c'+a'n+mc'+mn+b'd'+b'n+md'+mn+a'd'+b'c'=a'c'+b'd'+(a'+m)(d'+n)+(b'+m)(c'+n)=(a'c'+b'd')+(ad+bc)$;

In all cases we have $(ac+bd)+(a'd'+b'c')=(a'c'+b'd')+(ad+bc)$, hence $[(a'c'+b'd',a'd'+b'c')]=[(ac+bd,ad+bc)]$. We have shown that there exists $z\in Z$ that satisfies the above statement, which is $[(a'c'+b'd',a'd'+b'c')]$. To show uniqueness, for all $z'\in Z$ that satisfies the above statement, we have $z'=[(a'c'+b'd',a'd'+b'c')]$. Therefore, for all $w\in Z\times Z$, there exists a unique $z\in Z$ that satisfies the above statement, implying it indeed defines a function. $\blacksquare$

For any $x,y\in Z$, and any $a,b,c,d\in N$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $xy=[(ac+bd,ad+bc)]$.

Proposition. For all $z\in Z$, $z0=0$ and $0z=0$. (show proof)

Proposition. $1$ is an multiplicative identity for integers: for all $z\in Z$, $z1=z$ and $1z=z$. (show proof)

Proposition. For all $n\in Z$, $(-1)n=-n$. (show proof)

Proposition. Multiplication of integers is associative: for all $m,n,k\in Z$, $(mn)k=m(nk)$, and commutative: for all $m,n\in Z$, $mn=nm$. (show proof)

Proposition. Multiplication distributes over addition for integers: for all $m,n,k\in Z$, $m(n+k)=mn+mk$ and $(m+n)k=mk+nk$. (show proof)

Proposition. For all $x,y,z\in Z$,

if $x\gt0$ and $y\gt z$, then $xy\gt xz$;
if $x\gt0$ and $y\lt z$, then $xy\lt xz$;
if $x\lt0$ and $y\gt z$, then $xy\lt xz$;
if $x\lt0$ and $y\lt z$, then $xy\gt xz$.

(show proof)

Proposition. For all $x,y\in Z$, $\sgn(xy)=\sgn(x)\sgn(y)$. (show proof)

Parity
Let $X$ be $N$ or $Z$, we define a function $\mathcal P:X\to\{-1,1\}$, where $-1$ and $1$ are integers, by

$\mathcal P(x)=1$ if there exists $k\in X$ such that $k+k=x$;
$\mathcal P(x)=-1$ otherwise.

Then $\mathcal P$ is called the parity function of $X$. If $\mathcal P(x)=1$, we say $x$ is even; if $\mathcal P(x)=-1$, we say $x$ is odd.

Proposition. Define a function $f:N\to Z$ such that $f(n)=[(n,0)]$, then for all $m,n\in N$,

$f(m)\lt f(n)$ if $m\lt n$;
$f(0)=0$;
$f(1)=1$;
$f(m+n)=f(m)+f(n)$;
$f(mn)=f(m)f(n)$;
if $f(m)=f(n)$, then $m=n$.

(show proof)

Rational number
Define a relation on $Z\times Z\setminus\{0\}$ by: for all $u,v\in Z\times Z\setminus\{0\}$, $u\sim v$ if and only if for all $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $u=(a,b)$ and $v=(c,d)$, we have $ad=cb$. Then for any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$, $(a,b)\sim(c,d)$ if and only if $ad=cb$ (show proof). Note that $\sim$ is an equivalence relation (show proof). If $a\in Z$ and $b\in Z\setminus\{0\}$, then $[(a,b)]$ is an equivalence class with respect to $\sim$ and $(a,b)$. Denote the set of equivalence classes with respect to $\sim$ by $Q$. Then a set $q$ is called a rational number if and only if $q\in Q$. Therefore, for all $a\in Z$ and $b\in Z\setminus\{0\}$, $[(a,b)]$ is a rational number; and for all $q\in Q$, there exist $a\in Z$ and $b\in Z\setminus\{0\}$ such that $q=[(a,b)]$. If a notation $z$ represents an integer, then we may use $z$ to denote $[(z,1)]$. If in addition to $z$, a notation $w$ represents a non-zero integer, then we may use $\frac{z}{w}$ to represent $[(z,w)]$.

Order on rational numbers
Define a relation $\le$ on $Q$ by: for all $x,y\in Q$, $x\le y$ if and only if for all $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $(ad)\sgn(bd)\le(cb)\sgn(bd)$. Then we define the relations $\lt,\ge,\gt$ on $Q$ by: $x\lt y$ if and only if $x\le y$ and $x\neq y$; $x\ge y$ if and only if $y\le x$; $x\gt y$ if and only if $y\lt x$. For any $x,y\in Q$, and any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$,

$x=y$ if and only if $ad=cb$;
$x\le y$ if and only if $(ad)\sgn(bd)\le (cb)\sgn(bd)$;
$x\ge y$ if and only if $(ad)\sgn(bd)\ge (cb)\sgn(bd)$;
$x\lt y$ if and only if $(ad)\sgn(bd)\lt (cb)\sgn(bd)$;
$x\gt y$ if and only if $(ad)\sgn(bd)\gt (cb)\sgn(bd)$.

(show proof)

Proof. Fix $x,y\in Q$, suppose we have $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$. Clearly, if $x=y$, then $[(a,b)]=[(c,d)]$, so $ad=cb$; if $ad=cb$, then $(a,b)\sim(c,d)$, so $[(a,b)]=[(c,d)]$, and hence $x=y$. we have shown that $x=y$ if and only if $ad=cb$. Again, if $ad=cb$, then we have $(ad)\sgn(bd)=(cb)\sgn(bd)$; if $(ad)\sgn(bd)=(cb)\sgn(bd)$, since $b,d\neq0$, $bd\neq0$, so $\sgn(bd)\neq0$, hence $ad=cb$. We have shown that $x=y$ if and only if $(ad)\sgn(bd)=(cb)\sgn(bd)$.

If $x\le y$, then given any $a',c'\in Z$ and $b',d'\in Z\setminus\{0\}$ such that $x=[(a',b')]$ and $y=[(c',d')]$, we have $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$. Hence $(ad)\sgn(bd)\le(cb)\sgn(bd)$. Suppose $(ad)\sgn(bd)\le(cb)\sgn(bd)$. Given any $a',c'\in Z$ and $b',d'\in Z\setminus\{0\}$ such that $x=[(a',b')]$ and $y=[(c',d')]$, we have $[(a,b)]=[(a',b')]$ and $[(c,d)]=[(c',d')]$, so $ab'=a'b$ and $cd'=c'd$. Since $b,b',d,d'\neq0$, we have $bb',dd'\neq0$, and thus $[(ab',bb')]=x=[(a'b,b'b)]$ and $[(cd',dd')]=y=[(c'd,d'd)]$. Since $bd\neq0$, either $bd\lt0$ or $bd\gt0$. Same for $b'd'$.

Suppose $bd\gt0$ and $b'd'\gt0$, then $(ab')(dd')=(b'd')(ad)\sgn(bd)\le(b'd')(cb)\sgn(bd)=(cd')(bb')$， so $(a'd')(bd)=(a'b)(dd')=(ab')(dd')\le(cd')(bb')=(c'd)(bb')=(c'b')(bd)$, hence $a'd'\le c'b'$, thus $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$.
Suppose $bd\gt0$ and $b'd'\lt0$, then $(ab')(dd')=(b'd')(ad)\sgn(bd)\ge(b'd')(cb)\sgn(bd)=(cd')(bb')$， so $(a'd')(bd)=(a'b)(dd')=(ab')(dd')\ge(cd')(bb')=(c'd)(bb')=(c'b')(bd)$, hence $a'd'\ge c'b'$, thus $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$.
Suppose $bd\lt0$ and $b'd'\gt0$, then $-1(ab')(dd')=(b'd')(ad)\sgn(bd)\le(b'd')(cb)\sgn(bd)=-1(cd')(bb')$， so $(a'd')(bd)=(a'b)(dd')=(ab')(dd')\ge(cd')(bb')=(c'd)(bb')=(c'b')(bd)$, hence $a'd'\le c'b'$, thus $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$.
Suppose $bd\gt0$ and $b'd'\lt0$, then $-1(ab')(dd')=(b'd')(ad)\sgn(bd)\ge(b'd')(cb)\sgn(bd)=-1(cd')(bb')$， so $(a'd')(bd)=(a'b)(dd')=(ab')(dd')\le(cd')(bb')=(c'd)(bb')=(c'b')(bd)$, hence $a'd'\ge c'b'$, thus $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$.

In all cases, we have $(a'd')\sgn(b'd')\le(c'b')\sgn(b'd')$. Therefore, $(ad)\sgn(bd)\le(cb)\sgn(bd)$ implies $x\le y$. We have shown that $x\le y$ if and only if $(ad)\sgn(bd)\le(cb)\sgn(bd)$.

By a symmetric argument, we can show that $y\le x$ if and only if $(cb)\sgn(db)\le(ad)\sgn(db)$. Since we also have $x\ge y$ if and only if $y\le x$ and $(cb)\sgn(db)\le(ad)\sgn(db)$ if and only if $(ad)\sgn(bd)\ge(cb)\sgn(bd)$, we conclude that $x\ge y$ if and only if $(ad)\sgn(bd)\ge(cb)\sgn(bd)$.

If $x\lt y$, then $x\le y$ and $x\neq y$. By $x\le y$, we have $(ad)\sgn(bd)\le(cb)\sgn(bd)$; by $x\neq y$, we have $(ad)\sgn(bd)\neq(cb)\sgn(bd)$. Hence, we have $(ad)\sgn(bd)\le(cb)\sgn(bd)$ and $(ad)\sgn(bd)\neq(cb)\sgn(bd)$, and therefore, $(ad)\sgn(bd)\lt(cb)\sgn(bd)$. Suppose $(ad)\sgn(bd)\lt(cb)\sgn(bd)$. Then we have $(ad)\sgn(bd)\le(cb)\sgn(bd)$, so $x\le y$. Since $(ad)\sgn(bd)\lt(cb)\sgn(bd)$, we have $(ad)\sgn(bd)\neq(cb)\sgn(bd)$, so $x\neq y$. By $x\le y$ and $x\neq y$, we have $x\lt y$. We have shown that $x\lt y$ if and only if $(ad)\sgn(bd)\lt(cb)\sgn(bd)$.

By a symmetric argument, we can show that $y\lt x$ if and only if $(cb)\sgn(db)\lt(ad)\sgn(db)$. Since we also have $x\gt y$ if and only if $y\lt x$ and $(cb)\sgn(db)\lt(ad)\sgn(db)$ if and only if $(ad)\sgn(db)\gt(cb)\sgn(db)$, we conclude that $x\gt y$ if and only if $(ad)\sgn(db)\gt(cb)\sgn(db)$. $\blacksquare$

Proposition. For all $m\in Z$ and $n\in Z\setminus\{0\}$, $\sgn([(m,n)])=\sgn(m)\sgn(n)$. (show proof)

Proposition. For all $m,n\in Q$, we have exactly one of $m\lt n$, $m=n$, $m\gt n$. (show proof)

Proposition. For all $m,n,k\in Q$, if $m\lt n$ and $n\lt k$, then $m\lt k$. (show proof)

Note. We have shown that $\le$ is a total order on $Q$.

Addition of rational numbers
From $Q\times Q$ to $Q$, with $w$ as input and $z$ as output, "for all $x,y\in Q$ such that $w=(x,y)$, for all $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, $z=[(ad+cb,bd)]$" defines a function $+:Q\times Q\to Q$ called addition of rational numbers (show proof). For any $x,y\in Q$, and any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $x+y=[(ad+cb,bd)]$.

Proposition. $0$ is an additive identity for rational numbers: for all $q\in Q$, $q+0=q$ and $0+q=q$. (show proof)

Proposition. Addition of rational numbers is associative: for all $m,n,k\in Q$, $(m+n)+k=m+(n+k)$, and commutative: for all $m,n\in Q$, $m+n=n+m$. (show proof)

Proposition. For all $x,y,z\in Q$, if $y\lt z$, then $x+y\lt x+z$. (show proof)

Proposition. Every rational number has a unique additive inverse: for all $q\in Q$, there exists a unique $w\in Q$, such that $q+w=0$ and $w+q=0$. We denote the additive inverse of a rational number $q$ by $-q$. Given integer $a$ and non-zero integer $b$, we have $-[(a,b)]=[(-a,b)]$. (show proof)

Proposition. For all $m,n\in Q$, if $m\lt n$, then $-m\gt -n$. (show proof)

Proposition. For all $q\in Q$, $--q=q$. (show proof)

Subtraction of rational numbers
From $Q\times Q$ to $Q$, with $w$ as input and $z$ as output, "for all $x,y\in Q$ such that $w=(x,y)$, $z=x+(-y)$" defines a function $-:Q\times Q\to Q$ called subtraction of rational numbers. For any $x,y\in Q$, and any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $x-y=[(ad-cb,bd)]$.

Proposition. For all $m,n\in Q$, $m+(n-m)=n$. (show proof)

Multiplication of rational numbers
From $Q\times Q$ to $Q$, with $w$ as input and $z$ as output, "for all $x,y\in Q$ such that $w=(x,y)$, for all $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, $z=[(ac,bd)]$" defines a function $\times:Q\times Q\to Q$ called multiplication of rational numbers (show proof). For any $x,y\in Q$, and any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $xy=[(ac,bd)]$.

Proposition. For all $q\in Q$, $q0=0$ and $0q=0$. (show proof)

Proposition. $1$ is an multiplicative identity for rational numbers: for all $q\in Q$, $q1=q$ and $1q=q$. (show proof)

Proposition. For all $q\in Q$, $(-1)q=-q$. (show proof)

Proposition. Multiplication of rational numbers is associative: for all $m,n,k\in Q$, $(mn)k=m(nk)$, and commutative: for all $m,n\in Q$, $mn=nm$. (show proof)

Proposition. Multiplication distributes over addition for rational numbers: for all $m,n,k\in Q$, $m(n+k)=mn+mk$ and $(m+n)k=mk+nk$. (show proof)

Proposition. For all $x,y,z\in Q$,

if $x\gt0$ and $y\gt z$, then $xy\gt xz$;
if $x\gt0$ and $y\lt z$, then $xy\lt xz$;
if $x\lt0$ and $y\gt z$, then $xy\lt xz$;
if $x\lt0$ and $y\lt z$, then $xy\gt xz$.

(show proof)

Proposition. For all $x,y\in Q$, $\sgn(xy)=\sgn(x)\sgn(y)$. (show proof)

Proposition. Every non-zero rational number has a unique multiplicative inverse: for all $q\in Q\setminus\{0\}$, there exists a unique $w\in Q$, such that $qw=1$ and $wq=1$. We denote the multiplicative inverse of a non-zero rational number $q$ by $\frac{1}{q}$. Given non-zero integers $a,b$, we have $\frac{1}{[(a,b)]}=[(b,a)]$. (show proof)

Proposition. For all $q\in Q\setminus\{0\}$, $\frac{1}{-q}=-\frac{1}{q}$. (show proof)

Proposition. For all $m,n\in Q^+$ or $m,n\in Q^-$, if $m\lt n$ then $\frac{1}{m}\gt\frac{1}{n}$. (show proof)

Proposition. For all $q\in Q\setminus\{0\}$, $\frac{1}{\frac{1}{q}}=q$. (show proof)

Division of rational numbers
From $Q\times Q\setminus\{0\}$ to $Q$, with $w$ as input and $z$ as output, "for all $x\in Q$ and $y\in Q\setminus\{0\}$ such that $w=(x,y)$, $z=\frac{x}{y}$" defines a function $\divsymbol:Q\times Q\setminus\{0\}\to Q$ called division of rational numbers. For any $x\in Q$ and $y\in Q\setminus\{0\}$, and any $a,c\in Z$ and $b,d\in Z\setminus\{0\}$ such that $x=[(a,b)]$ and $y=[(c,d)]$, we have $x\divsymbol y=[(ad,cb)]$. We may denote $x\divsymbol y$ by $\frac{x}{y}$.

Note. Given $q\in Q\setminus\{0\}$, the notation $\frac{1}{q}$ takes the same value whether it is interpreted as the multiplicative inverse of $q$ or division of $1$ by $q$.

Proposition. For all $m\in Q\setminus\{0\}$ and $n\in Q$, $m(\frac{n}{m})=n$. (show proof)

Proposition. Define a function $f:Z\to Q$ such that $f(z)=[(z,1)]$, then for all $m,n\in Z$,

$f(m)\lt f(n)$ if $m\lt n$;
$f(0)=0$;
$f(1)=1$;
$f(m+n)=f(m)+f(n)$;
$f(-n)=-f(n)$;
$f(mn)=f(m)f(n)$;
if $f(m)=f(n)$, then $m=n$.

(show proof)

Real number
Define $R$ to be $\{r\in\mathcal P(Q)|\varphi(r)\}$, where $\varphi(r)$ represents the following properties:

$r\ne\emptyset$ and $r\ne Q$.
For all $x,y\in Q$, $x\lt y$ and $y\in r$ imply $x\in r$.
For all $x\in r$, there exists $y\in r$ such that $y\gt x$.

Then a set $r$ is called a real number if and only if $r\in R$. For every rational number $q$, $\{u\in Q|u\lt q\}$ is a real number (show proof). If a notation $q$ represents a rational number, then we may use $q$ to denote $\{u\in Q|u\lt q\}$.

Irrational number
Suppose $r$ is a real number such that for all rational number $q$, $r\neq\{u\in Q|u\lt q\}$, then $r$ is called an irrational number.

Order on real numbers
We define these relations on $R$. For all $x,y\in R$,

$x\le y$ if and only if $x\subseteq y$;
$x\ge y$ if and only if $y\subseteq x$;
$x\lt y$ if and only if $x\subset y$;
$x\gt y$ if and only if $y\subset x$.

Proposition. For all $m,n,k\in R$, if $m\lt n$ and $n\lt k$, then $m\lt k$. (show proof)

Proposition. Suppose $r\in R$ and $q\in Q^+$, there exists $s\in r$ such that $s+q\notin r$. (show proof)

Proposition. Suppose $r\in R$ and $q\in Q$, if $q\notin r$, then for all $s\in r$, $q\gt s$. (show proof)

Note. Combining the two propositions above, suppose $r\in R$ and $q\in Q^+$, there exists $s\in r$ such that for all $t\in r$, $s+q\gt t$.

Proposition. For all $a,b\in R$, we have exactly one of $a\lt b$, $a=b$, and $a\gt b$. (show proof)

Note. We have shown that $\le$ is a total order on $R$.

Completeness of real numbers
Every non-empty subset $\mathscr C$ of $R$ that has an upper bound must have a unique least upper bound, called the supremum of $\mathscr C$. (show proof)

Proof. Let $\mathscr C$ be a nonempty subset of $R$ having an upper bound $X$. We will first show that $\bigcup\mathscr C$ is a real number:

Since $\mathscr C$ is non-empty, there exists $A\in\mathscr C$. Since $A$ is a real number, $A$ is non-empty, so there exists $u\in A$, then $u\in\bigcup\mathscr C$, so $\bigcup\mathscr C\neq\emptyset$. Since $\mathscr C$ has an upper bound $X$, for all $A\in\mathscr C$, $A\subseteq X$. Since $X$ is a real number, $X\subset Q$, so there exists $v\in Q$ such that $v\notin X$, then for all $A\in\mathscr C$, $v\notin A$, hence $v\notin \bigcup\mathscr C$, implying $\bigcup\mathscr C\neq Q$.
Suppose $x,y\in Q$, $x\lt y$, and $y\in \bigcup\mathscr C$, then there exists $A\in\mathscr C$ such that $y\in A$. Since $A$ is a real number, we know $x\in A$, hence $x\in \bigcup\mathscr C$.
Suppose $x\in \bigcup\mathscr C$, then there exists $A\in\mathscr C$ such that $x\in A$. Since $A$ is a real number, we know there exists $y\in A$ such that $y\gt x$. And also, $y\in \bigcup\mathscr C$.

Hence $\bigcup\mathscr C$ is a real number.

Then we will show that $\bigcup\mathscr C$ is a least upper bound of $\mathscr C$. Suppose $A\in\mathscr C$, if $u\in A$, then $u\in \bigcup\mathscr C$, so $A\le \bigcup\mathscr C$. Hence $\bigcup\mathscr C$ is an upper bound of $\mathscr C$. Now let $Y$ be any upper bound of $\mathscr C$. If $u\in \bigcup\mathscr C$, then there exists $A\in\mathscr C$ such that $u\in A$. Since $Y$ is an upper bound of $\mathscr C$, $A\subseteq Y$, so $u\in Y$. Hence $\bigcup\mathscr C\le Y$. Therefore $\bigcup\mathscr C$ is a least upper bound of $\mathscr C$.

For uniqueness, let $Z$ be a least upper bounds of $\mathscr C$. Suppose for contradiction that $\bigcup\mathscr C\neq Z$, then either $\bigcup\mathscr C\lt Z$ or $\bigcup\mathscr C\gt Z$, implying $\bigcup\mathscr C$ and $Z$ cannot both be least upper bounds, a contradiction. Therefore, we have $\bigcup\mathscr C=Z$. $\blacksquare$

Notation. Recall that in the previous section, we defined a term to represent the first element and a term to represent the second element of an ordered pair $p$, which we will denote as $p_0$ and $p_1$ respectively, regardless of whether $p$ is indeed an ordered pair.

Addition of real numbers
Since for all $p\in R\times R$, $\{u\in Q|\exists (a\in p_0,b\in p_1),u=a+b\}\in R$ (show proof), we can define a function $+:R\times R\to R$ such that for all $x,y\in R$, $x+y=\{u\in Q|\exists (a\in x,b\in y),u=a+b\}$.

Proposition. Addition of real numbers is associative: for all $m,n,k\in R$, $(m+n)+k=m+(n+k)$, and commutative: for all $m,n\in R$, $m+n=n+m$. (show proof)

Proposition. $0$ is an additive identity for real numbers: for all $r\in R$, $r+0=r$ and $0+r=r$. (show proof)

Proposition. For all $x,y,z\in R$, if $y\lt z$, then $x+y\lt x+z$. (show proof)

Lemma. Given a real number $r$ and a positive rational number $q$, there exists an integer (as a rational number) $n$ such that $nq\in r$ but $(n+1)q\notin r$. (show proof)

Proposition. Every real number has a unique additive inverse: for all $r\in R$, there exists a unique $s\in R$, such that $r+s=0$ and $s+r=0$. We denote the additive inverse of a real number $r$ by $-r$. (show proof)

Proposition. For all $x,y\in R$, if $x\lt y$, then $-x\gt -y$. (show proof)

Proposition. For all $r\in R$, $--r=r$. (show proof)

Proposition. Every non-empty subset $\mathscr C$ of $R$ that has a lower bound must have a unique greatest lower bound, also called the infimum of $\mathscr C$. (show proof)

Subtraction of real numbers
We can define a function $-:R\times R\to R$ such that for all $x,y\in R$, $x-y=x+(-y)$.

Proposition. For all $a,b\in R$, $a+(b-a)=b$. (show proof)

Lemma. If $r\in R$ and $r\gt 0$, then there exists $u\in r$ such that $u\gt 0$. (show proof)

Multiplication of real numbers
Since for all $p\in R^+\times R^+$, $\{u\in Q|\exists (a\in p_0,b\in p_1),a\gt0\land b\gt0\land u\le ab\}\in R$ (show proof), we can define a function $\times^+:R^+\times R^+\to R$ such that for all $x,y\in R^+$, $x\times^+y=\{u\in Q|\exists (a\in x,b\in y),a\gt0\land b\gt0\land u\le ab\}$. Then we can define a function $\times:R\times R\to R$ such that for all $x,y\in R$,

if $x=0$ or $y=0$, then $xy=0$;
if $x\gt0$ and $y\gt0$, then $xy=x\times^+y$;
if $x\gt0$ and $y\lt0$, then $xy=-(x\times^+(-y))$;
if $x\lt0$ and $y\gt0$, then $xy=-((-x)\times^+y)$;
if $x\lt0$ and $y\lt0$, then $xy=(-x)\times^+(-y)$.

Since $\times$ is an extension of $\times^+$, given $x,y\in R^+$, we may denote $x\times^+y$ as $xy$.

Proposition. For all $x,y\in R$, $\sgn(xy)=\sgn(x)\sgn(y)$. (show proof)

Proposition. Multiplication of real numbers is associative: for all $m,n,k\in R$, $(mn)k=m(nk)$, and commutative: for all $m,n\in R$, $mn=nm$. (show proof)

Proof. Suppose $m,n,k\in R$ such that $m,n,k\gt 0$.

Suppose $u\in (mn)k$, then there exist $q\in mn$ and $c\in k$ such that $q\gt0$, $c\gt0$ and $u\le qc$, since there exist $a\in m$ and $b\in n$ such that $a\gt0$, $b\gt0$ and $q\le ab$, we have $u\le qc\le (ab)c$. Since $a,b,c\in Q$, we have $(ab)c=a(bc)$. Clearly, $bc\in nk$, and $a(bc)\in m(nk)$. Since $u\le (ab)c=a(bc)$, $u\in m(nk)$. By a symmetric argument, suppose $u\in m(nk)$, then $u\in (mn)k$. Hence $(mn)k=m(nk)$.

Suppose $u\in mn$, then there exist $a\in m$ and $b\in n$ such that $a\gt0$, $b\gt0$ and $u\le ab$. Since $a,b\in Q$, $ab=ba$. Clearly, $ba\in nm$. Since $u\le ab=ba$, $u\in nm$. By a symmetric argument, suppose $u\in nm$, then $u\in mn$. Hence $mn=nm$.

Now suppose $m,n,k\in R$.

If any of $m,n,k$ is $0$, then $(mn)k=0=m(nk)$.
If $m\gt0,n\gt0,k\gt0$, we have already shown that $(mn)k=m(nk)$.
If $m\gt0,n\gt0,k\lt0$, $(mn)k=-((mn)(-k))=-(m(n(-k)))=-(m(--(n(-k))))=-(m(-(nk)))=m(nk)$.
If $m\gt0,n\lt0,k\gt0$, $(mn)k=-((-(mn))k)=-((--(m(-n)))k)=-((m(-n))k)=-(m((-n)k))=-(m(--((-n)k)))=-(m(-(nk)))=m(nk)$.
If $m\gt0,n\lt0,k\lt0$, $(mn)k=(-(mn))(-k)=(--(m(-n)))(-k)=(m(-n))(-k)=m((-n)(-k))=m(nk)$.
If $m\lt0,n\gt0,k\gt0$, $(mn)k=-((-(mn))k)=-((--((-m)n))k)=-(((-m)n)k)=-((-m)(nk))=m(nk)$.
If $m\lt0,n\gt0,k\lt0$, $(mn)k=(-(mn))(-k)=(--((-m)n))(-k)=((-m)n)(-k)=(-m)(n(-k))=(-m)(--(n(-k)))=(-m)(-(nk))=m(nk)$.
If $m\lt0,n\lt0,k\gt0$, $(mn)k=((-m)(-n))k=(-m)((-n)k)=(-m)(--((-n)k))=(-m)(-(nk))=m(nk)$.
If $m\lt0,n\lt0,k\lt0$, $(mn)k=-((mn)(-k))=-(((-m)(-n))(-k))=-((-m)((-n)(-k)))=m((-n)(-k))=m(nk)$.

If any of $m,n$ is $0$, then $mn=0=nm$.
If $m\gt0,n\gt0$, we have already shown that $mn=nm$.
If $m\gt0,n\lt0$, $mn=-(m(-n))=-((-n)m)=nm$.
If $m\lt0,n\gt0$, $mn=-((-m)n)=-(n(-m))=nm$.
If $m\lt0,n\lt0$, $mn=(-m)(-n)=(-n)(-m)=nm$.

$\blacksquare$

Proposition. $1$ is a multiplicative identity for real numbers: for all $r\in R$, $r1=r$ and $1r=r$. (show proof)

Proposition. Let $r\in R$, then $(-1)r=-r$. (show proof)

Proposition. Multiplication distributes over addition for real numbers: for all $m,n,k\in R$, $m(n+k)=mn+mk$ and $(m+n)k=mk+nk$. (show proof)

Proof. Suppose $m,n,k\in R$ such that $m,n,k\gt 0$.

Suppose $u\in m(n+k)$, then there exist $a\in m$ and $q\in n+k$ such that $a\gt0$, $q\gt0$ and $u\le aq$, and there exist $b\in n$ and $c\in k$ such that $q=b+c$. Define $b',c'$ by:

If $b\gt0$ and $c\gt0$, $b'=b$ and $c'=c$.
If $b\gt0$ and $c\le0$, since $k\gt0$, there exists $v\in c$ such that $v\gt 0$. Then $b+v\gt b+c=q$. So let $b'=b(q/(b+v))$ and $c'=v(q/(b+v))$, we have $b'+c'=q$, $b'\lt b$ and $c'\lt c$, so $b'\in n$ and $c'\in k$. Also, $b'\gt0$ and $c'\gt0$.
If $b\le0$ and $c\gt0$, we define $b',c'$ by a symmetric definition to the above case.
If $b\le0$ and $c\le0$, then $q=b+c\le0$, a contradiction.

Now we have $b'\in n$ and $c'\in k$ such that $b'\gt0$, $c'\gt0$ and $q=b'+c'$.

Then $u\le aq=a(b'+c')=ab'+ac'$. Clearly, $ab'\in mn$ and $ac'\in mk$, so $u\in mn+mk$.

Suppose $u\in mn+mk$, there exist $q\in mn$ and $p\in mk$ such that $u=q+p$, there exist $a\in m$ and $b\in n$ such that $a\gt0$, $b\gt0$ and $q\le ab$, and there exist $c\in m$ and $d\in k$ such that $c\gt0$, $d\gt0$ and $p\le cd$. Let $v=a$ if $a\ge c$ and $v=c$ otherwise, then clearly, $b+d\in n+k$, and $v(b+d)\in m(n+k)$. Since $u=q+p\le ab+cd\le v(b+d)$, we have $u\in m(n+k)$.

We have shown that $m(n+k)=mn+mk$ for positive real $m,n,k$.

Now suppose $m,n,k\in R$.

If any of $m,n,k$ is $0$, then it is trivial that $m(n+k)=mn+mk$.
If $m\gt0,n\gt0,k\gt0$, we have already shown that $m(n+k)=mn+mk$.
If $m\gt0,n\gt0,k\lt0$,
- if $n+k=0$, then $m(n+k)=0=mn+(-(mn))=mn+m(-n)=mn+mk$;
- if $n+k\gt0$, then $mn=m((n+k)+(-k))=m(n+k)+m(-k)=m(n+k)+(-(mk))$, so $m(n+k)=mn+mk$;
- if $n+k\lt0$, then $mk=-(m(-k))=-(m(-((n+k)+(-n))))=-(m(-(n+k)+n))=-(m(-(n+k))+mn)=m(n+k)+(-(mn))$, so $m(n+k)=mn+mk$.
If $m\gt0,n\lt0,k\gt0$, $m(n+k)=m(k+n)=mk+mn=mn+mk$.
If $m\gt0,n\lt0,k\lt0$, $m(n+k)=-(m(-(n+k)))=-(m((-n)+(-k)))=-(m(-n)+m(-k))=m(--n)+m(--k)=mn+mk$.
If $m\lt0,n\gt0,k\gt0$, $m(n+k)=-((-m)(n+k))=-((-m)n+(-m)k)=(--m)n+(--m)k=mn+mk$.
If $m\lt0,n\gt0,k\lt0$,
- if $n+k=0$, then $m(n+k)=0=mn+(-(mn))=mn+m(-n)=mn+mk$;
- if $n+k\gt0$, then $mn=-((-m)n)=-((-m)((n+k)+(-k)))=-((-m)(n+k)+(-m)(-k))=(--m)(n+k)+(--m)(-k)=m(n+k)+(-(mk))$, so $m(n+k)=mn+mk$;
- if $n+k\lt0$, then $mk=(-m)(-k)=(-m)(-((n+k)+(-n)))=(-m)(-(n+k)+n)=(-m)(-(n+k))+(-m)n=m(n+k)+(-(mn))$, so $m(n+k)=mn+mk$.
If $m\lt0,n\lt0,k\gt0$, $m(n+k)=m(k+n)=mk+mn=mn+mk$.
If $m\lt0,n\lt0,k\lt0$, $m(n+k)=(-m)(-(n+k))=(-m)((-n)+(-k))=(-m)(-n)+(-m)(-k)=mn+mk$.

Hence $m(n+k)=mn+mk$. And also, $(m+n)k=k(m+n)=km+kn=mk+nk$. $\blacksquare$

Proposition. For all $x,y,z\in R$,

if $x\gt0$ and $y\gt z$, then $xy\gt xz$;
if $x\gt0$ and $y\lt z$, then $xy\lt xz$;
if $x\lt0$ and $y\gt z$, then $xy\lt xz$;
if $x\lt0$ and $y\lt z$, then $xy\gt xz$.

(show proof)

Proposition. Every non-zero real number has a unique multiplicative inverse: for all $r\in R$ such that $r\neq0$, there exists a unique $s\in R$, such that $rs=1$ and $sr=1$. We denote the multiplicative inverse of a real number $r$ by $\frac{1}{r}$. (show proof)

Proof. Suppose $r\in R^+$, denote $\{u\in Q|u\le0\}\cup\{u\in Q^+|\exists (v\in Q^+), 1/(u+v)\notin r\}$ by $s$. We will first show that $s$ is a real number:

Trivially, $0\in s$, so $s\neq\emptyset$. Let $x\in r$ such that $x\gt 0$, then for all $v\in Q^+$, $1/(1/x+v)\lt x$, so $1/(1/x+v)\in r$, so $1/x\in Q$ but $1/x\notin s$, implying $s\neq Q$.
Let $x,y\in Q$, suppose $x\lt y$ and $y\in s$. If $x\le0$, then $x\in s$. Suppose $x\gt0$, then $y\gt0$, and there exists $v\in Q^+$ such that $1/(y+v)\notin r$. Since $1/(x+v)\gt 1/(y+v)$, $1/(x+v)\notin r$, implying $x\in s$.
Let $x\in s$. If $x\le0$, it suffices to show that $s$ has a positive element. Let $w\in Q\setminus r$, then $w\gt0$, so $1/(2w)\gt0$. Now we have $1/(1/(2w)+1/(2w))=w\notin r$, so $1/(2w)\in s$, and also, $1/(2w)\gt0\ge x$. If $x\gt0$, then there exists $v\in Q^+$ such that $1/(x+v)\notin r$. So $1/((x+v/2)+v/2)\notin r$, implying $x+v/2\in s$, and also, $x+v/2\gt x$.

We have shown that $s$ is a real number. Also, since $s$ contains a positive element, $s$ is positive.

Then we will show that $s$ is indeed an multiplicative inverse of $r$. Note that $rs=\{u\in Q|\exists (a\in r,b\in s),a\gt0\land b\gt0\land u\le ab\}$ and $1=\{u\in Q|u\lt 1\}$ (a $1$ is real and the other is rational). If $u\in rs$, then there exist $a\in r,b\in s$ such that $a\gt0$, $b\gt0$ and $u\le ab$, and there exists $v\in Q^+$ such that $1/(b+v)\notin r$, so $1/b\notin r$. Now we have $1/b\gt a$, so $u\le ab\lt 1$, implying $u\in 1$. If $u\in 1$, then $u\lt 1$. Suppose $u\le0$, then trivially, $u\in rs$. Suppose $u\gt0$. Since $u\lt 1$, there exists some natural number $n$ such that $u\lt m/(m+1)$ for all $m\ge n$. Let $w\in r$ such that $w\gt0$, let $v=w/(n+1)$, then $v\gt0$, and there exists an integer $m$ such that $vm\in r$ and $v(m+1)\notin r$. Now we have $w(m+1)/(n+1)\notin r$ but $w\in r$, so $m\gt n\ge 0$. Hence, $u/vm\lt 1/(v(m+1))$. Then there exists some $l\in Q^+$ such that $u/vm+l=1/(v(m+1))$, then $1/(u/vm+l)=v(m+1)\notin r$, so $u/vm\in s$. We now have $vm\in r$, $u/vm\in s$, $vm\gt0$, $u/vm\gt0$ and $u=vm(u/vm)$, hence $u\in rs$. Therefore, $rs=1$. And $sr=rs=1$.

Now for $r\in R^-$, since $-r\gt0$, we can define $s$ so that $s\gt0$ and $s(-r)=1$, then $(-s)r=(-(-s))(-r)=s(-r)=1$. And $r(-s)=(-s)r=1$.

To show uniqueness, suppose $rs=rt=1$, then $s=s1=s(rt)=(sr)t=1t=t$. $\blacksquare$

Proposition. For all $r\in R\setminus\{0\}$, $\frac{1}{-r}=-\frac{1}{r}$. (show proof)

Proposition. For all $r,s\in R^+$ or $r,s\in Q^-$, if $r\lt s$ then $\frac{1}{r}\gt\frac{1}{s}$. (show proof)

Proposition. For all $r\in R\setminus\{0\}$, $\frac{1}{\frac{1}{r}}=r$. (show proof)

Division of real numbers
We can define a function $\divsymbol:R\times R\setminus\{0\}\to R$ such that for all $x\in R$ and $y\in R\setminus\{0\}$, $x\divsymbol y=x\frac{1}{y}$. We may denote $x\divsymbol y$ by $\frac{x}{y}$.

Note. Given $r\in R\setminus\{0\}$, the notation $\frac{1}{r}$ takes the same value whether it is interpreted as the multiplicative inverse of $r$ or division of $1$ by $r$.

Proposition. For all $r\in R\setminus\{0\}$ and $s\in R$, $r(\frac{s}{r})=s$. (show proof)

Exponentiation of real numbers with natural exponent
For every real number $r$, by recursion theorem in the previous section, we can define a function $E_r:N\to R$ such that $E_r(0)=1$ and for every $n\in N$, $E_r(S(n))=rE_r(n)$. We will denote $E_r(n)$ by $r^n$.

Proposition. Let $r,s\in R$ and $m,n\in N$,

$(rs)^m=r^ms^m$
$r^{m+n}=r^mr^n$
$r^{mn}={(r^m)}^n$

(show proof)

Lemma. Let $x,y\in R^+$ and $n\in N^+$, if $x^n\lt y^n$, then $x\lt y$. (show proof)

Proposition. For all $x\in R^+$ and $n\in N^+$, there is a unique positive real $y$ such that $y^n=x$ (show proof). We denote $y$ by $x^{1/n}$.

Exponentiation of positive real numbers with rational exponent
Let $r\in R^+$ and $q\in Q^+$. Then there exist $m,n\in N^+$ such that $q=[([(m,0)],[(n,0)])]$. We define $r^q$ by $(r^m)^{1/n}$. Note that $r^q$ is invariant with respect to choices of $m,n$ (show proof). Now let $q\in Q$ such that $q\lt0$ and define $r^q=1/(r^{-q})$. This is well-defined since $r^{-q}\gt0$ and has a multiplicative inverse. Also define $r^0=1$, where $0\in Q$. We have defined $r^q$ for all $r\gt0$ and $q\in Q$.

Now let $s\in R$ such that there exists $q\in Q$ such that $s=\{u\in Q|u\lt q\}$, and define $r^s=r^q$. We have defined $r^s$ for all $r\gt0$ and $s\in R$ such that $s$ is also rational.

Exponentiation of real numbers with negative base
Let $r\in R$ such that $r\lt0$. Let $s\in R$ such that there exists $q\in Q$ such that $s=\{u\in Q|u\lt q\}$ and there exists $z\in Z$ such that $q=[(z,1)]$. Then either there exists $n\in N$ such that $z=[(n,0)]$, in which case we define $r^s=r^n$, or there exists $n\in N^+$ such that $z=[(0,n)]$, in which case we define $r^s=1/(r^n)$, which is well-defined since $r^n\neq0$ and has a multiplicative inverse. We have defined $r^s$ for all $r\lt0$ and $s\in R$ such that $s$ is also an integer.

Exponentiation of real numbers with zero base
Let $s\in R$ such that $s\gt0$, define $0^s=0$. Also define $0^0=1$.

Square root
Let $r$ be a non-negative real number, the square root of $r$, denoted $\sqrt r$, is defined as $r^{\frac{1}{2}}$.

Proposition. Suppose $x,y$ are non-negative real numbers, then $\sqrt{xy}=\sqrt x\sqrt y$. (show proof)

Absolute value
The absolute value of a real number $r$ is defined as $\sqrt{r^2}$, also denoted $|r|$. Clearly, $|r|=r$ if $r\ge 0$; $|r|=-r$ if $r\le0$.

Proposition. Let $a,b\in R$, then $\abs{ab}=\abs{a}\abs{b}$. (show proof)

Proposition. Let $a,b\in R$, then $\abs{a+b}\le\abs{a}+\abs{b}$. (show proof)

Extended real number system
We extend $R$ into $\overline R$ by adding two elements: $Q$, also denoted $+\infty$ or just $\infty$, and $\emptyset$, also denoted $-\infty$. Order in $\overline R$ naturally follows from $R$. Hence we have $$-\infty\lt x\lt\infty$$ for all $x\in R$.

Arithmetics partially extend from $R$ to $\overline R$. Given $x\in \overline R$:

if $x\neq -\infty$, $x+\infty=\infty+x=\infty$;
if $x\neq \infty$, $x+(-\infty)=(-\infty)+x=-\infty$;
additive inverse of $\pm\infty$ is defined to be $\mp\infty$;
if $x\gt0$, $x(\pm\infty)=(\pm\infty)x=\pm\infty$;
if $x\lt0$, $x(\pm\infty)=(\pm\infty)x=\mp\infty$;
multiplicative inverse of $\pm\infty$ is defined to be $0$.

Supremum function and infimum function
With $\overline R$ defined, we can define a function $\sup:\mathcal P(\overline R)\to\overline R$ that maps

$\emptyset$ to $-\infty$,
a subset of $R$ unbounded above to $\infty$,
a non-empty subset of $R$ bounded above to its supremum,
a set that contains $\infty$ to $\infty$,
a set $S$ that contains $-\infty$ but not $\infty$ to whatever it would map $S\setminus\{-\infty\}$ to.

Similarly, we can define a function $\inf:\mathcal P(R)\to\overline R$ that maps

$\emptyset$ to $\infty$,
a subset of $R$ unbounded below to $-\infty$, and
a non-empty subset of $R$ bounded below to its infimum.
a set that contains $-\infty$ to $-\infty$,
a set $S$ that contains $\infty$ but not $-\infty$ to whatever it would map $S\setminus\{\infty\}$ to.

Notation. Let $a,b\in\overline R$, we use

$(a,b)$ to denote $\{r\in\overline R|a\lt r\lt b\}$;
$(a,b]$ to denote $\{r\in\overline R|a\lt r\le b\}$;
$[a,b)$ to denote $\{r\in\overline R|a\le r\lt b\}$;
$[a,b]$ to denote $\{r\in\overline R|a\le r\le b\}$.

Proposition. For all $r\in R$, there exist rational numbers $p,q\in R$ such that $p\lt r\lt q$. (show proof)

Proposition. For all $r\in R^+$, there exist non-zero natural numbers $m,n\in R$ such that $1/m\lt r\lt n$. (show proof)

Proposition. For all $r,s\in R$ such that $r\lt s$, there exists a rational $q\in R$ such that $r\lt q\lt s$. (show proof)

Proposition. Define a function $f:Q\to R$ such that $f(q)=\{u\in Q|u\lt q\}$, then for all $m,n\in Q$,

$f(m)\lt f(n)$ if $m\lt n$;
$f(0)=0$;
$f(1)=1$;
$f(m+n)=f(m)+f(n)$;
$f(-n)=-f(n)$;
$f(mn)=f(m)f(n)$;
$f(\frac{1}{n})=\frac{1}{f(n)}$ if $n\neq0$;
if $f(m)=f(n)$, then $m=n$.

(show proof)

Complex number
The set of complex numbers, $C$, is defined to be the Cartesian product $R\times R$. We define the following operations on $C$:

Addition: a function $+:C\times C\to C$ such that for all $p\in C\times C$, $+(p)=(p_{00}+p_{10},p_{01}+p_{11})$.
Multiplication: a function $\times:C\times C\to C$ such that for all $p\in C\times C$, $\times(p)=(p_{00}p_{10}-p_{01}p_{11},p_{00}p_{11}+p_{01}p_{10})$.

Then given $a,b,c,d\in R$,

$(a,b)+(c,d)=(a+c,b+d)$;
$(a,b)(c,d)=(ac-bd,ad+bc)$.

If a notation $r$ represents a real number, then we may use $r$ to denote $(r,0)$.

Real part and imaginary part
Given a complex number $z$, we call $z_0$ the real part of $z$, denoted $\Re(z)$, and call $z_1$ the imaginary part of $z$, denoted $\Im(z)$.

Complex conjugate
For a complex number $z$, its conjugate, denoted $\overline{z}$, is defined as $(\Re(z),-\Im(z))$.

Proposition. In $C$,

addition is associative and commutative;
$0$ is an additive identity;
every complex number has a unique additive inverse;
multiplication is associative and commutative;
$1$ is a multiplicative identity;
every non-zero complex number has a unique multiplicative inverse;
multiplication distributes over addition.

(show proof)

Note. We denote the additive inverse of a complex number $z$ by $-z$ and the multiplicative inverse of a non-zero complex number $z$ by $\frac{1}{z}$. We showed in the proof above that given $a,b\in R$,

$-(a,b)=(-a,-b)$;
$\frac{1}{(a,b)}=(\frac{a}{a^2+b^2},\frac{-b}{a^2+b^2})$ if $a\neq0$ or $b\neq0$.

Subtraction and division of complex numbers

We can define a function $-:C\times C\to C$ such that for all $z,w\in C$, $z-w=z+(-w)$.
We can define a function $\divsymbol:C\times C\setminus\{0\}\to C$ such that for all $z\in C$ and $w\in C\setminus\{0\}$, $\frac{z}{w}=z\frac{1}{w}$.

Proposition. For all $z,w\in C$,

$--z=z$;
$z+(w-z)=w$;
$0z=0$;
$(-1)z=-z$;
$\frac{1}{-z}=-\frac{1}{z}$ if $z\neq0$;
$\frac{1}{\frac{1}{z}}=z$ if $z\neq0$;
$z(\frac{w}{z})=w$ if $z\neq0$.

(show proof)

Exponentiation of complex numbers with natural exponent
For every complex number $z$, by recursion theorem, we can define a function $E_z:N\to C$ such that $E_z(0)=1$ and for every $n\in N$, $E_z(S(n))=zE_z(n)$. We will denote $E_z(n)$ by $z^n$.

Proposition. Let $z,w\in C$ and $m,n\in N$,

$(zw)^m=z^mw^m$
$z^{m+n}=z^mz^n$
$z^{mn}=(z^m)^n$

(show proof)

The number $i$
We define $$i=(0,1)\in C$$

Notation. Let $r\in R$, we denote $(r,0)$ by $r$ and $(0,r)$ by $ir$. Clearly, $ir=(0,1)(r,0)=(0,r)=ir$. So the notation can be interpreted either as a multiplication or as a number, interchangeably. Note that, given $a,b\in R$, we have $$(a,b)=a+ib$$ And we may use either side to denote a complex number.

Proposition. $i^2=-1$ (show proof)

Proposition. $\frac{1}{i}=-i$ (show proof)

Proposition. Let $z,w\in C$ and $n\in N$, then

$\overline{z+w}=\overline z+\overline w$
$\overline{-z}=-\overline z$
$\overline{zw}=\overline z\text{ }\overline w$
$\overline{1/z}=1/\overline z$
$\overline{z^n}=(\overline z)^n$

(show proof)

Absolute value of complex numbers
Let $z\in C$, the absolute value of $z$, denoted $\abs{z}$, is defined as $$\sqrt{\Re(z\overline z)}$$ Let $a,b\in R$ such that $z=(a,b)$, then $z\overline z=(a,b)(a,-b)=(a^2+b^2,0)$, so $\abs{z}=\sqrt{a^2+b^2}$, which is well-defined. Clearly, if $z=0$, then $\abs{z}=0$; if $z\neq0$, then $\abs{z}\gt0$.

Proposition. Let $z,w\in C$, then $\abs{zw}=\abs{z}\abs{w}$. (show proof)

Proposition. Let $z,w\in C$, then $\abs{z+w}\le\abs{z}+\abs{w}$. (show proof)

Proposition. Define a function $f:R\to C$ such that $f(r)=(r,0)$, then for all $r,s\in R$,

$f(0)=0$;
$f(1)=1$;
$f(r+s)=f(r)+f(s)$;
$f(-r)=-f(r)$;
$f(rs)=f(r)f(s)$;
$f(\frac{1}{r})=\frac{1}{f(r)}$ if $r\neq0$;
$f(r^n)=f(r)^n$ for all $n\in N$;
if $f(r)=f(s)$, then $f=s$.

(show proof)

Hierarchy of number systems
We have defined the number systems $N,Z,Q,R,C$, and the injective maps to transform a number from $N$ to $Z$, from $Z$ to $Q$, from $Q$ to $R$, and from $R$ to $C$. In practice, we may implicitly transform a number across number systems by these maps, or their inverses from their ranges.

Algebraic structure (show)

Algebraic structure
An algebraic structure is a type of objects, which take the form of tuples of sets. When we define an algebraic structure, we define a formula that checks the size of a tuple and whether the entries satisfy certain conditions. Usually, the leading entry of the tuple is the "main set"; that is, when we say an object is in an instance of an algebraic structure, we mean the object is in the leading entry. In some cases, a sub-tuple of a tuple is an instance of an algebraic structure, but the tuple itself is not, due to the existence of extra entries. We may still say that said tuple is an instance of said algebraic structure, while we implicitly mean that said sub-tuple is.

Note. The concept of "function" can be viewed as an algebraic structure. An object $\mathcal F$ is a function, if and only if it has $3$ entries, and "$\mathcal F(0)$ is a function from $\mathcal F(1)$ to $\mathcal F(2)$". Note that both $\mathcal F$ and $\mathcal F(0)$ may be called a function in this case.

Operation
Given a non-empty set $S$. Let $n\in N$, an $n$-ary operation on $S$ is an ordered pair of the natural number $n$ and a function from the set of $n$-tuples of $S$, denoted $S^n$, to $S$. Note that we may implicitly discuss a $1$-ary operation on $S$ and a function from $S$ to $S$ interchangeably, and since $S^0=\{\emptyset\}$, we may implicitly discuss a $0$-ary operation on $S$ and a member of $S$ interchangeably. The collection of operations on $S$ is a set. (show proof)

Homomorphism
Suppose we have a function $f:A\to B$ with

an indexed set of operations $(F_s)_{s\in S}$ and an indexed set of relations $(R_t)_{t\in T}$ on $A$;
an indexed set of operations $(G_s)_{s\in S}$ and an indexed set of relations $(Q_t)_{t\in T}$ on $B$;

such that

for all $s\in S$, $F_s$ and $G_s$ share the same arity number $k_s$, and for all $a_1,\ldots,a_{k_s}\in A$, $f(F_s(a_1,\ldots,a_{k_s}))=G_s(f(a_1),\ldots,f(a_{k_s}))$;
for all $t\in T$, $R_t$ and $Q_t$ share the same arity number $l_t$, and for all $a_1,\ldots,a_{l_t}\in A$, if $(a_1,\ldots,a_{l_t})\in R_t$ then $(f(a_1),\ldots,f(a_{l_t}))\in Q_t$.

Then $f$ is said to be a homomorphism with respect to said operations and relations.

Isomorphism
An isomorphism is a bijective homomorphism whose inverse is also a homomorphism. Two sets are said to be isomorphic with respect to certain operations and relations, if there exists an isomorphism between them with respect to said operations and relations. If $A$ and $B$ are isomorphic, we write $A\cong B$.

Note. Sometimes, two sets are informally said to be naturally isomorphic if there exists an obvious correspondence between their elements, which is informally called a canonical isomorphism.

Automorphism
An automorphism is an isomorphism from a set to itself.

Proposition. A bijective homomorphism with respect to operations only is an isomorphism. (show proof)

Proposition. Given sets $A,B,C$ and certain operations and relations defined on all of $A,B,C$, we have:

$A\cong A$;
if $A\cong B$ then $B\cong A$;
if $A\cong B$ and $B\cong C$, then $A\cong C$.

(show proof)

Proof. Suppose the operations in question are $(\varphi_i)_{i\in S}$ on $A$, $(\psi_i)_{i\in S}$ on $B$, and $(\tau_i)_{i\in S}$ on $C$, where for all $i\in S$, $\varphi_i$, $\psi_i$ and $\tau_i$ share the same arity number, and the relations in question are $(R_i)_{i\in T}$ on $A$, $(Q_i)_{i\in T}$ on $B$, and $(P_i)_{i\in T}$ on $C$, where for all $i\in T$, $R_i$, $Q_i$ and $P_i$ share the same arity number.

Let $I:A\to A$ be the identity function on $A$, then $I$ is bijective and $I^{-1}=I$. Let $i\in S$, let $k$ be the arity number of $\varphi_i$, and let $x_1,\ldots,x_k\in A$, then $I(\varphi_i(x_1,\ldots,x_k))=\varphi_i(I(x_1),\ldots,I(x_k))$. The relations are trivially preserved by $I$. Thus $I$ is a homomorphism, and $I^{-1}=I$ is also a homomorphism. Hence $I$ is an isomorphism. We have shown that $A\cong A$.

Suppose $A\cong B$, then there exists isomorphic $f:A\to B$. Then $f^{-1}:B\to A$ is bijective and homomorphic, and $(f^{-1})^{-1}=f$ is also homomorphic. Thus $f^{-1}$ is an isomorphism, implying $B\cong A$.

Suppose $A\cong B$ and $B\cong C$, then there exist isomorphic $f:A\to B$ and $g:B\to C$. Note that $g\circ f:A\to C$ is bijective and $(g\circ f)^{-1}=f^{-1}\circ g^{-1}$. Let $i\in S$, let $k$ be the arity number of $\varphi_i$, and let $x_1,\ldots,x_k\in A$, then $(g\circ f)(\varphi_i(x_1,\ldots,x_k))=g(\psi_i(f(x_1),\ldots,f(x_k)))=\tau_i((g\circ f)(x_1),\ldots,(g\circ f)(x_k))$. The relations are trivially preserved by $g\circ f$. Thus $g\circ f$ is a homomorphism. Let $i\in S$, let $k$ be the arity number of $\tau_i$, and let $y_1,\ldots,y_k\in C$, then $(g\circ f)^{-1}(\tau_i(y_1,\ldots,y_k))=f^{-1}(\psi_i(g^{-1}(y_1),\ldots,g^{-1}(y_k)))=\varphi_i((g\circ f)^{-1}(y_1),\ldots,(g\circ f)^{-1}(y_k))$. The relations are trivially preserved by $(g\circ f)^{-1}$. Thus $(g\circ f)^{-1}$ is a homomorphism. Hence $g\circ f$ is an isomorphism. We have shown that $A\cong C$. $\blacksquare$

Topological space
A topological space is a pair of sets $(X,\tau)$, satisfying the following conditions:

$\tau$ is a collection of subsets of $X$.
$\emptyset,X\in\tau$.
The union of any subset of $\tau$ is in $\tau$.
The intersection of any pair of sets in $\tau$ is in $\tau$.

The elements of $\tau$ are called open sets in $X$ and the collection $\tau$ is called a topology on $X$. A subset $A$ of $X$ is said to be closed in $X$ if and only if $X\setminus A$ is open.

Given a topological space $X$, we can define a function $\mathcal{N}:X\to\mathcal{P}(\mathcal{P}(X))$ that maps each $x\in X$ to a collection of subsets of $X$, each of which is a superset of an open set that contains $x$. In logical symbols, this collection is $$\{N\in\mathcal{P}(X)|\exists(M\in\tau),x\in M\land M\subseteq N\}$$ The elements of $\mathcal{N}(x)$ are called neighborhoods of $x$. For all $x\in X$, the following are satisfied:

For all $N\in\mathcal{N}(x)$, $x\in N$.
For all $M\in\mathcal{P}(X)$, if there exists $N\in\mathcal{N}(x)$ such that $N\subseteq M$, then $M\in\mathcal{N}(x)$.
For all $M,N\in\mathcal{N}(x)$, $M\cap N\in\mathcal{N}(x)$.
For all $N\in\mathcal{N}(x)$, there exists $L\in\mathcal{N}(x)$ such that $L\subseteq N$ and for all $y\in L$, $N\in\mathcal{N}(y)$.

(show proof)

Proposition. Let $X$ be a topological space.

$\emptyset,X$ are closed.
The intersection of any non-empty set of closed sets are closed.
The union of any pair of closed sets are closed.

(show proof)

Proposition. Let $X$ be a topological space and $S\subseteq X$. Then $S$ is open if and only if $S$ is a neighborhood of every point in $S$. (show proof)

Basis
Let $X$ be a topological space, then a set $\mathcal B$ of open subsets of $X$ is said to be a basis of $X$ if every open subset of $X$ is the union of some subset of $\mathcal B$.

Topological subspace
Let $(X,\tau_X)$ be a topological space. Let $U$ be a subset of $X$ and define $$\tau_U=\{S\cap U:S\in\tau_X\}$$ Then $(U,\tau_U)$ is a topological space (show proof), called a topological subspace of $X$. Note that if $U\in\tau_X$, then $$\tau_U=\{S\in\tau_X|S\subseteq U\}$$

Proposition. Let $X$ be a topological space, let $U\subseteq X$, and let $V\subseteq U$. Then the topological subspace $V$ of $X$ is equivalent to the topological subspace $V$ of the topological subspace $U$ of $X$. (show proof)

Note. For the following definitions and propositions, by default, we suppose $S$ is a subset of a topological space $X$.

Interior
The union of all open subsets of $X$ contained in $S$ is called the interior of $S$, denoted $\text{Int}S$.

Exterior
The union of all open subsets of $X$ contained in $X\setminus S$ is called the exterior of $S$, denoted $\text{Ext}S$.

Closure
$X\setminus\text{Ext}S$ is called the closure of $S$, denoted $\overline S$.

Boundary
$X\setminus(\text{Int}S\cup\text{Ext}S)$ is called the boundary of $S$, denoted $\partial S$.

Compactness
A collection of open subsets of $X$ whose union contains $S$ is said to be an open cover of $S$. If every open cover of $S$ has a finite subset that is also an open cover of $S$, then $S$ is said to be compact. If $\overline S$ is compact, then $S$ is said to be precompact. If every point of $X$ has a compact neighborhood, then $X$ is said to be locally compact.

Connectedness
A topological space $X$ is said to be disconnected if it is the union of two disjoint non-empty open subsets; otherwise, it is said to be connected. The connectedness of a subset $S$ of $X$ is determined by the connectedness of $S$ as a topological subspace of $X$.

Isolated point
Let $p\in S$. If there exists a neighborhood $U$ of $p$ such that $U\cap S=\{p\}$, then $p$ is called an isolated point of $S$.

Limit point
Let $p\in X$. If for every neighborhood $U$ of $p$, there exists $q\in S$ such that $q\in U\setminus\{p\}$, then $p$ is called a limit point of $S$.

Dense
If $\overline S=X$, then $S$ is called dense in $X$.

Proposition. For all $p\in S$, $p$ is either a limit point of $S$ or an isolated point of $S$. (show proof)

Proposition. $\overline S$ is closed. (show proof)

Proposition. $S$ is closed if and only if $S=\overline S$. (show proof)

Proposition. $S$ is closed if and only it contains its limit points. (show proof)

Proposition. $\overline S$ is the union of $S$ and the set of its limit points. (show proof)

Proposition. Suppose $U,V\subseteq X$ such that $U\subseteq V$, then $\overline U\subseteq\overline V$. (show proof)

Proposition. Suppose $U,V\subseteq X$ are compact, then $U\cup V$ is compact. (show proof)

Proposition. If $S$ is compact, then every closed subset of $S$ with respect to the topology of $X$ is compact. (show proof)

Proposition. Let $U$ be a subspace of $X$ and let $S\subseteq U$. Then $S$ is compact in $X$ if and only if $S$ is compact in $U$. (show proof)

Limit
Let $X$ be a topological space, let $(p_i)$ be a sequence of points of $X$, and let $p\in X$. If for every neighborhood $S$ of $p$, there exists $n\in N$ such that for all $k\ge n$, $p_k\in S$, then $p$ is said to be the limit of $(p_i)$, denoted $p_i\mapsto p$. And we also say that $(p_i)$ converges to $p$. If $(p_i)$ does not converge to $p$ for all $p\in X$, then we say that $(p_i)$ diverges.

Continuity
Let $X$ and $Y$ be topological spaces and let $f$ be a function from a subset of $X$ to $Y$. If for every open subset $U$ of $Y$, the preimage of $f$ on $U$ is an open subset of $X$, then $f$ is said to be continuous.

Homeomorphism
Let $X$ and $Y$ be topological spaces and let $f$ be a function from $X$ to $Y$. If $f$ is bijective and continuous, and the inverse of $f$ is also continuous, then $f$ is called a homeomorphism. If there exists a homeomorphism between topological spaces $X$ and $Y$, then $X$ and $Y$ are said to be homeomorphic.

Proposition. Let $X,Y$ be topological spaces. If $f:X\to Y$ is continuous, then its restriction on every subspace of $X$ is continuous. (show proof)

Proposition. Let $X,Y$ be topological spaces. If $f:X\to Y$ is continuous, then for all topological subspace $Z$ of $Y$ such that $f(X)\subseteq Z$, $f:X\to Z$ is continuous. (show proof)

Proposition. Let $X,Y$ be topological spaces. If we have $f:X\to Y$, and for all $p\in X$, there exists an open subspace $U_p$ of $X$ containing $p$ such that $f|_{U_p}$ is continuous, then $f$ is continuous. (show proof)

Proposition. Let $X,Y,Z$ be topological spaces. If $f:X\to Y$ and $g:Y\to Z$ are continuous, then $g\circ f:X\to Z$ is continuous. (show proof)

Proposition. Let $X,Y$ be topological spaces and let $f:X\to Y$ be continuous. Then for every compact subset $U$ of $X$, $f(U)$ is compact. (show proof)

Proposition. Let $X,Y$ be topological spaces and let $f:X\to Y$ be continuous. Then for every connected subset $U$ of $X$, $f(U)$ is connected. (show proof)

Path
Let $X$ be a topological space and $u,v\in X$. A path from $u$ to $v$ is a continuous map $f:[0,1]\to X$ such that $f(0)=u$ and $f(1)=v$.

Path-connectedness
Let $X$ be a topological space. If for all $u,v\in X$ there exists a path from $u$ to $v$, then $X$ is said to be path-connected.

Lemma. $R$ is connected. (show proof)

Lemma. $[0,1]$ is connected. (show proof)

Proposition. Let $X$ be a topological space. If $X$ is path-connected, then it is connected. (show proof)

Proposition. Let $X$ be a topological space. Then $X$ is connected if and only if every continuous function $f:X\to\{1,-1\}$ is constant (where the topology of $\{1,-1\}$ is its power set). (show proof)

Metric space
A metric space is a set $X$ with a function $d:X\times X\to R$, such that for all $x,y,z\in X$:

$d(x,y)=0\leftrightarrow x=y$,
$d(x,y)=d(y,x)$,
$d(x,z)\le d(x,y)+d(y,z)$.

Metric subspace
Let $(X,d_X)$ be a metric space. It is easily verifiable that a subset $U$ of $X$, together with $$d_U=d|_{U\times U}$$ forms a metric space $(U,d_U)$, called a metric subspace of $X$.

Proposition. Given a metric space $X$, for all $x,y\in X$, $d(x,y)\ge0$. (show proof)

Proposition. Given a metric space $X$, for all $x,y\in X$, $d(x,y)\gt0\leftrightarrow x\neq y$. (show proof)

Open ball
In a metric space $X$, an open ball centered at $p\in X$ with radius $r\gt0$, denoted $B_r(p)$, is defined by $$\{u\in X|d(u,p)\lt r\}$$

Standard topology of metric spaces
Let $X$ be a metric space. Define $\tau$ as the collection of subsets $U$ of $X$, such that for every point $p$ in $U$ there is an open ball centered at $p$ and contained by $U$. In logical symbols, $\tau$ is $$\{U\subseteq X|\forall(p\in U),\exists(r\in R),r\gt0\land B_r(p)\subseteq U\}$$ Then $X$ with $\tau$ forms a topological space (show proof). We call $\tau$ the standard topology of $X$. The concepts of open set, closed set and neighborhood in $X$ follow from $\tau$.

Note. From this point on, metric spaces are implicitly topological spaces with the standard topology.

Proposition. Given a metric space $X$, let $U$ be a subset of $X$, then the topology of the topological subspace $U$ equals the standard topology of the metric subspace $U$. (show proof)

Boundedness
A subset $S$ of a metric space $X$ is said to be bounded if there exists $p\in X$ and $r\gt0$ such that for all $s\in S$, $d(s,p)\lt r$.

Limit point in metric spaces
In a metric space $X$, a point $p$ is said to be a limit point of a subset $S$, if for all $r\gt0$, there exists $s\in S$ such that $s\neq p$ and $d(s,p)\lt r$. This is clearly equivalent to the definition provided by $X$ being a topological space.

Interior point in metric spaces
A point $p\in S$ is said to be an interior point of $S$, if there exists $r\gt0$, such that for all $x\in X$ with $d(x,p)\lt r$, we have $x\in S$. This is clearly equivalent to the definition provided by $X$ being a topological space.

Isolated point in metric spaces
A point $p\in S$ is said to be an isolated point of $S$, if there exists $r\gt0$, such that for all $x\in X$ with $0\lt d(x,p)\lt r$, we have $x\notin S$. This is clearly equivalent to the definition provided by $X$ being a topological space.

Note. $R$ or $C$, together with a function $d:R\times R\to R$ or $d:C\times C\to R$, defined by $$d(x,y)=\abs{x-y}$$ clearly forms a metric space.

Group
A group is a set $G$ together with a binary operation $\cdot$ on $G$, satisfying the following conditions:

Associativity: for all $a,b,c\in G$, $(a\cdot b)\cdot c=a\cdot (b\cdot c)$.
Identity element: there exists $e\in G$ such that for all $a\in G$, $a\cdot e=a$ and $e\cdot a=a$.
Inverse element: for all $a\in F$, there exists $b\in F$, such that $a\cdot b=e$ and $b\cdot a=e$, and we denote $b$ by $a^{-1}$.

Note. In a group, identity element is clearly unique; inverse elements are also clearly unique.

Note. A group naturally defines the following operations:

A nullary operation: the identity element.
A unitary operation: the inverse operation.
A binary operation: the operation $\cdot$.

When we talk about homomorphism or isomorphism between groups, these operations must be preserved.

Ring
A ring is a set $R$ together with two binary operations on $R$ called addition $+$ and multiplication $\times$ (the notation is usually omitted), satisfying the following conditions:

Associativity of addition and multiplication: for all $a,b,c\in R$, $a+(b+c)=(a+b)+c$, and $a(bc)=(ab)c$.
Commutativity of addition: for all $a,b\in R$, $a+b=b+a$.
Additive and multiplicative identity: there exist two distinct elements $0,1\in R$ such that for all $a\in R$, $a+0=a$, $a1=a$, and $1a=a$.
Additive inverses: for all $a\in R$, there exists $b\in R$, such that $a+b=0$, and we denote $b$ by $-a$.
Distributivity of multiplication over addition: for all $a,b,c\in F$, $a(b+c)=ab+ac$ and $(a+b)c=ac+bc$.

Note. In a ring, additive identity and multiplicative identity are clearly unique; additive inverses are also clearly unique.

Note. A ring naturally defines the following operations:

Two nullary operations: $0$ and $1$.
A unitary operation: additive inverse.
Two binary operations: addition and multiplication.

When we talk about homomorphism or isomorphism between rings, these operations must be preserved.

Note. $Z$ is a ring.

Field
A field is a set $F$ together with two binary operations on $F$ called addition $+$ and multiplication $\times$ (the notation is usually omitted), satisfying the following conditions:

Associativity of addition and multiplication: for all $a,b,c\in F$, $a+(b+c)=(a+b)+c$, and $a(bc)=(ab)c$.
Commutativity of addition and multiplication: for all $a,b\in F$, $a+b=b+a$, and $ab=ba$.
Additive and multiplicative identity: there exist two distinct elements $0,1\in F$ such that for all $a\in F$, $a+0=a$ and $a1=a$.
Additive inverses: for all $a\in F$, there exists $b\in F$, such that $a+b=0$, and we denote $b$ by $-a$.
Multiplicative inverses: for all $a\in F$ such that $a\neq 0$, there exists $b\in F$, such that $ab=1$, and we denote $b$ by $\frac{1}{a}$.
Distributivity of multiplication over addition: for all $a,b,c\in F$, $a(b+c)=ab+ac$.

Note. In a field, additive identity and multiplicative identity are clearly unique; additive inverses and multiplicative inverses, if exist, are also clearly unique.

Note. A field naturally defines the following operations:

Two nullary operations: $0$ and $1$.
Two unitary operations: additive inverse and extended multiplicative inverse (mapping $0$ to $0$).
Two binary operations: addition and multiplication.

When we talk about homomorphism or isomorphism between fields, these operations must be preserved.

Note. Let $F$ be a field, and let $a,b\in F$. Since $b$ has an additive inverse, denoted $-b$, we can define subtraction by $a-b=a+(-b)$. If $b\neq0$, then $b$ has a multiplicative inverse, denoted $\frac{1}{b}$, we can define division by $\frac{a}{b}=a\frac{1}{b}$. Also, we can define exponentiation with natural exponent with recursion theorem, so that $a^0=1$ and for all $n\in N$, $a^{S(n)}=aa^n$.

Proposition. Let $F$ be a field.

For all $a\in F$, $a0=0$.
For all $a,b\in F$, if $ab=0$, then $a=0$ or $b=0$.

(show proof)

Summation and product
Let $E$ be a field. Suppose we have $z_a,\ldots,z_b\in E$ for some $a,b\in Z$ such that $a\le b$. Define $f:N\to E$ by $f(n)=z_{a+n}$ if $n\le b-a$ and $f(n)=0$ otherwise, then by "repeated operation" in the set theory section, we can define a function $F:N\to E$ such that $F(n)$ represents $f(0)+\ldots+f(n)$ and a function $G:N\to E$ such that $G(n)$ represents $f(0)\times\ldots\times f(n)$. Then we will denote $F(b-a)$, representing $z_a+\ldots+z_b$, by $$\sum_{k=a}^bz_k$$ and $G(b-a)$, representing $z_a\times\ldots\times z_b$, by $$\prod_{k=a}^bz_k$$ where $k$ is a variable symbol not occurring in the term $z$. Also, given $a,b\in Z$ such that $a\gt b$, $\sum_{k=a}^bz_k$ denotes $0$, and $\prod_{k=a}^bz_k$ denotes $1$. Note that $k$ does not actually occur in the term and is purely meta-logical serving to indicate the underlying function $(z_a,\ldots,z_b)$ from $\{a,\ldots,b\}$ to $E$.

Proposition. $\sum_{k=a}^az_k=z_a$. (show proof)

Proposition. Given $a-1\le c\le b$, we have $$\sum_{k=a}^bz_k=\sum_{k=a}^cz_k+\sum_{k=c+1}^bz_k$$ and $$\prod_{k=a}^bz_k=\prod_{k=a}^cz_k\prod_{k=c+1}^bz_k$$ (show proof)

Note. With the above propositions, we have that $$\sum_{k=a}^bz_k=\p{\sum_{k=a}^{b-1}z_k}+z_b$$ and $$\prod_{k=a}^bz_k=\p{\prod_{k=a}^{b-1}z_k}z_b$$ if $a\le b$.

Proposition. Suppose $a\le b$ and $c$ is a constant, then

$$\sum_{k=a}^b(cz_k)=c\p{\sum_{k=a}^bz_k}$$
$$\sum_{k=a}^b(z_k+w_k)=\sum_{k=a}^bz_k+\sum_{k=a}^bw_k$$
$$\prod_{k=a}^b(cz_k)=c^{b-a+1}\p{\prod_{k=a}^bz_k}$$
$$\prod_{k=a}^b(z_kw_k)=\prod_{k=a}^bz_k\prod_{k=a}^bw_k$$

(show proof)

Note. $Q$, $R$, and $C$ are fields.

Vector space
A vector space is a set $V$, whose members are called vectors, together with a field $F$, whose members are called scalars, and the following operations:

vector addition: $+:V \times V \to V$, and
scalar multiplication: for each $c\in F$, $c:V \to V$

that satisfies the following conditions:

Associativity of vector addition: for all $\vb u,\vb v,\vb w\in V$, $\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u} + \mathbf{v}) + \mathbf{w}$.
Commutativity of vector addition: for all $\vb u,\vb v\in V$, $\mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u}$.
Identity element of vector addition: there exists an element $\mathbf{0} \in V$, such that for all $\mathbf{v} \in V$, $\mathbf{v} + \mathbf{0} = \mathbf{v}$.
Inverse elements of vector addition: for all $\mathbf{v} \in V$, there exists an element $\mathbf{w} \in V$, such that $\mathbf{v} + \mathbf{w} = \mathbf{0}$, and we denote $\mathbf{w}$ by $-\mathbf{v}$.
Compatibility of scalar multiplication with field multiplication: for all $a,b\in F$ and $\vb v\in V$, $a(b\mathbf{v}) = (ab)\mathbf{v}$.
Identity element of scalar multiplication: for all $\mathbf{v} \in V$, $1\mathbf{v} = \mathbf{v}$, where $1$ denotes the multiplicative identity in $F$.
Distributivity of scalar multiplication with respect to vector addition: for all $a\in F$ and $\vb u,\vb v\in V$, $a(\mathbf{u} + \mathbf{v}) = a\mathbf{u} + a\mathbf{v}$.
Distributivity of scalar multiplication with respect to field addition: for all $a,b\in F$ and $\vb v\in V$, $(a + b)\mathbf{v} = a\mathbf{v} + b\mathbf{v}$.

Note. In a vector space, identity element of vector addition is clearly unique; inverse elements of vector addition are also clearly unique.

Note. A vector space naturally defines the following operations:

A nullary operation: $\vb0$.
A unitary operation: additive inverse.
A binary operation: vector addition.
A schema of unitary operations: scalar multiplication.

When we talk about homomorphism or isomorphism between vector spaces, these operations must be preserved. However, it is trivial that a mapping that preserves vector addition also preserves $\vb0$ and additive inverse, thus preservation of vector addition and scalar multiplication suffices to prove a mapping homomorphic.

Note. Let $V$ be a vector space, and let $\vb v,\vb w\in V$. Since $\vb w$ has an inverse element of vector addition, denoted $-\vb w$, we can define vector subtraction by $\vb v-\vb w=\vb v+(-\vb w)$.

Proposition. Let $V$ be a vector space over a field $F$.

For all $c\in F$, $c\vb0=\vb0$.
For all $\vb v\in V$, $0\vb v=\vb0$.
For all $\vb v\in V$, $-\vb v=(-1)\vb v$.

(show proof)

Note. Given a field $F$. For any non-zero natural number $n$, the set consists of all $n$-tuples of $F$, denoted $F^n$, together with $F$ itself as a field and the following operations:

$+:F^n \times F^n \to F^n$, such that $(a_1,\ldots,a_n)+(b_1,\ldots,b_n)=(a_1+b_1,\ldots,a_n+b_n)$, and
for each $c\in F$, $c:F^n \to F^n$, such that $c(a_1,\ldots,a_n)=(ca_1,\ldots,ca_n)$,

forms a vector space.

Note. $F^1$ and $F$ are naturally isomorphic. In practice, we may implicitly translate between them.

Normed vector space
A normed vector space is a vector space $V$ over a field $F$, which is either $R$ or $C$, with a norm $\Vert\cdot\Vert:V\to R$ with the following properties:

For all $\vb v\in V$, $\Vert\vb v\Vert\ge0$.
For all $\vb v\in V$, $\Vert\vb v\Vert=0$ if and only if $\vb v=\vb0$.
For all $\vb v\in V$ and $c\in F$, $\Vert c\vb v\Vert=\abs{c}\Vert\vb v\Vert$.
For all $\vb u,\vb v\in V$, $\Vert\vb v+\vb u\Vert\le\Vert\vb v\Vert+\Vert\vb u\Vert$.

Standard metric of normed vector spaces
Let $V$ be a normed vector space. Define a function $d:V\times V\to R$ by $$d(\vb u,\vb v)=\Vert\vb u-\vb v\Vert$$ Then $V$ with $d$ forms a metric space (show proof). We call $d$ the standard metric of $V$.

Note. From this point on, normed vector spaces are implicitly metric spaces with the standard metric.

Inner product space
An inner product space is a vector space $V$ over $R$, together with an inner product $\braket{\cdot}{\cdot}:V\times V\to R$ with the following properties:

For all $\vb u,\vb v,\vb w\in V$, $\braket{\vb u+\vb v}{\vb w}=\braket{\vb u}{\vb w}+\braket{\vb v}{\vb w}$.
For all $\vb u,\vb v\in V$ and $c\in R$, $\braket{c\vb u}{\vb v}=c\braket{\vb u}{\vb v}$.
For all $\vb u,\vb v\in V$, $\braket{\vb u}{\vb v}=\braket{\vb v}{\vb u}$.
For all $\vb u\in V$ such that $\vb u\neq\vb0$, $\braket{\vb u}{\vb u}\gt0$.

The last property is called positive-definiteness.

Proposition. The following properties of inner product can be deduced:

For all $\vb u,\vb v,\vb w\in V$, $\braket{\vb u}{\vb v+\vb w}=\braket{\vb u}{\vb v}+\braket{\vb u}{\vb w}$.
For all $\vb u,\vb v\in V$ and $c\in R$, $\braket{\vb u}{c\vb v}=c\braket{\vb u}{\vb v}$.
For all $\vb u\in V$, $\braket{\vb0}{\vb u}=0$ and $\braket{\vb u}{\vb0}=0$.

(show proof)

Complex inner product space
A complex inner product space is a vector space $V$ over $C$, together with a complex inner product $\braket{\cdot}{\cdot}:V\times V\to C$ with the following properties:

For all $\vb u,\vb v,\vb w\in V$, $\braket{\vb u+\vb v}{\vb w}=\braket{\vb u}{\vb w}+\braket{\vb v}{\vb w}$.
For all $\vb u,\vb v\in V$ and $c\in C$, $\braket{c\vb u}{\vb v}=c\braket{\vb u}{\vb v}$.
For all $\vb u,\vb v\in V$, $\braket{\vb u}{\vb v}=\overline{\braket{\vb v}{\vb u}}$.
For all $\vb u\in V$ such that $\vb u\neq\vb0$, $\Re(\braket{\vb u}{\vb u})\gt0$.

Proposition. The following properties of complex inner product can be deduced:

For all $\vb u,\vb v,\vb w\in V$, $\braket{\vb u}{\vb v+\vb w}=\braket{\vb u}{\vb v}+\braket{\vb u}{\vb w}$.
For all $\vb u,\vb v\in V$ and $c\in C$, $\braket{\vb u}{c\vb v}=\overline{c}\braket{\vb u}{\vb v}$.
For all $\vb u\in V$, $\braket{\vb0}{\vb u}=0$ and $\braket{\vb u}{\vb0}=0$.
For all $\vb u\in V$, $\Im(\braket{\vb u}{\vb u})=0$.

(show proof)

Definition. Given an inner product space $V$, we can define a function $\Vert\cdot\Vert:V\to R$ by $$\Vert\vb v\Vert=\sqrt{\braket{\vb v}{\vb v}}$$ since $\braket{\vb v}{\vb v}\ge0$ for all $\vb v\in V$.

Given a complex inner product space $V$, we can define a function $\Vert\cdot\Vert:V\to R$ by $$\Vert\vb v\Vert=\sqrt{\Re(\braket{\vb v}{\vb v})}$$ since $\Re(\braket{\vb v}{\vb v})\ge0$ for all $\vb v\in V$.

Clearly, $\Vert\cdot\Vert$ satisfies the first three properties of a norm (show proof).

Note. It is trivial that $R$ with $\braket{a}{b}=ab$ is an inner product space, and $C$ with $\braket{z}{w}=z\overline{w}$ is a complex inner product space. For both $R$ and $C$, $\Vert x\Vert=\abs{x}$.

Orthogonality
Let $V$ be a real/complex inner product space and $\vb u,\vb v\in V$. If $\braket{\vb u}{\vb v}=0$, we say $\vb u$ and $\vb v$ are orthogonal.

Absolute homogeneity
Let $V$ be a real/complex inner product space. For all $\vb v\in V$ and scalar $c$, we have $$\Vert c\vb v\Vert=\abs{c}\Vert\vb v\Vert$$ (show proof)

Pythagorean theorem
Let $V$ be an inner product space. For all $\vb u,\vb v\in V$, $\vb u$ and $\vb v$ are orthogonal if and only if $$\Vert\vb u+\vb v\Vert^2=\Vert\vb u\Vert^2+\Vert\vb v\Vert^2$$ Let $V$ be a complex inner product space. For all $\vb u,\vb v\in V$, if $\vb u$ and $\vb v$ are orthogonal, then $$\Vert\vb u+\vb v\Vert^2=\Vert\vb u\Vert^2+\Vert\vb v\Vert^2$$ (show proof)

Cauchy-Schwarz inequality
Let $V$ be a real/complex inner product space. For all $\vb u,\vb v\in V$, we have $$\Vert\vb u\Vert\Vert\vb v\Vert\ge\abs{\braket{\vb u}{\vb v}}$$ Also, $\Vert\vb u\Vert\Vert\vb v\Vert=\abs{\braket{\vb u}{\vb v}}$ if and only if one of $\vb u$ and $\vb v$ is a scalar multiple of the other. (show proof)

Triangle inequality
Let $V$ be a real/complex inner product space. For all $\vb u,\vb v\in V$, we have $$\Vert\vb u+\vb v\Vert\le\Vert\vb u\Vert+\Vert\vb v\Vert$$ Also, $\Vert\vb u+\vb v\Vert=\Vert\vb u\Vert+\Vert\vb v\Vert$ if and only if one of $\vb u$ and $\vb v$ is a non-negative real multiple of the other. (show proof)

Standard norm of inner product spaces
With the triangle inequality, we have shown that the function $\Vert\cdot\Vert$ is indeed a norm of $V$ (as a real/complex inner product space), which we will call the standard norm of $V$.

Note. From this point on, real/complex inner product spaces are implicitly normed vector spaces with the standard norm.

Euclidean space
For any non-zero natural number $n$, the vector space $R^n$ together with the function $\cdot:R^n\times R^n\to R$, called the dot product, defined by $$(a_1,\ldots,a_n)\cdot (b_1,\ldots,b_n)=\sum_{i=1}^na_ib_i$$ clearly forms an inner product space, where $\cdot$ is the inner product. We call this inner product space the $n$-dimensional Euclidean space, denoted $R^n$. Note that we shifted indexes by $1$ for convenience in this definition, and the same will be done implicitly when Euclidean spaces are concerned. Although not usually included in discussions about Euclidean space, $R^0$ is defined to be the inner product space consisting of the set of $0$-tuples of real numbers, which is just $\{\emptyset\}$, the vector addition operation such that $\emptyset+\emptyset=\emptyset$, the scalar multiplication operation such that $c\emptyset=\emptyset$ for all $c\in R$, and the inner product such that $\braket{\emptyset}{\emptyset}=0$.

Note. A Euclidean space is an inner product space, a normed vector space, a metric space, and a topological space.

Projection function
Let $n\in N$ and $i\in\{1,\ldots,n\}$. We define $f_i:R^n\to R$ such that for all $\vb x\in R^n$, $f_i(\vb x)=x_i$. Then $f_i$ is called the projection function on the $i$-th coordinate from $R^n$.

Lemma. In a topological space, if $E$ is an infinite subset of a compact set $K$, then $E$ has a limit point in $K$. (show proof)

Lemma. In a metric space, If $p$ is a limit point of a set $S$, then every open ball $B_r(p)$ contains infinitely many points of $S$. (show proof)

Proposition. A subset $S$ of $R^n$ is compact if and only if it is closed and bounded. (show proof)

Proof. Suppose $S$ is closed and bounded. Then there exists $r\gt0$ such that $S$ is a subset of the $n$-cell $I^*$ consisting of all points $(x_1,\ldots,x_n)$ where $-r\le x_i\le r$ for $i$ from $1$ to $n$. Note that there exists $\delta\gt0$ such that for all $\vb a,\vb b\in I^*$, $d(\vb a,\vb b)\lt\delta$. Suppose for contradiction that $\mathcal C$ is an open cover of $I^*$ with no finite subcover. Then for every sub-$n$-cell $I$ of $I^*$ such that $\mathcal C$ has no finite subcover of $I$, if we equally divide $I$ into $2^n$ sub-$n$-cells by dividing $I$ into halves in each dimension, then $\mathcal C$ has no finite subcover of at least one of the sub-$n$-cells, because otherwise there exists a finite subcover of $I$. Therefore, by axiom of choice, there exists a function $f$ that maps every "sub-$n$-cell $I$ of $I^*$ such that $\mathcal C$ has no finite subcover of $I$" to a "sub-$n$-cell $I'$ of $I$ obtained by dividing $I$ into halves in each dimension such that $\mathcal C$ has no finite subcover of $I'$". By recursion theorem, we can define a sequence of $n$-cells $(I_n)$ with $I_0=I^*$, and $I_{n+1}=f(I_n)$. By induction, for any natural number $m$,

for all $k\in N$ such that $k\lt m$, $I_m\subset I_k$;
$I_m$ is not covered by any finite subset of $\mathcal C$;
if $\vb a,\vb b\in I_m$, then $d(\vb a,\vb b)\lt2^{-m}\delta$.

For each $n$-cell $I_m$, in each dimension $i$, $I_m$ has boundaries $a_{m,i}\le x_i\le b_{m,i}$. Note that for all $k\in N$ such that $k\lt m$, $a_{k,i}\le a_{m,i}\lt b_{m,i}\le b_{k,i}$. Let $x^*_i=\sup\{a_{m,i}:m\in N\}$, then clearly, for all $m\in N$, $a_{m,i}\le x^*_i\le b_{m,i}$. Now let $\vb x^*=(x^*_1,\ldots,x^*_n)$, then $\vb x^*\in I_m$ for all $m\in N$. Then there exists a member $C$ of $\mathcal C$ such that $\vb x^*\in C$. Since $C$ is open, there exists $r\gt 0$ such that $B_r(\vb x^*)\subseteq C$. But then there exists some $k\in N$ such that $2^{-k}\delta\lt r$, implying $I_k\subseteq C$, a contradiction. Hence every open cover of $I^*$ has a finite subcover, or $I^*$ is compact. Since $S$ is a closed subset of a compact set $I^*$, $S$ is compact.

Now suppose $S$ is compact. Then every infinite subset of $S$ as a limit point in $S$. Suppose for contradiction that $S$ is not bounded. Then for every $k\in N^+$, there exists $\vb p_k\in S$ such that $\Vert\vb p_k\Vert\gt k$. Let $E=\{\vb p_k\in R^n|k\in N^+\}$, then $E$ is infinite, and for every $\vb p\in R^n$ and every $r\gt0$, $B_r(\vb p)$ contains at most finitely many elements of $E$, hence $\vb p$ is not a limit point of $E$, implying $E$ is an infinite subset of $S$ without a limit point in $R^n$ and hence $S$, a contradiction. Therefore, $S$ is bounded.

Suppose for contradiction that $S$ is not closed. Then there exists a limit point $\vb p$ of $S$ such that $p\notin S$. For each $k\in N^+$, there exists $\vb p_k\in S$ such that $d(\vb p_k,\vb p)\lt1/k$. Let $E=\{\vb p_k\in R^n|k\in N^+\}$, then $E$ is infinite, and $\vb p$ is a limit point of $E$. If $\vb r\in R^n$ such that $\vb r\neq\vb p$, then there exists $m\in N^+$ such that $1/m\lt d(\vb p,\vb r)/2$, so for all $k\in N$ such that $k\gt m$, $d(\vb p_k,\vb r)\ge d(\vb p,\vb r)-d(\vb p,\vb p_k)\gt d(\vb p,\vb r)-1/k\gt d(\vb p,\vb r)-1/m\gt d(\vb p,\vb r)/2$. So let $r=d(\vb p,\vb r)/2$, there are at most finitely many elements of $E$ in $B_r(\vb r)$, implying $\vb r$ is not a limit point of $E$. Hence $E$ is an infinite subset of $S$ without a limit point in $S$, a contradiction. Therefore, $S$ is closed. $\blacksquare$

Order theory (show)

Class
Class is a meta-logical concept. A class defined by a formula $\varphi$ and a variable symbol $x$, denoted $\{x:\varphi(x)\}$, is a meta-logical function that maps each pair of structure and assignment (of free variables in $\varphi$ other than $x$) to the collection of objects in the universe of the structure such that, when assigned to $x$ and hence completing the assignment function for $\varphi$, the validity function defined by the structure and $\varphi$ maps the assignment function to $\top$. In other words, the class $\{x:\varphi(x)\}$ represents the collection of sets $x$ that satisfy $\varphi(x)$. In $ZFW$, certain notations involving classes can be translated into formulas. Most importantly,

$y\in\{x:\varphi(x)\}$ can be translated into $\varphi(y)$;
$y\subseteq\{x:\varphi(x)\}$ can be translated into $\forall u(u\in y\to\varphi(u))$;
$\{x:\varphi(x)\}\subseteq\{x:\psi(x)\}$ can be translated into $\forall u(\varphi(u)\to\psi(u))$;
$\{x:\varphi(x)\}=\{x:\psi(x)\}$ can be translated into $\forall u(\varphi(u)\leftrightarrow\psi(u))$.

Note. The class $\{x:x=x\}$ represents the universe for models, but not for structures in general.

Set-like class
A class $X=\{x:\varphi(x)\}$ is called set-like if there exists a set $Y$ such that for all $u$, $\varphi(u)$ if and only if $u\in Y$. Note that existence implies unique existence, and we denote $X$ being set-like to $Y$ by $X=Y$ or $Y=X$. A class that is not set-like is called a proper class.

Subclass
Given a class $X=\{x:\varphi(x)\}$, $\{u\in X|\theta(u)\}$ is the class $\{t:\psi(t)\}$ where $\psi(t)$ is the formula $\varphi(t)\land\theta(t)$.

Class union
Given a class $X=\{x:\varphi(x)\}$, $\cup X$ is the class $\{t:\psi(t)\}$ where $\psi(t)$ is the formula $\exists u(\varphi(u)\land t\in u)$.

Class intersection
Given a class $X=\{x:\varphi(x)\}$, $\cap X$ is the class $\{t:\psi(t)\}$ where $\psi(t)$ is the formula $\forall u(\varphi(u)\to t\in u)$.

Class relation
Given classes $A$ and $B$, a class relation from $A$ to $B$ is a class $R$ such that for all $u\in R$, there exist $a\in A$ and $b\in B$ such that $u=(a,b)$.

Class function
Given classes $X$ and $Y$, a class function from $X$ to $Y$ is a class relation $f$ from $X$ to $Y$ such that for all $x\in X$, there exists a unique $y$ such that $(x,y)\in f$. We denote the unique $y$ such that $(x,y)\in f$ as $f(x)$.

Proposition. Given class functions $f:X\to Y$ and $g:X\to Y$, $f=g$ if and only if for all $x\in X$, $f(x)=g(x)$. (show proof)

Defining class functions with formulas
Given classes $X$ and $Y$ and a formula $\varphi(x,y)$ such that for all $x\in X$ there exists a unique $y\in Y$ that satisfies $\varphi(x,y)$, $$\{u:\exists x\exists y(x\in X,y\in Y,u=(x,y),\varphi(x,y))\}$$ defines a class function $f$ from $X$ to $Y$ (show proof). And we say $f$ is defined by $\varphi(x,y)$ with $x$ as input and $y$ as output.

Proposition. Given classes $X$ and $Y$, suppose $y(x)$ is some term such that for all $x\in X$, $y(x)\in Y$, then we can define a class function $f:X\to Y$ such that for all $x\in X$, $f(x)=y(x)$. (show proof)

Note. Given a class $X$ and a term $y(x)$, let $V$ denote the universe, we can define a class function $f:X\to V$ such that for all $x\in X$, $f(x)=y(x)$.

Note. Given meta-logical $n$, pairwise disjoint classes $X_1,\ldots,X_n$, and terms $y_1(x),\ldots,y_n(x)$, we can construct a term $y(x)$, using class functions, such that for each meta-logical $i\in\{1,\ldots,n\}$, we have $(x\in X_i)\to(y(x)=y_i(x))$. This is useful in constructing definitions with conditions.

Restriction of class functions on sets
Suppose $F$ is a class function from $A$ to $B$ and $X$ is a set such that $X\subseteq A$, then there exists a unique function $f$ from $X$ such that for all $x\in X$, $f(x)=F(x)$ (show proof), which we will denote as $F\upharpoonright X$.

Domain and Image
Given a class $f$, the domain of $f$, denoted $\text{dom}f$, is the class $\{x:\exists y((x,y)\in f)\}$, and the image, or range, of $f$, denoted $\text{im}f$, is the class $\{y:\exists x((x,y)\in f)\}$. Whether $f$ is indeed a class function does not matter.

Linear order
Also called strict total order. A linear order is a binary relation $\lt$ on a set $S$ such that:

for all $x\in S$, $x\nless x$;
for all $x,y\in S$, if $x\neq y$ then $x\lt y$ or $y\lt x$;
for all $x,y,z\in S$, if $x\lt y$ and $y\lt z$ then $x\lt z$.

Clearly, given $x,y\in S$, we have exactly one among $x=y$, $x\lt y$ and $y\lt x$. Also, by axiom of foundation, a set $S$ is linearly ordered by $\in$ if and only if given $x,y\in S$, we have exactly one among $x=y$, $x\in y$ and $y\in x$.

Well-founded relation
A binary relation $\lt$ on $S$ is called well-founded if every non-empty subset $E$ of $S$ has a strictly least element $x^*$, such that for all $x\in E$, $x\nless x^*$.

Strict well-order
A strict well-order is a well-founded linear order.

Note. Clearly, given a well-order $\le$, "$\le$ and $\neq$" is a strict well-order; given a strict well-order $\lt$, "$\lt$ or $=$" is a well-order. Also, given a well-ordered/strictly well-ordered set, any restriction of the order on a subset well-orders/strictly well-orders it.

Proposition. Every set can be strictly well-ordered. (show proof)

Transitivity
A set $x$ is said to be transitive if every element of $x$ is a subset of $x$.

Ordinal
A set $\alpha$ is said to be an ordinal if it is transitive and linearly ordered by $\in$.

Notation. Given an ordinal $\alpha$, we denote $\alpha\cup\{\alpha\}$ by $\alpha^+$, which is also an ordinal.

Proposition. Every member of an ordinal is also an ordinal. (show proof)

Proposition. An ordinal is strictly well-ordered by $\in$. (show proof)

Proposition. An ordinal is well-ordered by $\subseteq$. (show proof)

Note. Since given $\alpha,\beta$ in an ordinal $X$, $\beta\subseteq\alpha$ if and only if $\beta\in\alpha$ or $\beta=\alpha$ if and only if $\beta\in\alpha^+$, the relation defined by $\beta\in\alpha^+$ well orders $X$.

Order embedding
Given relations $\sim_A$ on $A$ and $\sim_B$ on $B$, an order embedding $f$ from $A$ to $B$ is an injection such that $x\sim_A y$ implies $f(x)\sim_B f(y)$.

Order isomorphism
Given relations $\sim_A$ on $A$ and $\sim_B$ on $B$, an order isomorphism $f$ from $A$ to $B$ is a bijection such that $x\sim_A y$ implies $f(x)\sim_B f(y)$ and $x\sim_B y$ implies $f^{-1}(x)\sim_A f^{-1}(y)$.

Order on ordinals
given ordinals $\alpha$ and $\beta$, we use $\alpha\cong\beta$ to denote that there is an order isomorphism between $\alpha$ and $\beta$, and $\alpha\lt\beta$ to denote that there is an order isomorphism between $\alpha$ and some $\gamma\in\beta$. Also, $\alpha\le\beta$ denotes $\alpha\lt\beta$ or $\alpha\cong\beta$. Note that $\cong,\le,\lt$ are class relations defined on the class of ordinals. And $\cong$ is clearly an equivalence relation.

Lemma. Let $\alpha$ be an ordinal, let $\beta\subseteq\alpha$, and let $F$ be an order isomorphism from $\alpha\to\beta$, then for all $x\in\alpha$, $x\in F(x)^+$. (show proof)

Lemma. Let $\alpha$ be an ordinal and $\beta\in\alpha$, then $\alpha\ncong\beta$. (show proof)

Lemma. Let $\alpha,\beta$ be ordinals, if $\beta\subset\alpha$, then $\beta\in\alpha$. (show proof)

Lemma. Let $\alpha,\beta$ be ordinals, if $\alpha\cong\beta$, then for all $\gamma\in\alpha$ there exists a unique $\delta\in\beta$ such that $\gamma\cong\delta$. (show proof)

Proposition. Let $\alpha,\beta$ be ordinals, then exactly one among $\alpha\lt\beta$, $\alpha\cong\beta$, $\beta\lt\alpha$ holds. (show proof)

Proof. Define $F=\{(a,b)\in\alpha^+\times\beta^+|a\cong b\}$. Suppose for contradiction that $\text{dom}F\neq\alpha^+$ and $\text{im}F\neq\beta^+$. Then there exist a strictly least element $a^*$ of $\alpha^+\setminus\text{dom}F$ and a strictly least element $b^*$ of $\beta^+\setminus\text{im}F$. Note that for all $a\in\alpha^+$ such that $a^*\in a$, $a\notin\text{dom}F$, because otherwise $a^*\in\text{dom}F$, a contradiction. Similarly, for all $b\in\beta^+$ such that $b^*\in b$, $b\notin\text{im}F$. Hence $a^*=\text{dom}F$ and $b^*=\text{im}F$.

Suppose $(a,b)\in F$ and $(a,c)\in F$, then $b\cong c$, hence $b=c$, since both $b\in c$ and $c\in b$ are contradictory. Hence $F$ is a function from $a^*$ to $b^*$. Similarly, if $(a,c)\in F$ and $(b,c)\in F$, then $a=b$, implying $F$ is injective. Since $F$ is clearly surjective from $a^*$ to $b^*$, $F:a^*\to b^*$ is a bijection. Suppose $a,b\in a^*$ with $a\in b$, then clearly $F(a)\neq F(b)$. Suppose for contradiction that $F(b)\in F(a)$, then there exists $c\in a$ such that $b\cong F(b)\cong c$, but we have $c\in b$, a contradiction. Hence $F(a)\in F(b)$. Thus $F$ preserves order, and so does $F^{-1}$, implying $F$ is an order isomorphism between $a^*$ and $b^*$. But then we have $a^*\cong b^*$, a contradiction.

Therefore, we have exactly one among:

$\text{dom}F=\alpha^+$ and $\text{im}F\neq\beta^+$, in which case $\alpha\cong\gamma$ for some $\gamma\in\beta^+$ but not $\alpha\cong\beta$, hence $\alpha\lt\beta$ and not $\alpha\cong\beta$ or $\beta\lt\alpha$;
$\text{dom}F=\alpha^+$ and $\text{im}F=\beta^+$, in which case $\alpha\cong\beta$ and not $\alpha\lt\beta$ or $\beta\lt\alpha$;
$\text{dom}F\neq\alpha^+$ and $\text{im}F=\beta^+$, in which case $\beta\cong\gamma$ for some $\gamma\in\alpha^+$ but not $\beta\cong\alpha$, hence $\beta\lt\alpha$ and not $\alpha\cong\beta$ or $\alpha\lt\beta$.

$\blacksquare$

Transfinite induction on ordinals
Suppose $X$ is an ordinal and $\varphi(x)$ is a formula. If $\forall(\alpha\in X),(\forall(\beta\in\alpha),\varphi(\beta))\to\varphi(\alpha)$ then we have $\forall(\alpha\in X),\varphi(\alpha)$. (show proof)

Transfinite recursion on ordinals
Suppose $X$ is an ordinal, and $G$ is a class function defined on the universe $V$. Then there exists a unique function $F$ from $X$ such that for all $\alpha\in X$, $F(\alpha)=G(F|_\alpha)$. (show proof)

Proof. Let $\varphi(f,\alpha)$ denote the property "$f$ is a function from $\alpha^+$ such that $f(\beta)=G(f|_\beta)$ for all $\beta\in\alpha^+$". Define $A=\{\alpha\in X|\forall f(\neg\varphi(f,\alpha))\}$. Suppose for contradiction that $A\neq\emptyset$. Then $A$ has a strictly least element $\kappa$. Note that for all $\alpha\in\kappa$, there exists $f$ with $\varphi(f,\alpha)$. Thus we can define the class $H=\cup\{f:\exists(\alpha\in\kappa),\varphi(f,\alpha)\}$. Note that each set in $H$ is an ordered pair of the form $(\beta,f(\beta))$ where we have $\varphi(f,\alpha)$ for some $\alpha\in\kappa$ and $\beta\in\alpha^+$. We use $H_S$ to denote the subclass of $H$ consisting of ordered pairs $(a,b)$ with $a\in S$.

Let $\alpha\in\kappa$ and suppose for inductive hypothesis that for all $\beta\in\alpha$, $H_{\{\beta\}}=\{(\beta,G(f|_\beta))\}$ for some $f$ with $\varphi(f,\beta)$. Clearly, there exists $f$ with $\varphi(f,\alpha)$, so $(\alpha,f(\alpha))\in H_{\{\alpha\}}$. Since $\varphi(f,\alpha)$, we have $f(\alpha)=G(f|_\alpha)$, hence $(\alpha,G(f|_\alpha))\in H_{\{\alpha\}}$. Suppose $a^*,b^*\in H_{\{\alpha\}}$, then $a^*=(\alpha,f_1(\alpha)),b^*=(\alpha,f_2(\alpha))$ where $\varphi(f_1,\alpha_1),\varphi(f_2,\alpha_2)$ for some $\alpha_1,\alpha_2\in\kappa$ and $\alpha$ in both $\alpha_1^+$ and $\alpha_2^+$. Since for all $\beta\in\alpha$, $(\beta,f_1(\beta)),(\beta,f_2(\beta))\in H_{\{\beta\}}$, we have $f_1(\beta)=G(f|_\beta)=f_2(\beta)$ for some $f$ with $\varphi(f,\beta)$. Thus $f_1|_\alpha=f_2|_\alpha$, implying $f_1(\alpha)=G(f_1|_\alpha)=G(f_2|_\alpha)=f_2(\alpha)$, and hence $a^*=b^*$. Therefore, $H_{\{\alpha\}}=\{(\alpha,G(f|_\alpha))\}$ for some $f$ with $\varphi(f,\alpha)$. By transfinite induction, for all $\alpha\in\kappa$, $H_{\{\alpha\}}=\{(\alpha,G(f|_\alpha))\}$ for some $f$ with $\varphi(f,\alpha)$.

Now let $\alpha\in\kappa$, then there exists $f$ with $\varphi(f,\alpha)$, so $(\alpha,f(\alpha))\in H$. If $(\alpha,a^*),(\alpha,b^*)\in H$, then $(\alpha,a^*),(\alpha,b^*)\in H_{\{\alpha\}}$, so $a^*=b^*$. Hence for all $\alpha\in\kappa$, there exists a unique $\beta$ such that $(\alpha,\beta)\in H$. By axiom schema of replacement, there exists a set $Y$ such that for all $\alpha\in\kappa$, there exists a unique $\beta\in Y$ with $(\alpha,\beta)\in H$, which defines a function $h:\kappa\to Y$, and $h$ is clearly a function from $\kappa$. Let $\alpha\in\kappa$, then for all $f$ with $\varphi(f,\alpha)$, for all $\beta\in\alpha^+$, $(\beta,f(\beta))\in H$, so $h(\beta)=f(\beta)$, implying $h|_\alpha=f|_\alpha$. Note that $h(\alpha)=G(f|_\alpha)$ for some $f$ with $\varphi(f,\alpha)$, hence $h(\alpha)=G(f|_\alpha)=G(h|_\alpha)$.

Now let $g=h\cup\{(\kappa,G(h))\}$, then $g$ is a function from $\kappa^+$, such that for all $\alpha\in\kappa^+$, if $\alpha\in\kappa$, then $g(\alpha)=h(\alpha)=G(h|_\alpha)=G(g|_\alpha)$; if $\alpha=\kappa$, then $g(\alpha)=g(\kappa)=G(h)=G(g|_\kappa)=G(g|_\alpha)$. Therefore, we have $\varphi(g,\kappa)$, a contradiction. Hence we have $A=\emptyset$, implying for all $\alpha\in X$, there exists $f$ such that $\varphi(f,\alpha)$. Thus by the same logic as above, we can define a function $F$ from $X$ (resembling the function $h$ from $\kappa$ above), such that for all $\alpha\in X$, $F(\alpha)=G(F|_\alpha)$.

Now suppose we have functions $F_1,F_2$ from $X$ such that for all $\alpha\in X$, $F_1(\alpha)=G(F_1|_\alpha)$ and $F_2(\alpha)=G(F_2|_\alpha)$. Let $\alpha\in X$ and take as inductive hypothesis that for all $\beta\in\alpha$, $F_1(\beta)=F_2(\beta)$, then we have $F_1(\alpha)=G(F_1|_\alpha)=G(F_2|_\alpha)=F_2(\alpha)$, hence by transfinite induction, $\forall(\alpha\in X),F_1(\alpha)=F_2(\alpha)$, implying $F_1=F_2$. We have shown that there exists a unique function $F$ from $X$ such that for all $\alpha\in X$, $F(\alpha)=G(F|_\alpha)$. $\blacksquare$

Note. Note that transfinite recursion on ordinals is a schema of theorems.

Note. Suppose we have a set $a$ and a class function $f$ on the universe $V$. Then we can define a class function $G$ on $V$ by

$G(\emptyset)=a$;
else if $g$ is a function from $S(n)$ for some $n\in N$, then $G(g)=f(g(n))$;
otherwise, $G(X)=\emptyset$.

Then there exists a unique function $u$ from $N$ such that for all $n\in N$, $u(n)=G(u|_n)$. Thus $u(0)=G(\emptyset)=a$ and for all $n\in N$, $u(S(n))=G(u|_{S(n)})=f(u|_{S(n)}(n))=f(u(n))$. Therefore, recursive definition of $u$ in the form $u(0)=a$ and $u(S(n))=f(u(n))$ can be done even when $f$ is a class function defined on the universe.

Definition. Let $X$ be an ordinal. Let $\varphi(f,U)$ denote the formula "$U$ is the range of $f$". Let $C_X$ denote the class of functions from some $\alpha\in X^+$. For all $f\in C_X$, by axiom schema of replacement, there exists a set that is the range of $f$, thus there exists a unique $U$ such that $\varphi(f,U)$. Hence $\varphi(f,U)$ defines a class function from $C_X$, which can be extended into a class function $G_X$ from the universe, mapping any set not in $C_X$ to $\emptyset$. By transfinite recursion, there exists a unique function $F_X$ from $X^+$ such that for all $\alpha\in X^+$, $F_X(\alpha)=G_X(F_X|_\alpha)$. Now we can define a class function $O$ from the class of ordinals such that $O(\alpha)=F_\alpha(\alpha)$. The class representing the range of $O$ is denoted $\text{Ord}$.

Lemma. Let $\alpha,\beta$ be ordinals such that $\beta\in\alpha^+$, then $F_\alpha|_{\beta^+}=F_\beta$. (show proof)

Lemma. Let $\alpha$ be an ordinal, then $O(\alpha)=\{O(\beta):\beta\in\alpha\}$. (show proof)

Lemma. Let $\alpha$ be an ordinal, then $O(\alpha)$ is an ordinal and $\alpha\cong O(\alpha)$. (show proof)

Note. The above lemma shows that every set in $\text{Ord}$ is an ordinal.

Lemma. Let $\alpha,\beta$ be ordinals, then $O(\alpha)=O(\beta)$ if and only if $\alpha\cong\beta$. (show proof)

Lemma. For all $\alpha\in\text{Ord}$, $O(\alpha)=\alpha$. (show proof)

Lemma. Let $\alpha$ be an ordinal such that for all $\beta\in\alpha$, $\beta\in\text{Ord}$, then $\alpha\in\text{Ord}$. (show proof)

Proposition. $\text{Ord}$ is the class of ordinals. (show proof)

Note. By the above proposition:

for all $\alpha\in\text{Ord}, \alpha^+\in\text{Ord}$;
for all $\alpha\in\text{Ord}, \alpha\subset\text{Ord}$.

Proposition. Suppose $\alpha,\beta\in\text{Ord}$, then

$\alpha\cong\beta$ if and only if $\alpha=\beta$;
$\alpha\in\beta$ if and only if $\alpha\lt\beta$.

(show proof)

Notation. By the above proposition, given $\alpha,\beta\in\text{Ord}$, we have exactly one among $\alpha\in\beta$, $\alpha=\beta$ and $\beta\in\alpha$.

Proposition. Suppose $\alpha,\beta\in\text{Ord}$, then $\alpha\subseteq\beta$ if and only if $\alpha\le\beta$. (show proof)

Proposition. $\text{Ord}$ is strictly well-ordered by $\in$. Formally, $\text{Ord}$ is linearly ordered by $\in$, and given a class $X=\{x:\varphi(x)\}$, if there exists $x\in\text{Ord}\cap X$, then there exists $x^*\in\text{Ord}\cap X$ such that for all $x\in\text{Ord}\cap X$, $x\notin x^*$. (show proof)

Proposition. $\text{Ord}$ is a proper class. (show proof)

Proposition. Suppose $X$ is a set such that $X\subset\text{Ord}$, then $\cup X$ is the unique least upper bound of $X$ in $\text{Ord}$. (show proof)

Von Neumann hierarchy
Let $X\in\text{Ord}$. Let $\varphi(f,U)$ denote the formula $\forall u(u\in U\leftrightarrow\exists x\exists y(((x,y)\in f)\land(u\in\mathcal P(y))))$. Let $C_X$ denote the class of functions from some $\alpha\in X^+$. If $f\in C_X$ with domain $\alpha$, then by axiom schema of replacement, there exists a codomain $Y$ of $f$. Hence we can define the set $U=\bigcup\{\mathcal P(f(\beta)):\beta\in\alpha\}$, and we have $\varphi(f,U)$. Clearly, if $\varphi(f,U_1)$ and $\varphi(f,U_2)$, then $U_1=U_2$. Thus for all $f\in C_X$ there exists a unique $U$ such that $\varphi(f,U)$, then $\varphi(f,U)$ defines a class function from $C_X$, which can be extended into a class function $G_X$ from the universe, mapping any set not in $C_X$ to $\emptyset$. By transfinite recursion, there exists a unique function $F_X$ from $X^+$ such that for all $\alpha\in X^+$, $F_X(\alpha)=G_X(F_X|_\alpha)$. Now we can define a class function $V$ from $\text{Ord}$ such that $V_\alpha=F_\alpha(\alpha)$, called the Von Neumann hierarchy. And the class $$\bigcup\{V_\alpha:\alpha\in\text{Ord}\}$$ is called the Von Neumann universe, denoted $V$.

Proposition. For all $\alpha\in\text{Ord}$, $$V_\alpha=\bigcup\{\mathcal P(V_\beta):\beta\in\alpha\}$$ (show proof)

Proposition. For all $\alpha\in\text{Ord}$, $$\bigcup\{V_\beta:\beta\in\alpha\}\subseteq V_\alpha$$ (show proof)

Note. The above proposition directly implies that given $\alpha,\beta\in\text{Ord}$ with $\beta\in\alpha$, $V_\beta\subseteq V_\alpha$.

Proposition. For all $\alpha\in\text{Ord}$, for all $X\in V_\alpha$, $X\subseteq V_\alpha$. (show proof)

Proposition. For all $\alpha\in\text{Ord}$, for all $X\subseteq V_\alpha$, $X\in V_{\alpha^+}$. (show proof)

Proposition. The Von Neumann universe is the universe. (show proof)

Proof. Let $X$ be a set. If for all $x\in X$, $x\in V$, then for all $x\in X$ there exists $\alpha\in\text{Ord}$ such that $x\in V_\alpha$. Thus the class $\text{Ord}\cap\{\alpha:x\in V_\alpha\}$ has a strictly least element $\kappa_x$. Hence $x\mapsto\kappa_x$ defines a function from $X$, which has a set-sized codomain and hence a set-sized range $Y$. Note that $Y\subset\text{Ord}$. So there exists an upper bound $\gamma$ of $Y$ in $\text{Ord}$. Then we have $Y\subseteq\gamma^+$, where $\gamma^+\in\text{Ord}$. Thus $\bigcup\{V_\beta:\beta\in Y\}\subseteq V_{\gamma^+}$. Since for all $x\in X$, $\kappa_x\in Y$ and $x\in V_{\kappa_x}$, we have $x\in\bigcup\{V_\beta:\beta\in Y\}$, and hence $x\in V_{\gamma^+}$. Therefore, $X\in\mathcal P(V_{\gamma^+})$, implying $X\in\bigcup\{\mathcal P(V_\beta):\beta\in\gamma^{++}\}=V_{\gamma^{++}}$. And we conclude that $X\in V$.

We have shown that if $X\notin V$, then there exists $x\in X$ such that $x\notin V$. Thus $\{x\in X|x\notin V\}\neq\emptyset$. Note that there exists a choice function $\mathcal C_X$ of $\mathcal P(X)\setminus\{\emptyset\}$. Hence $\mathcal C_X(\{x\in X|x\notin V\})\in\{x\in X|x\notin V\}$. Suppose for contradiction that there exists a set $A$ such that $A\notin V$. Then we can define a class function $G:V\to V$ such that:

If $f$ is a function from some $k\in N$:
- If $k=0$, $G(f)=A$.
- If $k\neq0$ and $f(k-1)\notin V$, $G(f)=\mathcal C_{f(k-1)}(\{x\in f(k-1)|x\notin V\})$.
- If $k\neq0$ and $f(k-1)\in V$, $G(f)=\emptyset$.
If $f$ is not a function from some $k\in N$, then $G(f)=\emptyset$.

Then we can use transfinite recursion to define a function $F$ from $N$ such that $F(k)=G(F|_k)$. Now we have $F(0)=G(F|_0)=A\notin V$, and suppose $F(k)\notin V$, then $F(k+1)=G(F|_{k+1})=\mathcal C_{F(k)}(\{x\in F(k)|x\notin V\})\in\{x\in F(k)|x\notin V\}$, implying $F(k+1)\in F(k)$ and $F(k+1)\notin V$. Therefore, $F$ is an infinitely descending $\in$-sequence, a contradiction. And we conclude that every set is in $V$. $\blacksquare$

Rank
Let $X$ be a set, then $X\in V$, thus $\{\alpha:\alpha\in\text{Ord}\land X\in V_\alpha\}$ is non-empty and hence has a least element $\kappa_X$. Note that $X\mapsto\kappa_X$ defines a class function from $V$ to $\text{Ord}$, denoted $\rank$. And $\rank(X)$ is called the rank of $X$.

Transfinite induction theorem
Let $\varphi(x)$ be a formula. Suppose $\forall(\alpha\in\text{Ord}),(\forall(\beta\in\alpha),\varphi(\beta))\to\varphi(\alpha)$, then $\forall(\alpha\in\text{Ord}),\varphi(\alpha)$. (show proof)

Transfinite recursion theorem
Suppose $G:V\to V$ is a class function, then we can uniquely construct a class function $F:\text{Ord}\to V$ such that for all $\alpha\in\text{Ord}$, $F(\alpha)=G(F\upharpoonright\alpha)$. (show proof)

Proof. Let $\kappa\in\text{Ord}$, then by transfinite recursion, there exists a unique function $F_\kappa$ from $\kappa^+$ such that for all $\alpha\in\kappa^+$, $F_\kappa(\alpha)=G(F_\kappa|_\alpha)$. Thus $F_\kappa(\kappa)=G(F_\kappa|_\kappa)$. And we can define a class function $F:\text{Ord}\to V$ such that for all $\alpha\in\text{Ord}$, $F(\alpha)=F_\alpha(\alpha)$.

Let $\alpha,\beta\in\text{Ord}$ such that $\beta\in\alpha$. Let $\gamma\in\beta^+$ and suppose for inductive hypothesis that for all $\delta\in\gamma$, $F_\alpha(\delta)=F_\beta(\delta)$. Then $F_\alpha|_\gamma=F_\beta|_\gamma$, thus $F_\alpha(\gamma)=G(F_\alpha|_\gamma)=G(F_\beta|_\gamma)=F_\beta(\gamma)$. By transfinite induction, for all $\gamma\in\beta^+$, we have $F_\alpha(\gamma)=F_\beta(\gamma)$.

Let $\alpha\in\text{Ord}$. For all $\beta\in\alpha$, we have $\beta\in\text{Ord}$ and $\beta\in\beta^+$, thus $$F_\alpha|_\alpha(\beta)=F_\alpha(\beta)=F_\beta(\beta)=F(\beta)=F\upharpoonright\alpha(\beta)$$ Hence $F_\alpha|_\alpha=F\upharpoonright\alpha$, and therefore $$F(\alpha)=F_\alpha(\alpha)=G(F_\alpha|_\alpha)=G(F\upharpoonright\alpha)$$
Suppose $E:\text{Ord}\to V$ is a class functions such that for all $\alpha\in\text{Ord}$, $E(\alpha)=G(E\upharpoonright\alpha)$. Let $\alpha\in\text{Ord}$ and suppose for inductive hypothesis that for all $\beta\in\alpha$, $E(\beta)=F(\beta)$. Then for all $\beta\in\alpha$, $E\upharpoonright\alpha(\beta)=E(\beta)=F(\beta)=F\upharpoonright\alpha(\beta)$. Thus $E\upharpoonright\alpha=F\upharpoonright\alpha$, and we have $E(\alpha)=G(E\upharpoonright\alpha)=G(F\upharpoonright\alpha)=F(\alpha)$. By transfinite induction, for all $\alpha\in\text{Ord}$, $E(\alpha)=F(\alpha)$. Therefore $E=F$.

Finally, if $E$ is a class such that $E=F$, then $E$ is a class function from $\text{Ord}$ to $V$. Let $\alpha\in\text{Ord}$. Then for all $\beta\in\alpha$, $E\upharpoonright\alpha(\beta)=E(\beta)=F(\beta)=F\upharpoonright\alpha(\beta)$. Thus $E\upharpoonright\alpha=F\upharpoonright\alpha$. Therefore $E(\alpha)=F(\alpha)=G(F\upharpoonright\alpha)=G(E\upharpoonright\alpha)$. $\blacksquare$

Note. Note that transfinite recursion theorem is a schema of theorems. It essentially states that we can define a meta-logical function $\mathcal F$ from classes to classes, such that given classes $G$ and $E$, we have the formula representing that "if $G$ is a class function from $V$ to $V$, then $E$ is a class function from $\text{Ord}$ to $V$ such that for all $\alpha\in\text{Ord}$, $E(\alpha)=G(E\upharpoonright\alpha)$, if and only if $E=\mathcal F(G)$". And we say that $\mathcal F(G)$ is recursively defined by $G$.

Classification of ordinals
An ordinal $\alpha\in\text{Ord}$ is said to be a

zero ordinal if and only if $\alpha$ is empty;
successor ordinal if and only if $\alpha$ has a maximal element;
limit ordinal if and only if $\alpha$ is non-empty and has no maximal element.

Clearly, every ordinal in $\text{Ord}$ is in exactly one of these classes.

Lemma. Let $\alpha\in\text{Ord}$, then no $\beta\in\text{Ord}$ has $\alpha\lt\beta\lt\alpha^+$. (show proof)

Proposition. Let $\alpha\in\text{Ord}$, then $\alpha^+$ is a successor ordinal. (show proof)

Proposition. Let $\alpha$ be a successor ordinal, then there exists a unique $\beta\in\text{Ord}$ such that $\beta^+=\alpha$. (show proof)

Definition. By the above proposition, the formula $\beta^+=\alpha$ defines a class function $-$ from successor ordinals to $\text{Ord}$ with $\alpha$ as input and $\beta$ as output. And for all successor ordinal $\alpha$, we have $(\alpha^-)^+=\alpha$. Also, for all $\alpha\in\text{Ord}$, $(\alpha^+)^-=\alpha$.

Proposition.

$0$ is the only zero ordinal.
Every non-zero natural number is a successor ordinal.
$N$ is the least limit ordinal.

(show proof)

Notation. We use $\omega$ to denote the least limit ordinal, which is $N$.

Notation. Given a term $t$, we use $t^+$ to denote the term $t\cup\{t\}$, regardless of whether $t$ is an ordinal.

Definition. Given classes $A$ and $B$, we define $A\times B$ to be the class "there exist $a\in A$ and $b\in B$ such that $u=(a,b)$" with $u$ as the variable.

Addition on ordinals
Define a class function $G_\alpha:V\to V$ by:

$G_\alpha(X)=\alpha$ if $X$ is a function from a zero ordinal;
$G_\alpha(X)=\bigcup\{p_1^+:p\in X\}$ if $X$ is a function from a successor ordinal;
$G_\alpha(X)=\bigcup\{p_1:p\in X\}$ if $X$ is a function from a limit ordinal;
$G_\alpha(X)=\emptyset$ if otherwise.

If $\alpha\in\text{Ord}$, then $G$ recursively defines a class function $+_\alpha:\text{Ord}\to\text{Ord}$ (show proof).

Proof. By transfinite recursion, the class $+_\alpha$ defined by $G_\alpha$ is a class function from $\text{Ord}$ to $V$ such that for all $\beta\in\text{Ord}$, $+_\alpha(\beta)=G_\alpha(+_\alpha\upharpoonright\beta)$. Let $\beta\in\text{Ord}$ and suppose that for all $\gamma\in\beta$, $+_\alpha(\gamma)\in\text{Ord}$.

If $\beta$ is a zero ordinal, then $+_\alpha(\beta)=G_\alpha(+_\alpha\upharpoonright\emptyset)=\alpha\in\text{Ord}$.
If $\beta$ is a successor ordinal, then $+_\alpha(\beta)=G_\alpha(+_\alpha\upharpoonright\beta)=\bigcup\{p_1^+:p\in +_\alpha\upharpoonright\beta\}$. Note that for all $p\in +_\alpha\upharpoonright\beta$, there exist $\gamma,\delta$ such that $\gamma\in\beta$ and $p=(\gamma,\delta)$. Thus $p_1=\delta=+_\alpha\upharpoonright\beta(\gamma)=+_\alpha(\gamma)\in\text{Ord}$, and hence $p_1^+\in\text{Ord}$. Then for all $\delta\in\{p_1^+:p\in +_\alpha\upharpoonright\beta\}$, $\delta\in\text{Ord}$, and thus $\{p_1^+:p\in +_\alpha\upharpoonright\beta\}\subset\text{Ord}$. Hence $\bigcup\{p_1^+:p\in +_\alpha\upharpoonright\beta\}\in\text{Ord}$. Therefore, $+_\alpha(\beta)\in\text{Ord}$.
If $\beta$ is a limit ordinal, then $+_\alpha(\beta)=G_\alpha(+_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in +_\alpha\upharpoonright\beta\}$. By the same reasoning as above, for all $p\in +_\alpha\upharpoonright\beta$, $p_1\in\text{Ord}$. Then for all $\delta\in\{p_1:p\in +_\alpha\upharpoonright\beta\}$, $\delta\in\text{Ord}$, and thus $\{p_1:p\in +_\alpha\upharpoonright\beta\}\subset\text{Ord}$. Hence $\bigcup\{p_1:p\in +_\alpha\upharpoonright\beta\}\in\text{Ord}$. Therefore, $+_\alpha(\beta)\in\text{Ord}$.

In all cases, $+_\alpha(\beta)\in\text{Ord}$. Thus by transfinite induction, for all $\beta\in\text{Ord}$, $+_\alpha(\beta)\in\text{Ord}$. Hence $+_\alpha$ is a class function from $\text{Ord}$ to $\text{Ord}$. $\blacksquare$

And we can thus define a class function $+:\text{Ord}\times\text{Ord}\to\text{Ord}$ by $+(p)=+_{p_0}(p_1)$. Given $\alpha,\beta\in\text{Ord}$, we may denote $+((\alpha,\beta))$ as $\alpha+\beta$.

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\gamma\lt\beta$, then $\alpha+\gamma\lt\alpha+\beta$. (show proof)

Proof. Let $\alpha\in\text{Ord}$. Suppose $\beta\in\text{Ord}$ and for all $\delta\in\beta$, for all $\gamma\lt\delta$, $\alpha+\gamma\lt\alpha+\delta$.

If $\beta$ is a zero ordinal, then for all $\gamma\lt\beta$, $\alpha+\gamma\lt\alpha+\beta$.
If $\beta$ is a successor ordinal, then for some $\delta\in\beta$ we have $\delta^+=\beta$. Thus for all $\gamma\lt\delta$, $\alpha+\gamma\lt\alpha+\delta$. Note that $\alpha+\beta=G_\alpha(+_\alpha\upharpoonright\beta)=\bigcup\{p_1^+:p\in +_\alpha\upharpoonright\beta\}$. And $\alpha+\delta=+_\alpha(\delta)=+_\alpha\upharpoonright\beta(\delta)$. Thus $(\delta,\alpha+\delta)\in +_\alpha\upharpoonright\beta$. And $(\alpha+\delta)^+\in\{p_1^+:p\in +_\alpha\upharpoonright\beta\}$. Thus $(\alpha+\delta)^+\subseteq\bigcup\{p_1^+:p\in +_\alpha\upharpoonright\beta\}$, implying $(\alpha+\delta)^+\le\alpha+\beta$. If $\gamma\in\beta$, then $\gamma\le\delta$, thus $\alpha+\gamma\le\alpha+\delta\lt(\alpha+\delta)^+\le\alpha+\beta$.
If $\beta$ is a limit ordinal, then $\alpha+\beta=G_\alpha(+_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in +_\alpha\upharpoonright\beta\}$. Let $\gamma\in\beta$, then there exists $\delta\in\beta$ such that $\gamma\in\delta$, thus $\alpha+\gamma\lt\alpha+\delta$. Note that $\alpha+\delta=+_\alpha\upharpoonright\beta(\delta)$, implying $(\delta,\alpha+\delta)\in +_\alpha\upharpoonright\beta$. Thus $\alpha+\delta\in\{p_1:p\in +_\alpha\upharpoonright\beta\}$, implying $\alpha+\delta\subseteq\alpha+\beta$ and hence $\alpha+\delta\le\alpha+\beta$.

In all cases, we have that for all $\gamma\lt\beta$, $\alpha+\gamma\lt\alpha+\beta$. By transfinite induction, for all $\beta\in\text{Ord}$, for all $\gamma\lt\beta$, $\alpha+\gamma\lt\alpha+\beta$. $\blacksquare$

Proposition. Let $\alpha,\beta\in\text{Ord}$, then

$\alpha+0=\alpha$;
$\alpha+\beta^+=(\alpha+\beta)^+$;
$\alpha+\beta=\bigcup_{\gamma\in\beta}(\alpha+\gamma)$ if $\beta$ is a limit ordinal.

(show proof)

Proposition. For all $\alpha\in\text{Ord}$, $0+\alpha=\alpha$. (show proof)

Proposition. Ordinal addition extends natural number addition. (show proof)

Proposition. For all $\gamma\in\text{Ord}$, for all $\alpha\le\gamma$, there exists a unique $\beta\in\text{Ord}$ such that $\alpha+\beta=\gamma$. (show proof)

Proof. Let $\gamma\in\text{Ord}$ and suppose for all $\delta\in\gamma$, for all $\alpha\le\delta$, there exists a unique $\beta\in\text{Ord}$ such that $\alpha+\beta=\delta$.

If $\gamma$ is a zero ordinal, then for all $\alpha\le\gamma$, there exists a unique $\beta\in\text{Ord}$ such that $\alpha+\beta=\gamma$.
If $\gamma$ is a successor ordinal, then some $\delta\in\gamma$ has $\delta^+=\gamma$. Thus given $\alpha\le\gamma$, either $\alpha=\gamma$ or $\alpha\le\delta$.
- If $\alpha=\gamma$, then $\alpha+0=\gamma$.
- If $\alpha\le\delta$, then some $\beta\in\text{Ord}$ has $\alpha+\beta=\delta$. Thus $\alpha+\beta^+=(\alpha+\beta)^+=\delta^+=\gamma$.
Uniqueness is trivial.
If $\gamma$ is a limit ordinal, then $\gamma$ is non-empty and has no maximal element. Given $\alpha\le\gamma$, either $\alpha=\gamma$ or $\alpha\in\gamma$. The case when $\alpha=\gamma$ is trivial. Suppose $\alpha\in\gamma$. Note that for all $\delta\in\{u\in\gamma|\alpha\le u\}$, there exists a unique $\beta\in\text{Ord}$ such that $\alpha+\beta=\delta$. With axiom of replacement, this defines a function $f$ whose range we will denote as $Y$. For all $u\in Y$, some $\delta\in\{u\in\gamma|\alpha\le u\}$ has $f(\delta)=u$. Since $f(\delta)\in\text{Ord}$, $u\in\text{Ord}$. Thus $Y\subset\text{Ord}$, implying $\bigcup Y\in\text{Ord}$, which is also the least upper bound of $Y$. We denote $\bigcup Y$ as $\epsilon$. Let $\zeta\in Y$, then some $\delta\in\{u\in\gamma|\alpha\le u\}$ has $f(\delta)=\zeta$. Note that $\delta^+\in\{u\in\gamma|\alpha\le u\}$ and $$\alpha+f(\delta^+)=\delta^+=(\alpha+f(\delta))^+=(\alpha+\zeta)^+=\alpha+\zeta^+$$ implying $\zeta^+=f(\delta^+)\in Y$, and thus $\zeta\lt\zeta^+\le\epsilon$. Hence $Y\subseteq\epsilon$. Since $\alpha\in\{u\in\gamma|\alpha\le u\}$ and $\alpha+f(\alpha)=\alpha$, we have $f(\alpha)=0$, implying $0\in Y$, and thus $0\in\epsilon$. Hence $\epsilon$ is non-empty. Given $\delta\in\epsilon$, $\delta$ is not an upper bound of $Y$, thus some $\zeta\in Y$ has $\delta\lt\zeta\lt\epsilon$. Hence $\epsilon$ has no maximal element. Therefore $\epsilon$ is a limit ordinal. Then $\alpha+\epsilon=\bigcup_{\delta\in\epsilon}(\alpha+\delta)$. If $\zeta\in\alpha+\epsilon$, then for some $\delta\in\epsilon$, $\zeta\in\alpha+\delta$. Since $\delta$ is not an upper bound of $Y$, for some $x\in\{u\in\gamma|\alpha\le u\}$, $\delta\lt f(x)$, thus $\zeta\lt\alpha+\delta\lt\alpha+f(x)=x\lt\gamma$. Hence $\alpha+\epsilon\le\gamma$. If $\zeta\in\gamma$, then some $x\in\gamma$ has $\alpha,\zeta\in x$. Thus $\zeta\in x=\alpha+f(x)$. Since $f(x)\in Y$, $f(x)\in\epsilon$. Hence $\zeta\in\bigcup_{\delta\in\epsilon}(\alpha+\delta)=\alpha+\epsilon$. Hence $\gamma\le\alpha+\epsilon$. We have shown that $\alpha+\epsilon=\gamma$. Uniqueness is trivial.

By transfinite induction, for all $\gamma\in\text{Ord}$, for all $\alpha\le\gamma$, there exists a unique $\beta\in\text{Ord}$ such that $\alpha+\beta=\gamma$. $\blacksquare$

Lemma. For all $\alpha,\beta\in\text{Ord}$, if $\beta$ is a limit ordinal, then $\alpha+\beta$ is a limit ordinal. (show proof)

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, $(\alpha+\beta)+\gamma=\alpha+(\beta+\gamma)$. (show proof)

Proof. Let $\alpha,\beta\in\text{Ord}$. Suppose $\gamma\in\text{Ord}$ and for all $\delta\in\gamma$, $(\alpha+\beta)+\delta=\alpha+(\beta+\delta)$.

If $\gamma$ is a zero ordinal, then $$(\alpha+\beta)+\gamma=\alpha+\beta=\alpha+(\beta+\gamma)$$
If $\gamma$ is a successor ordinal, then some $\delta\in\gamma$ has $\delta^+=\gamma$. Thus $$(\alpha+\beta)+\gamma=(\alpha+\beta)+\delta^+=((\alpha+\beta)+\delta)^+=(\alpha+(\beta+\delta))^+=\alpha+(\beta+\delta)^+=\alpha+(\beta+\delta^+)=\alpha+(\beta+\gamma)$$
If $\gamma$ is a limit ordinal, then $$(\alpha+\beta)+\gamma =\bigcup_{\delta\in\gamma}((\alpha+\beta)+\delta) =\bigcup_{\delta\in\gamma}(\alpha+(\beta+\delta))$$ If $u\in\bigcup_{\delta\in\gamma}(\alpha+(\beta+\delta))$, then for some $\delta\in\gamma$, $u\in\alpha+(\beta+\delta)$. Since $\beta+\delta\in\beta+\gamma$, for some $\epsilon\in\beta+\gamma$, $u\in\alpha+\epsilon$. Thus $u\in\bigcup_{\epsilon\in\beta+\gamma}(\alpha+\epsilon)$. If $u\in\bigcup_{\epsilon\in\beta+\gamma}(\alpha+\epsilon)$, then for some $\epsilon\in\beta+\gamma$, $u\in\alpha+\epsilon$.
- If $\epsilon\lt\beta$, then $\alpha+\epsilon\lt\alpha+\beta$, thus $u\in\alpha+(\beta+0)$. Since $0\in\gamma$, we have $u\in\bigcup_{\delta\in\gamma}(\alpha+(\beta+\delta))$.
- If $\epsilon\ge\beta$, then for some $\delta\in\text{Ord}$, $\beta+\delta=\epsilon$. Thus $\alpha+\epsilon=\alpha+(\beta+\delta)$, implying $u\in\alpha+(\beta+\delta)$. Since $\epsilon\in\beta+\gamma$, we have $\beta+\delta\lt\beta+\gamma$, thus $\delta\in\gamma$. Hence $u\in\bigcup_{\delta\in\gamma}(\alpha+(\beta+\delta))$.
We have shown that $$\bigcup_{\delta\in\gamma}(\alpha+(\beta+\delta))=\bigcup_{\epsilon\in\beta+\gamma}(\alpha+\epsilon)$$ Since $\gamma$ is a limit ordinal, $\beta+\gamma$ is also a limit ordinal. Thus $$\bigcup_{\epsilon\in\beta+\gamma}(\alpha+\epsilon)=\alpha+(\beta+\gamma)$$

By transfinite induction, for all $\gamma\in\text{Ord}$, $(\alpha+\beta)+\gamma=\alpha+(\beta+\gamma)$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\beta\lt\alpha$, then $\beta+\gamma\le\alpha+\gamma$. (show proof)

Multiplication on ordinals
Define a class function $G_\alpha:V\to V$ by:

$G_\alpha(X)=0$ if $X$ is a function from a zero ordinal;
$G_\alpha(X)=\bigcup\{p_1+\alpha:p\in X\}$ if $X$ is a function from a successor ordinal;
$G_\alpha(X)=\bigcup\{p_1:p\in X\}$ if $X$ is a function from a limit ordinal;
$G_\alpha(X)=\emptyset$ if otherwise.

If $\alpha\in\text{Ord}$, then $G$ recursively defines a class function $\times_\alpha:\text{Ord}\to\text{Ord}$ (show proof).

Proof. By transfinite recursion, the class $\times_\alpha$ defined by $G_\alpha$ is a class function from $\text{Ord}$ to $V$ such that for all $\beta\in\text{Ord}$, $\times_\alpha(\beta)=G_\alpha(\times_\alpha\upharpoonright\beta)$. Let $\beta\in\text{Ord}$ and suppose that for all $\gamma\in\beta$, $\times_\alpha(\gamma)\in\text{Ord}$.

If $\beta$ is a zero ordinal, then $\times_\alpha(\beta)=G_\alpha(\times_\alpha\upharpoonright\emptyset)=0\in\text{Ord}$.
If $\beta$ is a successor ordinal, then $\times_\alpha(\beta)=G_\alpha(\times_\alpha\upharpoonright\beta)=\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}$. Note that for all $p\in \times_\alpha\upharpoonright\beta$, there exist $\gamma,\delta$ such that $\gamma\in\beta$ and $p=(\gamma,\delta)$. Thus $p_1=\delta=\times_\alpha\upharpoonright\beta(\gamma)=\times_\alpha(\gamma)\in\text{Ord}$, and hence $p_1+\alpha\in\text{Ord}$. Then for all $\delta\in\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}$, $\delta\in\text{Ord}$, and thus $\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}\subset\text{Ord}$. Hence $\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}\in\text{Ord}$. Therefore, $\times_\alpha(\beta)\in\text{Ord}$.
If $\beta$ is a limit ordinal, then $\times_\alpha(\beta)=G_\alpha(\times_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in \times_\alpha\upharpoonright\beta\}$. By the same reasoning as above, for all $p\in \times_\alpha\upharpoonright\beta$, $p_1\in\text{Ord}$. Then for all $\delta\in\{p_1:p\in \times_\alpha\upharpoonright\beta\}$, $\delta\in\text{Ord}$, and thus $\{p_1:p\in \times_\alpha\upharpoonright\beta\}\subset\text{Ord}$. Hence $\bigcup\{p_1:p\in \times_\alpha\upharpoonright\beta\}\in\text{Ord}$. Therefore, $\times_\alpha(\beta)\in\text{Ord}$.

In all cases, $\times_\alpha(\beta)\in\text{Ord}$. Thus by transfinite induction, for all $\beta\in\text{Ord}$, $\times_\alpha(\beta)\in\text{Ord}$. Hence $\times_\alpha$ is a class function from $\text{Ord}$ to $\text{Ord}$. $\blacksquare$

And we can thus define a class function $\times:\text{Ord}\times\text{Ord}\to\text{Ord}$ by $\times(p)=\times_{p_0}(p_1)$. Given $\alpha,\beta\in\text{Ord}$, we may denote $\times((\alpha,\beta))$ as $\alpha\beta$.

Proposition. For all $\alpha\in\text{Ord}$, $0\alpha=0$. (show proof)

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\alpha\neq0$ and $\gamma\lt\beta$, then $\alpha\gamma\lt\alpha\beta$. (show proof)

Proof. Let $\alpha\in\text{Ord}$ be non-zero. Suppose $\beta\in\text{Ord}$ and for all $\delta\in\beta$, for all $\gamma\lt\delta$, $\alpha\gamma\lt\alpha\delta$.

If $\beta$ is a zero ordinal, then for all $\gamma\lt\beta$, $\alpha\gamma\lt\alpha\beta$.
If $\beta$ is a successor ordinal, then for some $\delta\in\beta$ we have $\delta^+=\beta$. Thus for all $\gamma\lt\delta$, $\alpha\gamma\lt\alpha\delta$. Note that $\alpha\beta=G_\alpha(\times_\alpha\upharpoonright\beta)=\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}$. And $\alpha\delta=\times_\alpha(\delta)=\times_\alpha\upharpoonright\beta(\delta)$. Thus $(\delta,\alpha\delta)\in \times_\alpha\upharpoonright\beta$. And $(\alpha\delta)+\alpha\in\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}$. Thus $(\alpha\delta)+\alpha\subseteq\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta\}$, implying $(\alpha\delta)+\alpha\le\alpha\beta$. If $\gamma\in\beta$, then $\gamma\le\delta$, thus $\alpha\gamma\le\alpha\delta\lt(\alpha\delta)+\alpha\le\alpha\beta$.
If $\beta$ is a limit ordinal, then $\alpha\beta=G_\alpha(\times_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in \times_\alpha\upharpoonright\beta\}$. Let $\gamma\in\beta$, then there exists $\delta\in\beta$ such that $\gamma\in\delta$, thus $\alpha\gamma\lt\alpha\delta$. Note that $\alpha\delta=\times_\alpha\upharpoonright\beta(\delta)$, implying $(\delta,\alpha\delta)\in \times_\alpha\upharpoonright\beta$. Thus $\alpha\delta\in\{p_1:p\in \times_\alpha\upharpoonright\beta\}$, implying $\alpha\delta\subseteq\alpha\beta$ and hence $\alpha\delta\le\alpha\beta$.

In all cases, we have that for all $\gamma\lt\beta$, $\alpha\gamma\lt\alpha\beta$. By transfinite induction, for all $\beta\in\text{Ord}$, for all $\gamma\lt\beta$, $\alpha\gamma\lt\alpha\beta$. $\blacksquare$

Proposition. Let $\alpha,\beta\in\text{Ord}$, then

$\alpha0=0$;
$\alpha\beta^+=\alpha\beta+\alpha$;
$\alpha\beta=\bigcup_{\gamma\in\beta}(\alpha\gamma)$ if $\beta$ is a limit ordinal.

(show proof)

Proof. $\alpha0=G_\alpha(\times_\alpha\upharpoonright0)=0$.

Note that $\alpha\beta^+=G_\alpha(\times_\alpha\upharpoonright\beta^+)=\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta^+\}$. And $\alpha\beta=\times_\alpha(\beta)=\times_\alpha\upharpoonright\beta^+(\beta)$. Thus $(\beta,\alpha\beta)\in \times_\alpha\upharpoonright\beta^+$. And $\alpha\beta+\alpha\in\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta^+\}$. If $u\in\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta^+\}$, then for some $p\in \times_\alpha\upharpoonright\beta^+$, $u=p_1+\alpha$. And we have $p_0\in\beta^+$, thus $p_0\le\beta$ and $p_1=\times_\alpha\upharpoonright\beta^+(p_0)=\times_\alpha(p_0)=\alpha p_0$. Since $p_0\le\beta$, we have $\alpha p_0\le\alpha\beta$, thus $u=\alpha p_0+\alpha\subseteq\alpha\beta+\alpha$. Hence $\alpha\beta^+=\bigcup\{p_1+\alpha:p\in \times_\alpha\upharpoonright\beta^+\}=\alpha\beta+\alpha$.

Suppose $\beta$ is a limit ordinal. Then $\alpha\beta=G_\alpha(\times_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in \times_\alpha\upharpoonright\beta\}$. Thus if $\delta\in\alpha\beta$, then for some $u\in\{p_1:p\in \times_\alpha\upharpoonright\beta\}$, $\delta\in u$, and for some $p\in \times_\alpha\upharpoonright\beta$, $u=p_1$, thus $p_0\in\beta$ and $\delta\in p_1$. Note that $p_1=\times_\alpha\upharpoonright\beta(p_0)=\times_\alpha(p_0)=\alpha p_0\in\{\alpha\gamma:\gamma\in\beta\}$. Thus $\delta\in\bigcup_{\gamma\in\beta}(\alpha\gamma)$. For the other direction, if $\delta\in\bigcup_{\gamma\in\beta}(\alpha\gamma)$, then for some $u\in\{\alpha\gamma:\gamma\in\beta\}$, $\delta\in u$, and for some $\gamma\in\beta$, $u=\alpha\gamma$. Note that $\alpha\gamma=\times_\alpha(\gamma)=\times_\alpha\upharpoonright\beta(\gamma)$, thus $(\gamma,u)\in \times_\alpha\upharpoonright\beta$, and hence $u\in\{p_1:p\in \times_\alpha\upharpoonright\beta\}$. Then we have $\delta\in\alpha\beta$. $\blacksquare$

Proposition. For all $\alpha,\beta\in\text{Ord}$, $\alpha\beta=0$ if and only if $\alpha=0$ or $\beta=0$. (show proof)

Proposition. For all $\alpha\in\text{Ord}$, $\alpha1=\alpha$. (show proof)

Proposition. For all $\alpha\in\text{Ord}$, $1\alpha=\alpha$. (show proof)

Proposition. Ordinal addition extends natural number multiplication. (show proof)

Lemma. For all $\alpha,\beta\in\text{Ord}$, if $\alpha\neq0$ and $\beta$ is a limit ordinal, then $\alpha\beta$ is a limit ordinal. (show proof)

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, $\alpha(\beta+\gamma)=\alpha\beta+\alpha\gamma$. (show proof)

Proof. Let $\alpha,\beta\in\text{Ord}$. If $\alpha=0$, this is trivial. Now suppose $\alpha\neq0$. Suppose $\gamma\in\text{Ord}$ and for all $\delta\in\gamma$, $\alpha(\beta+\delta)=\alpha\beta+\alpha\delta$.

If $\gamma$ is a zero ordinal, then $$\alpha(\beta+\gamma)=\alpha\beta=\alpha\beta+\alpha\gamma$$
If $\gamma$ is a successor ordinal, then some $\delta\in\gamma$ has $\delta^+=\gamma$. Thus $$\alpha(\beta+\gamma)=\alpha(\beta+\delta^+)=\alpha(\beta+\delta)^+=\alpha(\beta+\delta)+\alpha =(\alpha\beta+\alpha\delta)+\alpha=\alpha\beta+(\alpha\delta+\alpha)=\alpha\beta+\alpha\delta^+=\alpha\beta+\alpha\gamma$$
If $\gamma$ is a limit ordinal, then $\beta+\gamma$ and $\alpha\gamma$ are limit ordinals. Thus $$\alpha(\beta+\gamma)=\bigcup_{\delta\in\beta+\gamma}(\alpha\delta)$$ If $u\in\bigcup_{\delta\in\beta+\gamma}(\alpha\delta)$, then for some $\delta\in\beta+\gamma$, $u\in\alpha\delta$.
- If $\delta\lt\beta$, since $\alpha\neq0$, we have $\alpha\delta\lt\alpha\beta$. Thus $u\in\alpha\beta$. Since $\gamma$ is a limit ordinal, it is non-zero. Thus $\alpha\gamma\neq0$, implying $0\in\alpha\gamma$. Hence $u\in\bigcup_{\zeta\in\alpha\gamma}(\alpha\beta+\zeta)$.
- If $\delta\ge\beta$, then for some $\epsilon\in\text{Ord}$, $\beta+\epsilon=\delta\lt\beta+\gamma$, thus $\epsilon\lt\gamma$. Since $\alpha\neq0$, we have $\alpha\epsilon\in\alpha\gamma$. Note that $\alpha\delta=\alpha(\beta+\epsilon)=\alpha\beta+\alpha\epsilon$. Thus $u\in\alpha\beta+\alpha\epsilon$. Hence $u\in\bigcup_{\zeta\in\alpha\gamma}(\alpha\beta+\zeta)$.
For the other direction, let $u\in\bigcup_{\zeta\in\alpha\gamma}(\alpha\beta+\zeta)$. Then for some $\zeta\in\alpha\gamma$, $u\in\alpha\beta+\zeta$. Since $\alpha\gamma=\bigcup_{\epsilon\in\gamma}(\alpha\epsilon)$, for some $\epsilon\in\gamma$, $\zeta\in\alpha\epsilon$. Thus $\alpha\beta+\zeta\lt\alpha\beta+\alpha\epsilon=\alpha(\beta+\epsilon)$. And we have $u\in\alpha(\beta+\epsilon)$ and $\beta+\epsilon\in\beta+\gamma$. Hence $u\in\bigcup_{\delta\in\beta+\gamma}(\alpha\delta)$. We have shown that $$\bigcup_{\delta\in\beta+\gamma}(\alpha\delta)=\bigcup_{\zeta\in\alpha\gamma}(\alpha\beta+\zeta)$$ And since $\alpha\gamma$ is a limit ordinal, $$\bigcup_{\zeta\in\alpha\gamma}(\alpha\beta+\zeta)=\alpha\beta+\alpha\gamma$$

By transfinite induction, for all $\gamma\in\text{Ord}$, $\alpha(\beta+\gamma)=\alpha\beta+\alpha\gamma$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, $(\alpha\beta)\gamma=\alpha(\beta\gamma)$. (show proof)

Proof. Let $\alpha,\beta\in\text{Ord}$. If $\alpha=0$ or $\beta=0$, this is trivial. Now suppose $\alpha$ and $\beta$ are non-zero. Suppose $\gamma\in\text{Ord}$ and for all $\delta\in\gamma$, $(\alpha\beta)\delta=\alpha(\beta\delta)$.

If $\gamma$ is a zero ordinal, then $$(\alpha\beta)\gamma=0=\alpha(\beta\gamma)$$
If $\gamma$ is a successor ordinal, then some $\delta\in\gamma$ has $\delta^+=\gamma$. Thus $$(\alpha\beta)\gamma=(\alpha\beta)\delta^+=(\alpha\beta)\delta+\alpha\beta=\alpha(\beta\delta)+\alpha\beta =\alpha(\beta\delta+\beta)=\alpha(\beta\delta^+)=\alpha(\beta\gamma)$$
If $\gamma$ is a limit ordinal, then $\beta\gamma$ is a limit ordinal. We have $$(\alpha\beta)\gamma=\bigcup_{\delta\in\gamma}((\alpha\beta)\delta)$$ If $u\in\bigcup_{\delta\in\gamma}((\alpha\beta)\delta)$, then for some $\delta\in\gamma$, $u\in(\alpha\beta)\delta$. Note that $(\alpha\beta)\delta=\alpha(\beta\delta)$. Thus $u\in\alpha(\beta\delta)$. Since $\beta\neq0$, $\beta\delta\in\beta\gamma$. Thus $u\in\bigcup_{\epsilon\in\beta\gamma}(\alpha\epsilon)$. For the other direction, let $u\in\bigcup_{\epsilon\in\beta\gamma}(\alpha\epsilon)$, then for some $\epsilon\in\beta\gamma$, $u\in\alpha\epsilon$. Note that $\beta\gamma=\bigcup_{\delta\in\gamma}(\beta\delta)$. Thus for some $\delta\in\gamma$, $\epsilon\in\beta\delta$. Since $\alpha\neq0$, we have $\alpha\epsilon\lt\alpha(\beta\delta)=(\alpha\beta)\delta$, implying $u\in(\alpha\beta)\delta$. Hence $u\in\bigcup_{\delta\in\gamma}((\alpha\beta)\delta)$. We have shown that $$\bigcup_{\delta\in\gamma}((\alpha\beta)\delta)=\bigcup_{\epsilon\in\beta\gamma}(\alpha\epsilon)$$ Since $\beta\gamma$ is a limit ordinal, $$\bigcup_{\epsilon\in\beta\gamma}(\alpha\epsilon)=\alpha(\beta\gamma)$$

By transfinite induction, for all $\gamma\in\text{Ord}$, $(\alpha\beta)\gamma=\alpha(\beta\gamma)$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\beta\lt\alpha$, then $\beta\gamma\le\alpha\gamma$. (show proof)

Exponentiation on ordinals
Define a class function $G_\alpha:V\to V$ by:

$G_\alpha(X)=1$ if $X$ is a function from a zero ordinal;
$G_\alpha(X)=\bigcup\{p_1\alpha:p\in X\}$ if $X$ is a function from a successor ordinal;
$G_\alpha(X)=\bigcup\{p_1:p\in X\setminus\{(0,1)\}\}$ if $X$ is a function from a limit ordinal;
$G_\alpha(X)=\emptyset$ if otherwise.

If $\alpha\in\text{Ord}$, then $G$ recursively defines a class function $\wedge_\alpha:\text{Ord}\to\text{Ord}$ (show proof).

Proof. By transfinite recursion, the class $\wedge_\alpha$ defined by $G_\alpha$ is a class function from $\text{Ord}$ to $V$ such that for all $\beta\in\text{Ord}$, $\wedge_\alpha(\beta)=G_\alpha(\wedge_\alpha\upharpoonright\beta)$. Let $\beta\in\text{Ord}$ and suppose that for all $\gamma\in\beta$, $\wedge_\alpha(\gamma)\in\text{Ord}$.

If $\beta$ is a zero ordinal, then $\wedge_\alpha(\beta)=G_\alpha(\wedge_\alpha\upharpoonright\emptyset)=1\in\text{Ord}$.
If $\beta$ is a successor ordinal, then $\wedge_\alpha(\beta)=G_\alpha(\wedge_\alpha\upharpoonright\beta)=\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}$. Note that for all $p\in \wedge_\alpha\upharpoonright\beta$, there exist $\gamma,\delta$ such that $\gamma\in\beta$ and $p=(\gamma,\delta)$. Thus $p_1=\delta=\wedge_\alpha\upharpoonright\beta(\gamma)=\wedge_\alpha(\gamma)\in\text{Ord}$, and hence $p_1\alpha\in\text{Ord}$. Then for all $\delta\in\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}$, $\delta\in\text{Ord}$, and thus $\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}\subset\text{Ord}$. Hence $\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}\in\text{Ord}$. Therefore, $\wedge_\alpha(\beta)\in\text{Ord}$.
If $\beta$ is a limit ordinal, then $\wedge_\alpha(\beta)=G_\alpha(\wedge_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$. By the same reasoning as above, for all $p\in \wedge_\alpha\upharpoonright\beta$, $p_1\in\text{Ord}$. Thus for all $p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}$, $p_1\in\text{Ord}$. Then for all $\delta\in\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$, $\delta\in\text{Ord}$, and thus $\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}\subset\text{Ord}$. Hence $\bigcup\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}\in\text{Ord}$. Therefore, $\wedge_\alpha(\beta)\in\text{Ord}$.

In all cases, $\wedge_\alpha(\beta)\in\text{Ord}$. Thus by transfinite induction, for all $\beta\in\text{Ord}$, $\wedge_\alpha(\beta)\in\text{Ord}$. Hence $\wedge_\alpha$ is a class function from $\text{Ord}$ to $\text{Ord}$. $\blacksquare$

And we can thus define a class function $\wedge:\text{Ord}\times\text{Ord}\to\text{Ord}$ by $\wedge(p)=\wedge_{p_0}(p_1)$. Given $\alpha,\beta\in\text{Ord}$, we may denote $\wedge((\alpha,\beta))$ as $\alpha^\beta$.

Proposition. For all $\alpha\in\text{Ord}$, if $\alpha$ is non-zero, then $0^\alpha=0$. (show proof)

Proposition. For all $\alpha\in\text{Ord}$, then $1^\alpha=1$. (show proof)

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\alpha\gt1$ and $\gamma\lt\beta$, then $\alpha^\gamma\lt\alpha^\beta$. (show proof)

Proof. Let $\alpha\in\text{Ord}$ be greater than $1$. Suppose $\beta\in\text{Ord}$ and for all $\delta\in\beta$, for all $\gamma\lt\delta$, $\alpha^\gamma\lt\alpha^\delta$.

If $\beta$ is a zero ordinal, then for all $\gamma\lt\beta$, $\alpha^\gamma\lt\alpha^\beta$.
If $\beta$ is a successor ordinal, then for some $\delta\in\beta$ we have $\delta^+=\beta$. Thus for all $\gamma\lt\delta$, $\alpha^\gamma\lt\alpha^\delta$. Note that $\alpha^\beta=G_\alpha(\wedge_\alpha\upharpoonright\beta)=\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}$. And $\alpha^\delta=\wedge_\alpha(\delta)=\wedge_\alpha\upharpoonright\beta(\delta)$. Thus $(\delta,\alpha^\delta)\in \wedge_\alpha\upharpoonright\beta$. And $\alpha^\delta\alpha\in\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}$. Thus $\alpha^\delta\alpha\subseteq\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta\}$, implying $\alpha^\delta\alpha\le\alpha^\beta$. Note that $\alpha^\delta\ge\alpha^0=1$, thus $\alpha^\delta$ is non-zero. If $\gamma\in\beta$, then $\gamma\le\delta$, thus $\alpha^\gamma\le\alpha^\delta=\alpha^\delta1\lt\alpha^\delta\alpha\le\alpha^\beta$.
If $\beta$ is a limit ordinal, then $\alpha^\beta=G_\alpha(\wedge_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$. Let $\gamma\in\beta$, then there exists $\delta\in\beta$ such that $\gamma\in\delta$, thus $\alpha^\gamma\lt\alpha^\delta$. Note that $\alpha^\delta=\wedge_\alpha\upharpoonright\beta(\delta)$, implying $(\delta,\alpha^\delta)\in \wedge_\alpha\upharpoonright\beta$. Since $\gamma\in\delta$, $\delta\neq0$, thus $(\delta,\alpha^\delta)\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}$. Hence $\alpha^\delta\in\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$, implying $\alpha^\delta\subseteq\alpha^\beta$ and hence $\alpha^\delta\le\alpha^\beta$.

In all cases, we have that for all $\gamma\lt\beta$, $\alpha^\gamma\lt\alpha^\beta$. By transfinite induction, for all $\beta\in\text{Ord}$, for all $\gamma\lt\beta$, $\alpha^\gamma\lt\alpha^\beta$. $\blacksquare$

Proposition. Let $\alpha,\beta\in\text{Ord}$, then

$\alpha^0=1$;
$\alpha^{\beta^+}=\alpha^\beta\alpha$;
$\alpha^\beta=\bigcup_{\gamma\in\beta\setminus\{0\}}(\alpha^\gamma)$ if $\beta$ is a limit ordinal.

(show proof)

Proof. $\alpha^0=G_\alpha(\wedge_\alpha\upharpoonright0)=1$.

If $\alpha=0$, then trivially, $\alpha^{\beta^+}=0=\alpha^\beta\alpha$. Now suppose $\alpha\neq0$. Note that $\alpha^{\beta^+}=G_\alpha(\wedge_\alpha\upharpoonright\beta^+)=\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta^+\}$. And $\alpha^\beta=\wedge_\alpha(\beta)=\wedge_\alpha\upharpoonright\beta^+(\beta)$. Thus $(\beta,\alpha^\beta)\in \wedge_\alpha\upharpoonright\beta^+$. And $\alpha^\beta\alpha\in\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta^+\}$. If $u\in\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta^+\}$, then for some $p\in \wedge_\alpha\upharpoonright\beta^+$, $u=p_1\alpha$. And we have $p_0\in\beta^+$, thus $p_0\le\beta$ and $p_1=\wedge_\alpha\upharpoonright\beta^+(p_0)=\wedge_\alpha(p_0)=\alpha^{p_0}$. Since $p_0\le\beta$ and $\alpha\neq0$, we have $\alpha^{p_0}\le\alpha^\beta$, thus $u=\alpha^{p_0}\alpha\subseteq\alpha^\beta\alpha$. Hence $\alpha^{\beta^+}=\bigcup\{p_1\alpha:p\in \wedge_\alpha\upharpoonright\beta^+\}=\alpha^\beta\alpha$.

Suppose $\beta$ is a limit ordinal. Then $\alpha^\beta=G_\alpha(\wedge_\alpha\upharpoonright\beta)=\bigcup\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$. Thus if $\delta\in\alpha^\beta$, then for some $p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}$, $\delta\in p_1$. Note that $p_0\in\beta$. If $p_0=0$, then $p_1=\wedge_\alpha\upharpoonright\beta(p_0)=\wedge_\alpha(p_0)=\alpha^p_0=1$, implying $(0,1)=p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}$, a contradiction. Thus $p_0\in\beta\setminus\{0\}$, implying $p_1=\alpha^p_0\in\{\alpha^\gamma:\gamma\in\beta\setminus\{0\}\}$. Thus $\delta\in\bigcup_{\gamma\in\beta\setminus\{0\}}(\alpha^\gamma)$. For the other direction, if $\delta\in\bigcup_{\gamma\in\beta\setminus\{0\}}(\alpha^\gamma)$, then for some $\gamma\in\beta\setminus\{0\}$, $\delta\in\alpha^\gamma$. Note that $\alpha^\gamma=\wedge_\alpha(\gamma)=\wedge_\alpha\upharpoonright\beta(\gamma)$, thus $(\gamma,\alpha^\gamma)\in \wedge_\alpha\upharpoonright\beta$. Since $\gamma\neq0$, $(\gamma,\alpha^\gamma)\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}$. and hence $\alpha^\gamma\in\{p_1:p\in(\wedge_\alpha\upharpoonright\beta)\setminus\{(0,1)\}\}$. Then we have $\delta\in\alpha^\beta$. $\blacksquare$

Proposition. For all $\alpha\in\text{Ord}$, $\alpha^1=\alpha$. (show proof)

Proposition. Ordinal addition extends natural number exponentiation. (show proof)

Lemma. For all $\alpha,\beta\in\text{Ord}$, if $\alpha\gt1$ and $\beta$ is a limit ordinal, then $\alpha^\beta$ is a limit ordinal. (show proof)

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, $\alpha^{\beta+\gamma}=\alpha^\beta\alpha^\gamma$. (show proof)

Proof. Let $\alpha,\beta,\gamma\in\text{Ord}$. Suppose $\alpha=0$.

If $\beta=0$ and $\gamma=0$, then both sides are $1$.
If $\beta\neq0$ or $\gamma\neq0$, then $\beta+\gamma\neq0$, thus both sides are $0$.

If $\alpha=1$, then both sides are $1$. Now suppose $\alpha\gt1$. Suppose for all $\delta\in\gamma$, $\alpha^{\beta+\delta}=\alpha^\beta\alpha^\delta$.

If $\gamma$ is a zero ordinal, then $$\alpha^{\beta+\gamma}=\alpha^\beta=\alpha^\beta\alpha^\gamma$$
If $\gamma$ is a successor ordinal, then for some $\delta\in\gamma$, $\delta^+=\gamma$. Thus $$\alpha^{\beta+\gamma}=\alpha^{\beta+\delta^+}=\alpha^{(\beta+\delta)^+} =\alpha^{\beta+\delta}\alpha=\p{\alpha^\beta\alpha^\delta}\alpha=\alpha^\beta\p{\alpha^\delta\alpha} =\alpha^\beta\alpha^{\delta^+}=\alpha^\beta\alpha^\gamma$$
If $\gamma$ is a limit ordinal, then $\beta+\gamma$ and $\alpha^\gamma$ are limit ordinals. Thus $$\alpha^{\beta+\gamma}=\bigcup_{\delta\in(\beta+\gamma)\setminus\{0\}}(\alpha^\delta)$$ and $$\alpha^\beta\alpha^\gamma=\bigcup_{\zeta\in\alpha^\gamma}(\alpha^\beta\zeta)$$ If $u\in\bigcup_{\delta\in(\beta+\gamma)\setminus\{0\}}(\alpha^\delta)$, then for some $\delta\in(\beta+\gamma)\setminus\{0\}$, $u\in\alpha^\delta$. Thus $\delta\in\bigcup_{\epsilon\in\gamma}(\beta+\epsilon)$, implying that for some $\epsilon\in\gamma$, $\delta\in\beta+\epsilon$. Note that $\alpha^\delta\lt\alpha^{\beta+\epsilon}=\alpha^\beta\alpha^\epsilon$, thus $u\in\alpha^\beta\alpha^\epsilon$. Also note that $\alpha^\epsilon\in\alpha^\gamma$, thus $u\in\bigcup_{\zeta\in\alpha^\gamma}(\alpha^\beta\zeta)$. For the other direction, if $u\in\bigcup_{\zeta\in\alpha^\gamma}(\alpha^\beta\zeta)$, then for some $\zeta\in\alpha^\gamma$, $u\in\alpha^\beta\zeta$. Thus $\zeta\in\bigcup_{\epsilon\in\gamma\setminus\{0\}}(\alpha^\epsilon)$, implying for some $\epsilon\in\gamma\setminus\{0\}$, $\zeta\in\alpha^\epsilon$. Note that $\alpha^\beta\ge\alpha^0=1\neq0$. Thus $\alpha^\beta\zeta\in\alpha^\beta\alpha^\epsilon=\alpha^{\beta+\epsilon}$, implying $u\in\alpha^{\beta+\epsilon}$. Since $\epsilon\in\gamma$, $\beta+\epsilon\in\beta+\gamma$. Since $\epsilon\neq0$, $\beta+\epsilon\neq0$. Hence $u\in\bigcup_{\delta\in(\beta+\gamma)\setminus\{0\}}(\alpha^\delta)$. We have shown that $$\bigcup_{\delta\in(\beta+\gamma)\setminus\{0\}}(\alpha^\delta)=\bigcup_{\zeta\in\alpha^\gamma}(\alpha^\beta\zeta)$$

By transfinite induction, for all $\gamma\in\text{Ord}$, $\alpha^{\beta+\gamma}=\alpha^\beta\alpha^\gamma$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, $\alpha^{\beta\gamma}=(\alpha^\beta)^\gamma$. (show proof)

Proof. Let $\alpha,\beta,\gamma\in\text{Ord}$. Suppose $\alpha=0$.

If $\beta=0$ or $\gamma=0$, then both sides are $1$.
If $\beta\neq0$ and $\gamma\neq0$, then $\beta\gamma\neq0$, thus both sides are $0$.

If $\alpha=1$, then both sides are $1$. Now suppose $\alpha\gt1$. If $\beta=0$, then both sides are $1$. Now suppose $\beta\neq0$. Suppose for all $\delta\in\gamma$, $\alpha^{\beta\delta}=(\alpha^\beta)^\delta$.

If $\gamma$ is a zero ordinal, then $$\alpha^{\beta\gamma}=1=(\alpha^\beta)^\gamma$$
If $\gamma$ is a successor ordinal, then for some $\delta\in\gamma$, $\delta^+=\gamma$. Thus $$\alpha^{\beta\gamma}=\alpha^{\beta\delta^+}=\alpha^{\beta\delta+\beta} =\alpha^{\beta\delta}\alpha^\beta=(\alpha^\beta)^\delta\alpha^\beta=(\alpha^\beta)^{\delta^+}=(\alpha^\beta)^\gamma$$
If $\gamma$ is a limit ordinal, then $\beta\gamma$ is a limit ordinal. Thus $$\alpha^{\beta\gamma}=\bigcup_{\delta\in(\beta\gamma)\setminus\{0\}}(\alpha^\delta)$$ and $$(\alpha^\beta)^\gamma=\bigcup_{\epsilon\in\gamma\setminus\{0\}}(\alpha^\beta)^\epsilon$$ If $u\in\bigcup_{\delta\in(\beta\gamma)\setminus\{0\}}(\alpha^\delta)$, then for some $\delta\in(\beta\gamma)\setminus\{0\}$, $u\in\alpha^\delta$. Thus $\delta\in\bigcup_{\epsilon\in\gamma}(\beta\epsilon)$, implying that for some $\epsilon\in\gamma$, $\delta\in\beta\epsilon$. Then we have $\alpha^\delta\in\alpha^{\beta\epsilon}=(\alpha^\beta)^\epsilon$. Thus $u\in(\alpha^\beta)^\epsilon$. Since $\beta\epsilon$ is non-empty, $\epsilon\neq0$. Hence $u\in\bigcup_{\epsilon\in\gamma\setminus\{0\}}(\alpha^\beta)^\epsilon$. For the other direction, let $u\in\bigcup_{\epsilon\in\gamma\setminus\{0\}}(\alpha^\beta)^\epsilon$. Then for some $\epsilon\in\gamma\setminus\{0\}$, $u\in(\alpha^\beta)^\epsilon$. Note that $(\alpha^\beta)^\epsilon=\alpha^{\beta\epsilon}$, thus $u\in\alpha^{\beta\epsilon}$. Also note that $\beta\epsilon\in\beta\gamma$, and since both $\beta$ and $\epsilon$ are non-zero, we have $\beta\epsilon\in(\beta\gamma)\setminus\{0\}$. Hence $u\in\bigcup_{\delta\in(\beta\gamma)\setminus\{0\}}(\alpha^\delta)$. We have shown that $$\bigcup_{\delta\in(\beta\gamma)\setminus\{0\}}(\alpha^\delta)=\bigcup_{\epsilon\in\gamma\setminus\{0\}}(\alpha^\beta)^\epsilon$$

By transfinite induction, for all $\gamma\in\text{Ord}$, $\alpha^{\beta\gamma}=(\alpha^\beta)^\gamma$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Ord}$, if $\beta\lt\alpha$, then $\beta^\gamma\le\alpha^\gamma$. (show proof)

Cardinal
A cardinal is an ordinal $\kappa\in\text{Ord}$ such that for all $\alpha\in\kappa$, there is no bijection between $\kappa$ and $\alpha$. The class of cardinals is denoted $\text{Car}$.

Proposition. Every natural number is a cardinal. (show proof)

Proposition. $\omega$ is a cardinal. (show proof)

Proposition. The class of successor cardinals equals the set of non-zero natural numbers. (show proof)

Proposition. Every set is bijective to a unique cardinal. (show proof)

Proof. Let $X$ be a set, then it is strictly well-orderable by some $\lt$. Let $x\in X$ and suppose for all $t\lt x$, $\{u\in X|u\lt t\}$ is order-isomorphic to a unique $\alpha_t\in\text{Ord}$, with $\alpha_t=\{\alpha_u:u\lt t\}$. Then $t\to\alpha_t$ defines a function $F$ from $\{t\in X|t\lt x\}$, with a set-sized range $\{\alpha_t:t\lt x\}$. Given $a,b\in\{t\in X|t\lt x\}$, if $a\neq b$, then $a\lt b$ or $b\lt a$, so $\alpha_a\in\alpha_b$ or $\alpha_b\in\alpha_a$, implying $F(a)\neq F(b)$. Hence $F$ is a bijection from $\{t\in X|t\lt x\}$ to $\{\alpha_t:t\lt x\}$. Clearly, $F$ is also an order-isomorphism. Since $\{\alpha_t:t\lt x\}\subset\text{Ord}$, $\{\alpha_t:t\lt x\}$ is linearly ordered by $\in$. If $\gamma\in\{\alpha_t:t\lt x\}$, then there exists $t\lt x$ such that $\gamma=\alpha_t=\{\alpha_u:u\lt t\}\subseteq\{\alpha_t:t\lt x\}$. Hence $\{\alpha_t:t\lt x\}$ is an ordinal. Since $\{\alpha_t:t\lt x\}\subset\text{Ord}$, $\{\alpha_t:t\lt x\}\in\text{Ord}$. Suppose $\{t\in X|t\lt x\}$ is order-isomorphic to both $\alpha_1,\alpha_2\in\text{Ord}$, then $\alpha_1\cong\alpha_2$, implying $\alpha_1=\alpha_2$. Thus $\{t\in X|t\lt x\}$ is order-isomorphic to a unique $\alpha_x\in\text{Ord}$, with $\alpha_x=\{\alpha_t:t\lt x\}$. Using the same logic as transfinite induction, for all $x\in X$, $\{t\in X|t\lt x\}$ is order-isomorphic to a unique $\alpha_x\in\text{Ord}$, with $\alpha_x=\{\alpha_t:t\lt x\}$. And by the same reasoning as above, $X$ is order-isomorphic to some $\alpha\in\text{Ord}$. Let $\kappa$ be the strictly least element of $\text{Ord}$ bijective to $\alpha$, then $X$ is bijective to $\kappa$, which is a cardinal. Suppose $X$ is bijective to $\kappa_1,\kappa_2\in\text{Car}$, then $\kappa_1 $ and $\kappa_2$ are bijective and $\kappa_1,\kappa_2\in\text{Ord}$. If $\kappa_1\neq\kappa_2$, then $\kappa_1\in\kappa_2$ or $\kappa_2\in\kappa_1$, implying they cannot both be cardinals, a contradiction. Hence $\kappa_1=\kappa_2$. $\blacksquare$

Cardinality
We briefly introduced cardinality earlier in the set theory section. Now we define the cardinality of a set $X$ to be the unique cardinal bijective to $X$, denoted $\abs{X}$. Note that this defines a class function from $V$ to $\text{Car}$.

Lemma. Suppose $X$ and $Y$ are non-empty sets. If there exists an injection from $X$ to $Y$, then there exists a surjection from $Y$ to $X$. If there exists a surjection from $X$ to $Y$, then there exists an injection from $Y$ to $X$. (show proof)

Lemma. If there exist an injection and a surjection from $X$ to $Y$, then there exists a bijection from $X$ to $Y$. (show proof)

Proposition. Given $X$ and $Y$,

if there exists an injection from $X$ to $Y$, then $\abs{X}\le\abs{Y}$;
if there exists a surjection from $X$ to $Y$, then $\abs{X}\ge\abs{Y}$;
if there exists a bijection from $X$ to $Y$, then $\abs{X}=\abs{Y}$.

(show proof)

Proposition. Let $X$ be a set of cardinals, then $\bigcup X$ is a cardinal. (show proof)

Proposition. Let $X$ be a set, then $\abs{X}\lt\abs{\mathcal P(X)}$. (show proof)

Lemma. $\abs{N\times N}=N$. (show proof)

Proposition. $$\omega=\abs{N}=\abs{Z}=\abs{Q}$$ (show proof)

Proposition. Let $\alpha,\beta$ be countable ordinals, then $\alpha^+,\alpha+\beta,\alpha\beta,\alpha^\beta$ are countable. (show proof)

Proof. We will assume $\abs{\alpha}=\omega$. The case when $\alpha$ is finite trivially follows from this case.

Note that there exists a bijection $f:\omega\to\alpha$. Define $g:\omega\to\alpha^+$ such that $g(0)=\alpha$ and $g(n)=f(n-1)$ if $n\neq0$, then $g$ is a bijection. Thus $\abs{\alpha^+}=\omega$.

Let $\gamma$ be an ordinal and suppose for all $\delta\in\gamma$, if $\delta$ is countable, then $\abs{\alpha+\delta}=\omega$. If $\gamma$ is countable, then every $\delta\in\gamma$ is countable.

If $\gamma$ is a zero ordinal, then $\abs{\alpha+\gamma}=\abs{\alpha}=\omega$.
If $\gamma$ is a successor ordinal, then for some $\delta\in\gamma$, $\delta^+=\gamma$, and we have $\abs{\alpha+\gamma}=\abs{(\alpha+\delta)^+}=\omega$.
If $\gamma$ is a limit ordinal, since $\abs{\alpha+\delta}=\omega$ for all $\delta\in\gamma$, there exists a bijection $f_\delta:\omega\to\alpha+\delta$ for each $\delta\in\gamma$. Since $\gamma$ is countable, there exists a bijection $g:\omega\to\gamma$. Define $h:\omega\times\omega\to\alpha+\gamma$ by $h(m,n)=f_{g(m)}(n)$, then $h$ is a surjection. Thus $\abs{\alpha+\gamma}\le\abs{\omega\times\omega}=\omega$. Also, $\abs{\alpha+\gamma}\ge\abs{\alpha}=\omega$. Thus $\abs{\alpha+\gamma}=\omega$.

Hence if $\gamma$ is countable, then $\abs{\alpha+\gamma}=\omega$. By transfinite induction, for all countable ordinal $\gamma$, $\abs{\alpha+\gamma}=\omega$. Hence $\abs{\alpha+\beta}=\omega$.

Let $\gamma$ be an ordinal and suppose for all $\delta\in\gamma$, if $\delta$ is non-zero countable, then $\abs{\alpha\delta}=\omega$. If $\gamma$ is non-zero countable, then every $\delta\in\gamma$ is either zero or non-zero countable.

If $\gamma$ is a zero ordinal, we have a contradiction.
If $\gamma$ is a successor ordinal, then for some $\delta\in\gamma$, $\delta^+=\gamma$, and we have $\abs{\alpha\gamma}=\abs{\alpha\delta+\alpha}=\omega$.
If $\gamma$ is a limit ordinal, since $\abs{\alpha\delta}=\omega$ for all non-zero $\delta\in\gamma$, there exists a bijection $f_\delta:\omega\to\alpha\delta$ for each non-zero $\delta\in\gamma$. Also define $f_0:\omega\to1$ by $f_0(n)=0$ for all $n\in\omega$. Since $\gamma$ is countable, there exists a bijection $g:\omega\to\gamma$. Define $h:\omega\times\omega\to\alpha\gamma$ by $h(m,n)=f_{g(m)}(n)$, then $h$ is a surjection. Thus $\abs{\alpha\gamma}\le\abs{\omega\times\omega}=\omega$. Also, $\abs{\alpha\gamma}\ge\abs{\alpha}=\omega$. Thus $\abs{\alpha\gamma}=\omega$.

Hence if $\gamma$ is non-zero countable, then $\abs{\alpha\gamma}=\omega$. By transfinite induction, for all non-zero countable ordinal $\gamma$, $\abs{\alpha\gamma}=\omega$. Hence $\abs{\alpha\beta}=\omega$ if $\beta\neq0$.

Let $\gamma$ be an ordinal and suppose for all $\delta\in\gamma$, if $\delta$ is non-zero countable, then $\abs{\alpha^\delta}=\omega$. If $\gamma$ is non-zero countable, then every $\delta\in\gamma$ is either zero or non-zero countable.

If $\gamma$ is a zero ordinal, we have a contradiction.
If $\gamma$ is a successor ordinal, then for some $\delta\in\gamma$, $\delta^+=\gamma$, and we have $\abs{\alpha^\gamma}=\abs{\alpha^\delta\alpha}=\omega$.
If $\gamma$ is a limit ordinal, since $\abs{\alpha^\delta}=\omega$ for all non-zero $\delta\in\gamma$, there exists a bijection $f_\delta:\omega\to\alpha^\delta$ for each non-zero $\delta\in\gamma$. Also define $f_0:\omega\to1$ by $f_0(n)=0$ for all $n\in\omega$. Since $\gamma$ is countable, there exists a bijection $g:\omega\to\gamma$. Define $h:\omega\times\omega\to\alpha^\gamma$ by $h(m,n)=f_{g(m)}(n)$, then $h$ is a surjection. Thus $\abs{\alpha^\gamma}\le\abs{\omega\times\omega}=\omega$. Also, $\abs{\alpha^\gamma}\ge\abs{\alpha}=\omega$. Thus $\abs{\alpha^\gamma}=\omega$.

Hence if $\gamma$ is non-zero countable, then $\abs{\alpha^\gamma}=\omega$. By transfinite induction, for all non-zero countable ordinal $\gamma$, $\abs{\alpha^\gamma}=\omega$. Hence $\abs{\alpha^\beta}=\omega$ if $\beta\neq0$.

$\blacksquare$

Addition on cardinals
Note that for all $p\in\text{Car}\times\text{Car}$, $\abs{p_0\bigsqcup p_1}\in\text{Car}$, where $p_0\bigsqcup p_1$ denotes $(\{0\}\times p_0)\cup(\{1\}\times p_1)$. Thus we can define a class function $+:\text{Car}\times\text{Car}\to\text{Car}$ such that for all $p\in\text{Car}\times\text{Car}$, $+(p)=\abs{p_0\bigsqcup p_1}$. Given $\alpha,\beta\in\text{Car}$, we may denote $+((\alpha,\beta))$ by $\alpha+\beta$, then we have $$\alpha+\beta=\abs{\alpha\bigsqcup\beta}$$

Multiplication on cardinals
Note that for all $p\in\text{Car}\times\text{Car}$, $\abs{p_0\times p_1}\in\text{Car}$. Thus we can define a class function $\times:\text{Car}\times\text{Car}\to\text{Car}$ such that for all $p\in\text{Car}\times\text{Car}$, $\times(p)=\abs{p_0\times p_1}$. Given $\alpha,\beta\in\text{Car}$, we may denote $\times((\alpha,\beta))$ by $\alpha\beta$, then we have $$\alpha\beta=\abs{\alpha\times\beta}$$

Exponentiation on cardinals
Note that for all $p\in\text{Car}\times\text{Car}$, $\abs{\{p_1\to p_0\}}\in\text{Car}$, where $\{p_1\to p_0\}$ denotes the set of functions from $p_1$ to $p_0$. Thus we can define a class function $\wedge:\text{Car}\times\text{Car}\to\text{Car}$ such that for all $p\in\text{Car}\times\text{Car}$, $\wedge(p)=\abs{\{p_1\to p_0\}}$. Given $\alpha,\beta\in\text{Car}$, we may denote $\wedge((\alpha,\beta))$ by $\alpha^\beta$, then we have $$\alpha^\beta=\abs{\{\beta\to\alpha\}}$$

Proposition. For all $\kappa\in\text{Car}$,

$\kappa+0=\kappa$;
$\kappa0=0$;
$\kappa1=\kappa$;
$\kappa^0=1$;
$\kappa^1=\kappa$;
$0^\kappa=0$ if $\kappa\neq0$;
$1^\kappa=1$.

(show proof)

Proposition. Cardinal arithmetics extends natural number arithmetics. (show proof)

Proof. We will use the usual notations to denote natural number arithmetics.

Let $m,n\in N$. Given $p\in m\bigsqcup n$, either $p\in\{0\}\times m$ or $p\in\{1\}\times n$.

If $p\in\{0\}\times m$, then for some $k\in m$, $p=(0,k)$, thus $p_0m+p_1=k\lt m\le m+n$.
If $p\in\{1\}\times n$, then for some $k\in n$, $p=(1,k)$, thus $p_0m+p_1=m+k\lt m+n$.

In both cases we have $p_0m+p_1\in m+n$. Thus we can define a function $f:m\bigsqcup n\to m+n$ such that for all $p\in m\bigsqcup n$, $f(p)=p_0m+p_1$. Given $p,q\in m\bigsqcup n$ such that $f(p)=f(q)$, we have $p_0m+p_1=f(p)=f(q)=q_0m+q_1$. If $p\in\{0\}\times m$ and $q\in\{1\}\times n$, then $f(p)=p_1\lt m\le m+q_1=f(q)$, a contradiction. Similarly, $p\in\{1\}\times n$ and $q\in\{0\}\times m$ also leads to a contradiction. Thus $p_0=q_0$, and we have $p_1=q_1$, implying $p=q$. Hence $f$ is injective. Let $k\in m+n$, then either $k\lt m$ or $k\ge m$.

If $k\lt m$, then $(0,k)\in m\bigsqcup n$ and $f((0,k))=k$.
If $k\ge m$, then $(1,k-m)\in m\bigsqcup n$ and $f((1,k-m))=k$.

In both cases there exists $p\in m\bigsqcup n$ such that $f(p)=k$. Hence $f$ is surjective. Therefore, $\abs{m\bigsqcup n}=\abs{m+n}=m+n$.

Let $m,n\in N$. Given $p\in m\times n$, $p_0n+p_1\lt p_0n+n=S(p_0)n\le mn$. Thus we can define a function $f:m\times n\to mn$ such that for all $p\in m\times n$, $f(p)=p_0n+p_1$. Given $p,q\in m\times n$ such that $f(p)=f(q)$, we have $p_0n+p_1=f(p)=f(q)=q_0n+q_1$. If $n=0$, then we immediately have a contradiction because $p_1\in n$. Thus $n\neq0$. Suppose $p_0\lt q_0$, then $S(p_0)\le q_0$, and we have $p_0n+(n+q_1)=S(p_0)n+q_1\le q_0n+q_1=p_0n+p_1$, implying $n+q_1=p_1\lt n$, a contradiction. Similarly, $p_0\gt q_0$ also leads to a contradiction, thus $p_0=q_0$, then we have $p_1=q_1$, implying $p=q$. Hence $f$ is injective. Let $k\in mn$, then $n\neq0$. Note that $k=n(k\divsymbol n)+(k\bmod n)$, thus $n(k\divsymbol n)\le n(k\divsymbol n)+(k\bmod n)\lt mn$, implying $k\divsymbol n\lt m$. Thus $(k\divsymbol n,k\bmod n)\in m\times n$, and we have $f((k\divsymbol n,k\bmod n))=k$. Hence $f$ is surjective. Therefore, $\abs{m\times n}=\abs{mn}=mn$.

Let $m,n\in N$. Suppose $g\in\{n\to m\}$. Then $\sum_{j=0}^{0-1}g(j)m^j=0\lt 1=m^0$. Let $k\in n$ and suppose $\sum_{j=0}^{k-1}g(j)m^j\lt m^k$. Then $\sum_{j=0}^{S(k)-1}g(j)m^j=\sum_{j=0}^{k-1}g(j)m^j+g(k)m^k\lt m^k+g(k)m^k=S(g(k))m^k\le mm^k=m^{S(k)}$. By induction, we have $\sum_{j=0}^{n-1}g(j)m^j\lt m^n$. Thus we can define a function $f:\{n\to m\}\to m^n$ such that for all $g\in\{n\to m\}$, $f(g)=\sum_{j=0}^{n-1}g(j)m^j$. Given $g,h\in\{n\to m\}$ such that $f(g)=f(h)$, we have $\sum_{j=0}^{n-1}g(j)m^j=f(g)=f(h)=\sum_{j=0}^{n-1}h(j)m^j$. Let $k\in n$ and suppose for all $l\in k$, $g(l)=h(l)$. Note that $$\p{\sum_{j=0}^{n-1}g(j)m^j}\bmod m^{k+1} =\p{\sum_{j=0}^{k}g(j)m^j+\sum_{j=k+1}^{n-1}g(j)m^j}\bmod m^{k+1} =\p{\p{\sum_{j=0}^{k}g(j)m^j}\bmod m^{k+1}+\p{\sum_{j=k+1}^{n-1}g(j)m^j}\bmod m^{k+1}}\bmod m^{k+1} =\p{\sum_{j=0}^{k}g(j)m^j}\bmod m^{k+1} $$ Similarly, we have $\p{\sum_{j=0}^{n-1}h(j)m^j}\bmod m^{k+1}=\p{\sum_{j=0}^{k}h(j)m^j}\bmod m^{k+1}$. Thus $\p{\sum_{j=0}^{k}g(j)m^j}\bmod m^{k+1}=\p{\sum_{j=0}^{k}h(j)m^j}\bmod m^{k+1}$. As we showed above, we have $\sum_{j=0}^{k}g(j)m^j\lt m^{k+1}$ and $\sum_{j=0}^{k}h(j)m^j\lt m^{k+1}$. Thus $\sum_{j=0}^{k}g(j)m^j=\sum_{j=0}^{k}h(j)m^j$. Since $g(l)=h(l)$ for all $l\in k$, we have $g(k)m^k=h(k)m^k$. If $m^k=0$, then $m=0$ and $k\neq0$, implying $0\in n$. But then $\{n\to m\}=\emptyset$, a contradiction. Thus $m^k\neq0$, and we have $g(k)=h(k)$, implying for all $l\in S(k)$, $g(l)=h(l)$. With the trivial base case, by induction, we have that for all $l\in n$, $g(l)=h(l)$, implying $g=h$. Hence $f$ is injective. Let $k\in m^n$. Let $j\in n$ and suppose for contradiction that $(k\bmod m^{j+1})\divsymbol m^j\ge m$, then $k\bmod m^{j+1}\ge m^j((k\bmod m^{j+1})\divsymbol m^j)\ge m^jm=m^{j+1}$, a contradiction. Thus for all $j\in n$, $(k\bmod m^{j+1})\divsymbol m^j\lt m$. And we can define a function $g:n\to m$ such that for all $j\in n$, $g(j)=(k\bmod m^{j+1})\divsymbol m^j$. Note that $\sum_{j=0}^{0-1}\p{(k\bmod m^{j+1})\divsymbol m^j}m^j=0=k\bmod m^0$. Let $l\in n$ and suppose $\sum_{j=0}^{l-1}\p{(k\bmod m^{j+1})\divsymbol m^j}m^j=k\bmod m^l$. Then $$\sum_{j=0}^{S(l)-1}\p{(k\bmod m^{j+1})\divsymbol m^j}m^j =\sum_{j=0}^{l-1}\p{(k\bmod m^{j+1})\divsymbol m^j}m^j+\p{(k\bmod m^{S(l)})\divsymbol m^{S(l)-1}}m^{S(l)-1} =k\bmod m^l+\p{(k\bmod m^{S(l)})\divsymbol m^l}m^l$$ Note that $$ k =m^{S(l)}(k\divsymbol m^{S(l)})+k\bmod m^{S(l)} =m^{S(l)}(k\divsymbol m^{S(l)})+m^l\p{(k\bmod m^{S(l)})\divsymbol m^l}+(k\bmod m^{S(l)})\bmod m^l =m^l\p{m(k\divsymbol m^{S(l)})+(k\bmod m^{S(l)})\divsymbol m^l}+(k\bmod m^{S(l)})\bmod m^l $$ and $(k\bmod m^{S(l)})\bmod m^l\lt m^l$. Thus $k\bmod m^l=(k\bmod m^{S(l)})\bmod m^l$, and we have $$k\bmod m^l+\p{(k\bmod m^{S(l)})\divsymbol m^l}m^l =(k\bmod m^{S(l)})\bmod m^l+\p{(k\bmod m^{S(l)})\divsymbol m^l}m^l =k\bmod m^{S(l)}$$ By induction, we have $$f(g)=\sum_{j=0}^{n-1}g(j)m^j=\sum_{j=0}^{n-1}\p{(k\bmod m^{j+1})\divsymbol m^j}m^j=k\bmod m^n=k$$ Hence $f$ is surjective. Therefore, $\abs{\{n\to m\}}=\abs{m^n}=m^n$. $\blacksquare$

Proposition. For all $\alpha,\beta,\gamma\in\text{Car}$,

$\alpha+\beta=\beta+\alpha$;
$(\alpha+\beta)+\gamma=\alpha+(\beta+\gamma)$;
$\alpha+\beta\le\alpha+\gamma$ if $\beta\le\gamma$;
$\alpha\beta=\beta\alpha$;
$(\alpha\beta)\gamma=\alpha(\beta\gamma)$;
$\alpha\beta\le\alpha\gamma$ if $\beta\le\gamma$;
$\alpha^{\beta+\gamma}=\alpha^\beta\alpha^\gamma$;
$\alpha^{\beta\gamma}=\p{\alpha^\beta}^\gamma$;
$\p{\alpha\beta}^\gamma=\alpha^\gamma\beta^\gamma$;
$\alpha^\beta\le\alpha^\gamma$ if $\alpha\neq0$ and $\beta\le\gamma$;
$\alpha^\gamma\le\beta^\gamma$ if $\alpha\le\beta$;

(show proof)

Proof. The map $p\mapsto(1-p_0,p_1)$ from $\alpha\bigsqcup\beta$ to $\beta\bigsqcup\alpha$ is a bijection. Thus $\alpha+\beta=\beta+\alpha$.

Let $f$ be a bijection from $\alpha\bigsqcup\beta$ to $\alpha+\beta$ and $g$ be a bijection from $\beta\bigsqcup\gamma$ to $\beta+\gamma$. Then for all $p\in(\alpha+\beta)\bigsqcup\gamma$, $p_0\in\{0,1\}$ and when $p_0=0$, $f^{-1}(p_1)\in\alpha\bigsqcup\beta$, implying $f^{-1}(p_1)_0\in\{0,1\}$. Thus the map

$p\to(0,f^{-1}(p_1)_1)$ if $p_0=0$ and $f^{-1}(p_1)_0=0$;
$p\to(1,g((0,f^{-1}(p_1)_1)))$ if $p_0=0$ and $f^{-1}(p_1)_0=1$;
$p\to(1,g((1,p_1)))$ if $p_0=1$;

from $(\alpha+\beta)\bigsqcup\gamma$ to $\alpha\bigsqcup(\beta+\gamma)$ is a bijection. Thus $(\alpha+\beta)+\gamma=\alpha+(\beta+\gamma)$.

If $\beta\le\gamma$ then $\beta\subseteq\gamma$, thus $\alpha\bigsqcup\beta\subseteq\alpha\bigsqcup\gamma$, implying $\alpha+\beta\le\alpha+\gamma$.

The map $p\mapsto(p_1,p_0)$ from $\alpha\times\beta$ to $\beta\times\alpha$ is a bijection. Thus $\alpha\beta=\beta\alpha$.

Let $f$ be a bijection from $\alpha\times\beta$ to $\alpha\beta$ and $g$ be a bijection from $\beta\times\gamma$ to $\beta\gamma$. The map $p\mapsto(f^{-1}(p_0)_0,g((f^{-1}(p_0)_1,p_1)))$ from $(\alpha\beta)\times\gamma$ to $\alpha\times(\beta\gamma)$ is a bijection. Thus $(\alpha\beta)\gamma=\alpha(\beta\gamma)$.

If $\beta\le\gamma$ then $\beta\subseteq\gamma$, thus $\alpha\times\beta\subseteq\alpha\times\gamma$, implying $\alpha\beta\le\alpha\gamma$.

Let $g_0$ denote the function $x\mapsto(0,x)$ from $\beta$ to $\{0\}\times\beta$, $g_1$ denote the function $x\mapsto(1,x)$ from $\gamma$ to $\{1\}\times\gamma$, and $h_a,h_b,h_c$ denote bijections from $\beta\bigsqcup\gamma$ to $\beta+\gamma$, from $\{\beta\to\alpha\}$ to $\alpha^\beta$, and from $\{\gamma\to\alpha\}$ to $\alpha^\gamma$, respectively. The map $f\mapsto(h_b((f\circ h_a)|_{\{0\}\times\beta}\circ g_0),h_c((f\circ h_a)|_{\{1\}\times\gamma}\circ g_1))$ from $\{\beta+\gamma\to\alpha\}$ to $\alpha^\beta\times\alpha^\gamma$ is a bijection. Thus $\alpha^{\beta+\gamma}=\alpha^\beta\alpha^\gamma$.

Let $g\in\{\gamma\to\{\beta\to\alpha\}\}$. Then there exists a unique function $h:\beta\times\gamma\to\alpha$ such that for all $p\in\beta\times\gamma$, $h(p)=g(p_1)(p_0)$. Then we can define a map $u$ from $\{\gamma\to\{\beta\to\alpha\}\}$ to $\{\beta\times\gamma\to\alpha\}$ such that for all $g\in\{\gamma\to\{\beta\to\alpha\}\}$, for all $p\in\beta\times\gamma$, $u(g)(p)=g(p_1)(p_0)$. Clearly, $u$ is a bijection. Let $h_a,h_b$ denote bijections from $\beta\times\gamma$ to $\beta\gamma$ and from $\{\beta\to\alpha\}$ to $\alpha^\beta$, respectively. The map $f\mapsto u(h_b^{-1}\circ f)\circ h_a^{-1}$ from $\{\gamma\to\alpha^\beta\}$ to $\{\beta\gamma\to\alpha\}$ is a bijection. Thus $\p{\alpha^\beta}^\gamma=\alpha^{\beta\gamma}$.

Let $g_a,g_b$ denote the functions from $\alpha\times\beta$ to $\alpha,\beta$ respectively, such that for all $p\in\alpha\times\beta$, $g_a(p)=p_0$ and $g_b(p)=p_1$. Let $h_a,h_b,h_c$ denote bijections from $\alpha\times\beta$ to $\alpha\beta$, from $\{\gamma\to\alpha\}$ to $\alpha^\gamma$, and from $\{\gamma\to\beta\}$ to $\beta^\gamma$, respectively. The map $f\mapsto(h_b(g_a\circ h_a^{-1}\circ f),h_c(g_b\circ h_a^{-1}\circ f))$ from $\{\gamma\to\alpha\beta\}$ to $\alpha^\gamma\times\beta^\gamma$ is a bijection. Thus $\p{\alpha\beta}^\gamma=\alpha^\gamma\beta^\gamma$.

If $\beta\le\gamma$ then $\beta\subseteq\gamma$. Since $\alpha\neq0$, we have $0\in\alpha$. Thus given $f\in\{\beta\to\alpha\}$, there exists a unique $g\in\{\gamma\to\alpha\}$ such that $g(x)=f(x)$ for all $x\in\beta$ and $g(x)=0$ for all $x\in\gamma\setminus\beta$. This defines an injective map from $\{\beta\to\alpha\}$ to $\{\gamma\to\alpha\}$, implying $\alpha^\beta\le\alpha^\gamma$.

If $\alpha\le\beta$ then $\alpha\subseteq\beta$, thus $\{\gamma\to\alpha\}\subseteq\{\gamma\to\beta\}$, implying $\alpha^\gamma\le\beta^\gamma$. $\blacksquare$

Proposition. $\abs{\mathcal P(X)}=2^{\abs{X}}$. (show proof)

Aleph number
Let $\kappa\in\text{Car}$, then $\abs{\mathcal P(\kappa)}\gt\kappa$, thus there exists a least cardinal greater than $\kappa$. This defines a class function $S:\text{Car}\to\text{Car}$ such that for all $\kappa\in\text{Car}$, $S(\kappa)$ is the least cardinal greater than $\kappa$. Now define a class function $G:V\to V$ by:

$G(X)=\omega$ if $X$ is a function from a zero ordinal;
$G(X)=\bigcup\{S(p_1):p\in X\}$ if $X$ is a function from a successor ordinal;
$G(X)=\bigcup\{p_1:p\in X\}$ if $X$ is a function from a limit ordinal;
$G(X)=\emptyset$ if otherwise.

Then $G$ recursively defines a class function $\aleph:\text{Ord}\to\text{Car}$ that is strictly increasing (show proof).

Proof. By transfinite recursion, $\aleph$ is a class function from $\text{Ord}$ to $V$, and for all $\alpha\in\text{Ord}$, $\aleph_\alpha=G(\aleph\upharpoonright\alpha)$. Let $\alpha\in\text{Ord}$ and suppose that for all $\beta\in\alpha$, $\aleph_\beta\in\text{Car}$.

If $\alpha$ is a zero ordinal, then $\aleph_\alpha=G(\aleph\upharpoonright\emptyset)=\omega\in\text{Car}$.
If $\alpha$ is a successor ordinal, then $\aleph_\alpha=G(\aleph\upharpoonright\alpha)=\bigcup\{S(p_1):p\in\aleph\upharpoonright\alpha\}$. Note that for all $p\in \aleph\upharpoonright\alpha$, there exist $\beta,\gamma$ such that $\beta\in\alpha$ and $p=(\beta,\gamma)$. Thus $p_1=\gamma=\aleph\upharpoonright\alpha(\beta)=\aleph_\beta\in\text{Car}$, and hence $S(p_1)\in\text{Car}$. Then for all $\gamma\in\{S(p_1):p\in\aleph\upharpoonright\alpha\}$, $\gamma\in\text{Car}$, and thus $\bigcup\{S(p_1):p\in\aleph\upharpoonright\alpha\}\in\text{Car}$. Hence $\aleph_\alpha\in\text{Car}$.
If $\alpha$ is a limit ordinal, then $\aleph_\alpha=G(\aleph\upharpoonright\alpha)=\bigcup\{p_1:p\in\aleph\upharpoonright\alpha\}$. By the same reasoning as above, for all $p\in \aleph\upharpoonright\alpha$, $p_1\in\text{Car}$. Then for all $\gamma\in\{p_1:p\in\aleph\upharpoonright\alpha\}$, $\gamma\in\text{Car}$, and thus $\bigcup\{p_1:p\in\aleph\upharpoonright\alpha\}\in\text{Car}$. Hence $\aleph_\alpha\in\text{Car}$.

In all cases, $\aleph_\alpha\in\text{Car}$. Thus by transfinite induction, for all $\alpha\in\text{Ord}$, $\aleph_\alpha\in\text{Car}$. Hence $\aleph$ is a class function from $\text{Ord}$ to $\text{Car}$.

Suppose $\alpha\in\text{Ord}$ and for all $\beta\in\alpha$, for all $\gamma\in\beta$, $\aleph_\gamma\lt\aleph_\beta$.

If $\alpha$ is a zero ordinal, then trivially, for all $\gamma\in\alpha$, $\aleph_\gamma\lt\aleph_\alpha$.
If $\alpha$ is a successor ordinal, then for some $\beta\in\alpha$ we have $\beta^+=\alpha$. If $\gamma\in\alpha$, then $\gamma\le\beta$, thus $\aleph_\gamma\le\aleph_\beta\lt S(\aleph_\beta)\le\bigcup\{S(p_1):p\in\aleph\upharpoonright\alpha\}=\aleph_\alpha$.
If $\alpha$ is a limit ordinal, then given $\gamma\in\alpha$, there exists $\beta\in\alpha$ such that $\gamma\in\beta$, implying $\aleph_\gamma\lt\aleph_\beta\le\bigcup\{p_1:p\in\aleph\upharpoonright\alpha\}=\aleph_\alpha$.

In all cases, we have that for all $\gamma\in\alpha$, $\aleph_\gamma\lt\aleph_\alpha$. By transfinite induction, for all $\alpha\in\text{Ord}$, for all $\gamma\in\alpha$, $\aleph_\gamma\lt\aleph_\alpha$. We have shown that $\aleph$ is strictly increasing. $\blacksquare$

And we have

$\aleph_0=\omega$;
$\aleph_{\alpha^+}=S(\aleph_\alpha)$;
$\aleph_\alpha=\bigcup_{\beta\in\alpha}\aleph_\beta$ if $\alpha$ is a limit ordinal.

(show proof)

Proposition. No set $X$ has $X=\text{Car}$. (show proof)

Lemma. Let $X$ be a non-empty set of ordinals with no maximal element, then $\bigcup X$ is a limit ordinal. (show proof)

Proposition. For every limit cardinal $\kappa$, there exists a unique ordinal $\alpha$ such that $\aleph_\alpha=\kappa$. (show proof)

Proof. Suppose there exists a limit cardinal that does not belong to the range of $\aleph$. Then there exists a least limit cardinal $\kappa$ with such property. Let $X$ denote the subset of $\kappa$ consisting of limit cardinals less than $\kappa$. Then $\bigcup X$ is a cardinal and is the least upper bound of $X$.

If $X$ is empty, then $\kappa$ cannot be greater than $\omega$. But then $\kappa$ also cannot be smaller than $\omega$ because $\omega$ is the least limit ordinal. Thus $\aleph_0=\omega=\kappa$, a contradiction.

If $X$ has a maximal element $\mu$, then $\mu$ must fall in the range of $\aleph$, implying for some ordinal $\alpha$, $\aleph_\alpha=\mu$. If $S(\mu)\lt\kappa$, then $S(\mu)\in X$, implying $\mu$ is not a maximal element of $X$, a contradiction. If $S(\mu)\gt\kappa$, then $S(\mu)$ is not the least cardinal greater than $\mu$, a contradiction. Hence $\aleph_{\alpha^+}=S(\aleph_\alpha)=S(\mu)=\kappa$, a contradiction.

If $X$ is non-empty and has no maximal element, then $\bigcup X$ is a limit cardinal. If $\bigcup X\lt\kappa$, then $\bigcup X\in X$, and there exists $\mu\in X$ such that $\mu\gt\bigcup X\ge\mu$, a contradiction. If $\bigcup X\gt\kappa$, then $\bigcup X$ is not the least upper bound of $X$, a contradiction. Thus $\bigcup X=\kappa$. Note that for each $\mu\in X$, there exists an ordinal $\alpha_\mu$ such that $\aleph_{\alpha_\mu}=\mu$. Also note that $\{\alpha_\mu:\mu\in X\}$ is a non-empty set of ordinals with no maximal element. Let $\alpha$ denote $\bigcup_{\mu\in X}\alpha_\mu$, then $\alpha$ is a limit ordinal. If $u\in\aleph_\alpha$, then for some $\beta\in\alpha$, $u\in\aleph_\beta$, and for some $\mu\in X$, $\beta\in\alpha_\mu$, thus $\aleph_\beta\lt\aleph_{\alpha_\mu}=\mu$, implying $u\in\mu$. Hence $u\in\bigcup X$. For the other direction, if $u\in\bigcup X$, then for some $\mu\in X$, $u\in\mu=\aleph_{\alpha_\mu}$. Since $\alpha_\mu\in\{\alpha_\mu:\mu\in X\}$, $\alpha_\mu\le\alpha$, implying $\aleph_{\alpha_\mu}\le\aleph_\alpha$. Hence $u\in\aleph_\alpha$. Therefore, we have $\aleph_\alpha=\bigcup X=\kappa$, a contradiction.

We have shown that every limit cardinal is in the range of $\aleph$. Uniqueness is trivial. $\blacksquare$

Proposition. A well-ordered set $(X,\le)$ is order-isomorphic to an ordinal. (show proof)

Proof. Given a function $f$ from an ordinal $\alpha$ to $X$, if $f$ is not a surjection, then there exists a unique least element of $X$ not in the range of $f$. Let $A$ denote the class of non-surjective functions from an ordinal to $X$, then this defines a class function $F:A\to X$. Define a class function $G:V\to V$ by:

$G(x)=F(x)$ if $x\in A$;
$G(x)=\emptyset$ if otherwise.

Then $G$ recursively defines a class function $H:\text{Ord}\to V$ such that for all $\alpha\in\text{Ord}$, $H(\alpha)=G(H\upharpoonright\alpha)$. Assume for contradiction that for all ordinal $\alpha$, $H\upharpoonright\alpha\in A$. Let $\alpha$ be an ordinal, then $H\upharpoonright\alpha$ is a function from $\alpha$ to $X$. Given $\gamma,\delta\in\alpha$ such that $H\upharpoonright\alpha(\gamma)=H\upharpoonright\alpha(\delta)$, suppose for contradiction that $\gamma\lt\delta$. Note that $H\upharpoonright\delta\in A$, thus $H\upharpoonright\alpha(\delta)=H(\delta)=G(H\upharpoonright\delta)=F(H\upharpoonright\delta)$, which is the least element of $X$ not in the range of $H\upharpoonright\delta$. But then $H\upharpoonright\alpha(\gamma)=H(\gamma)=H\upharpoonright\delta(\gamma)$, which is in the range of $H\upharpoonright\delta$, a contradiction. With a similar argument, $\gamma\gt\delta$ also leads to a contradiction. Hence $\gamma=\delta$. We have shown that $H\upharpoonright\alpha$ is injective from $\alpha$ to $X$. Since $\alpha$ is an arbitrary ordinal, we have $S(\abs{X})=\abs{S(\abs{X})}\le\abs{X}$, a contradiction. Hence there exists an ordinal $\alpha$ such that $H\upharpoonright\alpha\notin A$. Let $\epsilon$ be the least ordinal such that $H\upharpoonright\epsilon\notin A$. Let $\alpha$ be an ordinal and suppose for all $\beta\in\alpha$, if $\beta\le\epsilon$, then $H\upharpoonright\beta$ is an order-embedding from $\beta$ to $X$ with property $\varphi(H\upharpoonright\beta)$, where $\varphi(f)$ represents that for all $x,y\in X$, if $x$ is in range of $f$ and $y$ is not in range of $f$, then $x\lt y$ (meaning $x\le y$ and $x\neq y$). Let $\alpha\le\epsilon$.

If $\alpha$ is a zero ordinal, then $H\upharpoonright\alpha=\emptyset$, which is trivially an order-embedding from $\alpha$ to $X$ with property $\varphi(H\upharpoonright\alpha)$.
If $\alpha$ is a successor ordinal, then some $\beta\in\alpha$ has $\beta^+=\alpha$. Since $\beta\lt\epsilon$, $H\upharpoonright\beta$ is an order-embedding from $\beta$ to $X$ with property $\varphi(H\upharpoonright\beta)$ and $H\upharpoonright\beta\in A$. Then $H\upharpoonright\alpha(\beta)=H(\beta)=G(H\upharpoonright\beta)=F(H\upharpoonright\beta)$, which is the least element of $X$ not in the range of $H\upharpoonright\beta$. Then $H\upharpoonright\alpha$ is clearly an order-embedding from $\alpha$ to $X$ with property $\varphi(H\upharpoonright\alpha)$.
If $\alpha$ is a limit ordinal, then given $\gamma,\delta\in\alpha$, there exists $\beta\in\alpha$ such that $\gamma,\delta\in\beta$. Since for any $\beta\in\alpha$, $H\upharpoonright\beta$ is an order-embedding from $\beta$ to $X$ with property $\varphi(H\upharpoonright\beta)$, $H\upharpoonright\alpha$ is clearly an order-embedding from $\alpha$ to $X$ with property $\varphi(H\upharpoonright\alpha)$.

By transfinite induction, for every ordinal $\alpha$, if $\alpha\le\epsilon$, then $H\upharpoonright\alpha$ is an order-embedding from $\alpha$ to $X$. Thus $H\upharpoonright\epsilon$ is an order-embedding from $\epsilon$ to $X$. Note that $H\upharpoonright\epsilon\notin A$, implying $H\upharpoonright\epsilon$ is also a surjection from $\epsilon$ to $X$. Denote $H\upharpoonright\epsilon$ as $f$ and let $x,y\in X$ such that $x\le y$. Suppose $f^{-1}(x)\gt f^{-1}(y)$, then $y\le x$ and $x\neq y$, a contradiction. We have shown that $f$ is an order-isomorphism from $\epsilon$ to $X$. $\blacksquare$

Lemma. For all limit cardinal $\kappa$, $\kappa^2=\kappa$. (show proof)

Proof. Suppose for contradiction that there exists a limit cardinal $\kappa$ such that $\kappa^2\gt\kappa$. Then there exists a least limit cardinal $\kappa$ such that $\kappa^2\gt\kappa$.

Define the relation $\le$ on $\kappa\times\kappa$ such that for all $p,q\in\kappa\times\kappa$, $p\le q$ if and only if

$p_0\cup p_1\lt q_0\cup q_1$, or
$p_0\cup p_1=q_0\cup q_1$ and $p_0\lt q_0$, or
$p_0\cup p_1=q_0\cup q_1$, $p_0=q_0$, and $p_1\le q_1$.

Then $(\kappa\times\kappa,\le)$ is clearly a well-ordered set, and thus order-isomorphic to some ordinal $\alpha$. Then $\alpha\ge\abs{\alpha}=\abs{\kappa\times\kappa}=\kappa^2\gt\kappa$. Let $f$ be an order-isomorphism from $\alpha$ to $\kappa\times\kappa$, then there exists $p\in\kappa\times\kappa$ not in range of $f|_\kappa$, implying $f^{-1}(p)\in\alpha\setminus\kappa$. Then for all $\beta\in\kappa$, $\beta\lt f^{-1}(p)$, thus $f(\beta)\lt p$, implying $f(\beta)_0\cup f(\beta)_1\le p_0\cup p_1$. Since $\kappa$ is a limit ordinal, $(p_0\cup p_1)^+\in\kappa$. Denote $(p_0\cup p_1)^+$ as $\gamma$, then $f(\beta)_0\cup f(\beta)_1\lt\gamma$, thus $f(\beta)\in\gamma\times\gamma$. Since $f|_\kappa$ is an injection from $\kappa$ to $\gamma\times\gamma$, $\kappa\le\abs{\gamma\times\gamma}$. If $\gamma\in\omega$, then $\gamma$ is a cardinal and $\kappa\le\gamma\gamma\lt\omega$, a contradiction. Thus $\gamma\ge\omega$, implying $\abs{\gamma}$ is a limit cardinal. Since $\abs{\gamma}\le\gamma\lt\kappa$, $\abs{\gamma}^2\le\abs{\gamma}$, and thus $\abs{\gamma}^2=\abs{\gamma}$. Since $\gamma\times\gamma$ is bijective to $\abs{\gamma}\times\abs{\gamma}$, we have $\abs{\gamma\times\gamma}=\abs{\gamma}^2$. Then we have $\kappa\le\abs{\gamma\times\gamma}=\abs{\gamma}^2=\abs{\gamma}\le\gamma\lt\kappa$, a contradiction.

We have shown that for all limit cardinal $\kappa$, $\kappa^2\le\kappa$, and thus $\kappa^2=\kappa$. $\blacksquare$

Proposition. Suppose $\kappa,\mu$ are cardinals such that $\kappa\ge\mu$ and $\kappa\ge\omega$, then

$\kappa+\mu=\kappa$;
$\kappa\mu=\kappa$;
$\kappa^n=\kappa$ if $n\in\omega\setminus\{0\}$.

(show proof)

Proposition. Given non-zero natural number $n$, $$2^{\aleph_0}=\abs{R}=\abs{C}=\abs{R^n}$$ (show proof)

Cofinality
Let $\alpha\in\text{Car}$, the cofinality of $\alpha$ is the strictly least element $\kappa$ of $\text{Car}$ such that there exists a map $f:\kappa\to\alpha$ with the property that for all $\beta\in\alpha$, there exists $\gamma\in\kappa$ such that $\beta\le f(\gamma)$. This defines a function $\text{cf}$ from $\text{Car}$ to $\text{Car}$.

Inaccessibility
Let $\kappa\in\text{Car}$, $\kappa$ is called inaccessible if

$\omega\in\kappa$,
$\text{cf}(\kappa)=\kappa$, and
if $\abs{X}\lt\kappa$, then $\abs{\mathcal P(X)}\lt\kappa$.

Lemma. Suppose $\kappa$ is an inaccessible cardinal, then for all $\alpha\in\kappa$, $\alpha^+\in\kappa$. (show proof)

Lemma. Suppose $\kappa$ is an inaccessible cardinal, then $x\in V_\kappa$ implies $\abs{x}\lt\kappa$. (show proof)

Lemma. Suppose $\kappa$ is an inaccessible cardinal, then $x\subseteq V_\kappa$ and $\abs{x}\lt\kappa$ imply $x\in V_\kappa$. (show proof)

Standard model of ZFW
Assume as a hypothetical concept that we can define a collection $V$ in meta-logic such that:

$V$ is non-empty;
let $\Gamma$ be the collection of axioms in second-order $ZFW$, where axiom schema of specification and axiom schema of replacement are replaced by their second-order forms, the structure with
- $V$ as the universe,
- $x=y$ interpreted as "$x$ and $y$ are identical",
- $x\in y$ interpreted as "$x$ is in $y$", and
- every witness function symbol interpreted to satisfy the corresponding witness axiom
satisfies every axiom in $\Gamma$ and the formula representing that "there exists an inaccessible cardinal".

Then the structure described above, which we will denote as $V$, is a model of $ZFW$ (show proof),

Proof. The only thing to prove is that axiom schema of specification and axiom schema of replacement are satisfied by the structure. We will first make clear the definitions of the second order forms of these axiom schemas:

axiom of specification: given any natural number $n$, for every collection $\mathcal C$ of $(n+2)$-tuples of the universe, for every $n$-tuple $w$ of the universe, for every $X$ in the universe, there exists $Y$ in the universe that contains exactly the objects $u$ in $X$ such that $(w_1,\ldots,w_n,X,u)\in\mathcal C$.
axiom of replacement: given any natural number $n$, for every collection $\mathcal C$ of $(n+3)$-tuples of the universe, for every $n$-tuple $w$ of the universe, for every $X$ in the universe, if for all $x\in X$ there exists a unique $y$ in the universe such that $(w_1,\ldots,w_n,X,x,y)\in\mathcal C$, then there exists $Y$ in the universe, such that for all $x\in X$, there exists $y\in Y$ such that $(w_1,\ldots,w_n,X,x,y)\in\mathcal C$.

By the hypothesis, we have $V$ as the universe of the structure, which satisfies the above axioms.

For any natural number $n$, let $X,Y,u,w_1,\ldots,w_n$ be distinct variable symbols, and let $\varphi$ be a formula with free variables among $X,u,w_1,\ldots,w_n$. Let $\mathcal C$ be the collection of $(n+2)$-tuples $T$ of the universe such that $\varphi$ is true in the structure with respect to the assignment function $\sigma$ defined by

$\sigma(w_j)=T_j$ if $w_j\in\text{free}(\varphi)$ for all $j$ in $1,\ldots,n$,
$\sigma(X)=T_{n+1}$ if $X\in\text{free}(\varphi)$, and
$\sigma(u)=T_{n+2}$ if $u\in\text{free}(\varphi)$.

Assign arbitrary $w^*_j\in V$ to $w_j$ for all $j$ in $1,\ldots,n$, then $(w^*_1,\ldots,w^*_n)$ is an $n$-tuple of $V$. Assign arbitrary $X^*\in V$ to $X$, then there exists $Y^*\in V$ that contains exactly the objects $u^*$ in $X^*$ such that $(w^*_1,\ldots,w^*_n,X^*,u^*)\in\mathcal C$. We assign such $Y^*$ to $Y$. Now assign arbitrary $u^*\in V$ to $u$. If $u^*\in Y^*$, then $u^*\in X^*$ and $(w^*_1,\ldots,w^*_n,X^*,u^*)\in\mathcal C$, implying that $\varphi$ is true in the structure with respect to the assignment function $\sigma$ defined by

$\sigma(w_j)=w^*_j$ if $w_j\in\text{free}(\varphi)$ for all $j$ in $1,\ldots,n$,
$\sigma(X)=X^*$ if $X\in\text{free}(\varphi)$, and
$\sigma(u)=u^*$ if $u\in\text{free}(\varphi)$.

Note that $\sigma$ agrees with our assignments. Thus $(u\in Y)\to((u\in X)\land\varphi)$ is satisfied. If $u^*\in X^*$ and $\varphi$ is true in the structure with respect to the assignment function $\sigma$ described above, then $(w^*_1,\ldots,w^*_n,X^*,u^*)\in\mathcal C$, thus $u^*\in Y^*$. Hence $((u\in X)\land\varphi)\to(u\in Y)$ is satisfied. Therefore, $(u\in Y)\leftrightarrow((u\in X)\land\varphi)$ is satisfied. Since $u^*$ is arbitrary, $\forall u((u\in Y)\leftrightarrow((u\in X)\land\varphi))$ is satisfied. Since we have $Y^*$ as an instance, $\exists Y\forall u((u\in Y)\leftrightarrow((u\in X)\land\varphi))$ is satisfied. Since the rest of the assignments are arbitrary, $\forall w_1\ldots\forall w_n\forall X\exists Y\forall u((u\in Y)\leftrightarrow((u\in X)\land\varphi))$ is satisfied.

For any natural number $n$, let $X,Y,x,y,w_1,\ldots,w_n$ be distinct variable symbols, and let $\varphi$ be a formula with free variables among $X,x,y,w_1,\ldots,w_n$. Let $\mathcal C$ be the collection of $(n+3)$-tuples $T$ of the universe such that $\varphi$ is true in the structure with respect to the assignment function $\sigma$ defined by

$\sigma(w_j)=T_j$ if $w_j\in\text{free}(\varphi)$ for all $j$ in $1,\ldots,n$,
$\sigma(X)=T_{n+1}$ if $X\in\text{free}(\varphi)$,
$\sigma(x)=T_{n+2}$ if $x\in\text{free}(\varphi)$, and
$\sigma(y)=T_{n+3}$ if $y\in\text{free}(\varphi)$.

Assign arbitrary $w^*_j\in V$ to $w_j$ for all $j$ in $1,\ldots,n$, then $(w^*_1,\ldots,w^*_n)$ is an $n$-tuple of $V$. Assign arbitrary $X^*\in V$ to $X$. If for all $x^*\in X^*$ there exists a unique $y^*\in V$ such that $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$, then there exists $Y^*\in V$, such that for all $x^*\in X^*$, there exists $y^*\in Y^*$ such that $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$. Suppose $\forall x((x\in X)\to\exists!y\varphi)$ is satisfied. Let $x^*\in X^*$, if we assigned $x^*$ to $x$, then both $x\in X$ and $(x\in X)\to\exists!y\varphi$ are satisfied, thus $\exists!y\varphi$ is satisfied. Hence there exists a unique $y^*\in V$ such that when $y^*$ is assigned to $y$, $\varphi$ is satisfied. Note that given $y^*\in V$, $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$ if and only if $\varphi$ is satisfied with $y$ assigned $y^*$. Thus there exists a unique $y^*\in V$ such that $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$. Hence there exists $Y^*\in V$, such that for all $x^*\in X^*$, there exists $y^*\in Y^*$ such that $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$. We assign such $Y^*$ to $Y$. Suppose we have arbitrary $x^*\in V$. If $x^*\in X^*$, then there exists $y^*\in Y^*$ such that $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$. Again, note that given $y^*\in V$, $(w^*_1,\ldots,w^*_n,X^*,x^*,y^*)\in\mathcal C$ if and only if $\varphi$ is satisfied with $y$ assigned $y^*$. Thus if we assign such $y^*$ to $y$, then $(y\in Y)\land\varphi$ is satisfied. Since we have $y^*$ as an instance, $\exists y((y\in Y)\land\varphi)$ is satisfied, thus $(x\in X)\to\exists y((y\in Y)\land\varphi)$ is satisfied. If not $x^*\in X^*$, then $(x\in X)\to\exists y((y\in Y)\land\varphi)$ is trivially satisfied. Since $x^*$ is arbitrary, $\forall x((x\in X)\to\exists y((y\in Y)\land\varphi))$ is satisfied. Since we have $Y^*$ as an instance, $\exists Y\forall x((x\in X)\to\exists y((y\in Y)\land\varphi))$ is satisfied. Therefore $(\forall x((x\in X)\to\exists!y\varphi)\to\exists Y\forall x((x\in X)\to\exists y((y\in Y)\land\varphi)))$ is satisfied. Since the rest of the assignments are arbitrary, $\forall w_1\ldots\forall w_n\forall X(\forall x((x\in X)\to\exists!y\varphi)\to\exists Y\forall x((x\in X)\to\exists y((y\in Y)\land\varphi)))$ is satisfied. $\blacksquare$

and by soundness, this model satisfies every provable formula of $ZFW$, including the formula representing that "if there exists an inaccessible cardinal, then there exists a strictly least inaccessible cardinal". Since this model satisfies that "there exists an inaccessible cardinal" by hypothesis, it satisfies that "there exists a strictly least inaccessible cardinal". Let $\kappa$ be the strictly least inaccessible cardinal, as a term. Then the model interprets the term $V_\kappa$ to an object in $V$, which we will use the same notation $V_\kappa$ to denote. Then the object $V_\kappa$ is the standard universe of $ZFW$. Note that:

For all $X\in V_\kappa$, for all $x\in X$, $x\in V_\kappa$.
For all $X\in V_\kappa$, for all $Y\subseteq X$, $Y\in V_\kappa$.
The object that $V$ interprets $\omega$ to is in $V_\kappa$.

(show proof)

Proof. We will show below that for all $X\in V$, for all $Y\subseteq X$, $Y\in V$. Let $X\in V$ and $Y\subseteq X$. Let $\mathcal C$ be the collection of $2$-tuples $w$ of $V$ such that $w_2\in Y$. Then there exists $Y'\in V$ that contains exactly the objects $u$ in $X$ such that $(X,u)\in\mathcal C$. Note that $u\in Y$ if and only if $u\in X$ and $(X,u)\in\mathcal C$, if and only if $u\in Y'$. Thus $Y=Y'$, implying that $Y\in V$.

We will show below that for all $X\in V$, for all $x\in X$, $x\in V$. Let $X\in V$ and $x\in X$. Suppose for contradiction that $x\notin V$. Let $X'$ be the collection of objects $u$ in $X$ such that $u\neq x$. Then $X'\subseteq X$, thus $X'\in V$. Since $V$ satisfies axiom of extensionality, and given $u\in V$, $u\in X$ if and only if $u\in X'$, we have $X=X'$, a contradiction. Hence $x\in V$.

Suppose object $X\in V_\kappa$. Then $X\in V$. Note that the formula $\forall X((X\in V_\kappa)\to(X\subseteq V_\kappa))$ is satisfied by the model $V$. Thus $\forall x((x\in X)\to(x\in V_\kappa))$ is true when $X$ is taken to be the object $X$. That means for all object $x\in V$, we have $(x\in X)\to(x\in V_\kappa)$. Given object $x\in X$, we have $x\in V$, thus $(x\in X)\to(x\in V_\kappa)$, and since $x\in X$, we have $x\in V_\kappa$. We have shown that for all $x\in X$, $x\in V_\kappa$.

Suppose object $X\in V_\kappa$ and $Y$ is a sub-collection of $X$. Then $X\in V$, thus $Y\in V$. And $X\subseteq V_\kappa$, thus $Y\subseteq V_\kappa$. By a lemma above, we have $\abs{X}\lt\kappa$ in $V$. Since $Y\subseteq X$, we have $\abs{Y}\le\abs{X}$ in $V$. Thus $\abs{Y}\lt\kappa$ in $V$. Hence by a lemma above, we have $Y\in V_\kappa$ in $V$.

Since $\omega\in\kappa$, $\omega^+\in\kappa$. Trivially, $0\subseteq V_0$. Let $n\in\omega$ and suppose $n\subseteq V_n$. Then $n\in V_{n^+}$ and $n\subseteq V_{n^+}$. Thus $S(n)\subseteq V_{n^+}=V_{S(n)}$. By induction, given $n\in\omega$, we have $n\subseteq V_n$, and thus $n\in V_{n^+}\subseteq V_\omega$. Hence $\omega\subseteq V_\omega$, implying $\omega\in V_{\omega^+}\subseteq V_\kappa$. Therefore, $\omega\in V_\kappa$. And the object $\omega$ is in the object $V_\kappa$. $\blacksquare$

Define a structure of $ZFW$ with

$V_\kappa$ being the universe,
$x=y$ interpreted as "$x$ and $y$ are identical",
$x\in y$ interpreted as "$x$ is in $y$", and
every witness function symbol interpreted to satisfy the corresponding witness axiom.

Then this structure, which we will denote as $V_\kappa$, is a model of $ZFW$ (show proof).

Proof.

Logical axioms: For LA1, use induction to show that there exists a tuple of steps $(\varphi_1,\ldots,\varphi_n)$ to obtain the tautology $\varphi$ such that the variable symbols of each $\varphi_i$ is a sub-collection of those of $\varphi$. Then use induction to show that if the propositional variable symbols are assigned the same truth values as the corresponding first-order formulas given a first-order assignment function, then the original propositional formula is evaluated to the same truth value as the first-order formula formed by replacing variable symbols with formulas. Since the original formula is a tautology, we can show that LA1 is satisfied. For LA2, use induction to show that there exists a tuple of steps $(\varphi_1,\ldots,\varphi_n)$ to obtain $\varphi$ such that $t$ is substitutable for $x$ in each $\varphi_i$. Then use induction to show that given an assignment function $\sigma$ for the free variables of $\varphi[t/x]$ and $t$, if we define an assignment function $\tau$ for the free variables of $\varphi$ and $x$ that matches $\sigma$ for variables other than $x$, and maps $x$ to the object that $\sigma$ sends $t$ to, then $\delta_\tau(\varphi)=\delta_\sigma(\varphi[t/x])$. Using this, we can show that LA2 is satisfied. For LA6, use induction to show that when $x$ and $y$ are assigned the same object, $\varphi$ and $\varphi'$ are evaluated to the same truth value. The rest are trivial.
Axiom of extensionality: Let $X,Y\in V_\kappa$. Then $X,Y\subseteq V_\kappa$. Hence if for all $u\in V_\kappa$, $u\in X$ if and only if $u\in Y$, then given $u\in X$, we have $u\in V_\kappa$ and $u\in X$, thus $u\in Y$; given $u\in Y$, we have $u\in V_\kappa$ and $u\in Y$, thus $u\in X$. Hence $X=Y$.
Axiom of pairing: Let $a,b\in V_\kappa$. Then $a,b\in V$. Thus there exists $c\in V$ such that for all $u\in V$, $u\in c$ if and only if $u=a$ or $u=b$. Also, for some $\alpha\in\kappa$, $a\subseteq V_\alpha$ and for some $\beta\in\kappa$, $b\subseteq V_\beta$. Thus $a\in V_{\alpha^+}$ and $b\in V_{\beta^+}$.
- if $\alpha^+=\beta^+$, then $a,b\in V_{\alpha^+}$.
- If $\alpha^+\in\beta^+$, then $V_{\alpha^+}\subseteq V_{\beta^+}$, thus $a,b\in V_{\beta^+}$.
- If $\beta^+\in\alpha^+$, then $V_{\beta^+}\subseteq V_{\alpha^+}$, thus $a,b\in V_{\alpha^+}$.
Note that $\alpha^+,\beta^+\in\kappa$, thus there exists $\gamma\in\kappa$ such that $a,b\in V_\gamma$. Let $u\in V$, if $u\in c$, then $u=a$ or $u=b$, and in both cases we have $u\in V_\gamma$. Thus $c\subseteq V_\gamma$ and hence $c\in V_{\gamma^+}$. Since $\gamma^+\in\kappa$, $c\in V_\kappa$, and thus $c\subseteq V_\kappa$. Let $x\in V_\kappa$. If $x\in c$, then $x\in V$ and $x\in c$, thus $x=a$ or $x=b$. If $x=a$ or $x=b$, then $x\in V$ and $(x=a)\lor(x=b)$, thus $x\in c$.
Axiom schema of specification: Let $n$ be a natural number and $\varphi$ be a formula as required. Fix an assignment function for the variable symbols $w_1,\ldots,w_n$. Let $X\in V_\kappa$, then there exists $Y\in V$ such that for all $u\in V$, $u\in Y$ if and only if $u\in X$ and $\varphi$. Thus $Y\subseteq X$. Note that there exists $\alpha\in\kappa$ such that $X\subseteq V_\alpha$. Thus $Y\subseteq V_\alpha$ and hence $Y\in V_{\alpha^+}\subseteq V_\kappa$. Let $u\in V_\kappa$. If $u\in Y$, then $u\in V$ and $u\in Y$, thus $u\in X$ and $\varphi$. If $u\in X$ and $\varphi$, then $u\in V$ and $(u\in X)\land\varphi$, thus $u\in Y$.
Axiom of union: Let $X\in V_\kappa$. Then there exists $Y\in V$ such that for all $u\in V$, $u\in Y$ if and only if there exists $x\in V$ such that $x\in X$ and $u\in x$. Also, for some $\alpha\in\kappa$, $X\subseteq V_\alpha$. Let $u\in V$. If $u\in Y$ then there exists $x\in V$ such that $x\in X$ and $u\in x$. Thus $x\in V_\alpha$ and hence for some $\beta\in\alpha$, $x\subseteq V_\beta$, and we have $u\in V_\beta$. Therefore, we have $Y\subseteq V_\beta$, and thus $Y\in V_{\beta^+}\subseteq V_\kappa$. Let $u\in V_\kappa$. If $u\in Y$, then $u\in V$ and $u\in Y$, thus there exists $x\in V$ such that $x\in X$ and $u\in x$. Since $X\in V_\kappa$ and $x\in X$, we have $x\in V_\kappa$. If there exists $x\in V_\kappa$ such that $x\in X$ and $u\in x$, then $x\in V$, and $u\in V$, thus $u\in Y$.
Axiom of power set: Let $X\in V_\kappa$. Then there exists $Y\in V$ such that for all $y\in V$, $y\in Y$ if and only if for all $x\in V$, $x\in y$ implies $x\in X$. Also, for some $\alpha\in\kappa$, $X\subseteq V_\alpha$. Let $y\in V$. If $y\in Y$ then for all $x\in V$, $x\in y$ implies $x\in X$. Thus $y\subseteq X\subseteq V_\alpha$, and we have $y\in V_{\alpha^+}$. Therefore, we have $Y\subseteq V_{\alpha^+}$, and thus $Y\in V_{\alpha^{++}}\subseteq V_\kappa$. Let $y\in V_\kappa$. If $y\in Y$, then $y\in V$ and $y\in Y$, thus for all $x\in V$, $x\in y$ implies $x\in X$, hence for all $x\in V_\kappa$, $x\in y$ implies $x\in X$. If for all $x\in V_\kappa$, $x\in y$ implies $x\in X$, then let $x\in V$, if $x\in y$, then $x\in V_\kappa$, thus $x\in y$ implies $x\in X$. Since we also have $y\in V$, we have $y\in Y$.
Axiom of infinity: We showed above that $N\in V_\kappa$. Let $z\in V_\kappa$, if for all $y\in V_\kappa$, not $y\in z$, then given $u\in V$, either $u\in V_\kappa$, then $u\notin z$, or $u\notin V_\kappa$, then also $u\notin z$, hence $z=\emptyset$. Since $\emptyset\in N$, we have $z\in N$. Let $x\in V_\kappa$. Suppose $x\in N$. Let $X\in V_\kappa$. Suppose for all $u\in V_\kappa$, $u\in X$ if and only if $u\in x$ or $u=x$. Let $u\in V$. If $u\in X$, then $u\in V_\kappa$ and $u\in X$, thus $u\in x$ or $u=x$, implying $u\in S(x)$. If $u\in S(x)$, then $u\in x$ or $u=x$, in both cases we also have $u\in V_\kappa$, thus $u\in X$. Hence $X=S(x)$. Since $x\in N$, we have $S(x)\in N$, and thus $X\in N$.
Axiom schema of replacement: Let $n$ be a natural number and $\varphi$ be a formula as required. Fix an assignment function for the variable symbols $w_1,\ldots,w_n$. Let $X\in V_\kappa$. Suppose for all $x\in V_\kappa$, if $x\in X$ then there exists a unique $y\in V_\kappa$ such that we have $\varphi$. Let $x\in V$. If $x\in X$, then $x\in V_\kappa$ and $x\in X$, thus there exists a unique $y\in V_\kappa$ such that we have $\varphi$. Now consider the formula $(y\in V_\kappa)\land\varphi$, denoted $\varphi'$. We can see that the unique $y\in V_\kappa$ that satisfies $\varphi$ also uniquely satisfies $\varphi'$ in $V$. Thus there exists $Y\in V$ such that for all $x\in V$, if $x\in X$ then there exists $y\in V$ such that $y\in Y$ and $\varphi'$. Then we can see that for some $Y\in V$, for all $x\in X$, there exists a unique $y\in Y$ such that $\varphi'$. Note that $\varphi'(x,y)$ defines a function $f:X\to Y$. Let $Y'$ denote the range of $f$, then $f$ is a surjection from $X$ to $Y'$, thus $\abs{X}\ge\abs{Y'}$. Since $X\in V_\kappa$, we have $\abs{X}\lt\kappa$, thus $\abs{Y'}\lt\kappa$. If $y'\in Y'$, then there exists $x'\in V$ such that $x'\in X$ and $f(x')=y'$. And we have $\varphi'(x',f(x'))$ and thus $\varphi'(x',y')$, implying $y'\in V_\kappa$. Therefore, $Y'\subseteq V_\kappa$. Since $Y'\subseteq V_\kappa$ and $\abs{Y'}\lt\kappa$, by a lemma above, we have $Y'\in V_\kappa$. Let $x\in V_\kappa$. If $x\in X$, then $x\in V$ and $x\in X$, thus there exists $y\in V$ such that $y\in Y$ and $\varphi'$, implying $y\in V_\kappa$ and $\varphi$.
Axiom of foundation: Let $S\in V_\kappa$. Suppose there exists $x\in V_\kappa$ such that $x\in S$, then there exists $x\in V$ such that $x\in S$. Thus there exists $s\in V$ such that $s\in S$ and there does not exist $y\in V$ such that $y\in S$ and $y\in s$. Hence $s\in V_\kappa$ and there does not exist $y\in V_\kappa$ such that $y\in S$ and $y\in s$.
Witness axioms: These are trivial, since witness functions are interpreted to satisfy witness axioms. Just use the fact that the witness term is substitutable for the substituted variable symbol in $\varphi$ and the results we had for LA2.

$\blacksquare$

We call $V_\kappa$ the standard model of $ZFW$.