\newpage \subsection {Simulation.} We considered several types of machines (models of computation). We will see now that all these machines can be simulated by the simplest of them: the Turing Machine. In other words, these powerful machines can compute only those functions computable by a TM. \vfill{\bf Church-Turing Thesis} is a generalization of this conclusion: TMs can compute every function computable in any thinkable physical model of computation. This is not a mathematical result because the notion of model is not formally specified. But the long history of investigations of ways to design real and ideal computing devices makes it very convincing. Moreover, this Thesis has a stronger {\em Polynomial Time} version which states that if any model computes a function in polynomial volume, TM can do the same. Both forms of the Thesis play a significant role in foundations of Computer Science. \vfill{\bf PKM Simulation of PPM.} For convenience, we assume each PPM node has an edge into root. Now, for each node, we reconnect its incoming (in unlimited number) PPM edges, 2 per leaf, to a bidirectional binary tree with new PKM colors $l,r,u$. The number of edges increases at most 4 times. The nodes simulate PPM's pulling stage by extending their trees to double depth. To simulate the re-coloring stage, each node gets a binary {\em name} formed by the $l,r$ colors on its path through the root tree. Then it broadcasts its name down its own tree. When each node thus receives identities and pointers of its PPM neighbors, it stores them by creating a little auxiliary chain (acting as TM). Then it computes and implements the actions of the original PPM and rebalances its tree. This simulation of a PPM step takes polylogarithmic time. \vfill{\bf TM Simulation of PPM.} To simulate the above PKM by a TM, we first represent its memory configuration on the TM tape as the list of all pointers, sorted by the source names (described above) and then by color. The PKM program is reflected in the TM's transition table. Now the TM can simulate a PKM's pulling stage as follows: It creates a copy of each pointer and sorts copies by their sinks. Now each pointer, located at source, has its copy near its sink. So both components of 2-pointer paths are nearby: the special double-colored pointers can be created and moved to their sources by sorting. The re-coloring stage is straightforward, as all relevant pointers have the same source and are located together. When no active edges remain in the root, the Turing machine stops and its tape represents the PKM output. If a PPM computes a function $f(x)$ in $t(x)$ steps, using $s(x)$ nodes, the simulating TM uses space $S = O(s\log s)$, ($O(\log s)$ bits for each of $O(s)$ pointers) and time $T= O(S^2)t$, as TM sorting takes quadratic time. {\small TM cannot outperform Bubble Sort. Is its quadratic overhead a big deal? In a short time all silicon gates on your PC run, say, $X\!=\! 10^{23}\!\sim\!2^{2^{6.25}}$ clock cycles combined. Silicon parameters double almost annually. Decades may bring micron-thin things that can sail sunlight in space in clouds of great computing and physical (light beam) power. Centuries may turn them into a Dyson Sphere enveloping the solar system. Still, the power of such an ultimate computer is limited by the number of photons the Sun emits per second: $Y\!\sim\! 2^{2^{7.25}} \!\!=\! X^2$. Giga-years may turn much of the known universe into a computer, but its might is still limited by its total entropy $2^{2^{8.25}}\!\!=\!Y^2$. Squaring matters!} \vfill{\bf Faster PPM Simulations.} Parallel Bubble-Sort on CA or Merge-Sort on sequential FCM take nearly linear time. Parallel FCM can do much better [Ofman 65]. It represents and updates pointer graphs as the above TM. All steps are straightforward to do locally in parallel polylog time except sorting of pointers. We need to create a fixed connection sorting network. Sophisticated networks sort arbitrary arrays of integers in log time. We need only a simpler polylog method. Merge-Sort splits an array of two or more entries in two halves and sorts each recursively. Batcher-Merge combines two sorted lists in {\em parallel} log time. % Errata in SIGACT NEWS 28(2):80, June 1997 (submitted 1996): \vfill{\bf Batcher Merge:} Call array entries {\em $i$-th partners} when their addresses differ only in $i$-th bit. Operations on partners can be implemented on a {\em Shuffle Exchange} graph of $2^k$ nodes. Each node has pointers to its $k$-th partner and to and from its {\em shift} node obtained by moving its first address bit to the end. A {\em bitonic cycle} is the combination of two sorted arrays (one may be shorter), connected by max-to-max and min-to-min entries. We merge sorted arrays by appending the reversed second one to the first, considering the last and first entries as neighbors, and sorting the resulting bitonic cycle. The sorting of a $2^k$ long bitonic cycle proceeds by comparing each entry with its $k$-th partner (i.e. diametric opposite on the cycle) and switching if wrongly ordered. Each half becomes then a bitonic cycle and any two entries from different halves are in proper order. The process then repeats for each half recursively (decrementing $k$ through the graph's shift edges).