-
Notifications
You must be signed in to change notification settings - Fork 0
/
App_StochasticMatrices.tex
356 lines (316 loc) · 26.7 KB
/
App_StochasticMatrices.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
% !TeX root = Body.tex
%\todo{Metropolis行列のやつを$2\times 2$についても追加.}
\chapter{Analysis based on Stochastic Matrices}\label{chap:ProofEx}
In this appendix, we prove the existence of the non-equilibrium stationary state in our model with sliding of arbitrary velocity. In the first section, we discuss the formulation in terms of stochastic matrices. In the following two sections, we prove based on Ref.~\cite{Hara2011} several facts for stochastic matrices, which ensures that almost all Monte Carlo simulations converge to a unique equilibrium state. In the last two sections, we propose a way to construct the matrix for both equilibrium cases and non-equilibrium stationary cases, and discuss the distributions of their eigenvalues in terms of the convergence.
\section{A Simple Example: Stochastic Ising Model with $N$-spins}
We now consider a matrix form of the stochastic process. For example, the one-dimensional Ising chain with $N$-spin has $2^{N}$ states. If we label each of the states by $i=1,2,\dots,2^{N}$, we can write the stochastic time evolution of the system by the existence probability $p_{i}(t)$ that the system is in the $i$th state at a time $t$ and the transition probability $T_{ij}$ such that the system in the $j$th state changes to the $i$th state.
We additionally define the conditional probability $\tilde{p}_{ij}(t)$ that the system in the $j$th state at a time $t$ changes to the $i$th state at the next time $t+1$. Using the conditional probability, we can derive the relation between the existence probability $p_{i}(t)$ and the transition probability $T_{ij}$ as
\begin{align}
\tilde{p}_{ij}(t + 1) = T_{ij} p_{j}(t)\quad\text{for $1\leq i,j\leq 2^{N}$}.
\end{align}
From a property of the probability, it should hold that $\sum_{i=1}^{2^{N}}p_{i}(t)=1$ and $p_{i}(t) \ge 0$ ($i=1,2,\dots,2^{N}, t\in\mathbb{R}$). The conditional probability $\tilde{p}_{ij}(t)$ should also satisfy the condition that $\sum_{j=1}^{2^{N}}\tilde{p}_{ij}(t) = p_{i}(t+1)$ ($i=1,2,\dots,2^{N}, t\in\mathbb{R}$). Then we have
\begin{align}
p_{i}(t+1) = \sum_{j=1}^{2^{N}}\tilde{p}_{ij}(t + 1) = \sum_{j=1}^{2^{N}}T_{ij}p_{j}(t)\quad\text{for $1\leq i\leq 2^{N}$, $t\in\mathbb{R}$}.
\end{align}
In other words, the system can be described by the probability vector
\begin{align}
\bm{p}(t):={}^{\rm t}\left(p_{1}(t),p_{2}(t),\dots,p_{2^{N}}(t)\right)
\end{align}
and the stochastic matrix $\hat{T}:=\left(T_{ij}\right)$ as
\begin{align}
\bm{p}(t + 1) = \hat{T}\bm{p}(t)\quad\text{for $t\in\mathbb{R}$}.
\end{align}
We have to consider such the matrix when we discuss the convergence to the stationary state or its uniqueness. These discussions are valid for general $\Omega$-dimensional state spaces, and thus we denote the number of states by $\Omega$ from now on.
\section{General Theory of Stochastic Matrices}
In this section we discuss the conditions which ensure the convergence of a Monte Carlo simulation to a unique stationary state. We first define the stochastic matrix and discuss fundamental properties of the stochastic matrix. We next discuss a properties which result from an additional condition called \textit{weak/strong connectivity}, and thereby show the existence and the uniqueness of a stationary state. We also see that we can construct a stochastic matrix which leads to any desired stationary state under the \textit{detailed-balanced condition}.
From the condition $\sum_{i=1}^{\Omega}p_{i}(t)=1$ and $p_{i}(t) \ge 0$ ($i=1,2,\dots,\Omega, t\in\mathbb{R}$), we have a set of properties $\sum_{i=1}^{\Omega}T_{ij} = 1$, $T_{ij}\ge 0$ ($1\leq i\leq \Omega$). Any matrix with these conditions is called a \textit{stochastic matrix} and shows the following interesting property:
\begin{lemma}
Let $\hat{T}$ be a stochastic matrix. Then all absolute values of eigenvalue are less than or equal to unity. For any eigenvector $\bm{x}={}^{\rm t}\{x_{1},x_{2},\dots,x_{\Omega}\}$ which does \textit{not} belong to the eigenvalue $1$, it additionally holds that
\begin{align}
\sum_{i=1}^{\Omega}x_{j}=0.
\end{align}
\end{lemma}
We now define the vector $\bm{d}:={}^{\rm t}\left(1,1,\dots,1\right)$ to prove all facts after this.
\begin{proof}
For any stochastic matrix $\hat{T}$, we have
\begin{align}
&\left({}^{\rm t}\hat{T}\bm{d}\right)_{i}=\sum_{j=1}^{\Omega}\left({}^{t}T\right)_{ij}d_{j}=\sum_{j=1}^{\Omega}T_{ji}d_{j}=\sum_{j=1}^{\Omega}T_{ji}=1\quad\text{for $i=1,2,\dots,\Omega$},\\
\Longleftrightarrow\quad &{}^{\rm t}\hat{T}\bm{d} = \bm{d}.
\end{align}
Therefore the matrix ${}^{\rm t}\hat{T}$ has at least an eigenvalue $1$. The eigenequation for the matrix ${}^{\rm t}\hat{T}$ are rewritten as
\begin{align}
\det\left[\lambda \hat{I}_{\Omega} - {}^{\rm t}\hat{T}\right] = \det\left[{}^{\rm t}\left(\lambda \hat{I}_{\Omega} - \hat{T}\right)\right] = \det\left[\lambda \hat{I}_{\Omega} - \hat{T}\right],
\end{align}
and then the set of eigenvalues of $\hat{T}$ is equal to that of ${}^{\rm t}\hat{T}$. Therefore the matrix $\hat{T}$ has at least an eigenvalue $1$. A general eigenvalue equation of $\hat{T}$ can be written as
\begin{align}
\hat{T}\bm{x}_{n} = \lambda_{n}\bm{x}_{n}\label{eq:GenEigT},
\end{align}
where $\bm{x}_{n}={}^{\rm t}\left(x_{n,1},x_{n,2},\dots,x_{n,\Omega}\right)$ is its eigenvector. We have
\begin{align}
&\left((\text{l.h.s of \ref{eq:GenEigT}}),\bm{d}\right) = (\hat{T}\bm{x}_{\lambda},\bm{d}) = (\bm{x}_{\lambda},{}^{\rm t}\hat{T}\bm{d}) = (\bm{x}_{\lambda},\bm{d}),\\
&\left((\text{r.h.s of \ref{eq:GenEigT}}),\bm{d}\right) = (\lambda\bm{x}_{\lambda},\bm{d}) = \lambda(\bm{x}_{\lambda},\bm{d}).\\
\Longleftrightarrow \quad& (1-\lambda)(\bm{x}_{\lambda},\bm{d}) = 0\quad\Longleftrightarrow \quad \lambda = 1\text{ or }(\bm{x}_{\lambda},\bm{d}) = 0.\\
\Longrightarrow \quad& \sum_{i=1}^{\Omega}x_{\lambda,i} = 0\quad\text{if $\lambda \neq 1$}.
\end{align}
We additionally define the vector $\bm{y}_{\lambda}:={}^{\rm t}\left(|x_{\lambda,1}|,|x_{\lambda,2}|,\dots,|x_{\lambda,\Omega}|\right)$ for any $\lambda$. From the equation $\sum_{j=1}^{\Omega}T_{ij}x_{\lambda,j}=\lambda x_{i}(i=1,2,\dots,\Omega)$, we have
\begin{align}
\left|\sum_{j=1}^{\Omega}T_{ij}x_{\lambda,j}\right| &\leq \sum_{j=1}^{\Omega}T_{ij}|x_{\lambda,j}|\quad(\because T_{ij}\geq 0\quad\text{for $j=1,2,\dots,\Omega$})\\
&=\left(\hat{T}\bm{y}_{\lambda}\right)_{i}\quad\text{\text{for $i=1,2,\dots,\Omega$}}\label{ineq:Ty}.
\end{align}
The left hand-side of \eqref{ineq:Ty} are rewritten as
\begin{align}
\left|\sum_{j=1}^{\Omega}T_{ij}x_{\lambda,j}\right| = |\lambda x_{\lambda,j}| = |\lambda|\times |x_{\lambda,j}| = |\lambda|\times \left(\bm{y}_{\lambda}\right)_{j},
\end{align}
and thus we have
\begin{align}
& |\lambda|\times \left(\bm{y}_{\lambda}\right)_{j} \leq \left(\hat{T}\bm{y}_{\lambda}\right)_{i},\\
\Longleftrightarrow \quad& |\lambda|\times \left(\bm{y}_{\lambda},\bm{d}\right) \leq \left(\hat{T}\bm{y}_{\lambda},\bm{d}\right) = \left(\bm{y}_{\lambda},{}^{\rm t}\hat{T}\bm{d}\right) = \left(\bm{y}_{\lambda},\bm{d}\right), \\
\Longleftrightarrow \quad& |\lambda| \leq 1.
\end{align}
\end{proof}
\begin{definition}
For an arbitrary $1\leq i,j\leq \Omega$, if there exists an $n(i,j)>0$ such that
\begin{align}
\left(\hat{T}^{n(i,j)}\right)_{ij}>0,
\end{align}
the matrix $\hat{T}$ is called \textit{weakly connected}. Note that for any $n'>n(i,j)$ it does \textit{not} follow that $\left(\hat{T}^{n'}\right)_{ij}>0$.
\end{definition}
We limit the class of stochastic matrices to that of weakly connected ones from now on. To make the proofs of the theorems below easier, we also define the matrix $\hat{\mathcal{T}}_{\epsilon}$ ($\epsilon>0$) and discuss its properties. Denoting the maximum value of $n(i,j)$ by $\displaystyle n_{\rm max}:=\max_{1\leq i,j\leq \Omega}\left[n(i,j)\right]$ and defining the matrix $\hat{\mathcal{T}}_{\epsilon}:=\left(\hat{I}_{\Omega}+\epsilon \hat{T}\right)^{n_{\rm max}}$, we have
\begin{align}
\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij} =& \left(\left(\hat{I}_{\Omega}+\epsilon \hat{T}\right)^{n_{\rm max}}\right)_{ij} = \sum_{k=1}^{n_{\rm max}}\binom{n_{\rm max}}{k}\left({\hat{I}_{\Omega}}^{k}\left(\epsilon \hat{T}\right)^{n_{\rm max}-k}\right)_{ij}\\
=& \sum_{k=1}^{n_{\rm max}}\binom{n_{\rm max}}{k}\epsilon^{n_{\rm max}-k}\left(\hat{T}^{n_{\rm max}-k}\right)_{ij} \geq 0 \quad(\because T_{ij}>0)\quad\text{for $1\leq i,j\leq \Omega$}.
\end{align}
For the eigenvector $\bm{x}_{1}={}^{\rm t}\left(x_{1,1},x_{1,2},\dots,x_{1,\Omega}\right)$, which belongs to the eigenvalue $1$, it holds that
\begin{align}
\hat{\mathcal{T}}_{\epsilon}\bm{x}_{1} &= \sum_{k=1}^{n_{\rm max}}\binom{k}{n_{\rm max}}\epsilon^{n_{\rm max}-k}\hat{T}^{n_{\rm max}}\bm{x}_{1}\\
&= \sum_{k=1}^{n_{\rm max}}\binom{k}{n_{\rm max}}\epsilon^{n_{\rm max}-k}\bm{x}_{1}\\
&= (1+\epsilon)^{n_{\rm max}}\bm{x}_{1},
\end{align}
and each component is
\begin{align}
\sum_{j=1}^{\Omega}\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij}x_{1,j} = (1+\epsilon)^{n_{\rm max}}x_{1,i}\quad\text{for $i=1,2,\dots,\Omega$}\label{eq:EigEqTcal}.
\end{align}
\begin{lemma}
We can decompose the vector into a phase factor and a positive vector as follows:
\begin{align}
\bm{x}_{1} = \mathrm{e}^{i\theta}\bm{u}_{1},
\end{align}
where $\theta$ is the phase and $\bm{u}_{1}$ is the vector with all positive components.
\end{lemma}
\begin{proof}
If components of the vector $\bm{x}_{1}$ are \textit{not} common such that $\sum_{i=1}^{\Omega}|x_{1,i}|>|\sum_{i=1}^{\Omega}x_{1,i}|$ holds, we have
\begin{align}
\left|\sum_{j=1}^{\Omega}\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij}x_{1,j}\right| < \sum_{j=1}^{\Omega}\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij}|x_{1,j}| = (1+\epsilon)^{n_{\rm max}}|x_{1,i}|.
\end{align}
On the other hand, the row-wise sum of the matrix $\hat{\mathcal{T}}_{\epsilon}$ are
\begin{align}
\sum_{i=1}^{\Omega}\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij} = \sum_{k=1}^{n_{\rm max}}\binom{k}{n_{\rm max}}\epsilon^{n_{\rm max}-k}\sum_{i=1}^{\Omega}\left(\hat{T}^{n_{\rm max}-k}\right)_{ij} = (1+\epsilon)^{n_{\rm max}}.
\end{align}
Then we have
\begin{align}
\sum_{i=1}^{\Omega}\sum_{j=1}^{\Omega}\left(\hat{\mathcal{T}}_{\epsilon}\right)_{ij}|x_{1,j}| = (1+\epsilon)^{n_{\rm max}}\sum_{j=1}^{\Omega}|x_{1,j}| > (1+\epsilon)^{n_{\rm max}}\sum_{i=1}^{\Omega}|x_{1,i}|,
\end{align}
but it is the contradiction caused from our assumption $\sum_{i=1}^{\Omega}|x_{1,i}|>|\sum_{i=1}^{\Omega}x_{1,i}|$. Furthermore the left-hand side of \eqref{eq:EigEqTcal} is positive because $n_{\rm max}$ is the maximum value of $n(i,j)$, and therefore the right-hand side is also positive $x_{1,i}>0$ ($i=1,2,\dots,\Omega$).
\end{proof}
\begin{lemma}
The eigenspace of the matrix $\hat{\mathcal{T}}_{\epsilon}$, which belongs to the eigenvalue $1$, is \textit{one-dimensional}.
\end{lemma}
\begin{proof}
If we have two different eigenvectors which belongs to the eigenvalue $1$, we can write their eigenequations by two different \textit{positive vectors} as
\begin{align}
\hat{T}\bm{u}_{1} = \bm{u}_{1},\\
\hat{T}\bm{v}_{1} = \bm{v}_{1}.
\end{align}
For their any linear superposition, we also have
\begin{align}
\hat{T}(\bm{u}_{1} + t\bm{v}_{1}) = \bm{u}_{1} + t\bm{v}_{1},\quad\text{for any $t\in\mathbb{R}$}.
\end{align}
If two eigenvectors $\bm{u}_{1}$ and $\bm{v}_{1}$ are not parallel to each other, we can make a non-trivial vector with a certain value of $t$ such that $\left(\bm{u}_{1} + t\bm{v}_{1}\right)_{l} = 0$ for an $l$th element. However it is the contradiction with the fact $x_{1,i}>0$ ($i=1,2,\dots,\Omega$). Therefore we have no eigenspaces more than one which belongs to the eigenvalue $1$.
\end{proof}
\begin{definition}\label{def:StrongConn}
If there exists a number $N_{0}>0$ such that
\begin{align}
\left(\hat{T}^{N_{0}}\right)_{ij}>0
\end{align}
for an arbitrary $1\leq i,j\leq \Omega$, the matrix $\hat{T}$ is called \textit{strongly connected}.
\end{definition}
We additionally limit the class of stochastic matrices to that of strongly connected ones from now on.
\begin{theorem}
Unity is the only eigenvalue of absolute value $1$.
\end{theorem}
\begin{proof}
We now have $\hat{T}^{m}\bm{u}_{n}={\lambda_{n}}^{m}\bm{u}_{n}$, where $\bm{u}_{n} = {}^{\rm t}\left(u_{n,1}, u_{n,2}, \dots, u_{n,\Omega}\right)$ is the eigenvector which belongs to an eigenvalue $\lambda$. Their components are written as
\begin{align}
\sum_{j=1}^{\Omega}\left(\hat{T}^{n}\right)_{ij}u_{\lambda,j} = \lambda^{n} \bm{u}_{\lambda,i}\quad\text{for $i = 1,2,\dots,\Omega$}.
\end{align}
We can divide conditions for $\lambda$ into the following two cases:
\begin{description}
\item[Case 1: $\sum_{i=1}^{\Omega}|u_{\lambda,i}| > |\sum_{i=1}^{\Omega}u_{\lambda,i}|$,]\mbox{}\\
We have
\begin{align}
&\sum_{j=1}^{\Omega}\left(\hat{T}^{n}\right)_{ij}|u_{\lambda,j}| > |\sum_{j=1}^{\Omega}\left(\hat{T}^{n}\right)_{ij}u_{\lambda,j}| = |\lambda^{n}|\times|u_{\lambda,i}|,\quad\text{for $i = 1,2,\dots,\Omega$}.\\
\Longleftrightarrow\quad & |\lambda^{n}| < 1 \quad \Longleftrightarrow\quad |\lambda| < 1.
\end{align}
\item[Case 2: $\sum_{i=1}^{\Omega}|u_{\lambda,i}| = |\sum_{i=1}^{\Omega}u_{\lambda,i}|$.]\mbox{}\\
We have
\begin{align}
& \sum_{i=1}^{\Omega}\sum_{j=1}^{\Omega}\left(\hat{T}^{n}\right)_{ij}u_{\lambda,j} = \sum_{j=1}^{\Omega}u_{\lambda,j} = \lambda^{n}\sum_{i=1}u_{\lambda,i}.\\
\Longleftrightarrow\quad & \lambda^{n} = 1\quad\left(\because \bm{u}_{\lambda}\neq\bm{0},u_{\lambda,i}\geq 0 \Rightarrow \sum_{i=1}^{\Omega}u_{\lambda,i} > 0\right).
\end{align}
\end{description}
Thus the eigenvalue $1$ is the only eigenvalue of absolute value $1$.
\end{proof}
\begin{lemma}
The vector $\lim_{N\to\infty}\hat{T}^{N}\bm{r} = \bm{0}$ for any $\bm{r}\in\mathbb{C}$ is orthogonal to $\bm{d}$.
\end{lemma}
\begin{proof}
For an arbitrary vector $\bm{r}$, we can decompose it into its real and imaginary parts as $\bm{r}=\bm{r}_{\rm R} + i\bm{r}_{\rm I}$. Since the condition $(\bm{r},\bm{d}) = 0$ is equivalent to $\sum_{i=1}^{\Omega}r_{i} = 0$, we have
\begin{align}
\sum_{j\in I_{+}}r_{j} + \sum_{j\in I_{-}}r_{j} = 0,
\end{align}
where $I_{\pm}:=\left\{j\mid r_{j}\gtrless 0,1\leq j\leq \Omega\right\}$. Note that $\sum_{j\in I_{+}}r_{j} = \sum_{j\in I_{-}}|r_{j}|$. Thus we have
\begin{align}
\sum_{j\in I_{+}}r_{j} = \sum_{j\in I_{-}}|r_{j}| = \|\bm{r}\|_{1} / 2.
\end{align}
Since $\hat{T}$ is strongly connected, there is an integer $N_{0}$ such that $\left(\hat{T}^{N_{0}}\right)_{ij}>0$ for an arbitrary $1\leq i,j\leq \Omega$. For $N_{0}$, we have
\begin{align}
\left(\hat{T}^{N_{0}}\bm{r}\right) = \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} =& \sum_{j\in I_{+}}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - \sum_{j\in I_{-}}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\\
=& \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - 2\sum_{j\in I_{-}}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\\
\leq& \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - 2\delta_{N_{0}}\sum_{j\in I_{-}}|r_{j}|\\
=& \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - \delta_{N_{0}}\|\bm{r}\|_{1},
\end{align}
where $\displaystyle\delta_{N_{0}}:= \min_{1\leq i,j\leq \Omega}\left[\left(\hat{T}^{N_{0}}\right)_{ij}\right]$ for $i=1,2,\dots,\Omega$. Note that there exists $\delta_{N_{0}} > 0$ for the strongly connected matrix $\hat{T}$. Similarly we have
\begin{align}
\left(\hat{T}^{N_{0}}\bm{r}\right) = \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} =& \sum_{j\in I_{+}}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - \sum_{j\in I_{-}}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\\
=& 2\sum_{j\in I_{+}}\left(\hat{T}^{N_{0}}\right)_{ij}r_{j} - \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\\
\geq& 2\delta_{N_{0}}\sum_{j\in I_{+}}r_{j} - \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\\
=& \delta_{N_{0}}\|\bm{r}\|_{1} - \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}|\quad\text{for $i=1,2,\dots,\Omega$}.
\end{align}
Combining them, we have
\begin{align}
|\left(\hat{T}^{N_{0}}\bm{r}\right)| \leq \sum_{j=1}^{\Omega}\left(\hat{T}^{N_{0}}\right)_{ij}|r_{j}| - \delta_{N_{0}}\|\bm{r}\|_{1}\quad\text{for $i=1,2,\dots,\Omega$},
\end{align}
and then it holds that
\begin{align}
\|\hat{T}^{N_{0}}\bm{r}\|_{1} = \sum_{i=1}^{\Omega}|\left(\hat{T}^{N_{0}}\bm{r}\right)_{i}| \leq \sum_{j=1}^{\Omega}|r_{j}| - N_{0}\delta_{N_{0}}\|\bm{r}\|_{1} = (1-N\delta_{N_{0}})\|\bm{r}\|_{1}.
\end{align}
The vector $\hat{T}^{N_{0}}\bm{r}$ is also orthogonal to the vector $\bm{d}$:
\begin{align}
\left(\hat{T}^{N_{0}}\bm{r},\bm{d}\right) = \left(\bm{r},{}^{\rm t}\left(\hat{T}^{N_{0}}\right)\bm{d}\right) = \left(\bm{r},\left({}^{\rm t}\hat{T}\right)^{N_{0}}\bm{d}\right) = \left(\bm{r},\bm{d}\right) = 0.
\end{align}
Then, for any positive integer $l$, we can repeat this discussion as
\begin{align}
\|\hat{T}^{N_{0}l}\bm{r}\|_{1} \leq (1-N_{0}\delta_{N_{0}})^{l}\|\bm{r}\|_{1}.
\end{align}
Since $N_{0}>0, \delta_{N_{0}}>0$ and thus $1-N_{0}\delta_{N_{0}} < 0$, we have
\begin{align}
&\lim_{l\to\infty}(1-N_{0}\delta_{N_{0}})^{l}\|\bm{r}\|_{1} = 0.\\
\Longleftrightarrow\quad&\lim_{l\to\infty}\|\hat{T}^{N_{0}l}\bm{r}\|_{1} = 0.
\end{align}
Thus, for an arbitrary positive integer $N$, we have
\begin{align}
\lim_{N\to\infty}\|\hat{T}^{N}\bm{r}\|_{1} = 0.
\end{align}
\end{proof}
\begin{lemma}\label{theo:SupPos}
We can write any vector $\bm{x}$ as the superposition of $\bm{u}_{1}$ and $\bm{r}$.
\end{lemma}
\begin{proof}
Defining the coefficient $c_{1,\bm{x}}:=(\bm{x},\bm{d})/(\bm{u}_{1},\bm{d})$ and the vector $\bm{r}_{\bm{x}}:=\bm{x} - c_{1,\bm{x}}\bm{u}_{1}$, we have
\begin{align}
(\bm{r}_{\bm{x}},\bm{d}) &= (\bm{x},\bm{d}) - \frac{(\bm{x},\bm{d})}{(\bm{u}_{1},\bm{d})}(\bm{u}_{1},\bm{d}) = 0,\\
\bm{x} &= c_{1,\bm{x}}(\bm{u}_{1},\bm{d}) + \bm{r}_{\bm{x}}.
\end{align}
\end{proof}
\begin{theorem}
The limit $\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)}$ is independent of the initial vector $\bm{p}^{(0)}$:
\begin{align}
\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)} = \frac{\bm{u}_{1}}{\|\bm{u}_{1}\|_{1}},
\end{align}
where the vector $\bm{p}^{(0)}={}^{\rm t}\left\{p^{(0)}_{1}, p^{(0)}_{2}, \dots, p^{(0)}_{\Omega}\right\}$ is in the class of probability vectors normalized as $\sum_{i=1}^{\Omega}p^{(0)}_{i} = 1$.
\end{theorem}
\begin{proof}
From the theorem \ref{theo:SupPos}, we have
\begin{align}
\hat{T}^{N}\bm{p}^{(0)} = c_{1,\bm{p}^{(0)}}\hat{T}^{N}\bm{u}_{1} + \hat{T}^{N}\bm{r}_{\bm{p}^{(0)}} = c_{1,\bm{p}^{(0)}}\bm{u}_{1} + \hat{T}^{N}\bm{r}_{\bm{p}^{(0)}}.
\end{align}
Its limit $N\to\infty$ is taken as
\begin{align}
\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)} = c_{1,\bm{p}^{(0)}}\bm{u}_{1} + \lim_{N\to\infty}\hat{T}^{N}\bm{r}_{\bm{p}^{(0)}} = c_{1,\bm{p}^{(0)}}\bm{u}_{1}.
\end{align}
Using the matrix $\hat{A}:=\bm{u}_{1}{}^{\rm t}\bm{d}/(\bm{u}_{1},\bm{d})$, we have
\begin{align}
\left(\hat{A}\bm{p}^{(0)}\right)_{i} = \sum_{j = 1}^{\Omega}\frac{\left(\bm{u}_{1}\right)_{i}}{(\bm{u}_{1},\bm{d})}\left(\bm{p}^{(0)}\right)_{j} = \left(\bm{p}^{(0)},\bm{d}\right)\frac{\left(\bm{u}_{1}\right)_{i}}{(\bm{u}_{1},\bm{d})},\quad\text{for $i=1,2,\dots,\Omega$}.
\end{align}
It leads to
\begin{align}
& \hat{A}\bm{p}^{(0)} = \frac{\left(\bm{u}_{1}\right)_{i}}{(\bm{u}_{1},\bm{d})}\bm{p}^{(0)} = c_{1,\bm{p}^{(0)}}\bm{p}^{(0)}.
\end{align}
The limit $N\to\infty$ for $\hat{T}^{N}\bm{p}^{(0)}$ is rewritten by a simple multiplication as follows:
\begin{align}
\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)} = \frac{\bm{u}_{1}{}^{\rm t}\bm{d}}{(\bm{u}_{1},\bm{d})}\bm{p}^{(0)}.\label{for:TNP}
\end{align}
The expression~\eqref{for:TNP} lead to the relation $\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)} = \bm{u}_{1}/\|\bm{u}_{1}\|_{1}$. Indeed it holds that
\begin{align}
\frac{\bm{u}_{1}{}^{\rm t}\bm{d}}{(\bm{u}_{1},\bm{d})}\left(\bm{p}^{(0)}\right)_{i} = \frac{\sum_{j=1}^{\Omega}\left(\bm{u}_{1}\right)_{i}\left(\bm{d}\right)_{j}\left(\bm{p}^{(0)}\right)_{j}}{(\bm{u}_{1},\bm{d})} = \frac{u_{1,i}}{\|\bm{u}_{1}\|_{1}},
\end{align}
and thus we have
\begin{align}
\lim_{N\to\infty}\hat{T}^{N}\bm{p}^{(0)} = \frac{\bm{u}_{1}}{\|\bm{u}_{1}\|_{1}}.
\end{align}
\end{proof}
Once we get a strongly connected stochastic matrix $\hat{T}$, its stationary distribution $\bm{u}_{1}/\|\bm{u}_{1}\|_{1}$ is determined independently of the initial distribution.
\section{Construction of the Stochastic Matrix based on the Detailed Balanced Condition}
On the other hand, we can also construct the inverse discussion. This means the \textit{construction} of the matrix $\hat{T}'$ with a \textit{desired} stationary distribution $\bm{p}'$. We can formulate such a procedure using the \textit{detailed-balanced condition} as follows.
\begin{definition}
The matrix $\hat{T}'$ satisfies the detailed-balanced condition if it holds that
\begin{align}
T'_{ij}p'_{j} = T'_{ji}p'_{i}\quad\text{for $1\leq i,j\leq \Omega$}\label{cond:D.B.C}
\end{align}
for the stationary distribution $\bm{p}'={}^{\rm t}(p'_{1},p'_{2},\dots,p'_{\Omega})$.
\end{definition}
This property is simplified by summing over the subscription $i$ as follows
\begin{align}
\sum_{i=1}^{\Omega}T'_{ij}p'_{j} = p'_{j} = \sum_{i=1}^{\Omega}T'_{ji}p'_{i} = \left(\hat{T}\bm{p}'\right)_{j},\quad\text{for $j=1,2,\dots,\Omega$}.
\end{align}
Then if the detailed-balanced condition holds for a matrix $\hat{T}'$, the corresponding vector $\bm{p}'$ is the fixed point of the matrix $\hat{T}'$. Thus in order to obtain the matrix $\hat{T}'$ which leads any distribution $\bm{p}^{(0)}$ to the desired distribution $\bm{p}'$, we only have to construct each element $T'_{ij}$ of the matrix according to the condition~\eqref{cond:D.B.C} and make the matrix strongly connected.
In Monte Carlo simulations of statistical mechanics, we calculate the equilibrium distribution $\bm{p}_{\rm eq}(\beta)$ at an inverse temperature $\beta$ using an initial state $i_{0}$ and a rule of the stochastic process $i\to j\to j'\to\dots$. We can regard the matrix construction method as a set of the stochastic process $i_{0}\to j\to j'\to\dots$ over all possible states $i_{0}=1,2,\dots,\Omega$.
The condition~\eqref{cond:D.B.C} or
\begin{align}
\frac{T_{ij}}{T_{ji}} = \frac{p_{i}}{p_{j}}
\end{align}
is not sufficient to determine a concrete form of the matrix $\hat{T}$ and thus in general there is a degree of freedom in the form of $\hat{T}$. For Monte Carlo simulations of equilibrium statistical mechanics, this freedom also remains as follows:
\begin{align}
\frac{T_{ij}(\beta)}{T_{ji}(\beta)} = \frac{p_{i}(\beta)}{p_{j}(\beta)} = \mathrm{e}^{-\beta (E_{i}-E_{j})}\quad\text{for $1\leq i,j\leq \Omega$}\label{rel:FreedomProb},
\end{align}
where ${T_{ij}(\beta)}$ and $p_{i}(\beta)$ are the matrix corresponding to a simulation and the probability that the $i$th state emerges, respectively, at a fixed inverse temperature $\beta$. Nonetheless the relation~\eqref{rel:FreedomProb} is important in that the rate of transition probabilities is always given by the difference of energies $\left\{E_{1},E_{2},\dots,E_{\Omega}\right\}$, and thus we do \textit{not} have to calculate the exact values of $\tilde{p}_{i}(\beta):=\exp\left[-\beta E_{i}\right]/Z(\beta)$, where $Z(\beta):=\sum_{i}\exp\left[-\beta E_{i}\right]$.
If we use the Metropolis algorithm for the $N$-spin system, each element of the matrix $T_{ij}(\beta)$ is written as follows.
\begin{align}
T_{ij}(\beta)=
\begin{cases}
\frac{1}{N}\min\left[1,\mathrm{e}^{-\beta (E_{i}-E_{j})}\right]\quad&\text{for states $i,j$ mutually reacheable by a single flip},\\
0\quad&\text{for states $i,j$ \textit{not} mutually reacheable by a single flip}.
\end{cases}\label{cond:Metropolis}
\end{align}
The factor $1/N$ describes the random selection of a spin to flip.
We call the matrix with the condition~\eqref{cond:Metropolis} the \textit{Metropolis matrix} and denote it by $M_{ij}$ from now on.
Metropolis matrices generally have few non-zero elements because of not taking transitions between all energy eigenstates into account. Thus the Monte Carlo simulation with the Metropolis rate is very simple, but the convergence is not trivial. In the following subsections, we show the convergence for quite limited cases using the strong connectivity of the stochastic matrix.
\subsection{Metropolis Matrix for the Model of the Size $3\times 2$}
We construct the Metropolis matrix $\hat{M}(\beta)$ for the model of size $L_{x}=3$, $L_{z}=2$ in a concrete form and verify that all elements of the sixth power of the matrix are non-zero. By the fact, we can see the existence of its unique stationary state in the long-time limit . In addition we investigate the distribution of the eigenvalues of $\hat{M}(\beta)$ for several temperatures and verify that the matrix $\hat{M}(\beta)$ has only one eigenvalue of unity and all other eigenvalues are inside the unit circle on the complex plane except for the case $\beta=0$; see Fig.~\ref{fig:EigDistMbeta}.
\begin{figure}[htbp]
\centering
\subcaptionbox{\label{subcap:M1}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM1_B0-1_Detailed.png}}
\subcaptionbox{\label{subcap:M2}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM2_B0-1_Detailed.png}}
\subcaptionbox{\label{subcap:M3}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM3_B0-1_Detailed.png}}
\subcaptionbox{\label{subcap:M4}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM4_B0-1_Detailed.png}}
\subcaptionbox{\label{subcap:M5}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM5_B0-1_Detailed.png}}
\subcaptionbox{\label{subcap:M6}}[0.32\linewidth]{\includegraphics[width=0.30\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/_ArrayPlotM6_B0-1_Detailed.png}}
\caption{The array plots of powers of $\hat{M}(\beta)$ of size $L_{x}=3, L_{z}=2$ and at the temperature $T=10$: (\subref{subcap:M1}) $\hat{M}(\beta)$; (\subref{subcap:M2}) $\left(\hat{M}(\beta)\right)^{2}$; (\subref{subcap:M3}) $\left(\hat{M}(\beta)\right)^{3}$; (\subref{subcap:M4}) $\left(\hat{M}(\beta)\right)^{4}$; (\subref{subcap:M5}) $\left(\hat{M}(\beta)\right)^{5}$; (\subref{subcap:M6}) $\left(\hat{M}(\beta)\right)^{6}$.}
\end{figure}
\begin{figure}[htbp]
\centering
\subcaptionbox{\label{subcap:EigDistM1IT}}[0.32\linewidth]{\includegraphics[width=0.25\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/EigDist_M1_IT.png}}
\subcaptionbox{\label{subcap:EigDistM1B0-1}}[0.32\linewidth]{\includegraphics[width=0.25\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/EigDist_M1_Beta0-1.png}}
\subcaptionbox{\label{subcap:EigDistM1AZ}}[0.32\linewidth]{\includegraphics[width=0.25\linewidth]{../../NumCalc/ClassicalSpinStochMat/PROGRAMS/Mathematica/EigDist_M1_AZ.png}}
\caption{The distributions of eigenvalues for $\hat{M}(\beta)$: (\subref{subcap:EigDistM1IT}) $\beta\to+0$; (\subref{subcap:EigDistM1B0-1}) $\beta=0.1$; (\subref{subcap:EigDistM1AZ}) $\beta\to+\infty$.}\label{fig:EigDistMbeta}
\end{figure}