Skip to content

Commit 9e93b73

Browse files
committed
Move orphan status to internals
1 parent d530669 commit 9e93b73

File tree

10 files changed

+490
-453
lines changed

10 files changed

+490
-453
lines changed

doc/concepts/replication/repl_architecture.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,3 +238,13 @@ the instances failing in one of the data centers, as well as in case of an
238238
entire data center failure.
239239

240240
The maximal number of replicas in a mesh is 32.
241+
242+
Orphan status
243+
-------------
244+
245+
During ``box.cfg()``, an instance will try to join all masters listed
246+
in :ref:`box.cfg.replication <cfg_replication-replication>`.
247+
If the instance does not succeed with at least
248+
the number of masters specified in
249+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
250+
then it will switch to :ref:`orphan status <internals-replication-orphan_status>`.

doc/dev_guide/internals/index.rst

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,8 @@
33

44
.. _internals:
55

6-
================================================================================
76
Internals
8-
================================================================================
7+
=========
98

109
.. toctree::
1110
:maxdepth: 2
@@ -14,4 +13,4 @@ Internals
1413
msgpack_extensions
1514
file_formats
1615
recovery_internals
17-
replication_internals
16+
replication/index
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
:noindex:
2+
:fullwidth:
3+
4+
.. _internals-replication:
5+
6+
Replication internals
7+
=====================
8+
9+
.. toctree::
10+
11+
replication_server_startup
12+
orphan
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
.. _internals-replication-orphan_status:
2+
3+
Orphan status
4+
=============
5+
6+
Starting with Tarantool version 1.9, there is a change to the
7+
procedure when an instance joins a replica set.
8+
During ``box.cfg()`` the instance will try to join all masters listed
9+
in :ref:`box.cfg.replication <cfg_replication-replication>`.
10+
If the instance does not succeed with at least
11+
the number of masters specified in
12+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
13+
then it will switch to **orphan status**.
14+
While an instance is in orphan status, it is read-only.
15+
16+
To "join" a master, a replica instance must "connect" to the
17+
master node and then "sync".
18+
19+
"Connect" means contact the master over the physical network
20+
and receive acknowledgment. If there is no acknowledgment after
21+
:ref:`box.replication_connect_timeout <cfg_replication-replication_connect_timeout>`
22+
seconds (usually 4 seconds), and retries fail, then the connect step fails.
23+
24+
"Sync" means receive updates
25+
from the master in order to make a local database copy.
26+
Syncing is complete when the replica has received all the
27+
updates, or at least has received enough updates that the replica's lag
28+
(see
29+
:ref:`replication.upstream.lag <box_info_replication_upstream_lag>`
30+
in ``box.info()``)
31+
is less than or equal to the number of seconds specified in
32+
:ref:`box.cfg.replication_sync_lag <cfg_replication-replication_sync_lag>`.
33+
If ``replication_sync_lag`` is unset (nil) or set to TIMEOUT_INFINITY, then
34+
the replica skips the "sync" state and switches to "follow" immediately.
35+
36+
In order to leave orphan mode you need to sync with a sufficient number
37+
(:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`) of
38+
instances. To do so, you may either:
39+
40+
* Set :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
41+
to a lower value.
42+
* Reset ``box.cfg.replication`` to exclude instances that cannot be reached
43+
or synced with.
44+
* Set ``box.cfg.replication`` to ``""`` (empty string).
45+
46+
The following situations are possible.
47+
48+
.. _replication-leader:
49+
50+
**Situation 1: bootstrap**
51+
52+
Here ``box.cfg{}`` is being called for the first time.
53+
A replica is joining but no replica set exists yet.
54+
55+
1. Set status to 'orphan'.
56+
2. Try to connect to all nodes from ``box.cfg.replication``,
57+
or to the number of nodes required by
58+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
59+
Retrying up to 3 times in 30 seconds is possible because this is bootstrap,
60+
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
61+
is overridden.
62+
63+
3. Abort and throw an error if not connected to all nodes in ``box.cfg.replication`` or
64+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
65+
66+
4. This instance might be elected as the replica set 'leader'.
67+
Criteria for electing a leader include vclock value (largest is best),
68+
and whether it is read-only or read-write (read-write is best unless there is no other choice).
69+
The leader is the master that other instances must join.
70+
The leader is the master that executes
71+
:doc:`box.once() </reference/reference_lua/box_once>` functions.
72+
73+
5. If this instance is elected as the replica set leader,
74+
then
75+
perform an "automatic bootstrap":
76+
77+
a. Set status to 'running'.
78+
b. Return from ``box.cfg{}``.
79+
80+
Otherwise this instance will be a replica joining an existing replica set,
81+
so:
82+
83+
a. Bootstrap from the leader.
84+
See examples in section :ref:`Bootstrapping a replica set <replication-bootstrap>`.
85+
b. In background, sync with all the other nodes in the replication set.
86+
87+
**Situation 2: recovery**
88+
89+
Here ``box.cfg{}`` is not being called for the first time.
90+
It is being called again in order to perform recovery.
91+
92+
1. Perform :ref:`recovery <internals-recovery_process>` from the last local
93+
snapshot and the WAL files.
94+
95+
2. Connect to at least
96+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
97+
nodes. If failed -- set status to 'orphan'.
98+
(Attempts to sync will continue in the background and when/if they succeed
99+
then 'orphan' will be changed to 'connected'.)
100+
101+
3. If connected - sync with all connected nodes, until the difference is not more than
102+
:ref:`replication_sync_lag <cfg_replication-replication_sync_lag>` seconds.
103+
104+
.. _replication-configuration_update:
105+
106+
**Situation 3: configuration update**
107+
108+
Here ``box.cfg{}`` is not being called for the first time.
109+
It is being called again because some replication parameter
110+
or something in the replica set has changed.
111+
112+
1. Try to connect to all nodes from ``box.cfg.replication``,
113+
or to the number of nodes required by
114+
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
115+
within the time period specified in
116+
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`.
117+
118+
2. Try to sync with the connected nodes,
119+
within the time period specified in
120+
:ref:`replication_sync_timeout <cfg_replication-replication_sync_timeout>`.
121+
122+
3. If earlier steps fail, change status to 'orphan'.
123+
(Attempts to sync will continue in the background and when/if they succeed
124+
then 'orphan' status will end.)
125+
126+
4. If earlier steps succeed, set status to 'running' (master) or 'follow' (replica).
127+
128+
.. _replication-configuration_rebootstrap:
129+
130+
**Situation 4: rebootstrap**
131+
132+
Here ``box.cfg{}`` is not being called. The replica connected successfully
133+
at some point in the past, and is now ready for an update from the master.
134+
But the master cannot provide an update.
135+
This can happen by accident, or more likely can happen because the replica
136+
is slow (its :ref:`lag <cfg_replication-replication_sync_lag>` is large),
137+
and the WAL (.xlog) files containing the
138+
updates have been deleted. This is not crippling. The replica can discard
139+
what it received earlier, and then ask for the master's latest snapshot
140+
(.snap) file contents. Since it is effectively going through the bootstrap
141+
process a second time, this is called "rebootstrapping". However, there has
142+
to be one difference from an ordinary bootstrap -- the replica's
143+
:ref:`replica id <replication-replica-id>` will remain the same.
144+
If it changed, then the master would think that the replica is a
145+
new addition to the cluster, and would maintain a record of an
146+
instance ID of a replica that has ceased to exist. Rebootstrapping was
147+
introduced in Tarantool version 1.10.2 and is completely automatic.

doc/dev_guide/internals/replication_internals.rst renamed to doc/dev_guide/internals/replication/replication_server_startup.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
.. _internals-replication:
2-
.. _replication-server_startup:
1+
.. _internals-replication-server_startup:
32

43
Server startup with replication
54
===============================

doc/how-to/replication/repl_add_instances.rst

Lines changed: 1 addition & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Adding instances
44
================
55

6-
.. _replication-add_replica:
6+
.. _replication-add_replica:
77

88
This tutorial is intended as a follow-up to the
99
:ref:`replication bootstrapping <replication-bootstrap>` guide.
@@ -164,151 +164,3 @@ read-only mode for this instance:
164164
We also recommend to specify master #3 URI in all instance files in order to
165165
keep all the files consistent with each other and with the current replication
166166
topology.
167-
168-
.. _replication-orphan_status:
169-
170-
Orphan status
171-
-------------
172-
173-
Starting with Tarantool version 1.9, there is a change to the
174-
procedure when an instance joins a replica set.
175-
During ``box.cfg()`` the instance will try to join all masters listed
176-
in :ref:`box.cfg.replication <cfg_replication-replication>`.
177-
If the instance does not succeed with at least
178-
the number of masters specified in
179-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
180-
then it will switch to **orphan status**.
181-
While an instance is in orphan status, it is read-only.
182-
183-
To "join" a master, a replica instance must "connect" to the
184-
master node and then "sync".
185-
186-
"Connect" means contact the master over the physical network
187-
and receive acknowledgment. If there is no acknowledgment after
188-
:ref:`box.replication_connect_timeout <cfg_replication-replication_connect_timeout>`
189-
seconds (usually 4 seconds), and retries fail, then the connect step fails.
190-
191-
"Sync" means receive updates
192-
from the master in order to make a local database copy.
193-
Syncing is complete when the replica has received all the
194-
updates, or at least has received enough updates that the replica's lag
195-
(see
196-
:ref:`replication.upstream.lag <box_info_replication_upstream_lag>`
197-
in ``box.info()``)
198-
is less than or equal to the number of seconds specified in
199-
:ref:`box.cfg.replication_sync_lag <cfg_replication-replication_sync_lag>`.
200-
If ``replication_sync_lag`` is unset (nil) or set to TIMEOUT_INFINITY, then
201-
the replica skips the "sync" state and switches to "follow" immediately.
202-
203-
In order to leave orphan mode you need to sync with a sufficient number
204-
(:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`) of
205-
instances. To do so, you may either:
206-
207-
* Set :ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
208-
to a lower value.
209-
* Reset ``box.cfg.replication`` to exclude instances that cannot be reached
210-
or synced with.
211-
* Set ``box.cfg.replication`` to ``""`` (empty string).
212-
213-
The following situations are possible.
214-
215-
.. _replication-leader:
216-
217-
**Situation 1: bootstrap**
218-
219-
Here ``box.cfg{}`` is being called for the first time.
220-
A replica is joining but no replica set exists yet.
221-
222-
1. Set status to 'orphan'.
223-
2. Try to connect to all nodes from ``box.cfg.replication``,
224-
or to the number of nodes required by
225-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
226-
Retrying up to 3 times in 30 seconds is possible because this is bootstrap,
227-
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`
228-
is overridden.
229-
230-
3. Abort and throw an error if not connected to all nodes in ``box.cfg.replication`` or
231-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`.
232-
233-
4. This instance might be elected as the replica set 'leader'.
234-
Criteria for electing a leader include vclock value (largest is best),
235-
and whether it is read-only or read-write (read-write is best unless there is no other choice).
236-
The leader is the master that other instances must join.
237-
The leader is the master that executes
238-
:doc:`box.once() </reference/reference_lua/box_once>` functions.
239-
240-
5. If this instance is elected as the replica set leader,
241-
then
242-
perform an "automatic bootstrap":
243-
244-
a. Set status to 'running'.
245-
b. Return from ``box.cfg{}``.
246-
247-
Otherwise this instance will be a replica joining an existing replica set,
248-
so:
249-
250-
a. Bootstrap from the leader.
251-
See examples in section :ref:`Bootstrapping a replica set <replication-bootstrap>`.
252-
b. In background, sync with all the other nodes in the replication set.
253-
254-
**Situation 2: recovery**
255-
256-
Here ``box.cfg{}`` is not being called for the first time.
257-
It is being called again in order to perform recovery.
258-
259-
1. Perform :ref:`recovery <internals-recovery_process>` from the last local
260-
snapshot and the WAL files.
261-
262-
2. Connect to at least
263-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`
264-
nodes. If failed -- set status to 'orphan'.
265-
(Attempts to sync will continue in the background and when/if they succeed
266-
then 'orphan' will be changed to 'connected'.)
267-
268-
3. If connected - sync with all connected nodes, until the difference is not more than
269-
:ref:`replication_sync_lag <cfg_replication-replication_sync_lag>` seconds.
270-
271-
.. _replication-configuration_update:
272-
273-
**Situation 3: configuration update**
274-
275-
Here ``box.cfg{}`` is not being called for the first time.
276-
It is being called again because some replication parameter
277-
or something in the replica set has changed.
278-
279-
1. Try to connect to all nodes from ``box.cfg.replication``,
280-
or to the number of nodes required by
281-
:ref:`replication_connect_quorum <cfg_replication-replication_connect_quorum>`,
282-
within the time period specified in
283-
:ref:`replication_connect_timeout <cfg_replication-replication_connect_timeout>`.
284-
285-
2. Try to sync with the connected nodes,
286-
within the time period specified in
287-
:ref:`replication_sync_timeout <cfg_replication-replication_sync_timeout>`.
288-
289-
3. If earlier steps fail, change status to 'orphan'.
290-
(Attempts to sync will continue in the background and when/if they succeed
291-
then 'orphan' status will end.)
292-
293-
4. If earlier steps succeed, set status to 'running' (master) or 'follow' (replica).
294-
295-
.. _replication-configuration_rebootstrap:
296-
297-
**Situation 4: rebootstrap**
298-
299-
Here ``box.cfg{}`` is not being called. The replica connected successfully
300-
at some point in the past, and is now ready for an update from the master.
301-
But the master cannot provide an update.
302-
This can happen by accident, or more likely can happen because the replica
303-
is slow (its :ref:`lag <cfg_replication-replication_sync_lag>` is large),
304-
and the WAL (.xlog) files containing the
305-
updates have been deleted. This is not crippling. The replica can discard
306-
what it received earlier, and then ask for the master's latest snapshot
307-
(.snap) file contents. Since it is effectively going through the bootstrap
308-
process a second time, this is called "rebootstrapping". However, there has
309-
to be one difference from an ordinary bootstrap -- the replica's
310-
:ref:`replica id <replication-replica-id>` will remain the same.
311-
If it changed, then the master would think that the replica is a
312-
new addition to the cluster, and would maintain a record of an
313-
instance ID of a replica that has ceased to exist. Rebootstrapping was
314-
introduced in Tarantool version 1.10.2 and is completely automatic.

locale/ru/LC_MESSAGES/concepts/replication/repl_architecture.po

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -450,3 +450,21 @@ msgstr ""
450450

451451
msgid "The maximal number of replicas in a mesh is 32."
452452
msgstr "Максимальное количество реплик в ячейке -- 32."
453+
454+
msgid "Orphan status"
455+
msgstr "Статус orphan (одиночный)"
456+
457+
msgid ""
458+
"During ``box.cfg()``, an instance will try"
459+
" to join all masters listed in :ref:`box.cfg.replication <cfg_replication-"
460+
"replication>`. If the instance does not succeed with at least the number of "
461+
"masters specified in :ref:`replication_connect_quorum <cfg_replication-"
462+
"replication_connect_quorum>`, then it will switch to "
463+
":ref:`orphan status <replication-orphan_status>`."
464+
msgstr ""
465+
"Во время ``box.cfg()`` экземпляр пытается подключиться "
466+
"ко всем мастерам, указанным в :ref:`box.cfg.replication <cfg_replication-"
467+
"replication>`. Если не было успешно выполнено подключение к количеству "
468+
"мастеров, указанному в :ref:`replication_connect_quorum <cfg_replication-"
469+
"replication_connect_quorum>`, "
470+
"экземпляр переходит в :ref:`статус orphan (одиночный) <internals-replication-orphan_status>`."

0 commit comments

Comments
 (0)