-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[guides] guides: Add postgres migration guidelines #6596
Changes from 6 commits
c220510
ab17b6b
3ebe2c3
50229b8
20ca30f
cfe431f
54e0b49
5c82614
da99800
4f040b4
7b7a9eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,202 @@ | ||
Migration guidelines from MySQL to PostgreSQL | ||
============================================= | ||
|
||
.. include:: ../_static/badges/allplans-selfhosted.rst | ||
:start-after: :nosearch: | ||
|
||
From Mattermost v8.0, PostgreSQL is our database of choice for Mattermost to enhance the platform’s performance and capabilities. Recognizing the importance of supporting the community members who are interested in migrating from a MySQL database, we have taken proactive measures to provide guidance and best practices. | ||
|
||
To streamline the migration process and alleviate any potential challenges, we have prepared a comprehensive set of guidelines to facilitate a smooth transition. Additionally, we want to offer recommendations for various tools that have proven to be highly effective in simplifying your migration efforts. | ||
|
||
.. note:: | ||
|
||
These guidelines are in development and we are working to streamline the migration process. We plan to improve this guide by updating it as new information becomes available. It is essential to note that it does not encompass migration configurations for any plugins, such as Focalboard and Playbooks. If your system utilizes these plugins, we highly advise exercising patience until we incorporate the necessary configurations specifically tailored to ensure a smooth transition for those plugins as well. Please use this guide as a starting point and always backup your database before starting a migration. | ||
|
||
.. contents:: On this page: | ||
:backlinks: top | ||
:local: | ||
:depth: 1 | ||
|
||
Required tools | ||
-------------- | ||
|
||
- Install ``pgLoader``. See the official `installation | ||
guide <https://pgloader.readthedocs.io/en/latest/install.html>`__. | ||
- Install morph CLI via running the following command: | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- ``go install github.com/mattermost/morph/cmd/morph@v1`` | ||
|
||
- Optinally install ``dbcmp`` to compare the data after a migration: | ||
|
||
- ``go install github.com/mattermost/dbcmp/cmd/dbcmp@latest`` | ||
|
||
Before the migration | ||
-------------------- | ||
|
||
.. note:: | ||
This guide requires at least a schema of v6.4. So, if you have an earlier version and planning to migrate, please update your Mattermost Server to v6.4 at least. | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Backup your MySQL data. | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Confirm your Mattermost version. See the **About** modal for details. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @isacikgoz do we have a min MM version we would recommend? I would certainly like to call out being on a supported ESR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can add that, for now it looks like the minimum support version is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nab-77 added a note 👍 |
||
- Determine the migration window needed. This process requires you to stop the Mattermost Server during the migration. | ||
- See the `schema-diffs <#schema-diffs>`__ section to ensure data compatibility between schemas. | ||
- Prepare your PostgreSQL environment by creating a database and user. See the `database </install/prepare-mattermost-database.html>`__ documentation for details. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @isacikgoz do we have a preference for pgsql version here? (outside of ensuring its a supported version). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think minimum version that we support is fine, but the latest supported version would be our preference as far as I'm concerned. Do you think we should provide a specific version? |
||
|
||
Prepare target database | ||
----------------------- | ||
|
||
- Clone the ``mattermost`` repository for your specific version: | ||
``git clone -b <your current version (eg. release-7.8)> [email protected]:mattermost/mattermost.git --depth=1`` | ||
- ``cd`` into ``mattermost`` project*. | ||
- Create a PostgreSQL database using morph CLI with the following command: | ||
|
||
.. code:: bash | ||
|
||
morph apply up --driver postgres --dsn "postgres://user:pass@localhost:5432/<target_db_mame>?sslmode=disable" --path ./db/migrations/postgres --number -1 | ||
|
||
\* After ``v8`` due to project re-organization, the migrations directory has been changed to ``./server/channels/db/migrations/postgres/`` relative to project root. Therefore ``cd`` into ``mattermost/server/channels``. | ||
|
||
Schema diffs | ||
------------ | ||
|
||
Before the migration, due to differences between two schemas, some manual steps may required to have an error-free migration. | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Text to character varying | ||
~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Since the Mattermost MySQL schema uses the ``text`` column type in the various tables instead of ``varchar`` representation in the PostgreSQL schema, we encourage you to check if the sizes are consistent within the PostgreSQL schema limits. | ||
|
||
================ ================ ===================== | ||
Table Column Data type casting | ||
================ ================ ===================== | ||
Audits Action text -> varchar(512) | ||
Audits ExtraInfo text -> varchar(1024) | ||
ClusterDiscovery HostName text -> varchar(512) | ||
Commands IconURL text -> varchar(1024) | ||
Commands AutoCompleteDesc text -> varchar(1024) | ||
Commands AutoCompleteHint text -> varchar(1024) | ||
Compliances Keywords text -> varchar(512) | ||
Compliances Emails text -> varchar(1024) | ||
FileInfo Path text -> varchar(512) | ||
FileInfo ThumbnailPath text -> varchar(512) | ||
FileInfo PreviewPath text -> varchar(512) | ||
FileInfo Name text -> varchar(256) | ||
FileInfo MimeType text -> varchar(256) | ||
LinkMetadata URL text -> varchar(2048) | ||
RemoteClusters SiteURL text -> varchar(512) | ||
RemoteClusters Topics text -> varchar(512) | ||
Sessions DeviceId text -> varchar(512) | ||
Systems Value text -> varchar(1024) | ||
UploadSessions FileName text -> varchar(256) | ||
UploadSessions Path text -> varchar(512) | ||
================ ================ ===================== | ||
|
||
As you can see, there are several occurrences where the schema can differ and data size constraints within the PostgreSQL schema can result in errors. Several reports have been received from our community that ``LinkMetadata`` and ``FileInfo`` tables had some overflows, so we recommend checking these tables in particular. Please do check if your data in the MySQL schema exceeds these limitations. You can check if there are any required deletions. For example, to do so in the ``Audits`` table/``Action`` column; run: | ||
|
||
.. code:: sql | ||
|
||
DELETE FROM mattermost.Audits where LENGTH(Action) > 512; | ||
|
||
Full-text indexes | ||
~~~~~~~~~~~~~~~~~ | ||
|
||
It's possible that some words in the ``Posts`` and ``FileInfo`` tables can exceed the `limits of the maximum token length <https://www.postgresql.org/docs/11/textsearch-limitations.html>`__ for full text search indexing. In these cases, we recommend dropping the ``idx_posts_message_txt`` and ``idx_fileinfo_content_txt`` indexes from the PostgreSQL schema, and creating these indexes after the migration by running following queries: | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To drop indexes, run the following commands before the migration: | ||
|
||
.. code:: sql | ||
|
||
DROP INDEX IF EXISTS idx_posts_message_txt; | ||
DROP INDEX IF EXISTS idx_fileinfo_content_txt; | ||
|
||
Migrate the data | ||
---------------- | ||
|
||
Once we set the schema to desired state, we can start migrating the **data** by running ``pgLoader`` \*\* | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
\*\* Use the following configuration for the baseline of the data migration: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @isacikgoz what are you thoughts on us providing the migration.load template for users to download/edit? Link here maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The template is actually on the repo, being used by the workflow: https://github.com/mattermost/mattermost/blob/master/server/tests/template.load we can add the link here. |
||
|
||
.. code:: | ||
|
||
LOAD DATABASE | ||
FROM mysql://{{ .mysql_user }}:{{ .mysql_password }}@mysql:3306/{{ .source_schema }} | ||
INTO pgsql://{{ .pg_user }}:{{ .pg_password }}@postgres:5432/{{ .target_schema }} | ||
|
||
WITH data only, | ||
workers = 8, concurrency = 1, | ||
multiple readers per thread, rows per range = 50000, | ||
create no tables, | ||
create no indexes, | ||
preserve index names | ||
|
||
SET PostgreSQL PARAMETERS | ||
maintenance_work_mem to '128MB', | ||
work_mem to '12MB' | ||
|
||
SET MySQL PARAMETERS | ||
net_read_timeout = '120', | ||
net_write_timeout = '120' | ||
|
||
CAST column Channels.Type to channel_type drop typemod, | ||
column Teams.Type to team_type drop typemod, | ||
column UploadSessions.Type to upload_session_type drop typemod, | ||
column Drafts.Priority to text, | ||
type int when (= precision 11) to integer drop typemod, | ||
type bigint when (= precision 20) to bigint drop typemod, | ||
type text to varchar drop typemod, | ||
type tinyint when (<= precision 4) to boolean using tinyint-to-boolean, | ||
type json to jsonb drop typemod | ||
|
||
MATERIALIZE VIEWS exclude_products | ||
excluding table names matching ~<IR_>, ~<focalboard> | ||
|
||
BEFORE LOAD DO | ||
$$ ALTER SCHEMA public RENAME TO {{ .source_schema }}; $$ | ||
|
||
AFTER LOAD DO | ||
$$ UPDATE {{ .source_schema }}.db_migrations set name='add_createat_to_teamembers' where version=92; $$, | ||
$$ ALTER SCHEMA {{ .source_schema }} RENAME TO public; $$; | ||
|
||
Once you save this configuration file, eg. ``migration.load``, you can run the ``pgLoader`` with following command: | ||
isacikgoz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. code:: bash | ||
|
||
pgLoader migration.load > migration.log | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to place a note here to remind users to run the commands to re-create the indexes as noted in lines 113/114? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I can add that. |
||
To re-create indexes that has been removed before the migration, run the following once the migration is completed: | ||
|
||
.. code:: sql | ||
|
||
CREATE INDEX IF NOT EXISTS idx_posts_message_txt ON posts USING gin(to_tsvector('english', message)); | ||
CREATE INDEX IF NOT EXISTS idx_fileinfo_content_txt ON fileinfo USING gin(to_tsvector('english', content)); | ||
|
||
Feel free to contribute to and/or report your findings through your migration to us. | ||
|
||
Compare the data | ||
---------------- | ||
|
||
We internally developed a tool to simplify the process of comparing contents of two databases. The ``dbcmp`` tool compares every table and reports whether if there is a diversion between two schemas. | ||
|
||
The tool includes a few flags to run a comparison: | ||
|
||
.. code:: sh | ||
|
||
Usage: | ||
dbcmp [flags] | ||
|
||
Flags: | ||
--exclude strings exclude tables from comparison, takes comma-separated values. | ||
-h, --help help for dbcmp | ||
--source string source database dsn | ||
--target string target database dsn | ||
-v, --version version for dbcmp | ||
|
||
For our case we can simply run the following command: | ||
|
||
.. code:: sh | ||
|
||
dbcmp --source "${MYSQL_DSN}" --target "${POSTGRES_DSN}" --exclude="db_migrations,ir_,focalboard,systems" | ||
|
||
Note that this migration guide only covers the tables for Mattermost channels. Support for other plugins, such as Playbooks, will be added in the future. | ||
|
||
Another exclusion we are making is in the ``db_migrations`` table which has a small difference (a typo in a single migration name) creates a diff. Since we created the PostgreSQL schema with morph, and the official ``mattermost`` source, we can skip it safely without concerns. On the other hand, ``systems`` table may contain additional diffs if there were extra keys added during some of the migrations. Consider excluding the ``systems`` table if you run into issues, and perform a manual comparison as the data in the ``systems`` table is relatively smaller in size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a known issues table might be a good addition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to collect as much as known issues and tried to address "Schema diffs" section. I think we can add a section for known issues & troubleshooting once these gets accumulated.