Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/support large dml split #384

Merged
merged 54 commits into from
Jan 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
3b78e01
feat: demo, delete without pause and resume, rebase main(74fe562b1ab9…
newborn22 Dec 5, 2023
54ce6a6
feat:demo, delete, submit display, pause and resume
newborn22 Dec 6, 2023
d2544f3
feat: demo, update,delete,pause,resume,submit display
newborn22 Dec 8, 2023
fbf836b
feat: support crash recover when job status is running or paused
newborn22 Dec 11, 2023
fe019af
feat: support postpone and cancel
newborn22 Dec 11, 2023
ce78605
feat: support throttle
newborn22 Dec 12, 2023
2227c2c
feat: ugly draft of batch job runner, single pk with int type
newborn22 Dec 19, 2023
34d3d57
feat: optimize pause, resume, recover; support table entry gc; all ba…
newborn22 Dec 19, 2023
060e57f
feat: enrich batch process display
newborn22 Dec 20, 2023
9d6e3df
feat: support single pk with mysql types which can convert into int, …
newborn22 Dec 20, 2023
7c4b34b
feat: support multi PK with various types
newborn22 Dec 21, 2023
3a253c5
fix: rewrite code of batch generating process to make it more readable
newborn22 Dec 21, 2023
16cf5bb
fix: rename function
newborn22 Dec 21, 2023
ca7b83a
feat: support dynamic batch split
newborn22 Dec 22, 2023
5a1346d
fix: fix some bugs in nextBatchID generation; batchSize is min(thresh…
newborn22 Dec 23, 2023
ace17f2
feat: support show dml_job [DETAILS]
newborn22 Dec 24, 2023
3ae1947
feat: add time period code in jobsheduler and jobrunner
newborn22 Dec 25, 2023
7a0abb8
feat: support time period
newborn22 Dec 25, 2023
1026359
fix: fix some bugs and time period can work correctly
newborn22 Dec 26, 2023
8df1833
fix: put split batch codes into funcion splitBatchIntoTwo
newborn22 Dec 28, 2023
bcc2dba
fix: will return error if DML won't affect any rows
newborn22 Dec 28, 2023
c8a63e8
feat: add notify mechanism to jobScheduler
newborn22 Dec 29, 2023
6726067
feat: batchSize = min(userBatchSize, threshold/table_index_count*rati…
newborn22 Dec 29, 2023
ed002ca
fix: move constant var about sql to sqls.go
newborn22 Dec 29, 2023
67e7915
fix: modify some cols in big_dml_jobs_table
newborn22 Dec 29, 2023
43e7644
feat: add batch begin and end cols on batch table
newborn22 Dec 30, 2023
69b8a85
fix: modify dealing batch id format from '1+' to '1-1'
newborn22 Dec 30, 2023
d270b73
feat: dml job details is shown by batchID order
newborn22 Dec 30, 2023
101bb9c
feat: add fail policy
newborn22 Dec 31, 2023
a58e5ba
fix: modify the way to get batchID to execute, otherwise failpolicy s…
newborn22 Dec 31, 2023
cac2082
fix: clear some old todos; inhibit float type used as split col; inhi…
newborn22 Jan 2, 2024
39f6367
fix: modify system table name; add StripComments func; call setResetQ…
newborn22 Jan 3, 2024
7bc6138
fix: gen batchSQL by modify ast node instead of string operations whe…
newborn22 Jan 4, 2024
8b98944
fix: gen batchSQL and batchCountSQL when splitting sql by modify sql …
newborn22 Jan 4, 2024
3a7495d
fix: replace string literal with const var; fix error due to wrong re…
newborn22 Jan 4, 2024
2f7eadd
fix: fix wrong operion when rebase main
newborn22 Jan 4, 2024
71fd11a
fix: refactor genBatchSQL and genBatchCountSQL
newborn22 Jan 8, 2024
ba612ff
fix: refactor splitBatchIntoTwo, add related testcases
newborn22 Jan 8, 2024
0519d9d
fix: solve split brain problem; fix bug in genCountSQL
newborn22 Jan 9, 2024
e9c0412
fix:add failpoints; divide code into files; batchID start from x-2; d…
newborn22 Jan 9, 2024
25acd72
fix: split large functions into smaller ones
newborn22 Jan 10, 2024
a528f8a
fix: fix testcase
newborn22 Jan 10, 2024
299bd56
feat: refactor DML JOB controller
earayu Jan 10, 2024
6d9b575
feat: register flag in fs; table gc
newborn22 Jan 11, 2024
2c36059
fix: merge jobMonitor and JobSchduler as JobManager
newborn22 Jan 12, 2024
62e8bcf
feat: add time zone in feature 'running period time'
newborn22 Jan 12, 2024
e2f960c
test: add interation tests, from parseDML to splitBatch
newborn22 Jan 13, 2024
64526a1
fix: complete time period feature
newborn22 Jan 16, 2024
3f28916
test: e2e test single pk int passed; add github ci
newborn22 Jan 18, 2024
6dc0252
fix: refacor e2e test
newborn22 Jan 18, 2024
844a047
test: enrich e2e test
newborn22 Jan 18, 2024
4c001da
fix: delete runJobController func
newborn22 Jan 18, 2024
52a77ce
fix: let jobcontroller CI run on PR
newborn22 Jan 19, 2024
9ab95d2
fix: refactor e2e test to isolate every testcase; modify time format …
newborn22 Jan 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions .github/workflows/archive/cluster_endtoend_jobcontroller.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# DO NOT MODIFY: THIS FILE IS GENERATED USING "make generate_ci_workflows"

name: Cluster (jobcontroller)
on:
# pull_request:
workflow_dispatch:
push:
branches:
- main
concurrency:
group: format('{0}-{1}', ${{ github.ref }}, 'Cluster (jobcontroller)')
cancel-in-progress: true

env:
GITHUB_PR_HEAD_SHA: "${{ github.event.pull_request.head.sha }}"

jobs:
build:
name: Run endtoend tests on Cluster (jobcontroller)
runs-on: ubuntu-22.04

steps:
- name: Skip CI
run: |
if [[ "${{contains( github.event.pull_request.labels.*.name, 'Skip CI')}}" == "true" ]]; then
echo "skipping CI due to the 'Skip CI' label"
exit 1
fi

- name: Check if workflow needs to be skipped
id: skip-workflow
run: |
skip='false'
if [[ "${{github.event.pull_request}}" == "" ]] && [[ "${{github.ref}}" != "refs/heads/main" ]] && [[ ! "${{github.ref}}" =~ ^refs/heads/release-[0-9]+\.[0-9]$ ]] && [[ ! "${{github.ref}}" =~ "refs/tags/.*" ]]; then
skip='true'
fi
echo Skip ${skip}
echo "skip-workflow=${skip}" >> $GITHUB_OUTPUT

- name: Check out code
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: actions/checkout@v3

- name: Check for changes in relevant files
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: frouioui/paths-filter@main
id: changes
with:
token: ''
filters: |
end_to_end:
- 'go/**/*.go'
- 'test.go'
- 'Makefile'
- 'build.env'
- 'go.sum'
- 'go.mod'
- 'proto/*.proto'
- 'tools/**'
- 'config/**'
- 'bootstrap.sh'
- '.github/workflows/cluster_endtoend_jobcontroller.yml'

- name: Set up Go
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-go@v3
with:
go-version: 1.20.1

- name: Set up python
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-python@v4

- name: Tune the OS
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
# Limit local port range to not use ports that overlap with server side
# ports that we listen on.
sudo sysctl -w net.ipv4.ip_local_port_range="22768 65535"
# Increase the asynchronous non-blocking I/O. More information at https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_use_native_aio
echo "fs.aio-max-nr = 1048576" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p /etc/sysctl.conf

- name: Get dependencies
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |

# Get key to latest MySQL repo
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A8D3785C
# Setup MySQL 8.0
wget -c https://dev.mysql.com/get/mysql-apt-config_0.8.29-1_all.deb
echo mysql-apt-config mysql-apt-config/select-server select mysql-8.0 | sudo debconf-set-selections
sudo DEBIAN_FRONTEND="noninteractive" dpkg -i mysql-apt-config*
sudo apt-get update
# Install everything else we need, and configure
sudo apt-get install -y mysql-server mysql-client make unzip g++ etcd curl git wget eatmydata xz-utils libncurses5

sudo service mysql stop
sudo service etcd stop
sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld
go mod download

# install JUnit report formatter
go install github.com/vitessio/go-junit-report@HEAD

- name: Run cluster endtoend test
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
timeout-minutes: 45
run: |
# We set the VTDATAROOT to the /tmp folder to reduce the file path of mysql.sock file
# which musn't be more than 107 characters long.
export VTDATAROOT="/tmp/"
source build.env

set -x

# run the tests however you normally do, then produce a JUnit XML file
# failpoint
chmod 755 ./test/failpoint/failpoints.sh && source ./test/failpoint/failpoints.sh
echo "GO_FAILPOINTS=$GO_FAILPOINTS" >> $GITHUB_OUTPUT
make failpoint-enable
eatmydata -- go run test.go -docker=false -follow -shard jobcontroller | tee -a output.txt | go-junit-report -set-exit-code > report.xml
make failpoint-disable

- name: Print test output and Record test result
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true' && always()
run: |

# print test output
cat output.txt
141 changes: 141 additions & 0 deletions .github/workflows/archive/cluster_endtoend_jobcontroller_mysql57.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# DO NOT MODIFY: THIS FILE IS GENERATED USING "make generate_ci_workflows"

name: Cluster (jobcontroller) mysql57
on:
# pull_request:
workflow_dispatch:
push:
branches:
- main
concurrency:
group: format('{0}-{1}', ${{ github.ref }}, 'Cluster (jobcontroller) mysql57')
cancel-in-progress: true

env:
GITHUB_PR_HEAD_SHA: "${{ github.event.pull_request.head.sha }}"

jobs:
build:
name: Run endtoend tests on Cluster (jobcontroller) mysql57
runs-on: ubuntu-22.04

steps:
- name: Skip CI
run: |
if [[ "${{contains( github.event.pull_request.labels.*.name, 'Skip CI')}}" == "true" ]]; then
echo "skipping CI due to the 'Skip CI' label"
exit 1
fi

- name: Check if workflow needs to be skipped
id: skip-workflow
run: |
skip='false'
if [[ "${{github.event.pull_request}}" == "" ]] && [[ "${{github.ref}}" != "refs/heads/main" ]] && [[ ! "${{github.ref}}" =~ ^refs/heads/release-[0-9]+\.[0-9]$ ]] && [[ ! "${{github.ref}}" =~ "refs/tags/.*" ]]; then
skip='true'
fi
echo Skip ${skip}
echo "skip-workflow=${skip}" >> $GITHUB_OUTPUT

- name: Check out code
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: actions/checkout@v3

- name: Check for changes in relevant files
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: frouioui/paths-filter@main
id: changes
with:
token: ''
filters: |
end_to_end:
- 'go/**/*.go'
- 'test.go'
- 'Makefile'
- 'build.env'
- 'go.sum'
- 'go.mod'
- 'proto/*.proto'
- 'tools/**'
- 'config/**'
- 'bootstrap.sh'
- '.github/workflows/cluster_endtoend_jobcontroller_mysql57.yml'

- name: Set up Go
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-go@v3
with:
go-version: 1.20.1

- name: Set up python
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-python@v4

- name: Tune the OS
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
sudo sysctl -w net.ipv4.ip_local_port_range="22768 65535"
# Increase the asynchronous non-blocking I/O. More information at https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_use_native_aio
echo "fs.aio-max-nr = 1048576" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p /etc/sysctl.conf

- name: Get dependencies
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
sudo apt-get update

# Uninstall any previously installed MySQL first
sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld

sudo systemctl stop apparmor
sudo DEBIAN_FRONTEND="noninteractive" apt-get remove -y --purge mysql-server mysql-client mysql-common
sudo apt-get -y autoremove
sudo apt-get -y autoclean
sudo deluser mysql
sudo rm -rf /var/lib/mysql
sudo rm -rf /etc/mysql

# Get key to latest MySQL repo
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A8D3785C

wget -c https://dev.mysql.com/get/mysql-apt-config_0.8.29-1_all.deb
# Bionic packages are still compatible for Jammy since there's no MySQL 5.7
# packages for Jammy.
echo mysql-apt-config mysql-apt-config/repo-codename select bionic | sudo debconf-set-selections
echo mysql-apt-config mysql-apt-config/select-server select mysql-5.7 | sudo debconf-set-selections
sudo DEBIAN_FRONTEND="noninteractive" dpkg -i mysql-apt-config*
sudo apt-get update
sudo DEBIAN_FRONTEND="noninteractive" apt-get install -y mysql-client=5.7* mysql-community-server=5.7* mysql-server=5.7* libncurses5

sudo apt-get install -y make unzip g++ etcd curl git wget eatmydata
sudo service mysql stop
sudo service etcd stop

# install JUnit report formatter
go install github.com/vitessio/go-junit-report@HEAD

- name: Run cluster endtoend test
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
timeout-minutes: 45
run: |
# We set the VTDATAROOT to the /tmp folder to reduce the file path of mysql.sock file
# which musn't be more than 107 characters long.
export VTDATAROOT="/tmp/"
source build.env

set -x

# run the tests however you normally do, then produce a JUnit XML file
# failpoint
chmod 755 ./test/failpoint/failpoints.sh && source ./test/failpoint/failpoints.sh
echo "GO_FAILPOINTS=$GO_FAILPOINTS" >> $GITHUB_OUTPUT
make failpoint-enable
eatmydata -- go run test.go -docker=false -follow -shard jobcontroller | tee -a output.txt | go-junit-report -set-exit-code > report.xml
make failpoint-disable

- name: Print test output and Record test result
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true' && always()
run: |
# print test output
cat output.txt
Loading
Loading