-
Notifications
You must be signed in to change notification settings - Fork 442
Deployment of build.opensuse.org
You should need to do this only once, unless something in the setup changes.
- Set the correct SSH hostname in inventory/production.yml.
The SSH hostname needs to match the one that points to the reference instance in your SSH config usually found in
~/.ssh/config
. Ask someone in the team if you don't. - Set the correct credentials and information regarding the GitHub Deployments and Rocket Chat webhooks as described here.
- Ensure
obs
is configured as one of your hosts, so can access the reference server via SSH. - If you are not on openSUSE Tumbleweed/Leap, run the following command to install zypper-related commands for Ansible:
ansible-galaxy collection install community.general
-
Check that your VPN is working and you can access the reference server.
-
Check the diff of the changes you are going to introduce and if there are database/data migrations or anything requiring special care, it will be easier to solve any possible failure after deployment. Two possible ways to do this, it depends on what you prefer:
-
obs_deploy check-diff
(this is from obs_deploy and it's in thebin
directory ofansible-obs
). - Directly on GitHub at https://github.com/openSUSE/open-build-service/compare/commit1...commit2.
commit1
is the version currently deployed, you get it fromobs_deploy deployed-version
.commit2
is the version to be deployed, you get it fromobs_deploy available-package
by taking the commit SHA from the output (as highlighted here: Available package: obs-api-2.11~alpha.20210623T143055.cb94d5bb1c
-12578.1.noarch.rpm)
-
-
Check if there is a monkey patch in the server and act accordingly.
-
Deploy! Use the correct playbook depending on the changes to be deployed:
-
Most of the time we deploy only code changes that don't introduce any changes on the database schema or require to stop the Apache Server for other reasons. Ansible will abort the operation in case the package contains one/multiple database/data migration(s). If that is the case, check the other options below.
ansible-playbook -i inventory/production.yml deploy_without_migration.yml -vv
-
Some database/data migrations are non-disruptive and therefore don't cause downtimes. The corresponding Ansible playbook will skip the step of putting the Apache server into maintenance mode. You need to check upfront if the database/data migration is a non-disruptive one, Ansible is not able to distinguish between those two cases. Once you've confirmed that there won't be downtimes, go ahead. Otherwise, see the other option below.
ansible-playbook -i inventory/production.yml deploy_with_migration_without_downtime.yml -vv
-
In many cases, database migrations require to stop all interactions of the application with the database while they are getting executed. Therefore causing downtimes. Database migrations with downtime should run in the maintenance window Thursday 8AM - 10AM CET/CEST
ansible-playbook -i inventory/production.yml deploy_with_migration.yml -vv
-
If you need to monkey-patch something you'll want to lock deployments before you have a propper fix.
To do it, you need to use ObsGithubDeployments.
Ansible will detect any locks set and refuse to deploy, so you will be safe.
As an alternative you can user docker as explained in the README
We can use obs_deploy which provides useful information to anyone about to deploy:
-
obs_deploy check-diff
displays the changes that are going to apply. -
obs_deploy pending-migrations
will let you know if there is any pending migration. -
obs_deploy available-package
Displays the currently available package version. -
obs_deploy systemctl --host=<the host to connect to> --user=<username>
Return the status of some vital systemd services we use for the OBS service. The host should be the same one configured in inventory/production.yml. The user should be the user configured in your~/.ssh/config
for that host. -
obs_deploy --help
to discover more interesting commands.
We can use also obs_github_deployments to check for deploy locks or to set/unset them. Please, read the Usage section to learn how to do it.
For delayed jobs, clockworkd and indexing sphinx there is a systemd target:
systemctl obs-api-support.target start|stop|status|restart
To make sure everything is running fine:
# The systemd target should display as active (running), in green
systemctl status obs-api-support.target
# All dependant services should display active (running), in green
systemctl list-dependencies obs-api-support.target
If one of this is already running and systemctl obs-api-support.target stop
does not stop it, you have to manually kill it (ps aux | grep delayed
) otherwise restarting will fail.
To get an overview about remaining jobs, you can run:
run_in_api rails runner script/delayed_job_stats.rb
From time to time we have some issues with the CSS/JS assets. If application.css
or application.js
are missing (you will notice it, when you see unusual errors in your javascript console, specially a 404 when trying to retrieve it) then there are probably more than one sprocket manifest in production. Go to the public folder, check which one comes from the package and delete the one which doesn't. After that reload the application.
cd public/assets
rpm -qf .sprockets-manifest*
rm .sprockets-manifest-$SOMEHASH.json
cd ../..
touch tmp/restart.txt
When you see errors like SSL_connect SYSCALL returned=5 errno=0 state=unknown state
in Errbit this usually means that there is some issue with the RabbitMQ server / connection.
The maintenance window of the RabbitMQ server is Thursday, 8:00am to 10:00am CET
. This can also cause this issue.
Errbit could also report AMQ::Protocol::EmptyResponseError: Empty response received from the server.
errors. This happened not when the reference server is deployed, but when other machines are updated (regular kernel updates, for example). In this case a deployment of the reference server restored the connection. A reload of apache was not enough.
To make sure that Airbrake is working, run the following command run_in_api rake airbrake:test
and it will send an airbrake event to our errbit. It should show up in the OBS frontend App.
In case that the deployment caused some breakage and you might need to build some new packages quickly. In that case you might want to temporarily disable the test suite in our rpm spec. To do that add %disable_obs_test_suite 1
to the project config of OBS:Server:Unstable. Important: It needs to go into the Macros: ... :Macros
section.
It might happen that a deployment breaks OBS badly and you need to get it to work again quickly. In that case you can check if zypper still has the old packages in it's cache. Just run zypper se --details
and verify that the package version you want is still available.
If that's the case, run zypper in --oldpackage obs-api-$VERSION.$ARCH
to downgrade the package.
OBS packages are build whenever a PR get's merged to master. This might delay publishing of built packages. To prevent this, disable the OBS integration in GitHub:
- Go to settings of the OBS GitHub project
- Select the 'Integration & services' tab and click on 'Edit' in the OBS column
- Uncheck the 'Activate' checkbox and 'Update services'
- Once the deployment is done activate the checkbox again ;-)
As far as possible, when a bug is found we should act as usual: create a Pull Request, wait for it to be reviewed and merged and then wait until the changes can be deployed.
This process usually takes so long that, sometimes, we can not wait for it to finish. For example, when the bug is blocking someone's work or even the whole application.
Only in such cases, we can apply the changes manually on production (monkey patch) following these steps:
- Access production server via SSH.
- Go to the application's directory.
- Apply the fix manually.
- Run
touch tmp/restart.txt
to restart the server (Passenger). - Add to
/etc/motd
the link to the Pull Request that fixes the problem. In the next deployment, that PR is going to be applied and will replace the manual changes. - Block the deployment script with
ogd lock --repository $GITHUB_REPOSITORY --token $GITHUB_TOKEN --reason "Same information as you added on /etc/motd"
- ssh to our cloud uploader instance
- Run
zypper up obs-cloud-uploader
- Development Environment Overview
- Development Environment Tips & Tricks
- Spec-Tips
- Code Style
- Rubocop
- Testing with VCR
- Authentication
- Authorization
- Autocomplete
- BS Requests
- Events
- ProjectLog
- Notifications
- Feature Toggles
- Build Results
- Attrib classes
- Flags
- The BackendPackage Cache
- Maintenance classes
- Cloud uploader
- Delayed Jobs
- Staging Workflow
- StatusHistory
- OBS API
- Owner Search
- Search
- Links
- Distributions
- Repository
- Data Migrations
- next_rails
- Ruby Update
- Rails Profiling
- Installing a local LDAP-server
- Remote Pairing Setup Guide
- Factory Dashboard
- osc
- Setup an OBS Development Environment on macOS
- Run OpenQA smoketest locally
- Responsive Guidelines
- Importing database dumps
- Problem Statement & Solution
- Kickoff New Stuff
- New Swagger API doc
- Documentation and Communication
- GitHub Actions
- How to Introduce Software Design Patterns
- Query Objects
- Services
- View Components
- RFC: Core Components
- RFC: Decorator Pattern
- RFC: Backend models