Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AL2] Author-Standby promotion process failed to start AEM Author Service #427

Open
mbloch1986 opened this issue Jan 19, 2021 · 3 comments
Labels

Comments

@mbloch1986
Copy link
Contributor

mbloch1986 commented Jan 19, 2021

Describe the bug
On Amazon Linux 2 the process of promoting the author-standby as author-primary does not work properly. It failed at the stage to start the AEM Author service after it was previously stopped.

To Reproduce
Steps to reproduce the behavior:

  1. Run the Script /opt/shinesolutions/aem-tools/promote-author-standby-to-primary.sh on the author-standby
  2. Wait until script failed

Expected behavior
The Author-Standby promotion should have finished successfully.

Screenshots

Environment (please complete the following information if relevant):

  • AEM OpenCloud possible all versions
  • Amazon Linux 2

Additional context
it seems like that we should manage all service interactions in Amazon Linux 2 via the systemctl command instead of using the service command.

https://github.com/shinesolutions/puppet-aem-curator/blob/master/manifests/action_promote_author_standby_to_primary.pp#L44

Trying to follow the process manually works

  • systemctl stop aem-author
  • Update config files
  • systemctl start aem-author

As an alternative during the process of promoting the author-standby as author-primary, while the process waits for the login/welcome page to appear you can run the following commands in the following order to start AEM:

systemctl stop aem-author <- This will fail as the service is already stopped

systemctl start aem-author
@mbloch1986 mbloch1986 added the bug label Jan 19, 2021
@mbloch1986 mbloch1986 changed the title AL2 Author-Standby promotion process failed to start AEM Author Service [AL2] Author-Standby promotion process failed to start AEM Author Service Jan 19, 2021
@mbloch1986
Copy link
Contributor Author

mbloch1986 commented Mar 3, 2023

The root cause of this failing is because we are managing the state of AEM via the old service & the newsystemctl command.

During provisioning AEM gets started with the systemctl command. The author standby promotion uses the service command for stopping & starting AEM.

During the service stop service aem-author stop, the cq.pid PID file gets deleted. During service start service aem-author-start the service script triggers a stop via systemctl, to ensure that there is no other service running. This stopping via systemctl fails because the stop script tries to delete the cq.pid file. Since it doesn't exist anymore the stop via systemctl is failing and therefore the start via service fails.

The solution is to make sure that all service commands are replaced with the systemd command systemctl.

@mbloch1986
Copy link
Contributor Author

AEM AWS Stack Provisioner 6.4.0 includes the replacement of all service commands with systemctl.

@mbloch1986
Copy link
Contributor Author

The next step is to replace the service command with systemctl in the manage service SSM Command, to make sure that the author standby promotion works even after an offline snapshot was taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant