Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Rewrite #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

thesuperzapper
Copy link

This is a complete rewrite of the CSD.

Highlights:

  • Rewrote README.md
  • Support for Airflow 1.10.3 (and all new configs)
  • Template based airflow.cfg generation
    • I truly believe this is more maintainable than the crudini approach and allows easier expansion to webserver_config.py.
  • Secure configs stored in environment variables, rather than airflow.cfg
  • Cloudera Manager can now properly start/stop the service
  • Cleaned up all .sh files into control.sh and common.sh
  • Removed RabbitMQ roles (as users should set up RabbitMQ separately if they want to use it)
  • Validate SDL automatically with mvn package using the Cloudera schema-validator-maven-plugin
  • Moved to airflow linux user rather than root
  • Fixed licence headers

NOTE:

  • I have only tested this with the PR 3 of the Airflow Parcel
  • I have used src/_aux instead of src/aux to allow Windows users to build the jar
  • I can re-add the Makefile if you want, but since I am using the maven plugin for SDL validation, there is little benefit

TODO:

  • I have included the following code in control.sh
    unset PYTHONPATH
    unset PYTHONHOME
    
    • This is because the current iteration of PR3 for the parcel mistakenly sets them. (This causes issues with spark-submit from the worker roles, which tries to use airflow's python instead of the base Cloudera one)
    • This can be removed once I have update PR3

Please discuss. (I am more than willing to make changes if needed)

@thesuperzapper
Copy link
Author

thesuperzapper commented Aug 28, 2019

@rssanders3 @razorsedge @Naveen481 are you interested in these changes? Or should I make a fork?

I recommend looking a the docs/readme here: https://github.com/thesuperzapper/apache-airflow-cloudera-csd/tree/full_rewrite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant