Releases: luisbelloch/data_processing_course
Releases · luisbelloch/data_processing_course
2022.12
2021.12
- Improved Airflow samples: included TaskFlow API from v2 and samples using S3
- Dataproc tutorial for cloud-shell
2021.10
- Upgraded Spark to 3.1.2
- Upgraded to python 3.9.2
- Jupyterlab is now run by default in pyspark container
- Updated base images, kept Java 11 for now
2020.12
- Upgraded Spark to 3.0 (Java 8 + Python 3.9)
2020.1
- Updated to Spark 2.4.5 and Java 11 (adoptopenjdk)
- Makefile improvements
2019.2
- Updated Spark to version 2.4.4
- Fixed Ansible playbook to use AdoptOpenJDK
- Vagrant setup updated to
debian/buster64
- Reduced log level in Spark docker image
2019.1
- Added Apache Airflow basic samples, including DataProc/Spark one
- Updated Spark to version 2.4.0
- Improved live preview scripts
- Minor bugfixing
2018.1
First public release of 2018 course edition:
- Updated Spark to 2.2.1
- Docker image to execute course assignment tests
- Script to validate and package assignments before sending them
- Added minimal Ansible playbook, incorporated to Vagrant setup
- Added container with PySpark + Jupyter Notebook
- Added Minio samples
- Some other minor fixes
2017.3
2017.2
Second public release with fixes after classroom. It might get some revisions after assignments are delivered.