- PR #117 Add assert timeout for backup/restore test
- PR #116 Increase test running timeout
- PR #114 Update instaclustr-icarus to 2.0.4
- PR #113 Update bootstrap image
- PR #112 make services headless again
- PR #111 debug print patch when statefulset is going to be updated
- PR #110 do not compare status when calculating statefulset patch
- PR #104 Upgrade k8s to 1.24 in e2e tests
- k8s 1.25 compatibile
- Bump go libs
- Upgrade Docusaurus website CMS to latest v2.3.1
- Bump node version to 18.15.0 LTS in website CI
- fix: website/package.json & website/yarn.lock to reduce vulnerabilities
- No more need to trigger on PR events
- Fix golangci-lint workflow
- Fix build-args
- Fix context
- Fix context
- Use full path
- Use right context
- Remove extra brace
- Try to fix build
- Upgrade build-push-action everywhere
- Use new hosted image
- Remove temporary pull_request event
- Test with default value
- Remove debug
- Add pull_request back to push it at least once
- Remove Inputs, temporary run on pull_request and push it to gcr.io
- Upgrade docker/build-push-action
- Use GITHUB_ENV
- Do not use deprecated save-output
- Remove double quotes
- Run it in PR for now
- Rename jobs to easily use act tool
- Add workflow to build and host Icarus Docker image
- Add ending line to existing workflows
- Update cassandra-bootstrap image
- Make Service Headless as it was before 2.1.14
- Typo in NOTES.txt for multi-casskop
- Typo in NOTES.txt
- Set timestamp in RFC3339 format
- PodPolicy applies to pods, not services
- Bump go to 1.18
- Fix typo
- Use assignment instead
- Reformat comments
- Move code closer to where it's used
- Reformat comments
- Convert local variable to global variable
- Remove all settings that should not be needed
- Remove VERSION in Makefile and Bundle unused options
- Fix version in controllers as well
- Update version in multi-casskop as well
- Update cassandraclusters CRD
- Fix variable name
- Fetch history to get tags
- Fix for chart loop
- Fix version in Makefile
- Use double $ to run command in Makefile
- Remove version from values.yaml as well
- Remove need to change version in files before releasing
- Cleanup unused bundle-build
- Set version to 2.1.10
- Update CassandraCluster CRD
- Upgrade multi-casskop CRD
- Fix apiVersion in chart as well
- Upgrade apiVersion of multicasskops CRD to v1
- Upgrade rbac apiVersion
- Upgrade apiextensions-apiserver
- Remove old helm command and update terraform
- Helm needs to use the right tag in e2e
- issue_comment event does not set github.head_ref or ref_name
- Tidy go.mod
- Fix Lint errors
- Upgrade kustomize and remove unsupported replaces
- go get is not longer supported
- Revert "Remove unused command"
- Upgrade golint
- Upgrade base image in all Dockerfiles
- Upgrade bootstrap image and version
- Remove unused command
- Bump terser from 5.12.1 to 5.14.2 in /website
- Fix comments
- Update kub libraries and controller-runtime
- Update appId and apiKey given by Algolia
- Add permissions and upgrade actions/checkout
- Bump async from 2.6.3 to 2.6.4 in /website
- Remove duplicate
- Remove email and comments
- Add theme
- Remove tools folder
- Remove unused files
- Fix search key
- Add docker-generate back
- Remove unused tasks and code
- Update step names
- Validate operator-sdk bundle
- Fix operator-sdk download and no need to install controller-gen again
- Add missing build-arg
- Bump minimist from 1.2.5 to 1.2.6 in /website
- Add yq too
- Use latest CI image version
- Install kustomize in CI image
- Always push latest image when it's built
- push branch image only if master branch
- Rename job to casskop-image
- Increase timeout when running golangci-lint
- Support pushes to trigger-integration
- Push only if owner and push them in kuttl tests as well
- Use version from version.go for bundle
- Add bundle support
- Add rbacCreateClusterRole flag in chart
- Add missing clusterrole and binding
- Multicasskop fixes
- Upgrade cassandraImage to 4.0.2
- Upgrade cassandraImage to 4.0.1
- Use 4.0-rc1 like before
- Revert "Try to give it more time"
- Try to give it more time
- Update README and upgrade version in e2e test of Cassie 4
- Remove duplicated documentation
- Fix links
- Fix e2e tests workflow/event
- Update condition
- Remove old dependency
- Trigger e2e test manually or using specific comment
- Rename matrix variable
- No need to append credentials if not backp-restore test
- No need to to specify ref
- Exclude all test files for not checked errors
- No need to spend time deleting things
- Use working-directory key
- Add all tests back
- Revert "Push 00-createCluster.yaml to s3 temporarily"
- Push 00-createCluster.yaml to s3 temporarily
- Tag images as latest when needed
- Add fields removed during operator upgrade
- Add debug
- Build docker images
- Update k3d
- Update casskop.name and multi-casskop.name
- Update Readiness and Liveness of backrest-sidecar container
- Remove dependency for now
- Do not disable install-kuttl anymore
- Put concurrency on first job
- Cancel existing jobs
- Update documentation to use right github package
- Update versions/tags
- Change names
- Change name
- Change name of chart
- Update ranger configuration
- Fix syntax
- Disable kuttl-tests
- Disable some jobs for now
- Add backup-restore kuttl test and test only it for now
- Fix lint issues and workflow
- Fix event
- No need to use git history anymore
- Remove useless code
- Rename steps
- Support pull requests
- Dgoss tests are a requirement to push image
- Use available Github Action
- Add required key
- Add golint workflow
- Remove golint
- Remove unused tasks
- More renaming
- Add action on tags too
- Do not run when website is updated
- Upgrade node
- Fix build
- Fix compatibility issues
- Use slug instead of id
- Fix IDs
- Upgrade more modules
- Remove unused sidebar
- Use new Algolia configuration
- Upgrade docusaurus
- Do not deploy if not tag versioned
- Upgrade docusaurus to 2.0.0-beta17
- Update URLs as much as possible
- Fix website URL
- Fix publish_dir
- Modify publish_dir
- All images are now on my account
- Update workflows
- Use cache ability of setup-node
- Deploy to github pages website
- CI and Bootstrap workflow builders
- fix: website/package.json & website/yarn.lock to reduce vulnerabilities
- Add workflow to build more images and some cleaning
- Add condition back
- Change types
- Fix Dockerfile path
- Don't want to run kuttl tests on pushes but only after 1st workflow
- pull_request is annoying as github.ref_name is different
- Put Dockerfiles in docker/
- Run workflows on PRs or master branch only
- Fix it everywhere
- Put if in {{ }}
- Refactor workflow
- Helm by @cscetbon in cscetbon#9
- Remove circleci by @cscetbon in cscetbon#10
- PR 397 - Fix activation of Jolokia Auth
- PR 396 - Allow to configure VolumeMount for backrest-sidecar container
- PR 380 - Fix k3d version
- PR 379 - Do not validate Secret for file protocol in Backup
- PR 377 - Fix: update cc status if StatusFinalizing
- PR 376 - Bump operator sdk v1.13.0
- PR 375 - Fix issue #374 with determining the cassandra version from image
- PR 363 - Bump CRDs version from v1aplha1 to v2
- PR 362 - Added a possibility to configure readinessProbe timeouts for operator deployment
- PR 361 - Update Documentation
- PR 359 - Do not mount multiple times the same path
- PR 358 - Limit ConfigBuilder resources and add ability to specify its image
- PR 356 - Bump tar from 6.1.4 to 6.1.11 in /website
- PR 354 - Bump url-parse from 1.5.1 to 1.5.3 in /website
- PR 353 - Upgrade to Go 1.17
- PR 352 - Bump path-parse from 1.0.6 to 1.0.7 in /website
- PR 350 - Remove pre_stop.sh and update build image
- PR 349 - Bump tar from 6.0.5 to 6.1.4 in /website
- PR #344 - Revert "Fix #316 by labeling the cluster uid to each PVC (#322)"
- PR #343 - Revert uid label
- PR #341 - Document how to upgrade from v1 to v2
- PR #340 - Fix operator & pods restart when scaling up DC with autoUpdateSeedList set to true
- PR #337 - Bump prismjs from 1.23.0 to 1.24.0 in /website
- PR #336 - Bump postcss from 7.0.35 to 7.0.36 in /website
- PR #335 - Scales up and down a DC in multi-dcs
- PR #331 - Bump ws from 6.2.1 to 6.2.2 in /website
- PR #330 - Do not crash if backup is created with non existing Datacenter
- PR #246 - Support of Cassandra 4.0
- PR #329 - Bump dns-packet from 1.3.1 to 1.3.4 in /website
- PR #328 - e2e tests use kuttl only
- PR #325 - Add datacenter to Restore and take it into accoun during backup/restore operations
- PR #322 - Label each PVC with the cluster uid
- PR #319 - Fix the way FSGroup and RunAsUser are used
- PR #314 - Added
fsGroup
toCassandraCluster.Spec
- PR #313 - Add allow annotations to be passed from Multicasskop
- PR #307 - Patch kuttl test to work with releases
- PR #305 - Bump prismjs from 1.22.0 to 1.23.0 in /website
- PR #298 - Fix Helm publishing
- PR #297 - Fix sonar jdk
- PR #285 - Bump ini from 1.3.5 to 1.3.8 in /website
- PR #286 - Add the 2 new pages (Cassandra Backup & Restore) to the sidebar + bump version
- PR #287 - Restrict psp role in helm chart to get and list
- PR #289 - Do not use data folder to store downloaded sstables
- PR #290 - Enforce HARDLINKS restore strategy as it is faster
- PR #291 - Add check if cassandraBackup.Annotations map is nil before assignement
- PR #293 - Add option to Rename a table when restoring it
- PR #294 - Generate CRDs to deploy dir AND both helms CRDs dirs
- PR #295 - Upgrade Icarus to 1.0.8 (Fix error catching)
- CassandraRestore CRD (restorationStrategyType removed, add rename option)
- v1beta1 CRDs
- PR #283 - Add Kuttl (declarative E2E test) implem
- CassandraBackup CRD (fixed cassandraCluster typo)
- v1beta1 CRDs
- PR #282 - Fix helm 3 crds installation + add some docs
- PR #280 - Upgrade Icarus to 1.0.5
- PR #279 - Add option to specify resources at DCs level
- PR #278 - Make separate liveness/readiness probes possible
- PR #276 - Upgrade Icarus and add errors to status
- PR #275 - Use static name for k3d cluster
- PR #274 - Fix website doc and add search ability
- PR #273 - [CassandraCluster] Fix Jolokia auth.
- PR #265
- PR #263
- PR #256 - [Chart] Fix multi-casskop role
- PR #252 - [Plugin] Remove metadata.resourceVersion from the applied resource
- PR #250 - [CassandraCluster] Scale up node at a time
- PR #233 - [CassandraCluster] Add ShareProcessNamespace option for operator and cassandra nodes
- PR #245 - [Chart] Explicit roles needed by casskop
- PR #245 - [Chart] Explicit roles
- PR #240 - [Documentation] Bump lodasg from 4.17.15 to 4.17.19
- PR #242 - [Documentation] Bump elliptic from 6.5.2 to 6.5.3
- PR #244 - [Documentation] Bump prismjs from 1.20.0 to 1.21.0
- PR #234 - [CassandraCluster] Fix having pod to fail during decommissioning / joining, replacing liveness probe.
- PR #235 - [CassandraCluster] Fix multi decommissioning
- PR #241 - [CassandraCluster] Do not do more decommissions than needed
- PR #247 - [CassandraCluster] Update pre-stop bootstrap script
- PR #203 - [CassandraCluster] Data configuration at DC level
- PR #215 - [CassandraCluster] Default resources requirements for init containers
- PR #204 - Fix sonar project
- PR #217 - [Documentation] Website documentation in replacement of MD folder
- PR #225 - [CI/CD] Use k3d instead of Minikube
- PR #205 - [MultiCasskop] Non blocking unused Kubernetes cluster in
MultiCasskop
resources - PR #206 - [CassandraCluster] Fix readiness & liveness probe configuration update detection
- PR #220 - [Documentation] Fix yarn.lock
- PR #223 - [Documentation] Fix chart name
- PR #230 - [CassandraCluster] Set boostratp env vars based on Cassandra resources
- PR #201 - Add liveness and readiness probe configurable in CassandraCluster object
- PR #200 - Catch nil pvcSpec error
- PR #199 - Fix Issue #197 helm release
- PR #198 - Add custom metrics to operator
- PR #196 - Fix Issue #170 cross ip
- PR #195 - Ensure generated deepcopy files are always up to date
- PR #193 - Fix Issue #192 Add check on container length for statefulset comparison
Breaking Change in the bootstrap image See Upgrade section
- PR #190 - Fix Issue #189 Handle volumemounts per container
- PR #187 - Fix helm repo url
- PR #185 - Add the support of sidecars
- PR #184 - Use Jolokia calls instead of nodetool in readiness and liveness probes
- PR #179 - Fix Issue #168 Do not check toplogy in CassKop (does not work with MultiCassKop) during rebuild but using Cassandra
- PR #177 - Add documentation on how to add tolerations
- PR #175 - Fix dgoss tests
- PR #174 - Upgrade operator sdk
- PR #173 - Fix documentation
- PR #167 - Fix plugin remove command
- PR #165 - Fix OpenAPI v3.0 schema validation
- PR #164 - Rename repository
- PR #163 - Add documentation regarding the upgrade of the operator
- PR #162 - Adapt CI pipeline for multi-CassKop
- PR #161 - Fix helm chart
- PR #157 - Add logo for CassKop
- PR #156 - Watch only first cluster in MultiCassKop
- PR #155 - Refactor PodAffinityTerm
- PR #153 - Allow Istio to work with Cassandra and encrypt native connections
- PR #152 - Add GKE example
Introduce Multi-Casskop, the Operator to manage a single Cassandra Cluster above multiple Kubernetes clusters.
-
PR #145 - Fix Issue #142 PodStatus which rarely fails in unit tests
-
PR #146 - Fix Issue #143 External update of SeedList was not possible
-
PR #147 - Introduce Multi-CassKop operator
-
PR #149 - Get rid of env var SERVICE_NAME and keep current hostname in seedlist
-
PR #151 - Fix Issue #150 Makes JMX port remotely available (again)
- uses New bootstrap Image 0.1.3 : ghcr.io/cscetbon/casskop-bootstrap:0.1.3
Breaking Change in API
The fields spec.baseImage
and spec.version
have been removed in favor for spec.cassandraImage
witch is a merge of
both of thems.
- PR #128 Fix Issue #96: cluster stay pending
- PR #127 Fix Issue #126: update racks in parallel
- PR #124: Add Support for pod & services annotations
- PR #138 Add support for Tolerations
Examples of annotation needed in the CassandraCluster Spec:
service:
annotations:
external-dns.alpha.kubernetes.io/hostname: my.custom.domain.com.
- PR #119 Refactoring Makefile
- tests now uses default cassandra docker image
- initContainerImage and bootstrapContainerImage used to adapt to official cassandra image.
- ReadOnly Container :
Spec.ReadOnlyRootFilesystem
default true
- upgrade to operator-sdk 0.9.0 & go modules (thanks @jsanda)
- Released version
- GitHub open source version
- Add
spec.gcStdout
(default: true): to send gc logs to docker stdout - Add
spec.topology.dc[].numTokens
(default: 256): to specify different number of vnodes for each DC - Move RollingPartition from
spec.RollingPartition
to `spec.topology.dc[].rack[].RollingPartition - Add Cassandra psp config files in deploy
- add
spec.maxPodUnavailable
(default: 1): If there is pod unavailable in the ring casskop will refuse to make change on statefulses. we can bypass this by increasing the maxPodUnavailable value.
- Upgrade to Operator-sdk 0.2.0
- Upgrade to Operator-sdk 0.1.1
- Add and Remove DC
- Decommission now using JMX call instead of exec nodetool
- Configurable operator resyncPeriod via environment RESYNC_PERIOD
- No more uses of the Kubernetes subdomain in Cassandra Seeds --> Need Cassandra Docker Image > cassandra-3.11-v1.1.0
- This also fixes the First node SeedList. we know via dns request if the first node exists or not, and if not it is the first creation of the cluster. So next times we can properly remove node1 from it's seedlist.
- Add new parameter
imagePullPolicy: "IfNotPresent"
to the CRD (default is "Always") - Add
securityContext: runAsUser: 1000
to allow pod operator to launch with higher cluster security
- Fix Issue 60: Error when RollingUpdate on UpdateResource
- Fix Issue 59: Error on UpdateConfigMap vs UpdateStatefulset
-
SeedList Management
- new param
AutoUpdateSeedList
which defines if operator need to automatically compute and apply best seedList
- new param
-
CRD Improvement :
- CRD protection against forbidden changed in the CRD. the operator now refuses to change:
- the
dataCapacity
- the
dataStorageClass
- the
- We can now specify/surcharge the global
nodesPerRack
in each DC section
- CRD protection against forbidden changed in the CRD. the operator now refuses to change:
-
Better Status Management
-
Add Cluster level status to have a global view of whole cluster (composed of several statefulsets)
- lastClusterAction
- lastClusterActionStatus
Thoses status are used to know there is an ongoing action at cluster level, and that enables for instance to completely finish an ScaleUp on all Racks, before executing PodLevel actions such as NodeCleanup.
-
Add new status :
UpdateResources
- if we change requested Pod resources in the CRDUpdateSeedList
- when the operator need to make rolling Update to apply new seedlistWe won't update the Seedist if not All Racks are staged to this modification (no other actions ongoing on the cluster)
-
-
Add
ImagePullSecret
parameter in the CRD to allow provide docker credentials to pull images -
SeedList Management
- SeedList Initialisation before Startup: We try (if available) to take 1 seed in each rack for each DC
- the Operator will try to apply the best SeedList in case of cluster topology evolution (Scaling, Add DC/Racks..)
- The Operator will make a Rolling Update (see nes status above)
- The DFY Cassandra Image in couple with the Operator will make that a Pod in the SeedList will be removed from it's own seedList.
Limitation: The first Pod of the cluster will be in it's own SeedList
- We can manually update the SeedList on the CRD Object, this will RollingUpdate each statefulset sequentially starting with the First
-
Operator Debug
- Allow specific
docker-build-debug
target in the Makefile and in the Pipeline to build debug version of the operator- debug version of go application
- debug version of Image docker
- debug version of helm chart (see below)
- Allow specific
-
Helm Chart Improvment
-
Add Possibility to use images behind authentication (imagePullSecret)
imagePullSecrets:
enabled: true
name: <name of your docker registry secret>
- New way to define Debug Image and delve version API to uses
debug:
enabled: false
image:
repository: orangeopensource/casskop
tag: 0.1.2-debug
version: 2
- When NodeCleanup encounters some errors, we can see the status in the CassandraCluster
- Fix Bug #53: Error which prevent PVC to be Deleted when CRD is delete and
deletePVC
flag is true - Fix Bug #52: The cluster was not deploying if Topology was empty
- Rack Aware Deployment
- Add Topology section to declare Cassandra DC and Racks on their deployment ysing kubernetes nodes labels
- Note: Rename of
nodes
tonodesPerRacks
in the CRD yaml file
- add
hardAntiAffinity
flag in CRD to manage if we allow only 1 Cassandra Node per Kubernetes Nodes.
Limitation: This parameter check only for Pods in the same kubernetes Namespace!!
-
add
deletePVC
flag in CRD to allow to delete all PersistentVolumesClaims in case we delete the cluster -
Uses Jolokia for nodetool cleanup operation
-
Add
autoPilot
flag in CRD to enable to automatically execute Pod Operation cleanup after a ScaleUp, or to allow to do the Operation manually by editing Pods Labels status to Manual to ToDo
-
Rack Aware Deployment
- Pod level get infos for Rack & DC. PR #33
- Exposes CASSANDRA_RACK env var in the Pod from
cassandraclusters.db.orange.com.rack
Pod Labels - Exposes CASSANDRA_DC env var in the Pod from
cassandraclusters.db.orange.com.dc
Pod Labels
- Exposes CASSANDRA_RACK env var in the Pod from
- Pod level get infos for Rack & DC. PR #33
-
Make Uses of OLM (Operator Lifecycle Management) to manage the Operator
- #25: change declaration of local-storage in PersistentVolumeClaim
- Upgrade Operator SDK version to latest master (revision=a719b04752a51e5fe723467c7e66bc35830eb179)
- Add start time and end time labels on Pods during Pod Actions
- Add a Test on Operation Name for detecting an end in Cleanup Action
- in ensureDecommission
- Re-Order Status in ensureDecommission
- Add test on CassandraNode status to know if decommissioned is ongoing or not
- Add asynchronous for nodetool decommission operation
- Add Helm charts to deploy the operator
- Add a Pod Disruption Budget which allows to have only 2 cassandra node down at a same time while working on the kubernetes cluster
- Add a Jolokia client to interract with Cassandra
- Remove old unused code
- Add a test on the Pod Readiness before say ScaleUp is Done
- Increase HealthCheck Periods and Timeouts
- Add output messages in health checks requests for debug
- Fix GetLastPod is number of pods > 10
- Better management of decommission status (check with nodetool netstats to get node status), and adapt behaviour
- On scale down, test Date on pod label to not execute several time nodetool decommission until status change from NORMAL to LEAVING
- Add test on field readyReplicas of the Statefulset to know operation is Done
- add sample directory for demo manifests.
- Add plantuml algorithm documentation
- If no dataCapacity is specified in the CRD, then No PersistentVolumeClaim is created
- WARNING this is useful for dev but unsafe for production meaning that no datas will be persistent..
- Increase Timeout for HealthCheck Status from 5 to 40 and add PeriodSeconds to 50 between each healthcheck
- remove
nodetool drain
from the PreStop instruction - Add PodDisruptionBudget with MaxUnavailable=2
- Initial version port from cassandra-kooper-operator propject