README

Docker Compose with Cassandra

Overview

This guide provides instructions for setting up and running a Cassandra database using Docker. It covers the creation of the necessary Docker environment, starting the application, and configuring the database. Additionally, it explains the advantages of using Cassandra over other databases and provides insights into Cassandra's architecture and data modeling. ### Environment Configuration

`.env` File

Ensure you have an .env file in your project directory to manage environment variables. Below is an example of what the .env file might look like: dotenv Copy code # .env CASSANDRA_VERSION=4.0.7 CASSANDRA_PORT=9042

Starting the Application

Run Cassandra Container: docker run -p 9042:9042 --rm --name cassandra -d cassandra:4.0.7
Access the Cassandra Container: docker exec -it cassandra bash
Configure Cassandra Inside the Container: cqlsh -u cassandra -p cassandra
Create Keyspace: CREATE KEYSPACE spring_cassandra WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1};

Cassandra Overview

Advantages of Cassandra

Linearly Scalable: Achieves horizontal scalability for higher throughput.
Multi-Datacenter Deployment: Facilitates distributed data and replication.
High Availability: Ensured by clustering architecture. If a node fails, data retrieval is still possible without a master-slave setup.”
Data Modeling: Based on expected query patterns, emphasizing performance and denormalization of data.

Cassandra Architecture

Cluster Components:
- Datacenter: Location of a cluster deployment.
- Cluster: Collection of nodes.
- Node: Physical or virtual machine.
Request Handling:
- Application requests are load-balanced between nodes using a hash function.
- The coordinator node determines the location of the desired node via the snitch protocol.
- Data is replicated to other nodes based on the replication factor.
Inter-Node Communication:
- Nodes share state information using the gossip protocol.
- In case of cluster or datacenter failure, Cassandra handles data recovery automatically.

Comparison with Relational Databases

Cassandra:
- High availability through clustering.
- No master-slave architecture, avoiding master election delays.
- Sharding is database-managed, reducing application-level complexity.
Relational Databases:
- Client-aware sharding can become cumbersome with scale.
- Sharding management often falls on the application team.

Cassandra Data Modeling

Query-Based Design: Data models are created based on application workflow and query requirements.
Denormalization: Data is often duplicated across tables to optimize performance.
Tunable Consistency: Adjust the number of nodes data must be written to for a write operation to be considered successful.
No Joins: Data is denormalized, influencing table naming conventions to reflect different access patterns.

CAP Theorem in Cassandra

AP System: Guarantees availability and partition tolerance.
Eventual Consistency: Achieved through data replication across nodes.
No Transactional Support: Provides row-level transaction abilities with all-or-nothing writes.

Scalability and High Availability

Node Management:
- Nodes have a unified and consistent view of the cluster.
- New nodes are added easily and rebalanced automatically.
Failure Detection and Recovery:
- Detected using the gossip protocol.
- Coordinator nodes ensure data availability during node failures.

Durability and Consistency

Replication Model: Ensures durability through data replication.
Load Balancing: Operations are distributed among nodes to maximize throughput.
Eventual Consistency: Balances high availability with consistent data replication.

Additional Features

Lightweight Transactions: Achieved with the Paxos consensus protocol.
Batched Writes: Data protection mechanism before committing writes to all nodes.
Secondary Indexes: Enable querying of previously unqueryable tables, maintaining consistency with local replicas.

https://developers.google.com/identity/openid-connect/openid-connect#java

Name	Name	Last commit message	Last commit date
Latest commit mustafa-abuelmagd fix: email preview page, and stats number Oct 1, 2024 ad81685 · Oct 1, 2024 History 18 Commits
.mvn/wrapper	.mvn/wrapper	Initial Commit with basic application start up	Jul 10, 2024
src	src	fix: email preview page, and stats number	Oct 1, 2024
.env	.env	feat: preview and compose and send email	Sep 6, 2024
.gitignore	.gitignore	Add .env to gitignore	Jul 20, 2024
compose.yaml	compose.yaml	Initial Commit with basic application start up	Jul 10, 2024
create-keyspace.cql	create-keyspace.cql	Initial Commit with basic application start up	Jul 10, 2024
create-keyspace.sh	create-keyspace.sh	Initial Commit with basic application start up	Jul 10, 2024
example.env	example.env	fix: convert app properties to yaml	Oct 1, 2024
mvnw	mvnw	Initial Commit with basic application start up	Jul 10, 2024
mvnw.cmd	mvnw.cmd	Initial Commit with basic application start up	Jul 10, 2024
pom.xml	pom.xml	fix: clean code fixes	Sep 30, 2024
readme.md	readme.md	latest changes	Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Docker Compose with Cassandra

Overview

`.env` File

Starting the Application

Cassandra Overview

Advantages of Cassandra

Cassandra Architecture

Comparison with Relational Databases

Cassandra Data Modeling

CAP Theorem in Cassandra

Scalability and High Availability

Durability and Consistency

Additional Features

About

Releases

Packages

Languages

ittovate/demo-cassandra-crud

Folders and files

Latest commit

History

Repository files navigation

README

Docker Compose with Cassandra

Overview

.env File

Starting the Application

Cassandra Overview

Advantages of Cassandra

Cassandra Architecture

Comparison with Relational Databases

Cassandra Data Modeling

CAP Theorem in Cassandra

Scalability and High Availability

Durability and Consistency

Additional Features

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`.env` File

Packages