Skip to content

ittovate/demo-cassandra-crud

Folders and files

NameName
Last commit message
Last commit date
Jul 10, 2024
Oct 1, 2024
Sep 6, 2024
Jul 20, 2024
Jul 10, 2024
Jul 10, 2024
Jul 10, 2024
Oct 1, 2024
Jul 10, 2024
Jul 10, 2024
Sep 30, 2024
Sep 27, 2024

Repository files navigation

README

Docker Compose with Cassandra

Overview

This guide provides instructions for setting up and running a Cassandra database using Docker. It covers the creation of the necessary Docker environment, starting the application, and configuring the database. Additionally, it explains the advantages of using Cassandra over other databases and provides insights into Cassandra's architecture and data modeling. ### Environment Configuration

.env File

Ensure you have an .env file in your project directory to manage environment variables. Below is an example of what the .env file might look like: dotenv Copy code # .env CASSANDRA_VERSION=4.0.7 CASSANDRA_PORT=9042

Starting the Application

  1. Run Cassandra Container: docker run -p 9042:9042 --rm --name cassandra -d cassandra:4.0.7
  2. Access the Cassandra Container: docker exec -it cassandra bash
  3. Configure Cassandra Inside the Container: cqlsh -u cassandra -p cassandra
  4. Create Keyspace: CREATE KEYSPACE spring_cassandra WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1};

Cassandra Overview

Advantages of Cassandra

  • Linearly Scalable: Achieves horizontal scalability for higher throughput.

  • Multi-Datacenter Deployment: Facilitates distributed data and replication.

  • High Availability: Ensured by clustering architecture. If a node fails, data retrieval is still possible without a master-slave setup.”

  • Data Modeling: Based on expected query patterns, emphasizing performance and denormalization of data.

Cassandra Architecture

  1. Cluster Components:

    • Datacenter: Location of a cluster deployment.
    • Cluster: Collection of nodes.
    • Node: Physical or virtual machine.
  2. Request Handling:

    • Application requests are load-balanced between nodes using a hash function.
    • The coordinator node determines the location of the desired node via the snitch protocol.
    • Data is replicated to other nodes based on the replication factor.
  3. Inter-Node Communication:

    • Nodes share state information using the gossip protocol.
    • In case of cluster or datacenter failure, Cassandra handles data recovery automatically.

Comparison with Relational Databases

  • Cassandra:

    • High availability through clustering.
    • No master-slave architecture, avoiding master election delays.
    • Sharding is database-managed, reducing application-level complexity.
  • Relational Databases:

    • Client-aware sharding can become cumbersome with scale.
    • Sharding management often falls on the application team.

Cassandra Data Modeling

  • Query-Based Design: Data models are created based on application workflow and query requirements.
  • Denormalization: Data is often duplicated across tables to optimize performance.
  • Tunable Consistency: Adjust the number of nodes data must be written to for a write operation to be considered successful.
  • No Joins: Data is denormalized, influencing table naming conventions to reflect different access patterns.

CAP Theorem in Cassandra

  • AP System: Guarantees availability and partition tolerance.
  • Eventual Consistency: Achieved through data replication across nodes.
  • No Transactional Support: Provides row-level transaction abilities with all-or-nothing writes.

Scalability and High Availability

  • Node Management:

    • Nodes have a unified and consistent view of the cluster.
    • New nodes are added easily and rebalanced automatically.
  • Failure Detection and Recovery:

    • Detected using the gossip protocol.
    • Coordinator nodes ensure data availability during node failures.

Durability and Consistency

  • Replication Model: Ensures durability through data replication.
  • Load Balancing: Operations are distributed among nodes to maximize throughput.
  • Eventual Consistency: Balances high availability with consistent data replication.

Additional Features

  • Lightweight Transactions: Achieved with the Paxos consensus protocol.
  • Batched Writes: Data protection mechanism before committing writes to all nodes.
  • Secondary Indexes: Enable querying of previously unqueryable tables, maintaining consistency with local replicas.

https://developers.google.com/identity/openid-connect/openid-connect#java

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published