Spark-Kafka streaming integration demo
This repository contains a demonstration of integrating Apache Spark with Kafka for both batch and streaming data processing. The code showcases how to set up Spark to produce and consume data from a Kafka topic, using Avro serialization for structured data representation. Examples include creating data frames in Spark, configuring Kafka producers and consumers with SASL_SSL for secure communication, and writing to and reading from Kafka topics in both batch and streaming modes. The setup is designed to work within a Databricks environment, demonstrating the use of DBFS for checkpointing and data storage.