Kafka-Spark

Spark-Kafka streaming integration demo

This repository contains a demonstration of integrating Apache Spark with Kafka for both batch and streaming data processing. The code showcases how to set up Spark to produce and consume data from a Kafka topic, using Avro serialization for structured data representation. Examples include creating data frames in Spark, configuring Kafka producers and consumers with SASL_SSL for secure communication, and writing to and reading from Kafka topics in both batch and streaming modes. The setup is designed to work within a Databricks environment, demonstrating the use of DBFS for checkpointing and data storage.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
person.avsc		person.avsc
sheida-kafka-spark.md		sheida-kafka-spark.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kafka-Spark

About

Releases

Packages

SheidaMajidi/Kafka-Spark

Folders and files

Latest commit

History

Repository files navigation

Kafka-Spark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages