This repository contains a serverless data processing pipeline built using Azure Functions and Azure SQL Database. The pipeline ingests JSON data from an HTTP endpoint, processes it, and stores the results in an Azure SQL Database.
This project demonstrates a serverless approach to building data processing pipelines. It is designed to handle JSON data, process it, and store the results in an Azure SQL Database.
- Serverless Architecture: Powered by Azure Functions for a scalable and cost-efficient solution.
- Real-time Data Ingestion: Receives data via HTTP POST requests, making it suitable for various integrations.
- Data Processing: Includes logic to transform and clean the data before storage.
- Azure SQL Database: Stores the processed data for further analysis and reporting.
azure-serverless-data-pipeline/
├── .funcignore # Files and directories to ignore during Azure Function deployments
├── .gitignore # Files and directories to ignore in Git
├── README.md # Project documentation
├── function_app.py # Main Azure Function script
├── host.json # Global configuration options for all functions
├── local.settings.json # Local settings for running the function locally
└── requirements.txt # Python dependencies
- Python 3.8+: Ensure Python is installed on your machine.
- Azure Account: You need an Azure account to deploy resources.
- Azure CLI: Install the Azure CLI for deploying and managing Azure resources.
- Azure Functions Core Tools: Install the Azure Functions Core Tools for local development and testing.
-
Clone the repository:
git clone https://github.com/your-username/azure-serverless-data-pipeline.git cd azure-serverless-data-pipeline
-
Install Python dependencies:
pip install -r requirements.txt
-
Azure SQL Database:
- Create an Azure SQL Database through the Azure Portal.
- Set up the required tables for storing processed data.
-
Azure Function Configuration:
- Update the
function_app.py
file with your Azure SQL Database connection details.
conn_str = ( "Driver={ODBC Driver 17 for SQL Server};" "Server=tcp:<your_server>.database.windows.net,1433;" "Database=<your_database>;" "Uid=<your_username>;" "Pwd=<your_password>;" "Encrypt=yes;" "TrustServerCertificate=no;" "Connection Timeout=30;" )
- Update the
-
Local Settings:
- The
local.settings.json
file is used for running the Azure Function locally. Update it with your storage connection string if needed.
- The
-
Login to Azure:
az login
-
Deploy the Function App:
az functionapp create --resource-group <your-resource-group> --consumption-plan-location <your-location> --runtime python --functions-version 3 --name <your-function-name> --storage-account <your-storage-account>
-
Deploy the code:
func azure functionapp publish <your-function-name>
-
Start the Azure Function locally:
func start
-
Send a POST request with JSON data:
Use a tool like Postman or
curl
:curl -X POST http://localhost:7071/api/<your-function-name> -H "Content-Type: application/json" -d @sample_input.json
-
Check the output:
- The function should process the data and insert it into your local or remote Azure SQL Database.
-
Send a POST request to the deployed function:
curl -X POST https://<your-function-name>.azurewebsites.net/api/<your-function-name> -H "Content-Type: application/json" -d @sample_input.json
-
Monitor the Function:
- Use Azure Monitor or Application Insights to monitor the execution and performance of the deployed function.
Contributions are welcome! Please fork the repository and submit a pull request with your changes.
- Fork the repository.
- Create a feature branch (
git checkout -b feature/new-feature
). - Commit your changes (
git commit -m 'Add new feature'
). - Push to the branch (
git push origin feature/new-feature
). - Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.