A modular, OpenAI-compatible FastAPI server powered by GPT4Free (G4F) that provides access to multiple AI providers through a unified API interface.
- OpenAI-Compatible API: Full compatibility with OpenAI API v1 endpoints
- Multiple AI Providers: Access to various AI providers through G4F integration
- Chat Completions: Support for both streaming and non-streaming chat responses
- Image Generation: AI-powered image generation with multiple models
- Modular Architecture: Clean, maintainable code structure with separation of concerns
- Rate Limiting: Built-in request rate limiting and throttling
- Request Logging: Comprehensive logging and monitoring
- Error Handling: Robust error handling with detailed error responses
- CORS Support: Configurable Cross-Origin Resource Sharing
- Health Monitoring: Health check and status endpoints
ai-fast-api/
├── app/
│ ├── __init__.py # Package initialization with version info
│ ├── main.py # FastAPI application factory
│ ├── config.py # Configuration management
│ ├── models.py # Pydantic models for requests/responses
│ ├── api/
│ │ ├── __init__.py # API routes package
│ │ ├── chat.py # Chat completion endpoints
│ │ ├── images.py # Image generation endpoints
│ │ ├── models.py # Models and providers endpoints
│ │ └── health.py # Health check endpoints
│ ├── services/
│ │ ├── __init__.py # Services package
│ │ └── g4f_service.py # G4F integration service
│ └── utils/
│ ├── __init__.py # Utilities package
│ ├── logger.py # Centralized logging
│ └── middleware.py # Custom middleware
├── main.py # Application entry point
├── requirements.txt # Python dependencies
└── README.md # This file
- Python 3.9+
uv
package manager (recommended)
-
Clone the repository:
git clone <repository-url> cd ai-fast-api
-
Install dependencies using uv (recommended):
python install_dependencies.py
This script uses `uv` to install dependencies from `requirements.txt`. If `uv` is not installed, the script will provide instructions on how to install it.
Or using pip:
```bash
pip install -r requirements.txt
-
Configure environment variables:
cp .env.example .env # Edit .env with your preferred settings
To run the FastAPI application in development mode (activates virtual environment and starts the server):
python dev_start.py
Alternatively, you can run it directly with Uvicorn (ensure your virtual environment is activated):
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
Once the server is running, you can access:
- API Documentation: http://localhost:8000/docs
- ReDoc Documentation: http://localhost:8000/redoc
- Health Check: http://localhost:8000/health
curl -X GET "http://localhost:8000/v1/models" \
-H "Authorization: Bearer sk-test-key-123"
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test-key-123" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"temperature": 0.7,
"max_tokens": 150
}'
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test-key-123" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Write a short poem about technology"
}
],
"stream": true
}'
from openai import OpenAI
# Initialize client with your local server
client = OpenAI(
api_key="sk-test-key-123",
base_url="http://localhost:8000/v1"
)
# Create a chat completion
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Explain quantum computing in simple terms"}
],
temperature=0.7,
max_tokens=200
)
print(response.choices[0].message.content)
from openai import OpenAI
client = OpenAI(
api_key="sk-test-key-123",
base_url="http://localhost:8000/v1"
)
# Create a streaming chat completion
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Tell me a story about AI"}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
gpt-4o-mini
(default)gpt-4o
gpt-4
gpt-3.5-turbo
claude-3-sonnet
gemini-pro
Configuration is managed via environment variables. A .env.example
file is provided to help you set up. Copy this file to .env
and modify the values as needed.
Key variables include:
HOST
: Server host (default:0.0.0.0
)PORT
: Server port (default:8000
)DEBUG
: Enable debug mode (default:false
)G4F_PROVIDER
: Default G4F provider (default:auto
)G4F_MODEL
: Default G4F model (default:gpt-3.5-turbo
)G4F_TIMEOUT
: G4F request timeout in seconds (default:60
)G4F_RETRIES
: Number of G4F request retries (default:3
)RATE_LIMIT_REQUESTS
: Max requests per minute for rate limiting (default:60
)RATE_LIMIT_WINDOW
: Rate limit window in seconds (default:60
)CORS_ENABLED
: Enable CORS (default:true
)CORS_ORIGINS
: Comma-separated list of allowed CORS origins (default:*
)OPENAI_API_BASE
: Base path for OpenAI compatible API (default:/v1
)
These can be set directly in your environment or by creating a .env
file in the project root.
By default, the following API keys are configured for testing:
sk-test-key-123
(test environment)secret
(development)- Custom key from
API_KEY
environment variable
POST /v1/chat/completions
Create a chat completion with support for streaming and non-streaming responses.
Request Body:
{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 0.7,
"max_tokens": 150,
"stream": false
}
GET /v1/models
List all available models.
GET /health
Check server health status.
The API includes comprehensive error handling:
- 401 Unauthorized: Invalid or missing API key
- 429 Too Many Requests: Rate limit exceeded
- 500 Internal Server Error: Server or AI service errors
Errors are returned in OpenAI-compatible format:
{
"error": {
"message": "Invalid API key",
"type": "invalid_request_error",
"code": 401
}
}
The server implements rate limiting to prevent abuse:
- Default: 10 requests per second per client
- Configurable via environment variables
- Uses token bucket algorithm with
aiolimiter
Built-in retry mechanism for handling transient errors:
- Maximum 3 retry attempts
- Exponential backoff (4-10 seconds)
- Automatic retry on network/service errors
ai-fast-api/
├── main.py # Main FastAPI application
├── requirements.txt # Python dependencies
├── .env # Environment configuration
├── README.md # This file
└── .gitignore # Git ignore rules
# With auto-reload
python main.py
# Or with uvicorn directly
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
Test the API endpoints:
# Health check
curl http://localhost:8000/health
# List models
curl -H "Authorization: Bearer sk-test-key-123" http://localhost:8000/v1/models
# Chat completion
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-test-key-123" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello!"}]}'
Create a Dockerfile
:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
-
Security:
- Use strong, unique API keys
- Enable HTTPS/TLS
- Implement proper authentication
- Set up firewall rules
-
Performance:
- Use multiple workers:
uvicorn main:app --workers 4
- Configure appropriate rate limits
- Monitor resource usage
- Use multiple workers:
-
Monitoring:
- Set up logging aggregation
- Monitor API response times
- Track error rates
- Monitor G4F service availability
-
Import Error: Ensure all dependencies are installed
uv pip install -r requirements.txt
-
G4F Connection Issues: Check internet connection and G4F service status
-
Rate Limiting: Adjust
MAX_REQUESTS_PER_SECOND
in.env
-
Authentication Errors: Verify API key in request headers
Check server logs for detailed error information:
# Run with debug logging
LOG_LEVEL=debug python main.py
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License.
- GPT4Free (G4F) for providing free AI model access
- FastAPI for the excellent web framework
- OpenAI for the API specification