title | emoji | colorFrom | colorTo | sdk | pinned | license |
---|---|---|---|---|---|---|
Geo Llm R |
📚 |
blue |
yellow |
docker |
false |
bsd-2-clause |
🤗 Shiny App on Huggingface: https://huggingface.co/spaces/boettiger-lab/geo-llm-r
Work in progress. This is a proof-of-principle for an LLM-driven interface to dynamic mapping. Key technologies include duckdb, geoparquet, pmtiles, maplibre, open LLMs (via VLLM + LiteLLM). R interface through ellmer (LLMs), mapgl (maplibre), shiny, and duckdb.
All edits should be pushed to GitHub. Edits to main
branch are automatically deployed to HuggingFace via GitHub Actions.
When using this scaffold, you will first have to set up your auto-deploy system:
- Create a new HuggingFace Space (any template is fine, will be overwritten).
- Create a HuggingFace Token with write permissions if you do not have one.
- In the GitHub Settings of your repository, add the token as a "New Repository Secret" under the
Secrets and Variables
->Actions
section of settings (https://github.com/{USER}/{REPO}/settings/secrets/actions
). - Edit the
.github/workflows/deploy.yml
file to specify your HuggingFace user name and HF repo to publish to.
This example is designed to be able to leverage open source or open weights models. You will need to adjust the API URL and API key accordingly. This could be a local model with vllm
or ollama
, and of course commercial models should work too. The demo app currently runs on an VLLM+LiteLLM backed model, currently a Llama3 variant, hosted on the National Research Platform.
The LLM plays only a simple role in generating SQL queries from background information on the data including the table schema, see the system prompt for details. Most open models I have experimented with do not support the tool use or structured data interfaces very well compared to commercial models. An important trick in working with open models used here is merely requesting the reply be structured as JSON. Open models are quite decent at this, and at SQL construction, given necessary context about the data. The map and chart elements merely react the resulting data frames, and the entire analysis is thus transparent and reproducible as it would be if the user had composed their request in SQL instead of plain English.
The Dockerfile includes all dependencies required for the HuggingFace deployment, and can be used as a template or directly to serve RStudio server.
Pre-processing the data into cloud-native formats and hosting data on a high bandwidth, highly avalialbe server is essential for efficient and scalable renending. Pre-computing expensive operations such as zonal statistics across all features is also necessary. These steps are described in preprocess.md and corresponding scripts.