Skip to content

Architecture

Target picture

The challenge lends itself to a modular layered architecture. It separates source connection, processing, storage, analytics, delivery, and documentation. That improves traceability, allows components to be exchanged more easily, and supports further development after the hackathon.

Architecture overview

Layered architecture at a glance

Layer Purpose Typical outputs
Source Layer connect public data sources, documents, and services raw data, metadata, source references
Ingestion Layer retrieval, download logic, and technical capture API responses, files, snapshots
Raw Data Layer store source artifacts unchanged versioned originals and retrieval logs
Processing / ETL / ELT Layer parsing, cleaning, normalization, and mapping harmonized datasets and derived features
Storage Layer structured storage for analysis and reuse relational tables, geodata, optional document or vector stores
Analytics & Intelligence Layer indicators, trend cards, weak signals, and evidence logic calculated metrics, signals, assessments
API Layer standardized delivery for frontend or other consumers REST endpoints, OpenAPI, Swagger UI
Presentation Layer dashboard, web app, and visualization maps, timelines, indicators, filters
Documentation Layer technical and analytical traceability source inventory, architecture, reproducibility

Architecture principles

  • Modularity: every layer should remain replaceable and testable on its own
  • Source transparency: every result should remain tied back to concrete sources
  • Versioning: raw data, transformation logic, and indicators should be reproducibly versioned
  • Reproducibility: documentation, schema, and access logic should remain reusable after the hackathon
  • Small core: prefer a few reliable components over a broad but fragile stack

Technology guidance

Area Future hosted/server option File-based MVP option
SQL database PostgreSQL DuckDB or SQLite
Geospatial database PostGIS GeoParquet + DuckDB, or later PostGIS
Document or NoSQL storage MongoDB JSONL/Parquet + DuckDB, SQLite JSON, or TinyDB
Vector store ChromaDB server/cloud ChromaDB persistent folder
Search / full text OpenSearch or Elasticsearch SQLite FTS5, DuckDB FTS, Whoosh, or Tantivy
Backend / API FastAPI FastAPI
Frontend React or Streamlit React or Streamlit
Documentation MkDocs MkDocs
API documentation Swagger UI / OpenAPI Swagger UI / OpenAPI

Minimal technical cut for an MVP

A pragmatic MVP can already work with the following building blocks:

  • connectors for four public sources
  • raw data storage plus simple versioning
  • DuckDB, SQLite, Parquet, and/or ChromaDB as no-budget file-based stores in the curated data folder
  • PostgreSQL/PostGIS, MongoDB, or OpenSearch only if the team later chooses future hosted/server infrastructure
  • FastAPI for data and indicator delivery
  • React or Streamlit frontend with a map and a timeline
  • MkDocs for technical and analytical documentation

Pragmatic focus

Not every optional component needs to be implemented during the hackathon. What matters is that the end-to-end flow from source to visualization works in a traceable way.