Add robust PyPI dependency caching with task queue

Replace unbounded thread spawning with managed worker pool:
- New pypi_cache_tasks table tracks caching jobs
- Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS)
- Automatic retries with exponential backoff (30s, 60s, then fail)
- Deduplication to prevent duplicate caching attempts

New API endpoints for visibility and control:
- GET /pypi/cache/status - queue health summary
- GET /pypi/cache/failed - list failed tasks with errors
- POST /pypi/cache/retry/{package} - retry single package
- POST /pypi/cache/retry-all - retry all failed packages

This fixes silent failures in background dependency caching where
packages would fail to cache without any tracking or retry mechanism.
This commit is contained in:
Mondo Diaz
2026-02-02 11:16:02 -06:00
parent 490b05438d
commit d274f3f375
7 changed files with 1071 additions and 94 deletions

View File

@@ -15,6 +15,7 @@ from .pypi_proxy import router as pypi_router
from .seed import seed_database
from .auth import create_default_admin
from .rate_limit import limiter
from .pypi_cache_worker import init_cache_worker_pool, shutdown_cache_worker_pool
settings = get_settings()
logging.basicConfig(level=logging.INFO)
@@ -49,8 +50,13 @@ async def lifespan(app: FastAPI):
else:
logger.info(f"Running in {settings.env} mode - skipping seed data")
# Initialize PyPI cache worker pool
init_cache_worker_pool()
yield
# Shutdown: cleanup if needed
# Shutdown: cleanup
shutdown_cache_worker_pool()
app = FastAPI(