feat: add auto-fetch for missing dependencies from upstream registries

Add auto_fetch parameter to dependency resolution endpoint that fetches
missing dependencies from upstream registries (PyPI) when resolving.

- Add RegistryClient abstraction with PyPIRegistryClient implementation
- Extract fetch_and_cache_pypi_package() for reuse
- Add resolve_dependencies_with_fetch() async function
- Extend MissingDependency schema with fetch_attempted/fetch_error
- Add fetched list to DependencyResolutionResponse
- Add auto_fetch_max_depth config setting (default: 3)
- Remove Usage section from Package page UI
- Add 6 integration tests for auto-fetch functionality
This commit is contained in:
Mondo Diaz
2026-02-04 12:01:49 -06:00
parent 9f233e0d4d
commit cbc2e5e11a
10 changed files with 1348 additions and 65 deletions

View File

@@ -10,6 +10,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added S3 bucket provisioning terraform configuration (#59)
- Creates an S3 bucket to be used for anything Orchard
- Creates a log bucket for any logs tracking the S3 bucket
- Added auto-fetch capability to dependency resolution endpoint
- `GET /api/v1/project/{project}/{package}/+/{ref}/resolve?auto_fetch=true` fetches missing dependencies from upstream registries
- PyPI registry client queries PyPI JSON API to resolve version constraints
- Fetched artifacts are cached and included in response `fetched` field
- Missing dependencies show `fetch_attempted` and `fetch_error` status
- Configurable max fetch depth via `ORCHARD_AUTO_FETCH_MAX_DEPTH` (default: 3)
- Added `backend/app/registry_client.py` with extensible registry client abstraction
- `RegistryClient` ABC for implementing upstream registry clients
- `PyPIRegistryClient` implementation using PyPI JSON API
- `get_registry_client()` factory function for future npm/maven support
- Added `fetch_and_cache_pypi_package()` reusable function for PyPI package fetching
- Added HTTP connection pooling infrastructure for improved PyPI proxy performance
- `HttpClientManager` with configurable pool size, timeouts, and thread pool executor
- Eliminates per-request connection overhead (~100-500ms → ~5ms)
@@ -36,6 +47,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added `POST /api/v1/cache/resolve` endpoint to cache packages by coordinates instead of URL (#108)
### Changed
- Removed Usage section from Package page (curl command examples)
- PyPI proxy now uses shared HTTP connection pool instead of per-request clients
- PyPI proxy now caches upstream source configuration in Redis
- Dependency storage now uses batch INSERT instead of individual queries