Compare commits
123 Commits
196f3f957c
...
fix/pypi-p
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ec518519b2 | ||
|
|
968cb00477 | ||
|
|
262aff6e97 | ||
|
|
1389a03c69 | ||
|
|
a45ec46e94 | ||
|
|
1202947620 | ||
|
|
f5c9e438a0 | ||
|
|
aff08ad393 | ||
|
|
cdb3b5ecb3 | ||
|
|
659ecf6f73 | ||
|
|
15cd90b36d | ||
|
|
65bb073a6e | ||
|
|
cbc2e5e11a | ||
|
|
9f233e0d4d | ||
|
|
b27eb0a928 | ||
|
|
9a1d578525 | ||
|
|
08291a2f56 | ||
|
|
f8ad957ff9 | ||
|
|
331745320d | ||
|
|
a6fee37ea9 | ||
|
|
b1056f2286 | ||
|
|
2a423d66c0 | ||
|
|
cd9940da01 | ||
|
|
bdfc525e71 | ||
|
|
8d04dd5449 | ||
|
|
743ce26e54 | ||
|
|
39ae40f1c6 | ||
|
|
ca8f62f69b | ||
|
|
b55c810100 | ||
|
|
bef16d884b | ||
|
|
a97d3e630f | ||
|
|
7b0d423bee | ||
|
|
8731b42d3e | ||
|
|
a442778458 | ||
|
|
36c05230ff | ||
|
|
dc9c217d8a | ||
|
|
da3fd7a601 | ||
|
|
9a2b323fd8 | ||
|
|
6b3522aef2 | ||
|
|
f37d3e3e9a | ||
|
|
308057784e | ||
|
|
86c95bea2b | ||
|
|
cc5d67abd6 | ||
|
|
eb287edbda | ||
|
|
86e971381a | ||
|
|
cf2fe5151f | ||
|
|
2ae479146f | ||
|
|
a0dad73db0 | ||
|
|
b40c53d308 | ||
|
|
f04149b410 | ||
|
|
aa851ab445 | ||
|
|
9313942f53 | ||
|
|
9a795a301a | ||
|
|
9f13221012 | ||
|
|
a99381aafb | ||
|
|
d422ed5cd8 | ||
|
|
b2a8c7cfcc | ||
|
|
eb11efd001 | ||
|
|
02e69c65ee | ||
|
|
34d98f52cb | ||
|
|
29fa53d174 | ||
|
|
63de1ce672 | ||
|
|
0b85f37abd | ||
|
|
101152f87f | ||
|
|
3a09accfe6 | ||
|
|
88765b4f50 | ||
|
|
152af0a852 | ||
|
|
31edadf3ad | ||
|
|
2136e1f0c5 | ||
|
|
ff25677b16 | ||
|
|
0a6dad9af0 | ||
|
|
36cf288526 | ||
|
|
7008d913bf | ||
|
|
46e8c7df70 | ||
|
|
a3929bfb17 | ||
|
|
db2805a36c | ||
|
|
7a6e270d63 | ||
|
|
df4f9d168b | ||
|
|
1f98caa73c | ||
|
|
a485852a6f | ||
|
|
5517048f05 | ||
|
|
c7eca269f4 | ||
|
|
6a3a875a9c | ||
|
|
a39b6f098f | ||
|
|
e0562195df | ||
|
|
db7d0bb7c4 | ||
|
|
4a287d46c8 | ||
|
|
cbea91a528 | ||
|
|
80e2f3d157 | ||
|
|
522d23ec01 | ||
|
|
c1060feb5f | ||
|
|
e62e75bade | ||
|
|
befa517485 | ||
|
|
7a2c0a54c6 | ||
|
|
ead016208d | ||
|
|
4b76ca2046 | ||
|
|
94bbd87e6b | ||
|
|
2cf04a43ef | ||
|
|
9acef055b6 | ||
|
|
694f25ac9b | ||
|
|
06b2beb152 | ||
|
|
2b2dbae38b | ||
|
|
cd56d00ebf | ||
|
|
558e1bc78f | ||
|
|
32218dbb1c | ||
|
|
006df9dff9 | ||
|
|
844e937071 | ||
|
|
77c7526023 | ||
|
|
ec69d7619b | ||
|
|
8e3af8c4f5 | ||
|
|
24a0a71cf4 | ||
|
|
ab50148a60 | ||
|
|
acee458b3c | ||
|
|
f18b8ed560 | ||
|
|
7e84dd3958 | ||
|
|
a72c9d3f6e | ||
|
|
a6618fe550 | ||
|
|
796176c251 | ||
|
|
f58fb0079a | ||
|
|
f57762334f | ||
|
|
599c8c1d5b | ||
|
|
11c5aee0f1 | ||
|
|
1b706fe858 |
90
CHANGELOG.md
90
CHANGELOG.md
@@ -7,6 +7,37 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
|
|
||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
### Added
|
### Added
|
||||||
|
- Added S3 bucket provisioning terraform configuration (#59)
|
||||||
|
- Creates an S3 bucket to be used for anything Orchard
|
||||||
|
- Creates a log bucket for any logs tracking the S3 bucket
|
||||||
|
- Added auto-fetch capability to dependency resolution endpoint
|
||||||
|
- `GET /api/v1/project/{project}/{package}/+/{ref}/resolve?auto_fetch=true` fetches missing dependencies from upstream registries
|
||||||
|
- PyPI registry client queries PyPI JSON API to resolve version constraints
|
||||||
|
- Fetched artifacts are cached and included in response `fetched` field
|
||||||
|
- Missing dependencies show `fetch_attempted` and `fetch_error` status
|
||||||
|
- Configurable max fetch depth via `ORCHARD_AUTO_FETCH_MAX_DEPTH` (default: 3)
|
||||||
|
- Added `backend/app/registry_client.py` with extensible registry client abstraction
|
||||||
|
- `RegistryClient` ABC for implementing upstream registry clients
|
||||||
|
- `PyPIRegistryClient` implementation using PyPI JSON API
|
||||||
|
- `get_registry_client()` factory function for future npm/maven support
|
||||||
|
- Added `fetch_and_cache_pypi_package()` reusable function for PyPI package fetching
|
||||||
|
- Added HTTP connection pooling infrastructure for improved PyPI proxy performance
|
||||||
|
- `HttpClientManager` with configurable pool size, timeouts, and thread pool executor
|
||||||
|
- Eliminates per-request connection overhead (~100-500ms → ~5ms)
|
||||||
|
- Added Redis caching layer with category-aware TTL for hermetic builds
|
||||||
|
- `CacheService` with graceful fallback when Redis unavailable
|
||||||
|
- Immutable data (artifact metadata, dependencies) cached forever
|
||||||
|
- Mutable data (package index, versions) uses configurable TTL
|
||||||
|
- Added `ArtifactRepository` for batch database operations
|
||||||
|
- `batch_upsert_dependencies()` reduces N+1 queries to single INSERT
|
||||||
|
- `get_or_create_artifact()` uses atomic ON CONFLICT upsert
|
||||||
|
- Added infrastructure status to health endpoint (`/health`)
|
||||||
|
- Reports HTTP pool size and worker threads
|
||||||
|
- Reports Redis cache connection status
|
||||||
|
- Added new configuration settings for HTTP client, Redis, and cache TTL
|
||||||
|
- `ORCHARD_HTTP_MAX_CONNECTIONS`, `ORCHARD_HTTP_CONNECT_TIMEOUT`, etc.
|
||||||
|
- `ORCHARD_REDIS_HOST`, `ORCHARD_REDIS_PORT`, `ORCHARD_REDIS_ENABLED`
|
||||||
|
- `ORCHARD_CACHE_TTL_INDEX`, `ORCHARD_CACHE_TTL_VERSIONS`, etc.
|
||||||
- Added transparent PyPI proxy implementing PEP 503 Simple API (#108)
|
- Added transparent PyPI proxy implementing PEP 503 Simple API (#108)
|
||||||
- `GET /pypi/simple/` - package index (proxied from upstream)
|
- `GET /pypi/simple/` - package index (proxied from upstream)
|
||||||
- `GET /pypi/simple/{package}/` - version list with rewritten download links
|
- `GET /pypi/simple/{package}/` - version list with rewritten download links
|
||||||
@@ -14,35 +45,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- Allows `pip install --index-url https://orchard.../pypi/simple/ <package>`
|
- Allows `pip install --index-url https://orchard.../pypi/simple/ <package>`
|
||||||
- Artifacts cached on first access through configured upstream sources
|
- Artifacts cached on first access through configured upstream sources
|
||||||
- Added `POST /api/v1/cache/resolve` endpoint to cache packages by coordinates instead of URL (#108)
|
- Added `POST /api/v1/cache/resolve` endpoint to cache packages by coordinates instead of URL (#108)
|
||||||
|
|
||||||
### Changed
|
|
||||||
- Upstream sources table text is now centered under column headers (#108)
|
|
||||||
- ENV badge now appears inline with source name instead of separate column (#108)
|
|
||||||
- Test and Edit buttons now have more prominent button styling (#108)
|
|
||||||
- Reduced footer padding for cleaner layout (#108)
|
|
||||||
|
|
||||||
### Fixed
|
|
||||||
- Fixed purge_seed_data crash when deleting access permissions - was comparing UUID to VARCHAR column (#107)
|
|
||||||
|
|
||||||
### Changed
|
|
||||||
- Upstream source connectivity test no longer follows redirects, fixing "Exceeded maximum allowed redirects" error with Artifactory proxies (#107)
|
|
||||||
- Test runs automatically after saving a new or updated upstream source (#107)
|
|
||||||
- Test status now shows as colored dots (green=success, red=error) instead of text badges (#107)
|
|
||||||
- Clicking red dot shows error details in a modal (#107)
|
|
||||||
- Source name column no longer wraps text for better table layout (#107)
|
|
||||||
- Renamed "Cache Management" page to "Upstream Sources" (#107)
|
|
||||||
- Moved Delete button from table row to edit modal for cleaner table layout (#107)
|
|
||||||
|
|
||||||
### Removed
|
|
||||||
- Removed `is_public` field from upstream sources - all sources are now treated as internal/private (#107)
|
|
||||||
- Removed `allow_public_internet` (air-gap mode) setting from cache settings - not needed for enterprise proxy use case (#107)
|
|
||||||
- Removed seeding of public registry URLs (npm-public, pypi-public, maven-central, docker-hub) (#107)
|
|
||||||
- Removed "Public" badge and checkbox from upstream sources UI (#107)
|
|
||||||
- Removed "Allow Public Internet" toggle from cache settings UI (#107)
|
|
||||||
- Removed "Global Settings" section from cache management UI - auto-create system projects is always enabled (#107)
|
|
||||||
- Removed unused CacheSettings frontend types and API functions (#107)
|
|
||||||
|
|
||||||
### Added
|
|
||||||
- Added `ORCHARD_PURGE_SEED_DATA` environment variable support to stage helm values to remove seed data from long-running deployments (#107)
|
- Added `ORCHARD_PURGE_SEED_DATA` environment variable support to stage helm values to remove seed data from long-running deployments (#107)
|
||||||
- Added frontend system projects visual distinction (#105)
|
- Added frontend system projects visual distinction (#105)
|
||||||
- "Cache" badge for system projects in project list
|
- "Cache" badge for system projects in project list
|
||||||
@@ -209,6 +211,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- Added comprehensive integration tests for all dependency features
|
- Added comprehensive integration tests for all dependency features
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
- Removed Usage section from Package page (curl command examples)
|
||||||
|
- PyPI proxy now uses shared HTTP connection pool instead of per-request clients
|
||||||
|
- PyPI proxy now caches upstream source configuration in Redis
|
||||||
|
- Dependency storage now uses batch INSERT instead of individual queries
|
||||||
|
- Increased default database pool size from 5 to 20 connections
|
||||||
|
- Increased default database max overflow from 10 to 30 connections
|
||||||
|
- Enabled Redis in Helm chart values for dev, stage, and prod environments
|
||||||
|
- Upstream sources table text is now centered under column headers (#108)
|
||||||
|
- ENV badge now appears inline with source name instead of separate column (#108)
|
||||||
|
- Test and Edit buttons now have more prominent button styling (#108)
|
||||||
|
- Reduced footer padding for cleaner layout (#108)
|
||||||
|
- Upstream source connectivity test no longer follows redirects, fixing "Exceeded maximum allowed redirects" error with Artifactory proxies (#107)
|
||||||
|
- Test runs automatically after saving a new or updated upstream source (#107)
|
||||||
|
- Test status now shows as colored dots (green=success, red=error) instead of text badges (#107)
|
||||||
|
- Clicking red dot shows error details in a modal (#107)
|
||||||
|
- Source name column no longer wraps text for better table layout (#107)
|
||||||
|
- Renamed "Cache Management" page to "Upstream Sources" (#107)
|
||||||
|
- Moved Delete button from table row to edit modal for cleaner table layout (#107)
|
||||||
- Added pre-test stage reset to ensure known environment state before integration tests (#54)
|
- Added pre-test stage reset to ensure known environment state before integration tests (#54)
|
||||||
- Upload endpoint now accepts optional `ensure` file parameter for declaring dependencies
|
- Upload endpoint now accepts optional `ensure` file parameter for declaring dependencies
|
||||||
- Updated upload API documentation with ensure file format and examples
|
- Updated upload API documentation with ensure file format and examples
|
||||||
@@ -217,8 +237,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
- Added orchard logo icon and dot separator to footer
|
- Added orchard logo icon and dot separator to footer
|
||||||
|
|
||||||
### Fixed
|
### Fixed
|
||||||
|
- Fixed purge_seed_data crash when deleting access permissions - was comparing UUID to VARCHAR column (#107)
|
||||||
- Fixed dark theme styling for team pages - modals, forms, and dropdowns now use correct theme variables
|
- Fixed dark theme styling for team pages - modals, forms, and dropdowns now use correct theme variables
|
||||||
- Fixed UserAutocomplete and TeamSelector dropdown backgrounds for dark theme
|
- Fixed UserAutocomplete and TeamSelector dropdown backgrounds for dark theme
|
||||||
|
- Fixed PyPI proxy filtering platform-specific dependencies (pyobjc on macOS, pywin32 on Windows)
|
||||||
|
- Fixed bare version constraints being treated as wildcards (e.g., `certifi@2025.10.5` now fetches exact version)
|
||||||
|
|
||||||
|
### Removed
|
||||||
|
- Removed `is_public` field from upstream sources - all sources are now treated as internal/private (#107)
|
||||||
|
- Removed `allow_public_internet` (air-gap mode) setting from cache settings - not needed for enterprise proxy use case (#107)
|
||||||
|
- Removed seeding of public registry URLs (npm-public, pypi-public, maven-central, docker-hub) (#107)
|
||||||
|
- Removed "Public" badge and checkbox from upstream sources UI (#107)
|
||||||
|
- Removed "Allow Public Internet" toggle from cache settings UI (#107)
|
||||||
|
- Removed "Global Settings" section from cache management UI - auto-create system projects is always enabled (#107)
|
||||||
|
- Removed unused CacheSettings frontend types and API functions (#107)
|
||||||
|
|
||||||
## [0.5.1] - 2026-01-23
|
## [0.5.1] - 2026-01-23
|
||||||
### Changed
|
### Changed
|
||||||
|
|||||||
262
backend/app/cache_service.py
Normal file
262
backend/app/cache_service.py
Normal file
@@ -0,0 +1,262 @@
|
|||||||
|
"""
|
||||||
|
Redis-backed caching service with category-aware TTL and invalidation.
|
||||||
|
|
||||||
|
Provides:
|
||||||
|
- Immutable caching for artifact data (hermetic builds)
|
||||||
|
- TTL-based caching for discovery data
|
||||||
|
- Event-driven invalidation for config changes
|
||||||
|
- Graceful fallback when Redis unavailable
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from enum import Enum
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from .config import Settings
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class CacheCategory(Enum):
|
||||||
|
"""
|
||||||
|
Cache categories with different TTL and invalidation rules.
|
||||||
|
|
||||||
|
Immutable (cache forever):
|
||||||
|
- ARTIFACT_METADATA: Artifact info by SHA256
|
||||||
|
- ARTIFACT_DEPENDENCIES: Extracted deps by SHA256
|
||||||
|
- DEPENDENCY_RESOLUTION: Resolution results by input hash
|
||||||
|
|
||||||
|
Mutable (TTL + event invalidation):
|
||||||
|
- UPSTREAM_SOURCES: Upstream config, invalidate on DB change
|
||||||
|
- PACKAGE_INDEX: PyPI/npm index pages, TTL only
|
||||||
|
- PACKAGE_VERSIONS: Version listings, TTL only
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Immutable - cache forever (hermetic builds)
|
||||||
|
ARTIFACT_METADATA = "artifact"
|
||||||
|
ARTIFACT_DEPENDENCIES = "deps"
|
||||||
|
DEPENDENCY_RESOLUTION = "resolve"
|
||||||
|
|
||||||
|
# Mutable - TTL + event invalidation
|
||||||
|
UPSTREAM_SOURCES = "upstream"
|
||||||
|
PACKAGE_INDEX = "index"
|
||||||
|
PACKAGE_VERSIONS = "versions"
|
||||||
|
|
||||||
|
|
||||||
|
def get_category_ttl(category: CacheCategory, settings: Settings) -> Optional[int]:
|
||||||
|
"""
|
||||||
|
Get TTL for a cache category.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
TTL in seconds, or None for no expiry (immutable).
|
||||||
|
"""
|
||||||
|
ttl_map = {
|
||||||
|
# Immutable - no TTL
|
||||||
|
CacheCategory.ARTIFACT_METADATA: None,
|
||||||
|
CacheCategory.ARTIFACT_DEPENDENCIES: None,
|
||||||
|
CacheCategory.DEPENDENCY_RESOLUTION: None,
|
||||||
|
# Mutable - configurable TTL
|
||||||
|
CacheCategory.UPSTREAM_SOURCES: settings.cache_ttl_upstream,
|
||||||
|
CacheCategory.PACKAGE_INDEX: settings.cache_ttl_index,
|
||||||
|
CacheCategory.PACKAGE_VERSIONS: settings.cache_ttl_versions,
|
||||||
|
}
|
||||||
|
return ttl_map.get(category)
|
||||||
|
|
||||||
|
|
||||||
|
class CacheService:
|
||||||
|
"""
|
||||||
|
Redis-backed caching with category-aware TTL.
|
||||||
|
|
||||||
|
Key format: orchard:{category}:{protocol}:{identifier}
|
||||||
|
Example: orchard:deps:pypi:abc123def456
|
||||||
|
|
||||||
|
When Redis is disabled or unavailable, operations gracefully
|
||||||
|
return None/no-op to allow the application to function without caching.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, settings: Settings):
|
||||||
|
self._settings = settings
|
||||||
|
self._enabled = settings.redis_enabled
|
||||||
|
self._redis: Optional["redis.asyncio.Redis"] = None
|
||||||
|
self._started = False
|
||||||
|
|
||||||
|
async def startup(self) -> None:
|
||||||
|
"""Initialize Redis connection. Called by FastAPI lifespan."""
|
||||||
|
if self._started:
|
||||||
|
return
|
||||||
|
|
||||||
|
if not self._enabled:
|
||||||
|
logger.info("CacheService disabled (redis_enabled=False)")
|
||||||
|
self._started = True
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
import redis.asyncio as redis
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Connecting to Redis at {self._settings.redis_host}:"
|
||||||
|
f"{self._settings.redis_port}/{self._settings.redis_db}"
|
||||||
|
)
|
||||||
|
|
||||||
|
self._redis = redis.Redis(
|
||||||
|
host=self._settings.redis_host,
|
||||||
|
port=self._settings.redis_port,
|
||||||
|
db=self._settings.redis_db,
|
||||||
|
password=self._settings.redis_password,
|
||||||
|
decode_responses=False, # We handle bytes
|
||||||
|
)
|
||||||
|
|
||||||
|
# Test connection
|
||||||
|
await self._redis.ping()
|
||||||
|
logger.info("CacheService connected to Redis")
|
||||||
|
|
||||||
|
except ImportError:
|
||||||
|
logger.warning("redis package not installed, caching disabled")
|
||||||
|
self._enabled = False
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Redis connection failed, caching disabled: {e}")
|
||||||
|
self._enabled = False
|
||||||
|
self._redis = None
|
||||||
|
|
||||||
|
self._started = True
|
||||||
|
|
||||||
|
async def shutdown(self) -> None:
|
||||||
|
"""Close Redis connection. Called by FastAPI lifespan."""
|
||||||
|
if not self._started:
|
||||||
|
return
|
||||||
|
|
||||||
|
if self._redis:
|
||||||
|
await self._redis.aclose()
|
||||||
|
self._redis = None
|
||||||
|
|
||||||
|
self._started = False
|
||||||
|
logger.info("CacheService shutdown complete")
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _make_key(category: CacheCategory, protocol: str, identifier: str) -> str:
|
||||||
|
"""Build namespaced cache key."""
|
||||||
|
return f"orchard:{category.value}:{protocol}:{identifier}"
|
||||||
|
|
||||||
|
async def get(
|
||||||
|
self,
|
||||||
|
category: CacheCategory,
|
||||||
|
key: str,
|
||||||
|
protocol: str = "default",
|
||||||
|
) -> Optional[bytes]:
|
||||||
|
"""
|
||||||
|
Get cached value.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
category: Cache category for TTL rules
|
||||||
|
key: Unique identifier within category
|
||||||
|
protocol: Protocol namespace (pypi, npm, etc.)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Cached bytes or None if not found/disabled.
|
||||||
|
"""
|
||||||
|
if not self._enabled or not self._redis:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
full_key = self._make_key(category, protocol, key)
|
||||||
|
return await self._redis.get(full_key)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Cache get failed for {key}: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def set(
|
||||||
|
self,
|
||||||
|
category: CacheCategory,
|
||||||
|
key: str,
|
||||||
|
value: bytes,
|
||||||
|
protocol: str = "default",
|
||||||
|
) -> None:
|
||||||
|
"""
|
||||||
|
Set cached value with category-appropriate TTL.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
category: Cache category for TTL rules
|
||||||
|
key: Unique identifier within category
|
||||||
|
value: Bytes to cache
|
||||||
|
protocol: Protocol namespace (pypi, npm, etc.)
|
||||||
|
"""
|
||||||
|
if not self._enabled or not self._redis:
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
full_key = self._make_key(category, protocol, key)
|
||||||
|
ttl = get_category_ttl(category, self._settings)
|
||||||
|
|
||||||
|
if ttl is None:
|
||||||
|
await self._redis.set(full_key, value)
|
||||||
|
else:
|
||||||
|
await self._redis.setex(full_key, ttl, value)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Cache set failed for {key}: {e}")
|
||||||
|
|
||||||
|
async def delete(
|
||||||
|
self,
|
||||||
|
category: CacheCategory,
|
||||||
|
key: str,
|
||||||
|
protocol: str = "default",
|
||||||
|
) -> None:
|
||||||
|
"""Delete a specific cache entry."""
|
||||||
|
if not self._enabled or not self._redis:
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
full_key = self._make_key(category, protocol, key)
|
||||||
|
await self._redis.delete(full_key)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Cache delete failed for {key}: {e}")
|
||||||
|
|
||||||
|
async def invalidate_pattern(
|
||||||
|
self,
|
||||||
|
category: CacheCategory,
|
||||||
|
pattern: str = "*",
|
||||||
|
protocol: str = "default",
|
||||||
|
) -> int:
|
||||||
|
"""
|
||||||
|
Invalidate all entries matching pattern.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
category: Cache category
|
||||||
|
pattern: Glob pattern for keys (default "*" = all in category)
|
||||||
|
protocol: Protocol namespace
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Number of keys deleted.
|
||||||
|
"""
|
||||||
|
if not self._enabled or not self._redis:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
try:
|
||||||
|
full_pattern = self._make_key(category, protocol, pattern)
|
||||||
|
keys = []
|
||||||
|
async for key in self._redis.scan_iter(match=full_pattern):
|
||||||
|
keys.append(key)
|
||||||
|
|
||||||
|
if keys:
|
||||||
|
return await self._redis.delete(*keys)
|
||||||
|
return 0
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Cache invalidate failed for pattern {pattern}: {e}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
async def ping(self) -> bool:
|
||||||
|
"""Check if Redis is connected and responding."""
|
||||||
|
if not self._enabled or not self._redis:
|
||||||
|
return False
|
||||||
|
|
||||||
|
try:
|
||||||
|
await self._redis.ping()
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
@property
|
||||||
|
def enabled(self) -> bool:
|
||||||
|
"""Check if caching is enabled."""
|
||||||
|
return self._enabled
|
||||||
@@ -22,8 +22,8 @@ class Settings(BaseSettings):
|
|||||||
database_sslmode: str = "disable"
|
database_sslmode: str = "disable"
|
||||||
|
|
||||||
# Database connection pool settings
|
# Database connection pool settings
|
||||||
database_pool_size: int = 5 # Number of connections to keep open
|
database_pool_size: int = 20 # Number of connections to keep open
|
||||||
database_max_overflow: int = 10 # Max additional connections beyond pool_size
|
database_max_overflow: int = 30 # Max additional connections beyond pool_size
|
||||||
database_pool_timeout: int = 30 # Seconds to wait for a connection from pool
|
database_pool_timeout: int = 30 # Seconds to wait for a connection from pool
|
||||||
database_pool_recycle: int = (
|
database_pool_recycle: int = (
|
||||||
1800 # Recycle connections after this many seconds (30 min)
|
1800 # Recycle connections after this many seconds (30 min)
|
||||||
@@ -53,6 +53,25 @@ class Settings(BaseSettings):
|
|||||||
)
|
)
|
||||||
pypi_download_mode: str = "redirect" # "redirect" (to S3) or "proxy" (stream through Orchard)
|
pypi_download_mode: str = "redirect" # "redirect" (to S3) or "proxy" (stream through Orchard)
|
||||||
|
|
||||||
|
# HTTP Client pool settings
|
||||||
|
http_max_connections: int = 100 # Max connections per pool
|
||||||
|
http_max_keepalive: int = 20 # Keep-alive connections
|
||||||
|
http_connect_timeout: float = 30.0 # Connection timeout seconds
|
||||||
|
http_read_timeout: float = 60.0 # Read timeout seconds
|
||||||
|
http_worker_threads: int = 32 # Thread pool for blocking ops
|
||||||
|
|
||||||
|
# Redis cache settings
|
||||||
|
redis_host: str = "localhost"
|
||||||
|
redis_port: int = 6379
|
||||||
|
redis_db: int = 0
|
||||||
|
redis_password: Optional[str] = None
|
||||||
|
redis_enabled: bool = True # Set False to disable caching
|
||||||
|
|
||||||
|
# Cache TTL settings (seconds, 0 = no expiry)
|
||||||
|
cache_ttl_index: int = 300 # Package index pages: 5 min
|
||||||
|
cache_ttl_versions: int = 300 # Version listings: 5 min
|
||||||
|
cache_ttl_upstream: int = 3600 # Upstream source config: 1 hour
|
||||||
|
|
||||||
# Logging settings
|
# Logging settings
|
||||||
log_level: str = "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
|
log_level: str = "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
|
||||||
log_format: str = "auto" # "json", "standard", or "auto" (json in production)
|
log_format: str = "auto" # "json", "standard", or "auto" (json in production)
|
||||||
@@ -70,6 +89,10 @@ class Settings(BaseSettings):
|
|||||||
pypi_cache_max_depth: int = 10 # Maximum recursion depth for dependency caching
|
pypi_cache_max_depth: int = 10 # Maximum recursion depth for dependency caching
|
||||||
pypi_cache_max_attempts: int = 3 # Maximum retry attempts for failed cache tasks
|
pypi_cache_max_attempts: int = 3 # Maximum retry attempts for failed cache tasks
|
||||||
|
|
||||||
|
# Auto-fetch configuration for dependency resolution
|
||||||
|
auto_fetch_dependencies: bool = False # Server default for auto_fetch parameter
|
||||||
|
auto_fetch_timeout: int = 300 # Total timeout for auto-fetch resolution in seconds
|
||||||
|
|
||||||
# JWT Authentication settings (optional, for external identity providers)
|
# JWT Authentication settings (optional, for external identity providers)
|
||||||
jwt_enabled: bool = False # Enable JWT token validation
|
jwt_enabled: bool = False # Enable JWT token validation
|
||||||
jwt_secret: str = "" # Secret key for HS256, or leave empty for RS256 with JWKS
|
jwt_secret: str = "" # Secret key for HS256, or leave empty for RS256 with JWKS
|
||||||
|
|||||||
175
backend/app/db_utils.py
Normal file
175
backend/app/db_utils.py
Normal file
@@ -0,0 +1,175 @@
|
|||||||
|
"""
|
||||||
|
Database utilities for optimized artifact operations.
|
||||||
|
|
||||||
|
Provides batch operations to eliminate N+1 queries.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from sqlalchemy.dialects.postgresql import insert as pg_insert
|
||||||
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
|
from .models import Artifact, ArtifactDependency, CachedUrl
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class ArtifactRepository:
|
||||||
|
"""
|
||||||
|
Optimized database operations for artifact storage.
|
||||||
|
|
||||||
|
Key optimizations:
|
||||||
|
- Atomic upserts using ON CONFLICT
|
||||||
|
- Batch inserts for dependencies
|
||||||
|
- Joined queries to avoid N+1
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, db: Session):
|
||||||
|
self.db = db
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _format_dependency_values(
|
||||||
|
artifact_id: str,
|
||||||
|
dependencies: list[tuple[str, str, str]],
|
||||||
|
) -> list[dict]:
|
||||||
|
"""
|
||||||
|
Format dependencies for batch insert.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
artifact_id: SHA256 of the artifact
|
||||||
|
dependencies: List of (project, package, version_constraint)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of dicts ready for bulk insert.
|
||||||
|
"""
|
||||||
|
return [
|
||||||
|
{
|
||||||
|
"artifact_id": artifact_id,
|
||||||
|
"dependency_project": proj,
|
||||||
|
"dependency_package": pkg,
|
||||||
|
"version_constraint": ver,
|
||||||
|
}
|
||||||
|
for proj, pkg, ver in dependencies
|
||||||
|
]
|
||||||
|
|
||||||
|
def get_or_create_artifact(
|
||||||
|
self,
|
||||||
|
sha256: str,
|
||||||
|
size: int,
|
||||||
|
filename: str,
|
||||||
|
content_type: Optional[str] = None,
|
||||||
|
created_by: str = "system",
|
||||||
|
s3_key: Optional[str] = None,
|
||||||
|
) -> tuple[Artifact, bool]:
|
||||||
|
"""
|
||||||
|
Get existing artifact or create new one atomically.
|
||||||
|
|
||||||
|
Uses INSERT ... ON CONFLICT DO UPDATE to handle races.
|
||||||
|
If artifact exists, increments ref_count.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
sha256: Content hash (primary key)
|
||||||
|
size: File size in bytes
|
||||||
|
filename: Original filename
|
||||||
|
content_type: MIME type
|
||||||
|
created_by: User who created the artifact
|
||||||
|
s3_key: S3 storage key (defaults to standard path)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
(artifact, created) tuple where created is True for new artifacts.
|
||||||
|
"""
|
||||||
|
if s3_key is None:
|
||||||
|
s3_key = f"fruits/{sha256[:2]}/{sha256[2:4]}/{sha256}"
|
||||||
|
|
||||||
|
stmt = pg_insert(Artifact).values(
|
||||||
|
id=sha256,
|
||||||
|
size=size,
|
||||||
|
original_name=filename,
|
||||||
|
content_type=content_type,
|
||||||
|
ref_count=1,
|
||||||
|
created_by=created_by,
|
||||||
|
s3_key=s3_key,
|
||||||
|
).on_conflict_do_update(
|
||||||
|
index_elements=['id'],
|
||||||
|
set_={'ref_count': Artifact.ref_count + 1}
|
||||||
|
).returning(Artifact)
|
||||||
|
|
||||||
|
result = self.db.execute(stmt)
|
||||||
|
artifact = result.scalar_one()
|
||||||
|
|
||||||
|
# Check if this was an insert or update by comparing ref_count
|
||||||
|
# ref_count=1 means new, >1 means existing
|
||||||
|
created = artifact.ref_count == 1
|
||||||
|
|
||||||
|
return artifact, created
|
||||||
|
|
||||||
|
def batch_upsert_dependencies(
|
||||||
|
self,
|
||||||
|
artifact_id: str,
|
||||||
|
dependencies: list[tuple[str, str, str]],
|
||||||
|
) -> int:
|
||||||
|
"""
|
||||||
|
Insert dependencies in a single batch operation.
|
||||||
|
|
||||||
|
Uses ON CONFLICT DO NOTHING to skip duplicates.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
artifact_id: SHA256 of the artifact
|
||||||
|
dependencies: List of (project, package, version_constraint)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Number of dependencies inserted.
|
||||||
|
"""
|
||||||
|
if not dependencies:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
values = self._format_dependency_values(artifact_id, dependencies)
|
||||||
|
|
||||||
|
stmt = pg_insert(ArtifactDependency).values(values)
|
||||||
|
stmt = stmt.on_conflict_do_nothing(
|
||||||
|
index_elements=['artifact_id', 'dependency_project', 'dependency_package']
|
||||||
|
)
|
||||||
|
|
||||||
|
result = self.db.execute(stmt)
|
||||||
|
return result.rowcount
|
||||||
|
|
||||||
|
def get_cached_url_with_artifact(
|
||||||
|
self,
|
||||||
|
url_hash: str,
|
||||||
|
) -> Optional[tuple[CachedUrl, Artifact]]:
|
||||||
|
"""
|
||||||
|
Get cached URL and its artifact in a single query.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
url_hash: SHA256 of the URL
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
(CachedUrl, Artifact) tuple or None if not found.
|
||||||
|
"""
|
||||||
|
result = (
|
||||||
|
self.db.query(CachedUrl, Artifact)
|
||||||
|
.join(Artifact, CachedUrl.artifact_id == Artifact.id)
|
||||||
|
.filter(CachedUrl.url_hash == url_hash)
|
||||||
|
.first()
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
def get_artifact_dependencies(
|
||||||
|
self,
|
||||||
|
artifact_id: str,
|
||||||
|
) -> list[ArtifactDependency]:
|
||||||
|
"""
|
||||||
|
Get all dependencies for an artifact in a single query.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
artifact_id: SHA256 of the artifact
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of ArtifactDependency objects.
|
||||||
|
"""
|
||||||
|
return (
|
||||||
|
self.db.query(ArtifactDependency)
|
||||||
|
.filter(ArtifactDependency.artifact_id == artifact_id)
|
||||||
|
.all()
|
||||||
|
)
|
||||||
@@ -11,11 +11,18 @@ Handles:
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import re
|
import re
|
||||||
|
import logging
|
||||||
import yaml
|
import yaml
|
||||||
from typing import List, Dict, Any, Optional, Set, Tuple
|
from typing import List, Dict, Any, Optional, Set, Tuple, TYPE_CHECKING
|
||||||
from sqlalchemy.orm import Session
|
from sqlalchemy.orm import Session
|
||||||
from sqlalchemy import and_
|
from sqlalchemy import and_
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from .storage import S3Storage
|
||||||
|
from .registry_client import RegistryClient
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Import packaging for PEP 440 version matching
|
# Import packaging for PEP 440 version matching
|
||||||
try:
|
try:
|
||||||
from packaging.specifiers import SpecifierSet, InvalidSpecifier
|
from packaging.specifiers import SpecifierSet, InvalidSpecifier
|
||||||
@@ -102,9 +109,17 @@ class DependencyDepthExceededError(DependencyError):
|
|||||||
super().__init__(f"Dependency resolution exceeded maximum depth of {max_depth}")
|
super().__init__(f"Dependency resolution exceeded maximum depth of {max_depth}")
|
||||||
|
|
||||||
|
|
||||||
|
class TooManyArtifactsError(DependencyError):
|
||||||
|
"""Raised when dependency resolution resolves too many artifacts."""
|
||||||
|
def __init__(self, max_artifacts: int):
|
||||||
|
self.max_artifacts = max_artifacts
|
||||||
|
super().__init__(f"Dependency resolution exceeded maximum of {max_artifacts} artifacts")
|
||||||
|
|
||||||
|
|
||||||
# Safety limits to prevent DoS attacks
|
# Safety limits to prevent DoS attacks
|
||||||
MAX_DEPENDENCY_DEPTH = 50 # Maximum levels of nested dependencies
|
MAX_DEPENDENCY_DEPTH = 100 # Maximum levels of nested dependencies
|
||||||
MAX_DEPENDENCIES_PER_ARTIFACT = 200 # Maximum direct dependencies per artifact
|
MAX_DEPENDENCIES_PER_ARTIFACT = 200 # Maximum direct dependencies per artifact
|
||||||
|
MAX_TOTAL_ARTIFACTS = 1000 # Maximum total artifacts in resolution to prevent memory issues
|
||||||
|
|
||||||
|
|
||||||
def parse_ensure_file(content: bytes) -> EnsureFileContent:
|
def parse_ensure_file(content: bytes) -> EnsureFileContent:
|
||||||
@@ -325,6 +340,33 @@ def _is_version_constraint(version_str: str) -> bool:
|
|||||||
return any(op in version_str for op in ['>=', '<=', '!=', '~=', '>', '<', '==', '*'])
|
return any(op in version_str for op in ['>=', '<=', '!=', '~=', '>', '<', '==', '*'])
|
||||||
|
|
||||||
|
|
||||||
|
def _version_satisfies_constraint(version: str, constraint: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if a version satisfies a constraint.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
version: A version string (e.g., '1.26.0')
|
||||||
|
constraint: A version constraint (e.g., '>=1.20', '>=1.20,<2.0', '*')
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True if the version satisfies the constraint, False otherwise
|
||||||
|
"""
|
||||||
|
if not HAS_PACKAGING:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Wildcard matches everything
|
||||||
|
if constraint == '*' or not constraint:
|
||||||
|
return True
|
||||||
|
|
||||||
|
try:
|
||||||
|
spec = SpecifierSet(constraint)
|
||||||
|
v = Version(version)
|
||||||
|
return v in spec
|
||||||
|
except (InvalidSpecifier, InvalidVersion):
|
||||||
|
# If we can't parse, assume it doesn't match
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _resolve_version_constraint(
|
def _resolve_version_constraint(
|
||||||
db: Session,
|
db: Session,
|
||||||
package: Package,
|
package: Package,
|
||||||
@@ -696,6 +738,8 @@ def resolve_dependencies(
|
|||||||
current_path: Dict[str, str] = {}
|
current_path: Dict[str, str] = {}
|
||||||
# Resolution order (topological)
|
# Resolution order (topological)
|
||||||
resolution_order: List[str] = []
|
resolution_order: List[str] = []
|
||||||
|
# Track resolution path for debugging
|
||||||
|
resolution_path_sync: List[str] = []
|
||||||
|
|
||||||
def _resolve_recursive(
|
def _resolve_recursive(
|
||||||
artifact_id: str,
|
artifact_id: str,
|
||||||
@@ -707,12 +751,16 @@ def resolve_dependencies(
|
|||||||
depth: int = 0,
|
depth: int = 0,
|
||||||
):
|
):
|
||||||
"""Recursively resolve dependencies with cycle/conflict detection."""
|
"""Recursively resolve dependencies with cycle/conflict detection."""
|
||||||
|
pkg_key = f"{proj_name}/{pkg_name}"
|
||||||
|
|
||||||
# Safety limit: prevent DoS through deeply nested dependencies
|
# Safety limit: prevent DoS through deeply nested dependencies
|
||||||
if depth > MAX_DEPENDENCY_DEPTH:
|
if depth > MAX_DEPENDENCY_DEPTH:
|
||||||
|
logger.error(
|
||||||
|
f"Dependency depth exceeded at {pkg_key} (depth={depth}). "
|
||||||
|
f"Resolution path: {' -> '.join(resolution_path_sync[-20:])}"
|
||||||
|
)
|
||||||
raise DependencyDepthExceededError(MAX_DEPENDENCY_DEPTH)
|
raise DependencyDepthExceededError(MAX_DEPENDENCY_DEPTH)
|
||||||
|
|
||||||
pkg_key = f"{proj_name}/{pkg_name}"
|
|
||||||
|
|
||||||
# Cycle detection (at artifact level)
|
# Cycle detection (at artifact level)
|
||||||
if artifact_id in visiting:
|
if artifact_id in visiting:
|
||||||
# Build cycle path from current_path
|
# Build cycle path from current_path
|
||||||
@@ -720,34 +768,25 @@ def resolve_dependencies(
|
|||||||
cycle = [cycle_start, pkg_key]
|
cycle = [cycle_start, pkg_key]
|
||||||
raise CircularDependencyError(cycle)
|
raise CircularDependencyError(cycle)
|
||||||
|
|
||||||
# Conflict detection - check if we've seen this package before with a different version
|
# Version conflict handling - use first resolved version (lenient mode)
|
||||||
if pkg_key in version_requirements:
|
if pkg_key in version_requirements:
|
||||||
existing_versions = {r["version"] for r in version_requirements[pkg_key]}
|
existing_versions = {r["version"] for r in version_requirements[pkg_key]}
|
||||||
if version_or_tag not in existing_versions:
|
if version_or_tag not in existing_versions:
|
||||||
# Conflict detected - same package, different version
|
# Different version requested - log and use existing (first wins)
|
||||||
requirements = version_requirements[pkg_key] + [
|
existing = version_requirements[pkg_key][0]["version"]
|
||||||
{"version": version_or_tag, "required_by": required_by}
|
logger.debug(
|
||||||
]
|
f"Version mismatch for {pkg_key}: using {existing} "
|
||||||
raise DependencyConflictError([
|
f"(also requested: {version_or_tag} by {required_by})"
|
||||||
DependencyConflict(
|
|
||||||
project=proj_name,
|
|
||||||
package=pkg_name,
|
|
||||||
requirements=[
|
|
||||||
{
|
|
||||||
"version": r["version"],
|
|
||||||
"required_by": [{"path": r["required_by"]}] if r["required_by"] else []
|
|
||||||
}
|
|
||||||
for r in requirements
|
|
||||||
],
|
|
||||||
)
|
)
|
||||||
])
|
# Already resolved this package - skip
|
||||||
# Same version already resolved - skip
|
|
||||||
if artifact_id in visited:
|
|
||||||
return
|
return
|
||||||
|
|
||||||
if artifact_id in visited:
|
if artifact_id in visited:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
# Track path for debugging (only after early-return checks)
|
||||||
|
resolution_path_sync.append(f"{pkg_key}@{version_or_tag}")
|
||||||
|
|
||||||
visiting.add(artifact_id)
|
visiting.add(artifact_id)
|
||||||
current_path[artifact_id] = pkg_key
|
current_path[artifact_id] = pkg_key
|
||||||
|
|
||||||
@@ -799,6 +838,10 @@ def resolve_dependencies(
|
|||||||
if dep_artifact_id == artifact_id:
|
if dep_artifact_id == artifact_id:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
# Skip if this artifact is already being visited (would cause cycle)
|
||||||
|
if dep_artifact_id in visiting:
|
||||||
|
continue
|
||||||
|
|
||||||
_resolve_recursive(
|
_resolve_recursive(
|
||||||
dep_artifact_id,
|
dep_artifact_id,
|
||||||
dep.dependency_project,
|
dep.dependency_project,
|
||||||
@@ -812,6 +855,11 @@ def resolve_dependencies(
|
|||||||
visiting.remove(artifact_id)
|
visiting.remove(artifact_id)
|
||||||
del current_path[artifact_id]
|
del current_path[artifact_id]
|
||||||
visited.add(artifact_id)
|
visited.add(artifact_id)
|
||||||
|
resolution_path_sync.pop()
|
||||||
|
|
||||||
|
# Check total artifacts limit
|
||||||
|
if len(resolution_order) >= MAX_TOTAL_ARTIFACTS:
|
||||||
|
raise TooManyArtifactsError(MAX_TOTAL_ARTIFACTS)
|
||||||
|
|
||||||
# Add to resolution order (dependencies before dependents)
|
# Add to resolution order (dependencies before dependents)
|
||||||
resolution_order.append(artifact_id)
|
resolution_order.append(artifact_id)
|
||||||
@@ -848,6 +896,417 @@ def resolve_dependencies(
|
|||||||
},
|
},
|
||||||
resolved=resolved_list,
|
resolved=resolved_list,
|
||||||
missing=missing_dependencies,
|
missing=missing_dependencies,
|
||||||
|
fetched=[], # No fetching in sync version
|
||||||
|
total_size=total_size,
|
||||||
|
artifact_count=len(resolved_list),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# System project mapping for auto-fetch
|
||||||
|
SYSTEM_PROJECT_REGISTRY_MAP = {
|
||||||
|
"_pypi": "pypi",
|
||||||
|
"_npm": "npm",
|
||||||
|
"_maven": "maven",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def resolve_dependencies_with_fetch(
|
||||||
|
db: Session,
|
||||||
|
project_name: str,
|
||||||
|
package_name: str,
|
||||||
|
ref: str,
|
||||||
|
base_url: str,
|
||||||
|
storage: "S3Storage",
|
||||||
|
registry_clients: Dict[str, "RegistryClient"],
|
||||||
|
) -> DependencyResolutionResponse:
|
||||||
|
"""
|
||||||
|
Resolve all dependencies for an artifact recursively, fetching missing ones from upstream.
|
||||||
|
|
||||||
|
This async version extends the basic resolution with auto-fetch capability:
|
||||||
|
when a missing dependency is from a system project (e.g., _pypi), it attempts
|
||||||
|
to fetch the package from the corresponding upstream registry.
|
||||||
|
|
||||||
|
If the root artifact itself doesn't exist in a system project, it will also
|
||||||
|
be fetched from upstream before resolution begins.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
db: Database session
|
||||||
|
project_name: Project name
|
||||||
|
package_name: Package name
|
||||||
|
ref: Version reference (or artifact:hash)
|
||||||
|
base_url: Base URL for download URLs
|
||||||
|
storage: S3 storage for caching fetched artifacts
|
||||||
|
registry_clients: Map of system project to registry client {"_pypi": PyPIRegistryClient}
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
DependencyResolutionResponse with all resolved artifacts and fetch status
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
DependencyNotFoundError: If the root artifact cannot be found (even after fetch attempt)
|
||||||
|
CircularDependencyError: If circular dependencies are detected
|
||||||
|
"""
|
||||||
|
# Track fetched artifacts for response
|
||||||
|
fetched_artifacts: List[ResolvedArtifact] = []
|
||||||
|
|
||||||
|
# Check if project exists
|
||||||
|
project = db.query(Project).filter(Project.name == project_name).first()
|
||||||
|
|
||||||
|
# If project doesn't exist and it's a system project pattern, we can't auto-create it
|
||||||
|
if not project:
|
||||||
|
raise DependencyNotFoundError(project_name, package_name, ref)
|
||||||
|
|
||||||
|
# Check if package exists
|
||||||
|
package = db.query(Package).filter(
|
||||||
|
Package.project_id == project.id,
|
||||||
|
Package.name == package_name,
|
||||||
|
).first()
|
||||||
|
|
||||||
|
# Try to resolve the root artifact
|
||||||
|
root_artifact_id = None
|
||||||
|
root_version = None
|
||||||
|
root_size = None
|
||||||
|
|
||||||
|
# Handle artifact: prefix for direct artifact ID references
|
||||||
|
if ref.startswith("artifact:"):
|
||||||
|
artifact_id = ref[9:]
|
||||||
|
artifact = db.query(Artifact).filter(Artifact.id == artifact_id).first()
|
||||||
|
if artifact:
|
||||||
|
root_artifact_id = artifact.id
|
||||||
|
root_version = artifact_id[:12]
|
||||||
|
root_size = artifact.size
|
||||||
|
elif package:
|
||||||
|
# Try to resolve by version/constraint
|
||||||
|
resolved = _resolve_dependency_to_artifact(
|
||||||
|
db, project_name, package_name, ref
|
||||||
|
)
|
||||||
|
if resolved:
|
||||||
|
root_artifact_id, root_version, root_size = resolved
|
||||||
|
|
||||||
|
# If root artifact not found and this is a system project, try to fetch it
|
||||||
|
if root_artifact_id is None and project_name in SYSTEM_PROJECT_REGISTRY_MAP:
|
||||||
|
logger.info(
|
||||||
|
f"Root artifact {project_name}/{package_name}@{ref} not found, "
|
||||||
|
"attempting to fetch from upstream"
|
||||||
|
)
|
||||||
|
|
||||||
|
client = registry_clients.get(project_name)
|
||||||
|
if client:
|
||||||
|
try:
|
||||||
|
# Resolve the version constraint from upstream
|
||||||
|
version_info = await client.resolve_constraint(package_name, ref)
|
||||||
|
if version_info:
|
||||||
|
# Fetch and cache the package
|
||||||
|
fetch_result = await client.fetch_package(
|
||||||
|
package_name, version_info, db, storage
|
||||||
|
)
|
||||||
|
if fetch_result:
|
||||||
|
logger.info(
|
||||||
|
f"Successfully fetched root artifact {package_name}=="
|
||||||
|
f"{fetch_result.version} (artifact {fetch_result.artifact_id[:12]})"
|
||||||
|
)
|
||||||
|
root_artifact_id = fetch_result.artifact_id
|
||||||
|
root_version = fetch_result.version
|
||||||
|
root_size = fetch_result.size
|
||||||
|
|
||||||
|
# Add to fetched list
|
||||||
|
fetched_artifacts.append(ResolvedArtifact(
|
||||||
|
artifact_id=fetch_result.artifact_id,
|
||||||
|
project=project_name,
|
||||||
|
package=package_name,
|
||||||
|
version=fetch_result.version,
|
||||||
|
size=fetch_result.size,
|
||||||
|
download_url=f"{base_url}/api/v1/project/{project_name}/{package_name}/+/{fetch_result.version}",
|
||||||
|
))
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Failed to fetch root artifact {package_name}: {e}")
|
||||||
|
|
||||||
|
# If still no root artifact, raise error
|
||||||
|
if root_artifact_id is None:
|
||||||
|
raise DependencyNotFoundError(project_name, package_name, ref)
|
||||||
|
|
||||||
|
# Track state
|
||||||
|
resolved_artifacts: Dict[str, ResolvedArtifact] = {}
|
||||||
|
missing_dependencies: List[MissingDependency] = []
|
||||||
|
# Note: fetched_artifacts was already initialized above (line 911)
|
||||||
|
# and may already contain the root artifact if it was fetched from upstream
|
||||||
|
version_requirements: Dict[str, List[Dict[str, Any]]] = {}
|
||||||
|
visiting: Set[str] = set()
|
||||||
|
visited: Set[str] = set()
|
||||||
|
current_path: Dict[str, str] = {}
|
||||||
|
resolution_order: List[str] = []
|
||||||
|
|
||||||
|
# Track fetch attempts to prevent loops
|
||||||
|
fetch_attempted: Set[str] = set() # "project/package@constraint"
|
||||||
|
|
||||||
|
async def _try_fetch_dependency(
|
||||||
|
dep_project: str,
|
||||||
|
dep_package: str,
|
||||||
|
constraint: str,
|
||||||
|
required_by: str,
|
||||||
|
) -> Optional[Tuple[str, str, int]]:
|
||||||
|
"""
|
||||||
|
Try to fetch a missing dependency from upstream registry.
|
||||||
|
|
||||||
|
Returns (artifact_id, version, size) if successful, None otherwise.
|
||||||
|
"""
|
||||||
|
# Only fetch from system projects
|
||||||
|
registry_type = SYSTEM_PROJECT_REGISTRY_MAP.get(dep_project)
|
||||||
|
if not registry_type:
|
||||||
|
logger.debug(
|
||||||
|
f"Not a system project, skipping fetch: {dep_project}/{dep_package}"
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Build fetch key for loop prevention
|
||||||
|
fetch_key = f"{dep_project}/{dep_package}@{constraint}"
|
||||||
|
if fetch_key in fetch_attempted:
|
||||||
|
logger.debug(f"Already attempted fetch for {fetch_key}")
|
||||||
|
return None
|
||||||
|
fetch_attempted.add(fetch_key)
|
||||||
|
|
||||||
|
# Get registry client
|
||||||
|
client = registry_clients.get(dep_project)
|
||||||
|
if not client:
|
||||||
|
logger.debug(f"No registry client for {dep_project}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Resolve version constraint
|
||||||
|
version_info = await client.resolve_constraint(dep_package, constraint)
|
||||||
|
if not version_info:
|
||||||
|
logger.info(
|
||||||
|
f"No version of {dep_package} matches constraint '{constraint}' on upstream"
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Fetch and cache the package
|
||||||
|
fetch_result = await client.fetch_package(
|
||||||
|
dep_package, version_info, db, storage
|
||||||
|
)
|
||||||
|
if not fetch_result:
|
||||||
|
logger.warning(f"Failed to fetch {dep_package}=={version_info.version}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Successfully fetched {dep_package}=={version_info.version} "
|
||||||
|
f"(artifact {fetch_result.artifact_id[:12]})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add to fetched list for response
|
||||||
|
fetched_artifacts.append(ResolvedArtifact(
|
||||||
|
artifact_id=fetch_result.artifact_id,
|
||||||
|
project=dep_project,
|
||||||
|
package=dep_package,
|
||||||
|
version=fetch_result.version,
|
||||||
|
size=fetch_result.size,
|
||||||
|
download_url=f"{base_url}/api/v1/project/{dep_project}/{dep_package}/+/{fetch_result.version}",
|
||||||
|
))
|
||||||
|
|
||||||
|
return (fetch_result.artifact_id, fetch_result.version, fetch_result.size)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Error fetching {dep_package}: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Track resolution path for debugging
|
||||||
|
resolution_path: List[str] = []
|
||||||
|
|
||||||
|
async def _resolve_recursive_async(
|
||||||
|
artifact_id: str,
|
||||||
|
proj_name: str,
|
||||||
|
pkg_name: str,
|
||||||
|
version_or_tag: str,
|
||||||
|
size: int,
|
||||||
|
required_by: Optional[str],
|
||||||
|
depth: int = 0,
|
||||||
|
):
|
||||||
|
"""Recursively resolve dependencies with fetch capability."""
|
||||||
|
pkg_key = f"{proj_name}/{pkg_name}"
|
||||||
|
|
||||||
|
if depth > MAX_DEPENDENCY_DEPTH:
|
||||||
|
logger.error(
|
||||||
|
f"Dependency depth exceeded at {pkg_key} (depth={depth}). "
|
||||||
|
f"Resolution path: {' -> '.join(resolution_path[-20:])}"
|
||||||
|
)
|
||||||
|
raise DependencyDepthExceededError(MAX_DEPENDENCY_DEPTH)
|
||||||
|
|
||||||
|
# Cycle detection
|
||||||
|
if artifact_id in visiting:
|
||||||
|
cycle_start = current_path.get(artifact_id, pkg_key)
|
||||||
|
cycle = [cycle_start, pkg_key]
|
||||||
|
raise CircularDependencyError(cycle)
|
||||||
|
|
||||||
|
# Version conflict handling - use first resolved version (lenient mode)
|
||||||
|
if pkg_key in version_requirements:
|
||||||
|
existing_versions = {r["version"] for r in version_requirements[pkg_key]}
|
||||||
|
if version_or_tag not in existing_versions:
|
||||||
|
# Different version requested - log and use existing (first wins)
|
||||||
|
existing = version_requirements[pkg_key][0]["version"]
|
||||||
|
logger.debug(
|
||||||
|
f"Version mismatch for {pkg_key}: using {existing} "
|
||||||
|
f"(also requested: {version_or_tag} by {required_by})"
|
||||||
|
)
|
||||||
|
# Already resolved this package - skip
|
||||||
|
return
|
||||||
|
|
||||||
|
if artifact_id in visited:
|
||||||
|
return
|
||||||
|
|
||||||
|
# Track path for debugging (only after early-return checks)
|
||||||
|
resolution_path.append(f"{pkg_key}@{version_or_tag}")
|
||||||
|
|
||||||
|
visiting.add(artifact_id)
|
||||||
|
current_path[artifact_id] = pkg_key
|
||||||
|
|
||||||
|
if pkg_key not in version_requirements:
|
||||||
|
version_requirements[pkg_key] = []
|
||||||
|
version_requirements[pkg_key].append({
|
||||||
|
"version": version_or_tag,
|
||||||
|
"required_by": required_by,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Get dependencies
|
||||||
|
deps = db.query(ArtifactDependency).filter(
|
||||||
|
ArtifactDependency.artifact_id == artifact_id
|
||||||
|
).all()
|
||||||
|
|
||||||
|
for dep in deps:
|
||||||
|
# Skip self-dependencies (common with PyPI extras like pytest[testing] -> pytest)
|
||||||
|
dep_proj_normalized = dep.dependency_project.lower()
|
||||||
|
dep_pkg_normalized = _normalize_pypi_package_name(dep.dependency_package)
|
||||||
|
curr_proj_normalized = proj_name.lower()
|
||||||
|
curr_pkg_normalized = _normalize_pypi_package_name(pkg_name)
|
||||||
|
if dep_proj_normalized == curr_proj_normalized and dep_pkg_normalized == curr_pkg_normalized:
|
||||||
|
logger.debug(
|
||||||
|
f"Skipping self-dependency: {pkg_key} -> {dep.dependency_project}/{dep.dependency_package}"
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Also check if this dependency would resolve to the current artifact
|
||||||
|
# (handles cases where package names differ but resolve to same artifact)
|
||||||
|
resolved_dep = _resolve_dependency_to_artifact(
|
||||||
|
db,
|
||||||
|
dep.dependency_project,
|
||||||
|
dep.dependency_package,
|
||||||
|
dep.version_constraint,
|
||||||
|
)
|
||||||
|
|
||||||
|
if not resolved_dep:
|
||||||
|
# Try to fetch from upstream if it's a system project
|
||||||
|
fetched = await _try_fetch_dependency(
|
||||||
|
dep.dependency_project,
|
||||||
|
dep.dependency_package,
|
||||||
|
dep.version_constraint,
|
||||||
|
pkg_key,
|
||||||
|
)
|
||||||
|
|
||||||
|
if fetched:
|
||||||
|
resolved_dep = fetched
|
||||||
|
else:
|
||||||
|
# Still missing - add to missing list with fetch status
|
||||||
|
fetch_key = f"{dep.dependency_project}/{dep.dependency_package}@{dep.version_constraint}"
|
||||||
|
was_attempted = fetch_key in fetch_attempted
|
||||||
|
missing_dependencies.append(MissingDependency(
|
||||||
|
project=dep.dependency_project,
|
||||||
|
package=dep.dependency_package,
|
||||||
|
constraint=dep.version_constraint,
|
||||||
|
required_by=pkg_key,
|
||||||
|
fetch_attempted=was_attempted,
|
||||||
|
))
|
||||||
|
continue
|
||||||
|
|
||||||
|
dep_artifact_id, dep_version, dep_size = resolved_dep
|
||||||
|
|
||||||
|
# Skip if resolved to same artifact (self-dependency at artifact level)
|
||||||
|
if dep_artifact_id == artifact_id:
|
||||||
|
logger.debug(
|
||||||
|
f"Skipping self-dependency (same artifact): {pkg_key} -> "
|
||||||
|
f"{dep.dependency_project}/{dep.dependency_package} (artifact {dep_artifact_id[:12]})"
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip if this artifact is already being visited (would cause cycle)
|
||||||
|
if dep_artifact_id in visiting:
|
||||||
|
logger.debug(
|
||||||
|
f"Skipping dependency already in resolution stack: {pkg_key} -> "
|
||||||
|
f"{dep.dependency_project}/{dep.dependency_package} (artifact {dep_artifact_id[:12]})"
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if we've already resolved this package to a different version
|
||||||
|
dep_pkg_key = f"{dep.dependency_project}/{dep.dependency_package}"
|
||||||
|
if dep_pkg_key in version_requirements:
|
||||||
|
existing_version = version_requirements[dep_pkg_key][0]["version"]
|
||||||
|
if existing_version != dep_version:
|
||||||
|
# Different version resolved - check if existing satisfies new constraint
|
||||||
|
if HAS_PACKAGING and _version_satisfies_constraint(existing_version, dep.version_constraint):
|
||||||
|
logger.debug(
|
||||||
|
f"Reusing existing version {existing_version} for {dep_pkg_key} "
|
||||||
|
f"(satisfies constraint {dep.version_constraint})"
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
else:
|
||||||
|
logger.debug(
|
||||||
|
f"Version conflict for {dep_pkg_key}: have {existing_version}, "
|
||||||
|
f"need {dep.version_constraint} (resolved to {dep_version})"
|
||||||
|
)
|
||||||
|
# Don't raise error - just use the first version we resolved
|
||||||
|
# This is more lenient than strict conflict detection
|
||||||
|
continue
|
||||||
|
|
||||||
|
await _resolve_recursive_async(
|
||||||
|
dep_artifact_id,
|
||||||
|
dep.dependency_project,
|
||||||
|
dep.dependency_package,
|
||||||
|
dep_version,
|
||||||
|
dep_size,
|
||||||
|
pkg_key,
|
||||||
|
depth + 1,
|
||||||
|
)
|
||||||
|
|
||||||
|
visiting.remove(artifact_id)
|
||||||
|
del current_path[artifact_id]
|
||||||
|
visited.add(artifact_id)
|
||||||
|
resolution_path.pop()
|
||||||
|
|
||||||
|
# Check total artifacts limit
|
||||||
|
if len(resolution_order) >= MAX_TOTAL_ARTIFACTS:
|
||||||
|
raise TooManyArtifactsError(MAX_TOTAL_ARTIFACTS)
|
||||||
|
|
||||||
|
resolution_order.append(artifact_id)
|
||||||
|
|
||||||
|
resolved_artifacts[artifact_id] = ResolvedArtifact(
|
||||||
|
artifact_id=artifact_id,
|
||||||
|
project=proj_name,
|
||||||
|
package=pkg_name,
|
||||||
|
version=version_or_tag,
|
||||||
|
size=size,
|
||||||
|
download_url=f"{base_url}/api/v1/project/{proj_name}/{pkg_name}/+/{version_or_tag}",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Start resolution from root
|
||||||
|
await _resolve_recursive_async(
|
||||||
|
root_artifact_id,
|
||||||
|
project_name,
|
||||||
|
package_name,
|
||||||
|
root_version,
|
||||||
|
root_size,
|
||||||
|
None,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Build response in topological order
|
||||||
|
resolved_list = [resolved_artifacts[aid] for aid in resolution_order]
|
||||||
|
total_size = sum(r.size for r in resolved_list)
|
||||||
|
|
||||||
|
return DependencyResolutionResponse(
|
||||||
|
requested={
|
||||||
|
"project": project_name,
|
||||||
|
"package": package_name,
|
||||||
|
"ref": ref,
|
||||||
|
},
|
||||||
|
resolved=resolved_list,
|
||||||
|
missing=missing_dependencies,
|
||||||
|
fetched=fetched_artifacts,
|
||||||
total_size=total_size,
|
total_size=total_size,
|
||||||
artifact_count=len(resolved_list),
|
artifact_count=len(resolved_list),
|
||||||
)
|
)
|
||||||
|
|||||||
179
backend/app/http_client.py
Normal file
179
backend/app/http_client.py
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
"""
|
||||||
|
HTTP client manager with connection pooling and lifecycle management.
|
||||||
|
|
||||||
|
Provides:
|
||||||
|
- Shared connection pools for upstream requests
|
||||||
|
- Per-upstream client isolation when needed
|
||||||
|
- Thread pool for blocking I/O operations
|
||||||
|
- FastAPI lifespan integration
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
from concurrent.futures import ThreadPoolExecutor
|
||||||
|
from typing import Any, Callable, Optional
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
|
||||||
|
from .config import Settings
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class HttpClientManager:
|
||||||
|
"""
|
||||||
|
Manages httpx.AsyncClient pools with FastAPI lifespan integration.
|
||||||
|
|
||||||
|
Features:
|
||||||
|
- Default shared pool for general requests
|
||||||
|
- Per-upstream pools for sources needing specific config/auth
|
||||||
|
- Dedicated thread pool for blocking operations
|
||||||
|
- Graceful shutdown
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, settings: Settings):
|
||||||
|
self.max_connections = settings.http_max_connections
|
||||||
|
self.max_keepalive = settings.http_max_keepalive
|
||||||
|
self.connect_timeout = settings.http_connect_timeout
|
||||||
|
self.read_timeout = settings.http_read_timeout
|
||||||
|
self.worker_threads = settings.http_worker_threads
|
||||||
|
|
||||||
|
self._default_client: Optional[httpx.AsyncClient] = None
|
||||||
|
self._upstream_clients: dict[str, httpx.AsyncClient] = {}
|
||||||
|
self._executor: Optional[ThreadPoolExecutor] = None
|
||||||
|
self._started = False
|
||||||
|
|
||||||
|
async def startup(self) -> None:
|
||||||
|
"""Initialize clients and thread pool. Called by FastAPI lifespan."""
|
||||||
|
if self._started:
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Starting HttpClientManager: max_connections={self.max_connections}, "
|
||||||
|
f"worker_threads={self.worker_threads}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create connection limits
|
||||||
|
limits = httpx.Limits(
|
||||||
|
max_connections=self.max_connections,
|
||||||
|
max_keepalive_connections=self.max_keepalive,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create timeout config
|
||||||
|
timeout = httpx.Timeout(
|
||||||
|
connect=self.connect_timeout,
|
||||||
|
read=self.read_timeout,
|
||||||
|
write=self.read_timeout,
|
||||||
|
pool=self.connect_timeout,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create default client
|
||||||
|
self._default_client = httpx.AsyncClient(
|
||||||
|
limits=limits,
|
||||||
|
timeout=timeout,
|
||||||
|
follow_redirects=False, # Handle redirects manually for auth
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create thread pool for blocking operations
|
||||||
|
self._executor = ThreadPoolExecutor(
|
||||||
|
max_workers=self.worker_threads,
|
||||||
|
thread_name_prefix="orchard-blocking-",
|
||||||
|
)
|
||||||
|
|
||||||
|
self._started = True
|
||||||
|
logger.info("HttpClientManager started")
|
||||||
|
|
||||||
|
async def shutdown(self) -> None:
|
||||||
|
"""Close all clients and thread pool. Called by FastAPI lifespan."""
|
||||||
|
if not self._started:
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.info("Shutting down HttpClientManager")
|
||||||
|
|
||||||
|
# Close default client
|
||||||
|
if self._default_client:
|
||||||
|
await self._default_client.aclose()
|
||||||
|
self._default_client = None
|
||||||
|
|
||||||
|
# Close upstream-specific clients
|
||||||
|
for name, client in self._upstream_clients.items():
|
||||||
|
logger.debug(f"Closing upstream client: {name}")
|
||||||
|
await client.aclose()
|
||||||
|
self._upstream_clients.clear()
|
||||||
|
|
||||||
|
# Shutdown thread pool
|
||||||
|
if self._executor:
|
||||||
|
self._executor.shutdown(wait=True)
|
||||||
|
self._executor = None
|
||||||
|
|
||||||
|
self._started = False
|
||||||
|
logger.info("HttpClientManager shutdown complete")
|
||||||
|
|
||||||
|
def get_client(self, upstream_name: Optional[str] = None) -> httpx.AsyncClient:
|
||||||
|
"""
|
||||||
|
Get HTTP client for making requests.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
upstream_name: Optional upstream source name for dedicated pool.
|
||||||
|
If None, returns the default shared client.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
httpx.AsyncClient configured for the request.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
RuntimeError: If manager not started.
|
||||||
|
"""
|
||||||
|
if not self._started or not self._default_client:
|
||||||
|
raise RuntimeError("HttpClientManager not started. Call startup() first.")
|
||||||
|
|
||||||
|
if upstream_name and upstream_name in self._upstream_clients:
|
||||||
|
return self._upstream_clients[upstream_name]
|
||||||
|
|
||||||
|
return self._default_client
|
||||||
|
|
||||||
|
async def run_blocking(self, func: Callable[..., Any], *args: Any) -> Any:
|
||||||
|
"""
|
||||||
|
Run a blocking function in the thread pool.
|
||||||
|
|
||||||
|
Use this for:
|
||||||
|
- File I/O operations
|
||||||
|
- Archive extraction (zipfile, tarfile)
|
||||||
|
- Hash computation on large data
|
||||||
|
|
||||||
|
Args:
|
||||||
|
func: Synchronous function to execute
|
||||||
|
*args: Arguments to pass to the function
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The function's return value.
|
||||||
|
"""
|
||||||
|
if not self._executor:
|
||||||
|
raise RuntimeError("HttpClientManager not started. Call startup() first.")
|
||||||
|
|
||||||
|
loop = asyncio.get_running_loop()
|
||||||
|
return await loop.run_in_executor(self._executor, func, *args)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def active_connections(self) -> int:
|
||||||
|
"""Get approximate number of active connections (for health checks)."""
|
||||||
|
if not self._default_client:
|
||||||
|
return 0
|
||||||
|
# httpx doesn't expose this directly, return pool size as approximation
|
||||||
|
return self.max_connections
|
||||||
|
|
||||||
|
@property
|
||||||
|
def pool_size(self) -> int:
|
||||||
|
"""Get configured pool size."""
|
||||||
|
return self.max_connections
|
||||||
|
|
||||||
|
@property
|
||||||
|
def executor_active(self) -> int:
|
||||||
|
"""Get number of active thread pool workers."""
|
||||||
|
if not self._executor:
|
||||||
|
return 0
|
||||||
|
return len(self._executor._threads)
|
||||||
|
|
||||||
|
@property
|
||||||
|
def executor_max(self) -> int:
|
||||||
|
"""Get max thread pool workers."""
|
||||||
|
return self.worker_threads
|
||||||
@@ -15,6 +15,8 @@ from .pypi_proxy import router as pypi_router
|
|||||||
from .seed import seed_database
|
from .seed import seed_database
|
||||||
from .auth import create_default_admin
|
from .auth import create_default_admin
|
||||||
from .rate_limit import limiter
|
from .rate_limit import limiter
|
||||||
|
from .http_client import HttpClientManager
|
||||||
|
from .cache_service import CacheService
|
||||||
|
|
||||||
settings = get_settings()
|
settings = get_settings()
|
||||||
logging.basicConfig(level=logging.INFO)
|
logging.basicConfig(level=logging.INFO)
|
||||||
@@ -38,6 +40,17 @@ async def lifespan(app: FastAPI):
|
|||||||
finally:
|
finally:
|
||||||
db.close()
|
db.close()
|
||||||
|
|
||||||
|
# Initialize infrastructure services
|
||||||
|
logger.info("Initializing infrastructure services...")
|
||||||
|
|
||||||
|
app.state.http_client = HttpClientManager(settings)
|
||||||
|
await app.state.http_client.startup()
|
||||||
|
|
||||||
|
app.state.cache = CacheService(settings)
|
||||||
|
await app.state.cache.startup()
|
||||||
|
|
||||||
|
logger.info("Infrastructure services ready")
|
||||||
|
|
||||||
# Seed test data in development mode
|
# Seed test data in development mode
|
||||||
if settings.is_development:
|
if settings.is_development:
|
||||||
logger.info(f"Running in {settings.env} mode - checking for seed data")
|
logger.info(f"Running in {settings.env} mode - checking for seed data")
|
||||||
@@ -51,6 +64,12 @@ async def lifespan(app: FastAPI):
|
|||||||
|
|
||||||
yield
|
yield
|
||||||
|
|
||||||
|
# Shutdown infrastructure services
|
||||||
|
logger.info("Shutting down infrastructure services...")
|
||||||
|
await app.state.http_client.shutdown()
|
||||||
|
await app.state.cache.shutdown()
|
||||||
|
logger.info("Shutdown complete")
|
||||||
|
|
||||||
|
|
||||||
app = FastAPI(
|
app = FastAPI(
|
||||||
title="Orchard",
|
title="Orchard",
|
||||||
|
|||||||
@@ -23,15 +23,22 @@ from fastapi.responses import StreamingResponse, HTMLResponse, RedirectResponse
|
|||||||
from sqlalchemy.orm import Session
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
from .database import get_db
|
from .database import get_db
|
||||||
from .models import UpstreamSource, CachedUrl, Artifact, Project, Package, PackageVersion, ArtifactDependency
|
from .models import UpstreamSource, CachedUrl, Artifact, Project, Package, PackageVersion
|
||||||
from .storage import S3Storage, get_storage
|
from .storage import S3Storage, get_storage
|
||||||
from .config import get_env_upstream_sources, get_settings
|
from .config import get_env_upstream_sources, get_settings
|
||||||
|
from .http_client import HttpClientManager
|
||||||
|
from .db_utils import ArtifactRepository
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
router = APIRouter(prefix="/pypi", tags=["pypi-proxy"])
|
router = APIRouter(prefix="/pypi", tags=["pypi-proxy"])
|
||||||
|
|
||||||
|
|
||||||
|
def get_http_client(request: Request) -> HttpClientManager:
|
||||||
|
"""Get HttpClientManager from app state."""
|
||||||
|
return request.app.state.http_client
|
||||||
|
|
||||||
|
|
||||||
# Timeout configuration for proxy requests
|
# Timeout configuration for proxy requests
|
||||||
PROXY_CONNECT_TIMEOUT = 30.0
|
PROXY_CONNECT_TIMEOUT = 30.0
|
||||||
PROXY_READ_TIMEOUT = 60.0
|
PROXY_READ_TIMEOUT = 60.0
|
||||||
@@ -40,17 +47,36 @@ PROXY_READ_TIMEOUT = 60.0
|
|||||||
def _parse_requires_dist(requires_dist: str) -> Tuple[str, Optional[str]]:
|
def _parse_requires_dist(requires_dist: str) -> Tuple[str, Optional[str]]:
|
||||||
"""Parse a Requires-Dist line into (package_name, version_constraint).
|
"""Parse a Requires-Dist line into (package_name, version_constraint).
|
||||||
|
|
||||||
|
Filters out optional/extra dependencies and platform-specific dependencies
|
||||||
|
to avoid pulling in unnecessary packages during dependency resolution.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
"requests (>=2.25.0)" -> ("requests", ">=2.25.0")
|
"requests (>=2.25.0)" -> ("requests", ">=2.25.0")
|
||||||
"typing-extensions; python_version < '3.8'" -> ("typing-extensions", None)
|
"typing-extensions; python_version < '3.8'" -> ("typing-extensions", None)
|
||||||
"numpy>=1.21.0" -> ("numpy", ">=1.21.0")
|
"numpy>=1.21.0" -> ("numpy", ">=1.21.0")
|
||||||
"certifi" -> ("certifi", None)
|
"certifi" -> ("certifi", None)
|
||||||
|
"pytest; extra == 'test'" -> (None, None) # Filtered: extra dependency
|
||||||
|
"pyobjc; sys_platform == 'darwin'" -> (None, None) # Filtered: platform-specific
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple of (normalized_package_name, version_constraint or None)
|
Tuple of (normalized_package_name, version_constraint or None)
|
||||||
|
Returns (None, None) for dependencies that should be filtered out.
|
||||||
"""
|
"""
|
||||||
# Remove any environment markers (after semicolon)
|
# Check for and filter environment markers (after semicolon)
|
||||||
if ';' in requires_dist:
|
if ';' in requires_dist:
|
||||||
|
marker_part = requires_dist.split(';', 1)[1].lower()
|
||||||
|
|
||||||
|
# Filter out extra/optional dependencies - these are not core dependencies
|
||||||
|
# Examples: "pytest; extra == 'test'", "sphinx; extra == 'docs'"
|
||||||
|
if 'extra' in marker_part:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
# Filter out platform-specific dependencies to avoid cross-platform bloat
|
||||||
|
# Examples: "pyobjc; sys_platform == 'darwin'", "pywin32; sys_platform == 'win32'"
|
||||||
|
if 'sys_platform' in marker_part or 'platform_system' in marker_part:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
# Strip the marker for remaining dependencies (like python_version constraints)
|
||||||
requires_dist = requires_dist.split(';')[0].strip()
|
requires_dist = requires_dist.split(';')[0].strip()
|
||||||
|
|
||||||
# Match patterns like "package (>=1.0)" or "package>=1.0" or "package"
|
# Match patterns like "package (>=1.0)" or "package>=1.0" or "package"
|
||||||
@@ -565,6 +591,258 @@ async def pypi_package_versions(
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def fetch_and_cache_pypi_package(
|
||||||
|
db: Session,
|
||||||
|
storage: S3Storage,
|
||||||
|
http_client: httpx.AsyncClient,
|
||||||
|
package_name: str,
|
||||||
|
filename: str,
|
||||||
|
download_url: str,
|
||||||
|
expected_sha256: Optional[str] = None,
|
||||||
|
) -> Optional[dict]:
|
||||||
|
"""
|
||||||
|
Fetch a PyPI package from upstream and cache it in Orchard.
|
||||||
|
|
||||||
|
This is the core caching logic extracted from pypi_download_file() for reuse
|
||||||
|
by the registry client during auto-fetch dependency resolution.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
db: Database session
|
||||||
|
storage: S3 storage instance
|
||||||
|
http_client: Async HTTP client for making requests
|
||||||
|
package_name: Normalized package name (e.g., 'requests')
|
||||||
|
filename: Package filename (e.g., 'requests-2.31.0-py3-none-any.whl')
|
||||||
|
download_url: Full URL to download from upstream
|
||||||
|
expected_sha256: Optional SHA256 to verify download integrity
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dict with artifact_id, size, version, already_cached if successful.
|
||||||
|
None if the fetch failed.
|
||||||
|
"""
|
||||||
|
# Normalize package name
|
||||||
|
normalized_name = re.sub(r'[-_.]+', '-', package_name).lower()
|
||||||
|
|
||||||
|
# Check if we already have this URL cached
|
||||||
|
url_hash = hashlib.sha256(download_url.encode()).hexdigest()
|
||||||
|
cached_url = db.query(CachedUrl).filter(CachedUrl.url_hash == url_hash).first()
|
||||||
|
|
||||||
|
if cached_url:
|
||||||
|
# Already cached - return existing artifact info
|
||||||
|
artifact = db.query(Artifact).filter(Artifact.id == cached_url.artifact_id).first()
|
||||||
|
if artifact:
|
||||||
|
version = _extract_pypi_version(filename)
|
||||||
|
logger.info(f"PyPI fetch: {filename} already cached (artifact {artifact.id[:12]})")
|
||||||
|
return {
|
||||||
|
"artifact_id": artifact.id,
|
||||||
|
"size": artifact.size,
|
||||||
|
"version": version,
|
||||||
|
"already_cached": True,
|
||||||
|
}
|
||||||
|
|
||||||
|
# Get upstream sources for auth headers
|
||||||
|
sources = _get_pypi_upstream_sources(db)
|
||||||
|
matched_source = sources[0] if sources else None
|
||||||
|
|
||||||
|
headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
|
||||||
|
if matched_source:
|
||||||
|
headers.update(_build_auth_headers(matched_source))
|
||||||
|
auth = _get_basic_auth(matched_source) if matched_source else None
|
||||||
|
|
||||||
|
download_timeout = httpx.Timeout(connect=30.0, read=300.0, write=300.0, pool=30.0)
|
||||||
|
|
||||||
|
try:
|
||||||
|
logger.info(f"PyPI fetch: downloading {filename} from {download_url}")
|
||||||
|
|
||||||
|
response = await http_client.get(
|
||||||
|
download_url,
|
||||||
|
headers=headers,
|
||||||
|
auth=auth,
|
||||||
|
timeout=download_timeout,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Handle redirects manually
|
||||||
|
redirect_count = 0
|
||||||
|
while response.status_code in (301, 302, 303, 307, 308) and redirect_count < 5:
|
||||||
|
redirect_url = response.headers.get('location')
|
||||||
|
if not redirect_url:
|
||||||
|
break
|
||||||
|
|
||||||
|
if not redirect_url.startswith('http'):
|
||||||
|
redirect_url = urljoin(download_url, redirect_url)
|
||||||
|
|
||||||
|
logger.debug(f"PyPI fetch: following redirect to {redirect_url}")
|
||||||
|
|
||||||
|
# Don't send auth to different hosts
|
||||||
|
redirect_headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
|
||||||
|
redirect_auth = None
|
||||||
|
if urlparse(redirect_url).netloc == urlparse(download_url).netloc:
|
||||||
|
redirect_headers.update(headers)
|
||||||
|
redirect_auth = auth
|
||||||
|
|
||||||
|
response = await http_client.get(
|
||||||
|
redirect_url,
|
||||||
|
headers=redirect_headers,
|
||||||
|
auth=redirect_auth,
|
||||||
|
follow_redirects=False,
|
||||||
|
timeout=download_timeout,
|
||||||
|
)
|
||||||
|
redirect_count += 1
|
||||||
|
|
||||||
|
if response.status_code != 200:
|
||||||
|
error_detail = _parse_upstream_error(response)
|
||||||
|
logger.warning(f"PyPI fetch: upstream returned {response.status_code} for {filename}: {error_detail}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
content_type = response.headers.get('content-type', 'application/octet-stream')
|
||||||
|
|
||||||
|
# Stream to temp file to avoid loading large packages into memory
|
||||||
|
tmp_path = None
|
||||||
|
try:
|
||||||
|
with tempfile.NamedTemporaryFile(delete=False, suffix=f"_{filename}") as tmp_file:
|
||||||
|
tmp_path = tmp_file.name
|
||||||
|
async for chunk in response.aiter_bytes(chunk_size=65536):
|
||||||
|
tmp_file.write(chunk)
|
||||||
|
|
||||||
|
# Store in S3 from temp file (computes hash and deduplicates automatically)
|
||||||
|
with open(tmp_path, 'rb') as f:
|
||||||
|
result = storage.store(f)
|
||||||
|
sha256 = result.sha256
|
||||||
|
size = result.size
|
||||||
|
|
||||||
|
# Verify hash if expected
|
||||||
|
if expected_sha256 and sha256 != expected_sha256.lower():
|
||||||
|
logger.error(
|
||||||
|
f"PyPI fetch: hash mismatch for {filename}: "
|
||||||
|
f"expected {expected_sha256[:12]}, got {sha256[:12]}"
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Extract dependencies from the temp file
|
||||||
|
extracted_deps = _extract_dependencies_from_file(tmp_path, filename)
|
||||||
|
if extracted_deps:
|
||||||
|
logger.info(f"PyPI fetch: extracted {len(extracted_deps)} dependencies from {filename}")
|
||||||
|
|
||||||
|
logger.info(f"PyPI fetch: downloaded {filename}, {size} bytes, sha256={sha256[:12]}")
|
||||||
|
finally:
|
||||||
|
# Clean up temp file
|
||||||
|
if tmp_path and os.path.exists(tmp_path):
|
||||||
|
os.unlink(tmp_path)
|
||||||
|
|
||||||
|
# Check if artifact already exists
|
||||||
|
existing = db.query(Artifact).filter(Artifact.id == sha256).first()
|
||||||
|
if existing:
|
||||||
|
existing.ref_count += 1
|
||||||
|
db.flush()
|
||||||
|
else:
|
||||||
|
new_artifact = Artifact(
|
||||||
|
id=sha256,
|
||||||
|
original_name=filename,
|
||||||
|
content_type=content_type,
|
||||||
|
size=size,
|
||||||
|
ref_count=1,
|
||||||
|
created_by="pypi-proxy",
|
||||||
|
s3_key=result.s3_key,
|
||||||
|
checksum_md5=result.md5,
|
||||||
|
checksum_sha1=result.sha1,
|
||||||
|
s3_etag=result.s3_etag,
|
||||||
|
)
|
||||||
|
db.add(new_artifact)
|
||||||
|
db.flush()
|
||||||
|
|
||||||
|
# Create/get system project and package
|
||||||
|
system_project = db.query(Project).filter(Project.name == "_pypi").first()
|
||||||
|
if not system_project:
|
||||||
|
system_project = Project(
|
||||||
|
name="_pypi",
|
||||||
|
description="System project for cached PyPI packages",
|
||||||
|
is_public=True,
|
||||||
|
is_system=True,
|
||||||
|
created_by="pypi-proxy",
|
||||||
|
)
|
||||||
|
db.add(system_project)
|
||||||
|
db.flush()
|
||||||
|
elif not system_project.is_system:
|
||||||
|
system_project.is_system = True
|
||||||
|
db.flush()
|
||||||
|
|
||||||
|
package = db.query(Package).filter(
|
||||||
|
Package.project_id == system_project.id,
|
||||||
|
Package.name == normalized_name,
|
||||||
|
).first()
|
||||||
|
if not package:
|
||||||
|
package = Package(
|
||||||
|
project_id=system_project.id,
|
||||||
|
name=normalized_name,
|
||||||
|
description=f"PyPI package: {normalized_name}",
|
||||||
|
format="pypi",
|
||||||
|
)
|
||||||
|
db.add(package)
|
||||||
|
db.flush()
|
||||||
|
|
||||||
|
# Extract and create version
|
||||||
|
version = _extract_pypi_version(filename)
|
||||||
|
if version and not filename.endswith('.metadata'):
|
||||||
|
existing_version = db.query(PackageVersion).filter(
|
||||||
|
PackageVersion.package_id == package.id,
|
||||||
|
PackageVersion.version == version,
|
||||||
|
).first()
|
||||||
|
if not existing_version:
|
||||||
|
pkg_version = PackageVersion(
|
||||||
|
package_id=package.id,
|
||||||
|
artifact_id=sha256,
|
||||||
|
version=version,
|
||||||
|
version_source="filename",
|
||||||
|
created_by="pypi-proxy",
|
||||||
|
)
|
||||||
|
db.add(pkg_version)
|
||||||
|
|
||||||
|
# Cache the URL mapping
|
||||||
|
existing_cached = db.query(CachedUrl).filter(CachedUrl.url_hash == url_hash).first()
|
||||||
|
if not existing_cached:
|
||||||
|
cached_url_record = CachedUrl(
|
||||||
|
url_hash=url_hash,
|
||||||
|
url=download_url,
|
||||||
|
artifact_id=sha256,
|
||||||
|
)
|
||||||
|
db.add(cached_url_record)
|
||||||
|
|
||||||
|
# Store extracted dependencies using batch operation
|
||||||
|
if extracted_deps:
|
||||||
|
seen_deps: dict[str, str] = {}
|
||||||
|
for dep_name, dep_version in extracted_deps:
|
||||||
|
if dep_name not in seen_deps:
|
||||||
|
seen_deps[dep_name] = dep_version if dep_version else "*"
|
||||||
|
|
||||||
|
deps_to_store = [
|
||||||
|
("_pypi", dep_name, dep_version)
|
||||||
|
for dep_name, dep_version in seen_deps.items()
|
||||||
|
]
|
||||||
|
|
||||||
|
repo = ArtifactRepository(db)
|
||||||
|
inserted = repo.batch_upsert_dependencies(sha256, deps_to_store)
|
||||||
|
if inserted > 0:
|
||||||
|
logger.debug(f"Stored {inserted} dependencies for {sha256[:12]}...")
|
||||||
|
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"artifact_id": sha256,
|
||||||
|
"size": size,
|
||||||
|
"version": version,
|
||||||
|
"already_cached": False,
|
||||||
|
}
|
||||||
|
|
||||||
|
except httpx.ConnectError as e:
|
||||||
|
logger.warning(f"PyPI fetch: connection failed for {filename}: {e}")
|
||||||
|
return None
|
||||||
|
except httpx.TimeoutException as e:
|
||||||
|
logger.warning(f"PyPI fetch: timeout for {filename}: {e}")
|
||||||
|
return None
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception(f"PyPI fetch: error downloading {filename}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
@router.get("/simple/{package_name}/{filename}")
|
@router.get("/simple/{package_name}/{filename}")
|
||||||
async def pypi_download_file(
|
async def pypi_download_file(
|
||||||
request: Request,
|
request: Request,
|
||||||
@@ -573,6 +851,7 @@ async def pypi_download_file(
|
|||||||
upstream: Optional[str] = None,
|
upstream: Optional[str] = None,
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
storage: S3Storage = Depends(get_storage),
|
storage: S3Storage = Depends(get_storage),
|
||||||
|
http_client: HttpClientManager = Depends(get_http_client),
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
Download a package file, caching it in Orchard.
|
Download a package file, caching it in Orchard.
|
||||||
@@ -654,7 +933,9 @@ async def pypi_download_file(
|
|||||||
headers.update(_build_auth_headers(matched_source))
|
headers.update(_build_auth_headers(matched_source))
|
||||||
auth = _get_basic_auth(matched_source) if matched_source else None
|
auth = _get_basic_auth(matched_source) if matched_source else None
|
||||||
|
|
||||||
timeout = httpx.Timeout(300.0, connect=PROXY_CONNECT_TIMEOUT) # 5 minutes for large files
|
# Use shared HTTP client from pool with longer timeout for file downloads
|
||||||
|
client = http_client.get_client()
|
||||||
|
download_timeout = httpx.Timeout(connect=30.0, read=300.0, write=300.0, pool=30.0)
|
||||||
|
|
||||||
# Initialize extracted dependencies list
|
# Initialize extracted dependencies list
|
||||||
extracted_deps = []
|
extracted_deps = []
|
||||||
@@ -662,11 +943,11 @@ async def pypi_download_file(
|
|||||||
# Fetch the file
|
# Fetch the file
|
||||||
logger.info(f"PyPI proxy: fetching {filename} from {upstream_url}")
|
logger.info(f"PyPI proxy: fetching {filename} from {upstream_url}")
|
||||||
|
|
||||||
async with httpx.AsyncClient(timeout=timeout, follow_redirects=False) as client:
|
|
||||||
response = await client.get(
|
response = await client.get(
|
||||||
upstream_url,
|
upstream_url,
|
||||||
headers=headers,
|
headers=headers,
|
||||||
auth=auth,
|
auth=auth,
|
||||||
|
timeout=download_timeout,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Handle redirects manually
|
# Handle redirects manually
|
||||||
@@ -693,6 +974,7 @@ async def pypi_download_file(
|
|||||||
headers=redirect_headers,
|
headers=redirect_headers,
|
||||||
auth=redirect_auth,
|
auth=redirect_auth,
|
||||||
follow_redirects=False,
|
follow_redirects=False,
|
||||||
|
timeout=download_timeout,
|
||||||
)
|
)
|
||||||
redirect_count += 1
|
redirect_count += 1
|
||||||
|
|
||||||
@@ -821,7 +1103,7 @@ async def pypi_download_file(
|
|||||||
)
|
)
|
||||||
db.add(cached_url_record)
|
db.add(cached_url_record)
|
||||||
|
|
||||||
# Store extracted dependencies (deduplicate first - METADATA can list same dep under multiple extras)
|
# Store extracted dependencies using batch operation
|
||||||
if extracted_deps:
|
if extracted_deps:
|
||||||
# Deduplicate: keep first version constraint seen for each package name
|
# Deduplicate: keep first version constraint seen for each package name
|
||||||
seen_deps: dict[str, str] = {}
|
seen_deps: dict[str, str] = {}
|
||||||
@@ -829,22 +1111,17 @@ async def pypi_download_file(
|
|||||||
if dep_name not in seen_deps:
|
if dep_name not in seen_deps:
|
||||||
seen_deps[dep_name] = dep_version if dep_version else "*"
|
seen_deps[dep_name] = dep_version if dep_version else "*"
|
||||||
|
|
||||||
for dep_name, dep_version in seen_deps.items():
|
# Convert to list of tuples for batch insert
|
||||||
# Check if this dependency already exists for this artifact
|
deps_to_store = [
|
||||||
existing_dep = db.query(ArtifactDependency).filter(
|
("_pypi", dep_name, dep_version)
|
||||||
ArtifactDependency.artifact_id == sha256,
|
for dep_name, dep_version in seen_deps.items()
|
||||||
ArtifactDependency.dependency_project == "_pypi",
|
]
|
||||||
ArtifactDependency.dependency_package == dep_name,
|
|
||||||
).first()
|
|
||||||
|
|
||||||
if not existing_dep:
|
# Batch upsert - handles duplicates with ON CONFLICT DO NOTHING
|
||||||
dep = ArtifactDependency(
|
repo = ArtifactRepository(db)
|
||||||
artifact_id=sha256,
|
inserted = repo.batch_upsert_dependencies(sha256, deps_to_store)
|
||||||
dependency_project="_pypi",
|
if inserted > 0:
|
||||||
dependency_package=dep_name,
|
logger.debug(f"Stored {inserted} dependencies for {sha256[:12]}...")
|
||||||
version_constraint=dep_version,
|
|
||||||
)
|
|
||||||
db.add(dep)
|
|
||||||
|
|
||||||
db.commit()
|
db.commit()
|
||||||
|
|
||||||
|
|||||||
426
backend/app/registry_client.py
Normal file
426
backend/app/registry_client.py
Normal file
@@ -0,0 +1,426 @@
|
|||||||
|
"""
|
||||||
|
Registry client abstraction for upstream package registries.
|
||||||
|
|
||||||
|
Provides a pluggable interface for fetching packages from upstream registries
|
||||||
|
(PyPI, npm, Maven, etc.) during dependency resolution with auto-fetch enabled.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import hashlib
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import tempfile
|
||||||
|
from abc import ABC, abstractmethod
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import List, Optional, TYPE_CHECKING
|
||||||
|
from urllib.parse import urljoin, urlparse
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from packaging.specifiers import SpecifierSet, InvalidSpecifier
|
||||||
|
from packaging.version import Version, InvalidVersion
|
||||||
|
from sqlalchemy.orm import Session
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from .storage import S3Storage
|
||||||
|
from .http_client import HttpClientManager
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class VersionInfo:
|
||||||
|
"""Information about a package version from an upstream registry."""
|
||||||
|
|
||||||
|
version: str
|
||||||
|
download_url: str
|
||||||
|
filename: str
|
||||||
|
sha256: Optional[str] = None
|
||||||
|
size: Optional[int] = None
|
||||||
|
content_type: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FetchResult:
|
||||||
|
"""Result of fetching a package from upstream."""
|
||||||
|
|
||||||
|
artifact_id: str # SHA256 hash
|
||||||
|
size: int
|
||||||
|
version: str
|
||||||
|
filename: str
|
||||||
|
already_cached: bool = False
|
||||||
|
|
||||||
|
|
||||||
|
class RegistryClient(ABC):
|
||||||
|
"""Abstract base class for upstream registry clients."""
|
||||||
|
|
||||||
|
@property
|
||||||
|
@abstractmethod
|
||||||
|
def source_type(self) -> str:
|
||||||
|
"""Return the source type this client handles (e.g., 'pypi', 'npm')."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def get_available_versions(self, package_name: str) -> List[str]:
|
||||||
|
"""
|
||||||
|
Get all available versions of a package from upstream.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
package_name: The normalized package name
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of version strings, sorted from oldest to newest
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def resolve_constraint(
|
||||||
|
self, package_name: str, constraint: str
|
||||||
|
) -> Optional[VersionInfo]:
|
||||||
|
"""
|
||||||
|
Find the best version matching a constraint.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
package_name: The normalized package name
|
||||||
|
constraint: Version constraint (e.g., '>=1.9', '<2.0,>=1.5', '*')
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
VersionInfo with download URL, or None if no matching version found
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
async def fetch_package(
|
||||||
|
self,
|
||||||
|
package_name: str,
|
||||||
|
version_info: VersionInfo,
|
||||||
|
db: Session,
|
||||||
|
storage: "S3Storage",
|
||||||
|
) -> Optional[FetchResult]:
|
||||||
|
"""
|
||||||
|
Fetch and cache a package from upstream.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
package_name: The normalized package name
|
||||||
|
version_info: Version details including download URL
|
||||||
|
db: Database session for creating records
|
||||||
|
storage: S3 storage for caching the artifact
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
FetchResult with artifact_id, or None if fetch failed
|
||||||
|
"""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class PyPIRegistryClient(RegistryClient):
|
||||||
|
"""PyPI registry client using the JSON API."""
|
||||||
|
|
||||||
|
# Timeout configuration for PyPI requests
|
||||||
|
CONNECT_TIMEOUT = 30.0
|
||||||
|
READ_TIMEOUT = 60.0
|
||||||
|
DOWNLOAD_TIMEOUT = 300.0 # Longer timeout for file downloads
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
http_client: httpx.AsyncClient,
|
||||||
|
upstream_sources: List,
|
||||||
|
pypi_api_url: str = "https://pypi.org/pypi",
|
||||||
|
):
|
||||||
|
"""
|
||||||
|
Initialize PyPI registry client.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
http_client: Shared async HTTP client
|
||||||
|
upstream_sources: List of configured upstream sources for auth
|
||||||
|
pypi_api_url: Base URL for PyPI JSON API
|
||||||
|
"""
|
||||||
|
self.client = http_client
|
||||||
|
self.sources = upstream_sources
|
||||||
|
self.api_url = pypi_api_url
|
||||||
|
|
||||||
|
@property
|
||||||
|
def source_type(self) -> str:
|
||||||
|
return "pypi"
|
||||||
|
|
||||||
|
def _normalize_package_name(self, name: str) -> str:
|
||||||
|
"""Normalize a PyPI package name per PEP 503."""
|
||||||
|
return re.sub(r"[-_.]+", "-", name).lower()
|
||||||
|
|
||||||
|
def _get_auth_headers(self) -> dict:
|
||||||
|
"""Get authentication headers from configured sources."""
|
||||||
|
headers = {"User-Agent": "Orchard-Registry-Client/1.0"}
|
||||||
|
if self.sources:
|
||||||
|
source = self.sources[0]
|
||||||
|
if hasattr(source, "auth_type"):
|
||||||
|
if source.auth_type == "bearer":
|
||||||
|
password = (
|
||||||
|
source.get_password()
|
||||||
|
if hasattr(source, "get_password")
|
||||||
|
else getattr(source, "password", None)
|
||||||
|
)
|
||||||
|
if password:
|
||||||
|
headers["Authorization"] = f"Bearer {password}"
|
||||||
|
elif source.auth_type == "api_key":
|
||||||
|
custom_headers = (
|
||||||
|
source.get_headers()
|
||||||
|
if hasattr(source, "get_headers")
|
||||||
|
else {}
|
||||||
|
)
|
||||||
|
if custom_headers:
|
||||||
|
headers.update(custom_headers)
|
||||||
|
return headers
|
||||||
|
|
||||||
|
def _get_basic_auth(self) -> Optional[tuple]:
|
||||||
|
"""Get basic auth credentials if configured."""
|
||||||
|
if self.sources:
|
||||||
|
source = self.sources[0]
|
||||||
|
if hasattr(source, "auth_type") and source.auth_type == "basic":
|
||||||
|
username = getattr(source, "username", None)
|
||||||
|
if username:
|
||||||
|
password = (
|
||||||
|
source.get_password()
|
||||||
|
if hasattr(source, "get_password")
|
||||||
|
else getattr(source, "password", "")
|
||||||
|
)
|
||||||
|
return (username, password or "")
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def get_available_versions(self, package_name: str) -> List[str]:
|
||||||
|
"""Get all available versions from PyPI JSON API."""
|
||||||
|
normalized = self._normalize_package_name(package_name)
|
||||||
|
url = f"{self.api_url}/{normalized}/json"
|
||||||
|
|
||||||
|
headers = self._get_auth_headers()
|
||||||
|
auth = self._get_basic_auth()
|
||||||
|
timeout = httpx.Timeout(self.READ_TIMEOUT, connect=self.CONNECT_TIMEOUT)
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = await self.client.get(
|
||||||
|
url, headers=headers, auth=auth, timeout=timeout
|
||||||
|
)
|
||||||
|
|
||||||
|
if response.status_code == 404:
|
||||||
|
logger.debug(f"Package {normalized} not found on PyPI")
|
||||||
|
return []
|
||||||
|
|
||||||
|
if response.status_code != 200:
|
||||||
|
logger.warning(
|
||||||
|
f"PyPI API returned {response.status_code} for {normalized}"
|
||||||
|
)
|
||||||
|
return []
|
||||||
|
|
||||||
|
data = response.json()
|
||||||
|
releases = data.get("releases", {})
|
||||||
|
|
||||||
|
# Filter to valid versions and sort
|
||||||
|
versions = []
|
||||||
|
for v in releases.keys():
|
||||||
|
try:
|
||||||
|
Version(v)
|
||||||
|
versions.append(v)
|
||||||
|
except InvalidVersion:
|
||||||
|
continue
|
||||||
|
|
||||||
|
versions.sort(key=lambda x: Version(x))
|
||||||
|
return versions
|
||||||
|
|
||||||
|
except httpx.RequestError as e:
|
||||||
|
logger.warning(f"Failed to query PyPI for {normalized}: {e}")
|
||||||
|
return []
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Error parsing PyPI response for {normalized}: {e}")
|
||||||
|
return []
|
||||||
|
|
||||||
|
async def resolve_constraint(
|
||||||
|
self, package_name: str, constraint: str
|
||||||
|
) -> Optional[VersionInfo]:
|
||||||
|
"""Find best version matching constraint from PyPI."""
|
||||||
|
normalized = self._normalize_package_name(package_name)
|
||||||
|
url = f"{self.api_url}/{normalized}/json"
|
||||||
|
|
||||||
|
headers = self._get_auth_headers()
|
||||||
|
auth = self._get_basic_auth()
|
||||||
|
timeout = httpx.Timeout(self.READ_TIMEOUT, connect=self.CONNECT_TIMEOUT)
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = await self.client.get(
|
||||||
|
url, headers=headers, auth=auth, timeout=timeout
|
||||||
|
)
|
||||||
|
|
||||||
|
if response.status_code == 404:
|
||||||
|
logger.debug(f"Package {normalized} not found on PyPI")
|
||||||
|
return None
|
||||||
|
|
||||||
|
if response.status_code != 200:
|
||||||
|
logger.warning(
|
||||||
|
f"PyPI API returned {response.status_code} for {normalized}"
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
data = response.json()
|
||||||
|
releases = data.get("releases", {})
|
||||||
|
|
||||||
|
# Handle wildcard - return latest version
|
||||||
|
if constraint == "*":
|
||||||
|
latest_version = data.get("info", {}).get("version")
|
||||||
|
if latest_version and latest_version in releases:
|
||||||
|
return self._get_version_info(
|
||||||
|
normalized, latest_version, releases[latest_version]
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Parse constraint
|
||||||
|
# If constraint looks like a bare version (no operator), treat as exact match
|
||||||
|
# e.g., "2025.10.5" -> "==2025.10.5"
|
||||||
|
effective_constraint = constraint
|
||||||
|
if constraint and constraint[0].isdigit():
|
||||||
|
effective_constraint = f"=={constraint}"
|
||||||
|
logger.debug(
|
||||||
|
f"Bare version '{constraint}' for {normalized}, "
|
||||||
|
f"treating as exact match '{effective_constraint}'"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
specifier = SpecifierSet(effective_constraint)
|
||||||
|
except InvalidSpecifier:
|
||||||
|
# Invalid constraint - treat as wildcard
|
||||||
|
logger.warning(
|
||||||
|
f"Invalid version constraint '{constraint}' for {normalized}, "
|
||||||
|
"treating as wildcard"
|
||||||
|
)
|
||||||
|
latest_version = data.get("info", {}).get("version")
|
||||||
|
if latest_version and latest_version in releases:
|
||||||
|
return self._get_version_info(
|
||||||
|
normalized, latest_version, releases[latest_version]
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Find matching versions
|
||||||
|
matching = []
|
||||||
|
for v_str, files in releases.items():
|
||||||
|
if not files: # Skip versions with no files
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
v = Version(v_str)
|
||||||
|
if v in specifier:
|
||||||
|
matching.append((v_str, v, files))
|
||||||
|
except InvalidVersion:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not matching:
|
||||||
|
logger.debug(
|
||||||
|
f"No versions of {normalized} match constraint '{constraint}'"
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Sort by version and return highest match
|
||||||
|
matching.sort(key=lambda x: x[1], reverse=True)
|
||||||
|
best_version, _, best_files = matching[0]
|
||||||
|
|
||||||
|
return self._get_version_info(normalized, best_version, best_files)
|
||||||
|
|
||||||
|
except httpx.RequestError as e:
|
||||||
|
logger.warning(f"Failed to query PyPI for {normalized}: {e}")
|
||||||
|
return None
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Error resolving {normalized}@{constraint}: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _get_version_info(
|
||||||
|
self, package_name: str, version: str, files: List[dict]
|
||||||
|
) -> Optional[VersionInfo]:
|
||||||
|
"""Extract download info from PyPI release files."""
|
||||||
|
if not files:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Prefer wheel over sdist
|
||||||
|
wheel_file = None
|
||||||
|
sdist_file = None
|
||||||
|
|
||||||
|
for f in files:
|
||||||
|
filename = f.get("filename", "")
|
||||||
|
if filename.endswith(".whl"):
|
||||||
|
# Prefer platform-agnostic wheels
|
||||||
|
if "py3-none-any" in filename or wheel_file is None:
|
||||||
|
wheel_file = f
|
||||||
|
elif filename.endswith(".tar.gz") and sdist_file is None:
|
||||||
|
sdist_file = f
|
||||||
|
|
||||||
|
selected = wheel_file or sdist_file
|
||||||
|
if not selected:
|
||||||
|
# Fall back to first available file
|
||||||
|
selected = files[0]
|
||||||
|
|
||||||
|
return VersionInfo(
|
||||||
|
version=version,
|
||||||
|
download_url=selected.get("url", ""),
|
||||||
|
filename=selected.get("filename", ""),
|
||||||
|
sha256=selected.get("digests", {}).get("sha256"),
|
||||||
|
size=selected.get("size"),
|
||||||
|
content_type="application/zip"
|
||||||
|
if selected.get("filename", "").endswith(".whl")
|
||||||
|
else "application/gzip",
|
||||||
|
)
|
||||||
|
|
||||||
|
async def fetch_package(
|
||||||
|
self,
|
||||||
|
package_name: str,
|
||||||
|
version_info: VersionInfo,
|
||||||
|
db: Session,
|
||||||
|
storage: "S3Storage",
|
||||||
|
) -> Optional[FetchResult]:
|
||||||
|
"""Fetch and cache a PyPI package."""
|
||||||
|
# Import here to avoid circular imports
|
||||||
|
from .pypi_proxy import fetch_and_cache_pypi_package
|
||||||
|
|
||||||
|
normalized = self._normalize_package_name(package_name)
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
f"Fetching {normalized}=={version_info.version} from upstream PyPI"
|
||||||
|
)
|
||||||
|
|
||||||
|
result = await fetch_and_cache_pypi_package(
|
||||||
|
db=db,
|
||||||
|
storage=storage,
|
||||||
|
http_client=self.client,
|
||||||
|
package_name=normalized,
|
||||||
|
filename=version_info.filename,
|
||||||
|
download_url=version_info.download_url,
|
||||||
|
expected_sha256=version_info.sha256,
|
||||||
|
)
|
||||||
|
|
||||||
|
if result is None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return FetchResult(
|
||||||
|
artifact_id=result["artifact_id"],
|
||||||
|
size=result["size"],
|
||||||
|
version=version_info.version,
|
||||||
|
filename=version_info.filename,
|
||||||
|
already_cached=result.get("already_cached", False),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_registry_client(
|
||||||
|
source_type: str,
|
||||||
|
http_client: httpx.AsyncClient,
|
||||||
|
upstream_sources: List,
|
||||||
|
) -> Optional[RegistryClient]:
|
||||||
|
"""
|
||||||
|
Factory function to get a registry client for a source type.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
source_type: The registry type ('pypi', 'npm', etc.)
|
||||||
|
http_client: Shared async HTTP client
|
||||||
|
upstream_sources: List of configured upstream sources
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
RegistryClient for the source type, or None if not supported
|
||||||
|
"""
|
||||||
|
if source_type == "pypi":
|
||||||
|
# Filter to PyPI sources
|
||||||
|
pypi_sources = [s for s in upstream_sources if getattr(s, "source_type", "") == "pypi"]
|
||||||
|
return PyPIRegistryClient(http_client, pypi_sources)
|
||||||
|
|
||||||
|
# Future: Add npm, maven, etc.
|
||||||
|
logger.debug(f"No registry client available for source type: {source_type}")
|
||||||
|
return None
|
||||||
@@ -141,11 +141,13 @@ from .dependencies import (
|
|||||||
get_reverse_dependencies,
|
get_reverse_dependencies,
|
||||||
check_circular_dependencies,
|
check_circular_dependencies,
|
||||||
resolve_dependencies,
|
resolve_dependencies,
|
||||||
|
resolve_dependencies_with_fetch,
|
||||||
InvalidEnsureFileError,
|
InvalidEnsureFileError,
|
||||||
CircularDependencyError,
|
CircularDependencyError,
|
||||||
DependencyConflictError,
|
DependencyConflictError,
|
||||||
DependencyNotFoundError,
|
DependencyNotFoundError,
|
||||||
DependencyDepthExceededError,
|
DependencyDepthExceededError,
|
||||||
|
TooManyArtifactsError,
|
||||||
)
|
)
|
||||||
from .config import get_settings, get_env_upstream_sources
|
from .config import get_settings, get_env_upstream_sources
|
||||||
from .checksum import (
|
from .checksum import (
|
||||||
@@ -421,7 +423,8 @@ def _log_audit(
|
|||||||
|
|
||||||
# Health check
|
# Health check
|
||||||
@router.get("/health", response_model=HealthResponse)
|
@router.get("/health", response_model=HealthResponse)
|
||||||
def health_check(
|
async def health_check(
|
||||||
|
request: Request,
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
storage: S3Storage = Depends(get_storage),
|
storage: S3Storage = Depends(get_storage),
|
||||||
):
|
):
|
||||||
@@ -449,11 +452,30 @@ def health_check(
|
|||||||
|
|
||||||
overall_status = "ok" if (storage_healthy and database_healthy) else "degraded"
|
overall_status = "ok" if (storage_healthy and database_healthy) else "degraded"
|
||||||
|
|
||||||
return HealthResponse(
|
# Build response with optional infrastructure status
|
||||||
status=overall_status,
|
response_data = {
|
||||||
storage_healthy=storage_healthy,
|
"status": overall_status,
|
||||||
database_healthy=database_healthy,
|
"storage_healthy": storage_healthy,
|
||||||
)
|
"database_healthy": database_healthy,
|
||||||
|
}
|
||||||
|
|
||||||
|
# Add HTTP pool status if available
|
||||||
|
if hasattr(request.app.state, 'http_client'):
|
||||||
|
http_client = request.app.state.http_client
|
||||||
|
response_data["http_pool"] = {
|
||||||
|
"pool_size": http_client.pool_size,
|
||||||
|
"worker_threads": http_client.executor_max,
|
||||||
|
}
|
||||||
|
|
||||||
|
# Add cache status if available
|
||||||
|
if hasattr(request.app.state, 'cache'):
|
||||||
|
cache = request.app.state.cache
|
||||||
|
response_data["cache"] = {
|
||||||
|
"enabled": cache.enabled,
|
||||||
|
"connected": await cache.ping() if cache.enabled else False,
|
||||||
|
}
|
||||||
|
|
||||||
|
return HealthResponse(**response_data)
|
||||||
|
|
||||||
|
|
||||||
# --- Authentication Routes ---
|
# --- Authentication Routes ---
|
||||||
@@ -2645,10 +2667,10 @@ def list_packages(
|
|||||||
format: Optional[str] = Query(default=None, description="Filter by package format"),
|
format: Optional[str] = Query(default=None, description="Filter by package format"),
|
||||||
platform: Optional[str] = Query(default=None, description="Filter by platform"),
|
platform: Optional[str] = Query(default=None, description="Filter by platform"),
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
|
current_user: Optional[User] = Depends(get_current_user_optional),
|
||||||
):
|
):
|
||||||
project = db.query(Project).filter(Project.name == project_name).first()
|
# Check read access (handles private project visibility)
|
||||||
if not project:
|
project = check_project_access(db, project_name, current_user, "read")
|
||||||
raise HTTPException(status_code=404, detail="Project not found")
|
|
||||||
|
|
||||||
# Validate sort field
|
# Validate sort field
|
||||||
valid_sort_fields = {
|
valid_sort_fields = {
|
||||||
@@ -2929,13 +2951,13 @@ def update_package(
|
|||||||
package_update: PackageUpdate,
|
package_update: PackageUpdate,
|
||||||
request: Request,
|
request: Request,
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
|
current_user: Optional[User] = Depends(get_current_user_optional),
|
||||||
):
|
):
|
||||||
"""Update a package's metadata."""
|
"""Update a package's metadata."""
|
||||||
user_id = get_user_id(request)
|
user_id = get_user_id(request)
|
||||||
|
|
||||||
project = db.query(Project).filter(Project.name == project_name).first()
|
# Check write access to project
|
||||||
if not project:
|
project = check_project_access(db, project_name, current_user, "write")
|
||||||
raise HTTPException(status_code=404, detail="Project not found")
|
|
||||||
|
|
||||||
package = (
|
package = (
|
||||||
db.query(Package)
|
db.query(Package)
|
||||||
@@ -3012,6 +3034,7 @@ def delete_package(
|
|||||||
package_name: str,
|
package_name: str,
|
||||||
request: Request,
|
request: Request,
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
|
current_user: Optional[User] = Depends(get_current_user_optional),
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
Delete a package and all its versions.
|
Delete a package and all its versions.
|
||||||
@@ -3022,9 +3045,8 @@ def delete_package(
|
|||||||
"""
|
"""
|
||||||
user_id = get_user_id(request)
|
user_id = get_user_id(request)
|
||||||
|
|
||||||
project = db.query(Project).filter(Project.name == project_name).first()
|
# Check write access to project (deletion requires write permission)
|
||||||
if not project:
|
project = check_project_access(db, project_name, current_user, "write")
|
||||||
raise HTTPException(status_code=404, detail="Project not found")
|
|
||||||
|
|
||||||
package = (
|
package = (
|
||||||
db.query(Package)
|
db.query(Package)
|
||||||
@@ -7005,12 +7027,17 @@ def get_package_reverse_dependencies(
|
|||||||
response_model=DependencyResolutionResponse,
|
response_model=DependencyResolutionResponse,
|
||||||
tags=["dependencies"],
|
tags=["dependencies"],
|
||||||
)
|
)
|
||||||
def resolve_artifact_dependencies(
|
async def resolve_artifact_dependencies(
|
||||||
project_name: str,
|
project_name: str,
|
||||||
package_name: str,
|
package_name: str,
|
||||||
ref: str,
|
ref: str,
|
||||||
request: Request,
|
request: Request,
|
||||||
|
auto_fetch: bool = Query(
|
||||||
|
True,
|
||||||
|
description="Fetch missing dependencies from upstream registries (e.g., PyPI). Set to false for fast, network-free resolution."
|
||||||
|
),
|
||||||
db: Session = Depends(get_db),
|
db: Session = Depends(get_db),
|
||||||
|
storage: S3Storage = Depends(get_storage),
|
||||||
current_user: Optional[User] = Depends(get_current_user_optional),
|
current_user: Optional[User] = Depends(get_current_user_optional),
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
@@ -7019,6 +7046,16 @@ def resolve_artifact_dependencies(
|
|||||||
Returns a flat list of all artifacts needed, in topological order
|
Returns a flat list of all artifacts needed, in topological order
|
||||||
(dependencies before dependents). Includes download URLs for each artifact.
|
(dependencies before dependents). Includes download URLs for each artifact.
|
||||||
|
|
||||||
|
**Parameters:**
|
||||||
|
- **auto_fetch**: When true (default), attempts to fetch missing dependencies from
|
||||||
|
upstream registries (PyPI for _pypi project packages). Set to false for
|
||||||
|
fast, network-free resolution when all dependencies are already cached.
|
||||||
|
|
||||||
|
**Response Fields:**
|
||||||
|
- **resolved**: All artifacts in dependency order with download URLs
|
||||||
|
- **missing**: Dependencies that couldn't be resolved (with fetch status if auto_fetch=true)
|
||||||
|
- **fetched**: Artifacts that were fetched from upstream during this request
|
||||||
|
|
||||||
**Error Responses:**
|
**Error Responses:**
|
||||||
- 404: Artifact or dependency not found
|
- 404: Artifact or dependency not found
|
||||||
- 409: Circular dependency or version conflict detected
|
- 409: Circular dependency or version conflict detected
|
||||||
@@ -7030,7 +7067,38 @@ def resolve_artifact_dependencies(
|
|||||||
base_url = str(request.base_url).rstrip("/")
|
base_url = str(request.base_url).rstrip("/")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
if auto_fetch:
|
||||||
|
# Use async resolution with auto-fetch
|
||||||
|
from .registry_client import get_registry_client
|
||||||
|
from .pypi_proxy import _get_pypi_upstream_sources
|
||||||
|
|
||||||
|
settings = get_settings()
|
||||||
|
|
||||||
|
# Get HTTP client from app state
|
||||||
|
http_client = request.app.state.http_client.get_client()
|
||||||
|
|
||||||
|
# Get upstream sources for registry clients
|
||||||
|
pypi_sources = _get_pypi_upstream_sources(db)
|
||||||
|
|
||||||
|
# Build registry clients
|
||||||
|
registry_clients = {}
|
||||||
|
pypi_client = get_registry_client("pypi", http_client, pypi_sources)
|
||||||
|
if pypi_client:
|
||||||
|
registry_clients["_pypi"] = pypi_client
|
||||||
|
|
||||||
|
return await resolve_dependencies_with_fetch(
|
||||||
|
db=db,
|
||||||
|
project_name=project_name,
|
||||||
|
package_name=package_name,
|
||||||
|
ref=ref,
|
||||||
|
base_url=base_url,
|
||||||
|
storage=storage,
|
||||||
|
registry_clients=registry_clients,
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# Fast, synchronous resolution without network calls
|
||||||
return resolve_dependencies(db, project_name, package_name, ref, base_url)
|
return resolve_dependencies(db, project_name, package_name, ref, base_url)
|
||||||
|
|
||||||
except DependencyNotFoundError as e:
|
except DependencyNotFoundError as e:
|
||||||
raise HTTPException(
|
raise HTTPException(
|
||||||
status_code=404,
|
status_code=404,
|
||||||
@@ -7070,6 +7138,15 @@ def resolve_artifact_dependencies(
|
|||||||
"max_depth": e.max_depth,
|
"max_depth": e.max_depth,
|
||||||
}
|
}
|
||||||
)
|
)
|
||||||
|
except TooManyArtifactsError as e:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=400,
|
||||||
|
detail={
|
||||||
|
"error": "too_many_artifacts",
|
||||||
|
"message": str(e),
|
||||||
|
"max_artifacts": e.max_artifacts,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
# --- Upstream Caching Routes ---
|
# --- Upstream Caching Routes ---
|
||||||
|
|||||||
@@ -493,6 +493,8 @@ class HealthResponse(BaseModel):
|
|||||||
version: str = "1.0.0"
|
version: str = "1.0.0"
|
||||||
storage_healthy: Optional[bool] = None
|
storage_healthy: Optional[bool] = None
|
||||||
database_healthy: Optional[bool] = None
|
database_healthy: Optional[bool] = None
|
||||||
|
http_pool: Optional[Dict[str, Any]] = None
|
||||||
|
cache: Optional[Dict[str, Any]] = None
|
||||||
|
|
||||||
|
|
||||||
# Garbage collection schemas
|
# Garbage collection schemas
|
||||||
@@ -890,6 +892,8 @@ class MissingDependency(BaseModel):
|
|||||||
package: str
|
package: str
|
||||||
constraint: Optional[str] = None
|
constraint: Optional[str] = None
|
||||||
required_by: Optional[str] = None
|
required_by: Optional[str] = None
|
||||||
|
fetch_attempted: bool = False # True if auto-fetch was attempted
|
||||||
|
fetch_error: Optional[str] = None # Error message if fetch failed
|
||||||
|
|
||||||
|
|
||||||
class DependencyResolutionResponse(BaseModel):
|
class DependencyResolutionResponse(BaseModel):
|
||||||
@@ -897,6 +901,7 @@ class DependencyResolutionResponse(BaseModel):
|
|||||||
requested: Dict[str, str] # project, package, ref
|
requested: Dict[str, str] # project, package, ref
|
||||||
resolved: List[ResolvedArtifact]
|
resolved: List[ResolvedArtifact]
|
||||||
missing: List[MissingDependency] = []
|
missing: List[MissingDependency] = []
|
||||||
|
fetched: List[ResolvedArtifact] = [] # Artifacts fetched from upstream during resolution
|
||||||
total_size: int
|
total_size: int
|
||||||
artifact_count: int
|
artifact_count: int
|
||||||
|
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ passlib[bcrypt]==1.7.4
|
|||||||
bcrypt==4.0.1
|
bcrypt==4.0.1
|
||||||
slowapi==0.1.9
|
slowapi==0.1.9
|
||||||
httpx>=0.25.0
|
httpx>=0.25.0
|
||||||
|
redis>=5.0.0
|
||||||
|
|
||||||
# Test dependencies
|
# Test dependencies
|
||||||
pytest>=7.4.0
|
pytest>=7.4.0
|
||||||
|
|||||||
@@ -135,3 +135,19 @@ class TestPyPIPackageNormalization:
|
|||||||
assert "text/html" in response.headers.get("content-type", "")
|
assert "text/html" in response.headers.get("content-type", "")
|
||||||
elif response.status_code == 503:
|
elif response.status_code == 503:
|
||||||
assert "No PyPI upstream sources configured" in response.json()["detail"]
|
assert "No PyPI upstream sources configured" in response.json()["detail"]
|
||||||
|
|
||||||
|
|
||||||
|
class TestPyPIProxyInfrastructure:
|
||||||
|
"""Tests for PyPI proxy infrastructure integration."""
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_health_endpoint_includes_infrastructure(self, integration_client):
|
||||||
|
"""Health endpoint should report infrastructure status."""
|
||||||
|
response = integration_client.get("/health")
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
data = response.json()
|
||||||
|
assert data["status"] == "ok"
|
||||||
|
# Infrastructure status should be present
|
||||||
|
assert "http_pool" in data
|
||||||
|
assert "cache" in data
|
||||||
|
|||||||
@@ -873,10 +873,14 @@ class TestCircularDependencyDetection:
|
|||||||
|
|
||||||
|
|
||||||
class TestConflictDetection:
|
class TestConflictDetection:
|
||||||
"""Tests for #81: Dependency Conflict Detection and Reporting"""
|
"""Tests for dependency conflict handling.
|
||||||
|
|
||||||
|
The resolver uses "first version wins" strategy for version conflicts,
|
||||||
|
allowing resolution to succeed rather than failing with an error.
|
||||||
|
"""
|
||||||
|
|
||||||
@pytest.mark.integration
|
@pytest.mark.integration
|
||||||
def test_detect_version_conflict(
|
def test_version_conflict_uses_first_version(
|
||||||
self, integration_client, test_project, unique_test_id
|
self, integration_client, test_project, unique_test_id
|
||||||
):
|
):
|
||||||
"""Test conflict when two deps require different versions of same package."""
|
"""Test conflict when two deps require different versions of same package."""
|
||||||
@@ -968,21 +972,19 @@ class TestConflictDetection:
|
|||||||
)
|
)
|
||||||
assert response.status_code == 200
|
assert response.status_code == 200
|
||||||
|
|
||||||
# Try to resolve app - should report conflict
|
# Try to resolve app - with lenient conflict handling, this should succeed
|
||||||
|
# The resolver uses "first version wins" strategy for conflicting versions
|
||||||
response = integration_client.get(
|
response = integration_client.get(
|
||||||
f"/api/v1/project/{test_project}/{pkg_app}/+/1.0.0/resolve"
|
f"/api/v1/project/{test_project}/{pkg_app}/+/1.0.0/resolve"
|
||||||
)
|
)
|
||||||
assert response.status_code == 409
|
assert response.status_code == 200
|
||||||
data = response.json()
|
data = response.json()
|
||||||
# Error details are nested in "detail" for HTTPException
|
|
||||||
detail = data.get("detail", data)
|
|
||||||
assert detail.get("error") == "dependency_conflict"
|
|
||||||
assert len(detail.get("conflicts", [])) > 0
|
|
||||||
|
|
||||||
# Verify conflict details
|
# Resolution should succeed with first-encountered version of common
|
||||||
conflict = detail["conflicts"][0]
|
assert data["artifact_count"] >= 1
|
||||||
assert conflict["package"] == pkg_common
|
# Find the common package in resolved list
|
||||||
assert len(conflict["requirements"]) == 2
|
common_resolved = [r for r in data["resolved"] if r["package"] == pkg_common]
|
||||||
|
assert len(common_resolved) == 1 # Only one version should be included
|
||||||
|
|
||||||
finally:
|
finally:
|
||||||
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
|
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
|
||||||
@@ -1067,3 +1069,277 @@ class TestConflictDetection:
|
|||||||
finally:
|
finally:
|
||||||
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
|
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
|
||||||
integration_client.delete(f"/api/v1/project/{test_project}/packages/{pkg}")
|
integration_client.delete(f"/api/v1/project/{test_project}/packages/{pkg}")
|
||||||
|
|
||||||
|
|
||||||
|
class TestAutoFetchDependencies:
|
||||||
|
"""Tests for auto-fetch functionality in dependency resolution.
|
||||||
|
|
||||||
|
These tests verify:
|
||||||
|
- Resolution with auto_fetch=true (default) fetches missing dependencies from upstream
|
||||||
|
- Resolution with auto_fetch=false skips network calls for fast resolution
|
||||||
|
- Proper handling of missing/non-existent packages
|
||||||
|
- Response schema includes fetched artifacts list
|
||||||
|
"""
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_resolve_auto_fetch_true_is_default(
|
||||||
|
self, integration_client, test_package, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that auto_fetch=true is the default (no fetch needed when all deps cached)."""
|
||||||
|
project_name, package_name = test_package
|
||||||
|
|
||||||
|
# Upload a simple artifact without dependencies
|
||||||
|
content = unique_content("autofetch-default", unique_test_id, "nodeps")
|
||||||
|
files = {"file": ("default.tar.gz", BytesIO(content), "application/gzip")}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": f"v1.0.0-{unique_test_id}"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Resolve without auto_fetch param (should default to false)
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/+/v1.0.0-{unique_test_id}/resolve"
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Should have empty fetched list
|
||||||
|
assert data.get("fetched", []) == []
|
||||||
|
assert data["artifact_count"] == 1
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_resolve_auto_fetch_explicit_false(
|
||||||
|
self, integration_client, test_package, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that auto_fetch=false works explicitly."""
|
||||||
|
project_name, package_name = test_package
|
||||||
|
|
||||||
|
content = unique_content("autofetch-explicit-false", unique_test_id, "nodeps")
|
||||||
|
files = {"file": ("explicit.tar.gz", BytesIO(content), "application/gzip")}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": f"v2.0.0-{unique_test_id}"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Resolve with explicit auto_fetch=false
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/+/v2.0.0-{unique_test_id}/resolve",
|
||||||
|
params={"auto_fetch": "false"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
assert data.get("fetched", []) == []
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_resolve_auto_fetch_true_no_missing_deps(
|
||||||
|
self, integration_client, test_project, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that auto_fetch=true works when all deps are already cached."""
|
||||||
|
pkg_a = f"fetch-a-{unique_test_id}"
|
||||||
|
pkg_b = f"fetch-b-{unique_test_id}"
|
||||||
|
|
||||||
|
for pkg in [pkg_a, pkg_b]:
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{test_project}/packages",
|
||||||
|
json={"name": pkg}
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Upload B (no deps)
|
||||||
|
content_b = unique_content("B", unique_test_id, "fetch")
|
||||||
|
files = {"file": ("b.tar.gz", BytesIO(content_b), "application/gzip")}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{test_project}/{pkg_b}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": "1.0.0"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Upload A (depends on B)
|
||||||
|
ensure_a = yaml.dump({
|
||||||
|
"dependencies": [
|
||||||
|
{"project": test_project, "package": pkg_b, "version": "1.0.0"}
|
||||||
|
]
|
||||||
|
})
|
||||||
|
content_a = unique_content("A", unique_test_id, "fetch")
|
||||||
|
files = {
|
||||||
|
"file": ("a.tar.gz", BytesIO(content_a), "application/gzip"),
|
||||||
|
"ensure": ("orchard.ensure", BytesIO(ensure_a.encode()), "application/x-yaml"),
|
||||||
|
}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{test_project}/{pkg_a}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": "1.0.0"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Resolve with auto_fetch=true - should work since deps are cached
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{test_project}/{pkg_a}/+/1.0.0/resolve",
|
||||||
|
params={"auto_fetch": "true"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Should resolve successfully
|
||||||
|
assert data["artifact_count"] == 2
|
||||||
|
# Nothing fetched since everything was cached
|
||||||
|
assert len(data.get("fetched", [])) == 0
|
||||||
|
# No missing deps
|
||||||
|
assert len(data.get("missing", [])) == 0
|
||||||
|
|
||||||
|
finally:
|
||||||
|
for pkg in [pkg_a, pkg_b]:
|
||||||
|
integration_client.delete(f"/api/v1/project/{test_project}/packages/{pkg}")
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_resolve_missing_dep_with_auto_fetch_false(
|
||||||
|
self, integration_client, test_package, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that missing deps are reported when auto_fetch=false."""
|
||||||
|
project_name, package_name = test_package
|
||||||
|
|
||||||
|
# Create _pypi system project if it doesn't exist
|
||||||
|
response = integration_client.get("/api/v1/projects/_pypi")
|
||||||
|
if response.status_code == 404:
|
||||||
|
response = integration_client.post(
|
||||||
|
"/api/v1/projects",
|
||||||
|
json={"name": "_pypi", "description": "System project for PyPI packages"}
|
||||||
|
)
|
||||||
|
# May fail if already exists or can't create - that's ok
|
||||||
|
|
||||||
|
# Upload artifact with dependency on _pypi package that doesn't exist locally
|
||||||
|
ensure_content = yaml.dump({
|
||||||
|
"dependencies": [
|
||||||
|
{"project": "_pypi", "package": "nonexistent-pkg-xyz123", "version": ">=1.0.0"}
|
||||||
|
]
|
||||||
|
})
|
||||||
|
|
||||||
|
content = unique_content("missing-pypi", unique_test_id, "dep")
|
||||||
|
files = {
|
||||||
|
"file": ("missing-pypi-dep.tar.gz", BytesIO(content), "application/gzip"),
|
||||||
|
"ensure": ("orchard.ensure", BytesIO(ensure_content.encode()), "application/x-yaml"),
|
||||||
|
}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": f"v3.0.0-{unique_test_id}"},
|
||||||
|
)
|
||||||
|
# Upload should succeed - validation is loose for system projects
|
||||||
|
if response.status_code == 200:
|
||||||
|
# Resolve without auto_fetch - should report missing
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/+/v3.0.0-{unique_test_id}/resolve",
|
||||||
|
params={"auto_fetch": "false"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Should have missing dependencies
|
||||||
|
assert len(data.get("missing", [])) >= 1
|
||||||
|
|
||||||
|
# Verify missing dependency structure
|
||||||
|
missing = data["missing"][0]
|
||||||
|
assert missing["project"] == "_pypi"
|
||||||
|
assert missing["package"] == "nonexistent-pkg-xyz123"
|
||||||
|
# Without auto_fetch, these should be false/None
|
||||||
|
assert missing.get("fetch_attempted", False) is False
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_resolve_response_schema_has_fetched_field(
|
||||||
|
self, integration_client, test_package, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that the resolve response always includes the fetched field."""
|
||||||
|
project_name, package_name = test_package
|
||||||
|
|
||||||
|
content = unique_content("schema-check", unique_test_id, "nodeps")
|
||||||
|
files = {"file": ("schema.tar.gz", BytesIO(content), "application/gzip")}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": f"v4.0.0-{unique_test_id}"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Check both auto_fetch modes include fetched field
|
||||||
|
for auto_fetch in ["false", "true"]:
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/+/v4.0.0-{unique_test_id}/resolve",
|
||||||
|
params={"auto_fetch": auto_fetch},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Required fields
|
||||||
|
assert "requested" in data
|
||||||
|
assert "resolved" in data
|
||||||
|
assert "missing" in data
|
||||||
|
assert "fetched" in data # New field
|
||||||
|
assert "total_size" in data
|
||||||
|
assert "artifact_count" in data
|
||||||
|
|
||||||
|
# Types
|
||||||
|
assert isinstance(data["fetched"], list)
|
||||||
|
assert isinstance(data["missing"], list)
|
||||||
|
|
||||||
|
@pytest.mark.integration
|
||||||
|
def test_missing_dep_schema_has_fetch_fields(
|
||||||
|
self, integration_client, test_package, unique_test_id
|
||||||
|
):
|
||||||
|
"""Test that missing dependency entries have fetch_attempted and fetch_error fields."""
|
||||||
|
project_name, package_name = test_package
|
||||||
|
|
||||||
|
# Create a dependency on a non-existent package in a real project
|
||||||
|
dep_project_name = f"dep-test-{unique_test_id}"
|
||||||
|
response = integration_client.post(
|
||||||
|
"/api/v1/projects", json={"name": dep_project_name}
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
try:
|
||||||
|
ensure_content = yaml.dump({
|
||||||
|
"dependencies": [
|
||||||
|
{"project": dep_project_name, "package": "nonexistent-pkg", "version": "1.0.0"}
|
||||||
|
]
|
||||||
|
})
|
||||||
|
|
||||||
|
content = unique_content("missing-schema", unique_test_id, "check")
|
||||||
|
files = {
|
||||||
|
"file": ("missing-schema.tar.gz", BytesIO(content), "application/gzip"),
|
||||||
|
"ensure": ("orchard.ensure", BytesIO(ensure_content.encode()), "application/x-yaml"),
|
||||||
|
}
|
||||||
|
response = integration_client.post(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/upload",
|
||||||
|
files=files,
|
||||||
|
data={"version": f"v5.0.0-{unique_test_id}"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
|
||||||
|
# Resolve
|
||||||
|
response = integration_client.get(
|
||||||
|
f"/api/v1/project/{project_name}/{package_name}/+/v5.0.0-{unique_test_id}/resolve",
|
||||||
|
params={"auto_fetch": "true"},
|
||||||
|
)
|
||||||
|
assert response.status_code == 200
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Should have missing dependencies
|
||||||
|
assert len(data.get("missing", [])) >= 1
|
||||||
|
|
||||||
|
# Check schema for missing dependency
|
||||||
|
missing = data["missing"][0]
|
||||||
|
assert "project" in missing
|
||||||
|
assert "package" in missing
|
||||||
|
assert "constraint" in missing
|
||||||
|
assert "required_by" in missing
|
||||||
|
# New fields
|
||||||
|
assert "fetch_attempted" in missing
|
||||||
|
assert "fetch_error" in missing # May be None
|
||||||
|
|
||||||
|
finally:
|
||||||
|
integration_client.delete(f"/api/v1/projects/{dep_project_name}")
|
||||||
|
|||||||
374
backend/tests/unit/test_cache_service.py
Normal file
374
backend/tests/unit/test_cache_service.py
Normal file
@@ -0,0 +1,374 @@
|
|||||||
|
"""Tests for CacheService."""
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import MagicMock, AsyncMock, patch
|
||||||
|
|
||||||
|
|
||||||
|
class TestCacheCategory:
|
||||||
|
"""Tests for cache category enum."""
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
def test_immutable_categories_have_no_ttl(self):
|
||||||
|
"""Immutable categories should return None for TTL."""
|
||||||
|
from app.cache_service import CacheCategory, get_category_ttl
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
|
||||||
|
assert get_category_ttl(CacheCategory.ARTIFACT_METADATA, settings) is None
|
||||||
|
assert get_category_ttl(CacheCategory.ARTIFACT_DEPENDENCIES, settings) is None
|
||||||
|
assert get_category_ttl(CacheCategory.DEPENDENCY_RESOLUTION, settings) is None
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
def test_mutable_categories_have_ttl(self):
|
||||||
|
"""Mutable categories should return configured TTL."""
|
||||||
|
from app.cache_service import CacheCategory, get_category_ttl
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(
|
||||||
|
cache_ttl_index=300,
|
||||||
|
cache_ttl_upstream=3600,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert get_category_ttl(CacheCategory.PACKAGE_INDEX, settings) == 300
|
||||||
|
assert get_category_ttl(CacheCategory.UPSTREAM_SOURCES, settings) == 3600
|
||||||
|
|
||||||
|
|
||||||
|
class TestCacheService:
|
||||||
|
"""Tests for Redis cache service."""
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_disabled_cache_returns_none(self):
|
||||||
|
"""When Redis disabled, get() should return None."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
|
||||||
|
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key")
|
||||||
|
|
||||||
|
assert result is None
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_disabled_cache_set_is_noop(self):
|
||||||
|
"""When Redis disabled, set() should be a no-op."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"test-value")
|
||||||
|
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_cache_key_namespacing(self):
|
||||||
|
"""Cache keys should be properly namespaced."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
|
||||||
|
key = CacheService._make_key(CacheCategory.PACKAGE_INDEX, "pypi", "numpy")
|
||||||
|
|
||||||
|
assert key == "orchard:index:pypi:numpy"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_ping_returns_false_when_disabled(self):
|
||||||
|
"""ping() should return False when Redis is disabled."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
|
||||||
|
result = await cache.ping()
|
||||||
|
|
||||||
|
assert result is False
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_enabled_property(self):
|
||||||
|
"""enabled property should reflect Redis state."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
assert cache.enabled is False
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_delete_is_noop_when_disabled(self):
|
||||||
|
"""delete() should be a no-op when Redis is disabled."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
await cache.delete(CacheCategory.PACKAGE_INDEX, "test-key")
|
||||||
|
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_invalidate_pattern_returns_zero_when_disabled(self):
|
||||||
|
"""invalidate_pattern() should return 0 when Redis is disabled."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
|
||||||
|
result = await cache.invalidate_pattern(CacheCategory.PACKAGE_INDEX)
|
||||||
|
|
||||||
|
assert result == 0
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_startup_already_started(self):
|
||||||
|
"""startup() should be idempotent."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
await cache.startup()
|
||||||
|
await cache.startup() # Should not raise
|
||||||
|
|
||||||
|
assert cache._started is True
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_shutdown_not_started(self):
|
||||||
|
"""shutdown() should handle not-started state."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=False)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
await cache.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_make_key_with_default_protocol(self):
|
||||||
|
"""_make_key should work with default protocol."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
|
||||||
|
key = CacheService._make_key(CacheCategory.ARTIFACT_METADATA, "default", "abc123")
|
||||||
|
|
||||||
|
assert key == "orchard:artifact:default:abc123"
|
||||||
|
|
||||||
|
|
||||||
|
class TestCacheServiceWithMockedRedis:
|
||||||
|
"""Tests for CacheService with mocked Redis client."""
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_get_returns_cached_value(self):
|
||||||
|
"""get() should return cached value when available."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
# Mock the redis client
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
mock_redis.get.return_value = b"cached-data"
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key", "pypi")
|
||||||
|
|
||||||
|
assert result == b"cached-data"
|
||||||
|
mock_redis.get.assert_called_once_with("orchard:index:pypi:test-key")
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_set_with_ttl(self):
|
||||||
|
"""set() should use setex for mutable categories."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True, cache_ttl_index=300)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"test-value", "pypi")
|
||||||
|
|
||||||
|
mock_redis.setex.assert_called_once_with(
|
||||||
|
"orchard:index:pypi:test-key", 300, b"test-value"
|
||||||
|
)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_set_without_ttl(self):
|
||||||
|
"""set() should use set (no expiry) for immutable categories."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
await cache.set(
|
||||||
|
CacheCategory.ARTIFACT_METADATA, "abc123", b"metadata", "pypi"
|
||||||
|
)
|
||||||
|
|
||||||
|
mock_redis.set.assert_called_once_with(
|
||||||
|
"orchard:artifact:pypi:abc123", b"metadata"
|
||||||
|
)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_delete_calls_redis_delete(self):
|
||||||
|
"""delete() should call Redis delete."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
await cache.delete(CacheCategory.PACKAGE_INDEX, "test-key", "pypi")
|
||||||
|
|
||||||
|
mock_redis.delete.assert_called_once_with("orchard:index:pypi:test-key")
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_invalidate_pattern_deletes_matching_keys(self):
|
||||||
|
"""invalidate_pattern() should delete all matching keys."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
|
||||||
|
# Create an async generator for scan_iter
|
||||||
|
async def mock_scan_iter(match=None):
|
||||||
|
for key in [b"orchard:index:pypi:numpy", b"orchard:index:pypi:requests"]:
|
||||||
|
yield key
|
||||||
|
|
||||||
|
mock_redis.scan_iter = mock_scan_iter
|
||||||
|
mock_redis.delete.return_value = 2
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
result = await cache.invalidate_pattern(CacheCategory.PACKAGE_INDEX, "*", "pypi")
|
||||||
|
|
||||||
|
assert result == 2
|
||||||
|
mock_redis.delete.assert_called_once()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_ping_returns_true_when_connected(self):
|
||||||
|
"""ping() should return True when Redis responds."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
mock_redis.ping.return_value = True
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
result = await cache.ping()
|
||||||
|
|
||||||
|
assert result is True
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_get_handles_exception(self):
|
||||||
|
"""get() should return None and log warning on exception."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
mock_redis.get.side_effect = Exception("Connection lost")
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key")
|
||||||
|
|
||||||
|
assert result is None
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_set_handles_exception(self):
|
||||||
|
"""set() should log warning on exception."""
|
||||||
|
from app.cache_service import CacheService, CacheCategory
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True, cache_ttl_index=300)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
mock_redis.setex.side_effect = Exception("Connection lost")
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
# Should not raise
|
||||||
|
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"value")
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_ping_returns_false_on_exception(self):
|
||||||
|
"""ping() should return False on exception."""
|
||||||
|
from app.cache_service import CacheService
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(redis_enabled=True)
|
||||||
|
cache = CacheService(settings)
|
||||||
|
|
||||||
|
mock_redis = AsyncMock()
|
||||||
|
mock_redis.ping.side_effect = Exception("Connection lost")
|
||||||
|
cache._redis = mock_redis
|
||||||
|
cache._enabled = True
|
||||||
|
cache._started = True
|
||||||
|
|
||||||
|
result = await cache.ping()
|
||||||
|
|
||||||
|
assert result is False
|
||||||
|
|
||||||
167
backend/tests/unit/test_db_utils.py
Normal file
167
backend/tests/unit/test_db_utils.py
Normal file
@@ -0,0 +1,167 @@
|
|||||||
|
"""Tests for database utility functions."""
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
|
|
||||||
|
class TestArtifactRepository:
|
||||||
|
"""Tests for ArtifactRepository."""
|
||||||
|
|
||||||
|
def test_batch_dependency_values_formatting(self):
|
||||||
|
"""batch_upsert_dependencies should format values correctly."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
deps = [
|
||||||
|
("_pypi", "numpy", ">=1.21.0"),
|
||||||
|
("_pypi", "requests", "*"),
|
||||||
|
("myproject", "mylib", "==1.0.0"),
|
||||||
|
]
|
||||||
|
|
||||||
|
values = ArtifactRepository._format_dependency_values("abc123", deps)
|
||||||
|
|
||||||
|
assert len(values) == 3
|
||||||
|
assert values[0] == {
|
||||||
|
"artifact_id": "abc123",
|
||||||
|
"dependency_project": "_pypi",
|
||||||
|
"dependency_package": "numpy",
|
||||||
|
"version_constraint": ">=1.21.0",
|
||||||
|
}
|
||||||
|
assert values[2]["dependency_project"] == "myproject"
|
||||||
|
|
||||||
|
def test_empty_dependencies_returns_empty_list(self):
|
||||||
|
"""Empty dependency list should return empty values."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
values = ArtifactRepository._format_dependency_values("abc123", [])
|
||||||
|
|
||||||
|
assert values == []
|
||||||
|
|
||||||
|
def test_format_dependency_values_preserves_special_characters(self):
|
||||||
|
"""Version constraints with special characters should be preserved."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
deps = [
|
||||||
|
("_pypi", "package-name", ">=1.0.0,<2.0.0"),
|
||||||
|
("_pypi", "another_pkg", "~=1.4.2"),
|
||||||
|
]
|
||||||
|
|
||||||
|
values = ArtifactRepository._format_dependency_values("hash123", deps)
|
||||||
|
|
||||||
|
assert values[0]["version_constraint"] == ">=1.0.0,<2.0.0"
|
||||||
|
assert values[1]["version_constraint"] == "~=1.4.2"
|
||||||
|
|
||||||
|
def test_batch_upsert_dependencies_returns_zero_for_empty(self):
|
||||||
|
"""batch_upsert_dependencies should return 0 for empty list without DB call."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
|
||||||
|
result = repo.batch_upsert_dependencies("abc123", [])
|
||||||
|
|
||||||
|
assert result == 0
|
||||||
|
# Verify no DB operations were performed
|
||||||
|
mock_db.execute.assert_not_called()
|
||||||
|
|
||||||
|
def test_get_or_create_artifact_builds_correct_statement(self):
|
||||||
|
"""get_or_create_artifact should use ON CONFLICT DO UPDATE."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
from app.models import Artifact
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_result = MagicMock()
|
||||||
|
mock_artifact = MagicMock()
|
||||||
|
mock_artifact.ref_count = 1
|
||||||
|
mock_result.scalar_one.return_value = mock_artifact
|
||||||
|
mock_db.execute.return_value = mock_result
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
artifact, created = repo.get_or_create_artifact(
|
||||||
|
sha256="abc123def456",
|
||||||
|
size=1024,
|
||||||
|
filename="test.whl",
|
||||||
|
content_type="application/zip",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert mock_db.execute.called
|
||||||
|
assert created is True
|
||||||
|
assert artifact == mock_artifact
|
||||||
|
|
||||||
|
def test_get_or_create_artifact_existing_not_created(self):
|
||||||
|
"""get_or_create_artifact should return created=False for existing artifact."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_result = MagicMock()
|
||||||
|
mock_artifact = MagicMock()
|
||||||
|
mock_artifact.ref_count = 5 # Existing artifact with ref_count > 1
|
||||||
|
mock_result.scalar_one.return_value = mock_artifact
|
||||||
|
mock_db.execute.return_value = mock_result
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
artifact, created = repo.get_or_create_artifact(
|
||||||
|
sha256="abc123def456",
|
||||||
|
size=1024,
|
||||||
|
filename="test.whl",
|
||||||
|
)
|
||||||
|
|
||||||
|
assert created is False
|
||||||
|
|
||||||
|
def test_get_cached_url_with_artifact_returns_tuple(self):
|
||||||
|
"""get_cached_url_with_artifact should return (CachedUrl, Artifact) tuple."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_cached_url = MagicMock()
|
||||||
|
mock_artifact = MagicMock()
|
||||||
|
mock_db.query.return_value.join.return_value.filter.return_value.first.return_value = (
|
||||||
|
mock_cached_url,
|
||||||
|
mock_artifact,
|
||||||
|
)
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
result = repo.get_cached_url_with_artifact("url_hash_123")
|
||||||
|
|
||||||
|
assert result == (mock_cached_url, mock_artifact)
|
||||||
|
|
||||||
|
def test_get_cached_url_with_artifact_returns_none_when_not_found(self):
|
||||||
|
"""get_cached_url_with_artifact should return None when URL not cached."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_db.query.return_value.join.return_value.filter.return_value.first.return_value = None
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
result = repo.get_cached_url_with_artifact("nonexistent_hash")
|
||||||
|
|
||||||
|
assert result is None
|
||||||
|
|
||||||
|
def test_get_artifact_dependencies_returns_list(self):
|
||||||
|
"""get_artifact_dependencies should return list of dependencies."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_dep1 = MagicMock()
|
||||||
|
mock_dep2 = MagicMock()
|
||||||
|
mock_db.query.return_value.filter.return_value.all.return_value = [
|
||||||
|
mock_dep1,
|
||||||
|
mock_dep2,
|
||||||
|
]
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
result = repo.get_artifact_dependencies("artifact_hash_123")
|
||||||
|
|
||||||
|
assert len(result) == 2
|
||||||
|
assert result[0] == mock_dep1
|
||||||
|
assert result[1] == mock_dep2
|
||||||
|
|
||||||
|
def test_get_artifact_dependencies_returns_empty_list(self):
|
||||||
|
"""get_artifact_dependencies should return empty list when no dependencies."""
|
||||||
|
from app.db_utils import ArtifactRepository
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_db.query.return_value.filter.return_value.all.return_value = []
|
||||||
|
|
||||||
|
repo = ArtifactRepository(mock_db)
|
||||||
|
result = repo.get_artifact_dependencies("artifact_without_deps")
|
||||||
|
|
||||||
|
assert result == []
|
||||||
194
backend/tests/unit/test_http_client.py
Normal file
194
backend/tests/unit/test_http_client.py
Normal file
@@ -0,0 +1,194 @@
|
|||||||
|
"""Tests for HttpClientManager."""
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import MagicMock, AsyncMock, patch
|
||||||
|
|
||||||
|
|
||||||
|
class TestHttpClientManager:
|
||||||
|
"""Tests for HTTP client pool management."""
|
||||||
|
|
||||||
|
@pytest.mark.unit
|
||||||
|
def test_manager_initializes_with_settings(self):
|
||||||
|
"""Manager should initialize with config settings."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(
|
||||||
|
http_max_connections=50,
|
||||||
|
http_connect_timeout=15.0,
|
||||||
|
)
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
assert manager.max_connections == 50
|
||||||
|
assert manager.connect_timeout == 15.0
|
||||||
|
assert manager._default_client is None # Not started yet
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_startup_creates_client(self):
|
||||||
|
"""Startup should create the default async client."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
await manager.startup()
|
||||||
|
|
||||||
|
assert manager._default_client is not None
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_shutdown_closes_client(self):
|
||||||
|
"""Shutdown should close all clients gracefully."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
await manager.startup()
|
||||||
|
client = manager._default_client
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
assert manager._default_client is None
|
||||||
|
assert client.is_closed
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_get_client_returns_default(self):
|
||||||
|
"""get_client() should return the default client."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
await manager.startup()
|
||||||
|
|
||||||
|
client = manager.get_client()
|
||||||
|
|
||||||
|
assert client is manager._default_client
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_get_client_raises_if_not_started(self):
|
||||||
|
"""get_client() should raise RuntimeError if manager not started."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="not started"):
|
||||||
|
manager.get_client()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_run_blocking_executes_in_thread_pool(self):
|
||||||
|
"""run_blocking should execute sync functions in thread pool."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
import threading
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
await manager.startup()
|
||||||
|
|
||||||
|
main_thread = threading.current_thread()
|
||||||
|
execution_thread = None
|
||||||
|
|
||||||
|
def blocking_func():
|
||||||
|
nonlocal execution_thread
|
||||||
|
execution_thread = threading.current_thread()
|
||||||
|
return "result"
|
||||||
|
|
||||||
|
result = await manager.run_blocking(blocking_func)
|
||||||
|
|
||||||
|
assert result == "result"
|
||||||
|
assert execution_thread is not main_thread
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_run_blocking_raises_if_not_started(self):
|
||||||
|
"""run_blocking should raise RuntimeError if manager not started."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
with pytest.raises(RuntimeError, match="not started"):
|
||||||
|
await manager.run_blocking(lambda: None)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_startup_idempotent(self):
|
||||||
|
"""Calling startup multiple times should be safe."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
await manager.startup()
|
||||||
|
client1 = manager._default_client
|
||||||
|
|
||||||
|
await manager.startup() # Should not create a new client
|
||||||
|
client2 = manager._default_client
|
||||||
|
|
||||||
|
assert client1 is client2 # Same client instance
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_shutdown_idempotent(self):
|
||||||
|
"""Calling shutdown multiple times should be safe."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
await manager.startup()
|
||||||
|
await manager.shutdown()
|
||||||
|
await manager.shutdown() # Should not raise
|
||||||
|
|
||||||
|
assert manager._default_client is None
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_properties_return_configured_values(self):
|
||||||
|
"""Properties should return configured values."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings(
|
||||||
|
http_max_connections=75,
|
||||||
|
http_worker_threads=16,
|
||||||
|
)
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
await manager.startup()
|
||||||
|
|
||||||
|
assert manager.pool_size == 75
|
||||||
|
assert manager.executor_max == 16
|
||||||
|
|
||||||
|
await manager.shutdown()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
@pytest.mark.unit
|
||||||
|
async def test_active_connections_when_not_started(self):
|
||||||
|
"""active_connections should return 0 when not started."""
|
||||||
|
from app.http_client import HttpClientManager
|
||||||
|
from app.config import Settings
|
||||||
|
|
||||||
|
settings = Settings()
|
||||||
|
manager = HttpClientManager(settings)
|
||||||
|
|
||||||
|
assert manager.active_connections == 0
|
||||||
243
backend/tests/unit/test_metadata.py
Normal file
243
backend/tests/unit/test_metadata.py
Normal file
@@ -0,0 +1,243 @@
|
|||||||
|
"""Unit tests for metadata extraction functionality."""
|
||||||
|
|
||||||
|
import io
|
||||||
|
import gzip
|
||||||
|
import tarfile
|
||||||
|
import zipfile
|
||||||
|
import pytest
|
||||||
|
from app.metadata import (
|
||||||
|
extract_metadata,
|
||||||
|
extract_deb_metadata,
|
||||||
|
extract_wheel_metadata,
|
||||||
|
extract_tarball_metadata,
|
||||||
|
extract_jar_metadata,
|
||||||
|
parse_deb_control,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestDebMetadata:
|
||||||
|
"""Tests for Debian package metadata extraction."""
|
||||||
|
|
||||||
|
def test_parse_deb_control_basic(self):
|
||||||
|
"""Test parsing a basic control file."""
|
||||||
|
control = """Package: my-package
|
||||||
|
Version: 1.2.3
|
||||||
|
Architecture: amd64
|
||||||
|
Maintainer: Test <test@example.com>
|
||||||
|
Description: A test package
|
||||||
|
"""
|
||||||
|
result = parse_deb_control(control)
|
||||||
|
assert result["package_name"] == "my-package"
|
||||||
|
assert result["version"] == "1.2.3"
|
||||||
|
assert result["architecture"] == "amd64"
|
||||||
|
assert result["format"] == "deb"
|
||||||
|
|
||||||
|
def test_parse_deb_control_with_epoch(self):
|
||||||
|
"""Test parsing version with epoch."""
|
||||||
|
control = """Package: another-pkg
|
||||||
|
Version: 2:1.0.0-1
|
||||||
|
"""
|
||||||
|
result = parse_deb_control(control)
|
||||||
|
assert result["version"] == "2:1.0.0-1"
|
||||||
|
assert result["package_name"] == "another-pkg"
|
||||||
|
assert result["format"] == "deb"
|
||||||
|
|
||||||
|
def test_extract_deb_metadata_invalid_magic(self):
|
||||||
|
"""Test that invalid ar magic returns empty dict."""
|
||||||
|
file = io.BytesIO(b"not an ar archive")
|
||||||
|
result = extract_deb_metadata(file)
|
||||||
|
assert result == {}
|
||||||
|
|
||||||
|
def test_extract_deb_metadata_valid_ar_no_control(self):
|
||||||
|
"""Test ar archive without control.tar returns empty."""
|
||||||
|
# Create minimal ar archive with just debian-binary
|
||||||
|
ar_data = b"!<arch>\n"
|
||||||
|
ar_data += b"debian-binary/ 0 0 0 100644 4 `\n"
|
||||||
|
ar_data += b"2.0\n"
|
||||||
|
|
||||||
|
file = io.BytesIO(ar_data)
|
||||||
|
result = extract_deb_metadata(file)
|
||||||
|
# Should return empty since no control.tar found
|
||||||
|
assert result == {} or "version" not in result
|
||||||
|
|
||||||
|
|
||||||
|
class TestWheelMetadata:
|
||||||
|
"""Tests for Python wheel metadata extraction."""
|
||||||
|
|
||||||
|
def _create_wheel_with_metadata(self, metadata_content: str) -> io.BytesIO:
|
||||||
|
"""Helper to create a wheel file with given METADATA content."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
with zipfile.ZipFile(buf, 'w') as zf:
|
||||||
|
zf.writestr('package-1.0.0.dist-info/METADATA', metadata_content)
|
||||||
|
buf.seek(0)
|
||||||
|
return buf
|
||||||
|
|
||||||
|
def test_extract_wheel_version(self):
|
||||||
|
"""Test extracting version from wheel METADATA."""
|
||||||
|
metadata = """Metadata-Version: 2.1
|
||||||
|
Name: my-package
|
||||||
|
Version: 2.3.4
|
||||||
|
Summary: A test package
|
||||||
|
"""
|
||||||
|
file = self._create_wheel_with_metadata(metadata)
|
||||||
|
result = extract_wheel_metadata(file)
|
||||||
|
assert result.get("version") == "2.3.4"
|
||||||
|
assert result.get("package_name") == "my-package"
|
||||||
|
assert result.get("format") == "wheel"
|
||||||
|
|
||||||
|
def test_extract_wheel_no_version(self):
|
||||||
|
"""Test wheel without version field."""
|
||||||
|
metadata = """Metadata-Version: 2.1
|
||||||
|
Name: no-version-pkg
|
||||||
|
"""
|
||||||
|
file = self._create_wheel_with_metadata(metadata)
|
||||||
|
result = extract_wheel_metadata(file)
|
||||||
|
assert "version" not in result
|
||||||
|
assert result.get("package_name") == "no-version-pkg"
|
||||||
|
assert result.get("format") == "wheel"
|
||||||
|
|
||||||
|
def test_extract_wheel_invalid_zip(self):
|
||||||
|
"""Test that invalid zip returns format-only dict."""
|
||||||
|
file = io.BytesIO(b"not a zip file")
|
||||||
|
result = extract_wheel_metadata(file)
|
||||||
|
assert result == {"format": "wheel"}
|
||||||
|
|
||||||
|
def test_extract_wheel_no_metadata_file(self):
|
||||||
|
"""Test wheel without METADATA file returns format-only dict."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
with zipfile.ZipFile(buf, 'w') as zf:
|
||||||
|
zf.writestr('some_file.py', 'print("hello")')
|
||||||
|
buf.seek(0)
|
||||||
|
result = extract_wheel_metadata(buf)
|
||||||
|
assert result == {"format": "wheel"}
|
||||||
|
|
||||||
|
|
||||||
|
class TestTarballMetadata:
|
||||||
|
"""Tests for tarball metadata extraction from filename."""
|
||||||
|
|
||||||
|
def test_extract_version_from_filename_standard(self):
|
||||||
|
"""Test standard package-version.tar.gz format."""
|
||||||
|
file = io.BytesIO(b"") # Content doesn't matter for filename extraction
|
||||||
|
result = extract_tarball_metadata(file, "mypackage-1.2.3.tar.gz")
|
||||||
|
assert result.get("version") == "1.2.3"
|
||||||
|
assert result.get("package_name") == "mypackage"
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
|
||||||
|
def test_extract_version_with_v_prefix(self):
|
||||||
|
"""Test version with v prefix."""
|
||||||
|
file = io.BytesIO(b"")
|
||||||
|
result = extract_tarball_metadata(file, "package-v2.0.0.tar.gz")
|
||||||
|
assert result.get("version") == "2.0.0"
|
||||||
|
assert result.get("package_name") == "package"
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
|
||||||
|
def test_extract_version_underscore_separator(self):
|
||||||
|
"""Test package_version format."""
|
||||||
|
file = io.BytesIO(b"")
|
||||||
|
result = extract_tarball_metadata(file, "my_package_3.1.4.tar.gz")
|
||||||
|
assert result.get("version") == "3.1.4"
|
||||||
|
assert result.get("package_name") == "my_package"
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
|
||||||
|
def test_extract_version_complex(self):
|
||||||
|
"""Test complex version string."""
|
||||||
|
file = io.BytesIO(b"")
|
||||||
|
result = extract_tarball_metadata(file, "package-1.0.0-beta.1.tar.gz")
|
||||||
|
# The regex handles versions with suffix like -beta_1
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
# May or may not extract version depending on regex match
|
||||||
|
if "version" in result:
|
||||||
|
assert result.get("package_name") == "package"
|
||||||
|
|
||||||
|
def test_extract_no_version_in_filename(self):
|
||||||
|
"""Test filename without version returns format-only dict."""
|
||||||
|
file = io.BytesIO(b"")
|
||||||
|
result = extract_tarball_metadata(file, "package.tar.gz")
|
||||||
|
# Should return format but no version
|
||||||
|
assert result.get("version") is None
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
|
||||||
|
|
||||||
|
class TestJarMetadata:
|
||||||
|
"""Tests for JAR/Java metadata extraction."""
|
||||||
|
|
||||||
|
def _create_jar_with_manifest(self, manifest_content: str) -> io.BytesIO:
|
||||||
|
"""Helper to create a JAR file with given MANIFEST.MF content."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
with zipfile.ZipFile(buf, 'w') as zf:
|
||||||
|
zf.writestr('META-INF/MANIFEST.MF', manifest_content)
|
||||||
|
buf.seek(0)
|
||||||
|
return buf
|
||||||
|
|
||||||
|
def test_extract_jar_version_from_manifest(self):
|
||||||
|
"""Test extracting version from MANIFEST.MF."""
|
||||||
|
manifest = """Manifest-Version: 1.0
|
||||||
|
Implementation-Title: my-library
|
||||||
|
Implementation-Version: 4.5.6
|
||||||
|
"""
|
||||||
|
file = self._create_jar_with_manifest(manifest)
|
||||||
|
result = extract_jar_metadata(file)
|
||||||
|
assert result.get("version") == "4.5.6"
|
||||||
|
assert result.get("package_name") == "my-library"
|
||||||
|
assert result.get("format") == "jar"
|
||||||
|
|
||||||
|
def test_extract_jar_bundle_version(self):
|
||||||
|
"""Test extracting OSGi Bundle-Version."""
|
||||||
|
manifest = """Manifest-Version: 1.0
|
||||||
|
Bundle-Version: 2.1.0
|
||||||
|
Bundle-Name: Test Bundle
|
||||||
|
"""
|
||||||
|
file = self._create_jar_with_manifest(manifest)
|
||||||
|
result = extract_jar_metadata(file)
|
||||||
|
# Bundle-Version is stored in bundle_version, not version
|
||||||
|
assert result.get("bundle_version") == "2.1.0"
|
||||||
|
assert result.get("bundle_name") == "Test Bundle"
|
||||||
|
assert result.get("format") == "jar"
|
||||||
|
|
||||||
|
def test_extract_jar_invalid_zip(self):
|
||||||
|
"""Test that invalid JAR returns format-only dict."""
|
||||||
|
file = io.BytesIO(b"not a jar file")
|
||||||
|
result = extract_jar_metadata(file)
|
||||||
|
assert result == {"format": "jar"}
|
||||||
|
|
||||||
|
|
||||||
|
class TestExtractMetadataDispatch:
|
||||||
|
"""Tests for the main extract_metadata dispatcher function."""
|
||||||
|
|
||||||
|
def test_dispatch_to_wheel(self):
|
||||||
|
"""Test that .whl files use wheel extractor."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
with zipfile.ZipFile(buf, 'w') as zf:
|
||||||
|
zf.writestr('pkg-1.0.dist-info/METADATA', 'Version: 1.0.0\nName: pkg')
|
||||||
|
buf.seek(0)
|
||||||
|
|
||||||
|
result = extract_metadata(buf, "package-1.0.0-py3-none-any.whl")
|
||||||
|
assert result.get("version") == "1.0.0"
|
||||||
|
assert result.get("package_name") == "pkg"
|
||||||
|
assert result.get("format") == "wheel"
|
||||||
|
|
||||||
|
def test_dispatch_to_tarball(self):
|
||||||
|
"""Test that .tar.gz files use tarball extractor."""
|
||||||
|
file = io.BytesIO(b"")
|
||||||
|
result = extract_metadata(file, "mypackage-2.3.4.tar.gz")
|
||||||
|
assert result.get("version") == "2.3.4"
|
||||||
|
assert result.get("package_name") == "mypackage"
|
||||||
|
assert result.get("format") == "tarball"
|
||||||
|
|
||||||
|
def test_dispatch_unknown_extension(self):
|
||||||
|
"""Test that unknown extensions return empty dict."""
|
||||||
|
file = io.BytesIO(b"some content")
|
||||||
|
result = extract_metadata(file, "unknown.xyz")
|
||||||
|
assert result == {}
|
||||||
|
|
||||||
|
def test_file_position_reset_after_extraction(self):
|
||||||
|
"""Test that file position is reset to start after extraction."""
|
||||||
|
buf = io.BytesIO()
|
||||||
|
with zipfile.ZipFile(buf, 'w') as zf:
|
||||||
|
zf.writestr('pkg-1.0.dist-info/METADATA', 'Version: 1.0.0\nName: pkg')
|
||||||
|
buf.seek(0)
|
||||||
|
|
||||||
|
extract_metadata(buf, "package.whl")
|
||||||
|
|
||||||
|
# File should be back at position 0
|
||||||
|
assert buf.tell() == 0
|
||||||
85
backend/tests/unit/test_pypi_proxy.py
Normal file
85
backend/tests/unit/test_pypi_proxy.py
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
"""Unit tests for PyPI proxy functionality."""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from app.pypi_proxy import _parse_requires_dist
|
||||||
|
|
||||||
|
|
||||||
|
class TestParseRequiresDist:
|
||||||
|
"""Tests for _parse_requires_dist function."""
|
||||||
|
|
||||||
|
def test_simple_package(self):
|
||||||
|
"""Test parsing a simple package name."""
|
||||||
|
name, version = _parse_requires_dist("numpy")
|
||||||
|
assert name == "numpy"
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_package_with_version(self):
|
||||||
|
"""Test parsing package with version constraint."""
|
||||||
|
name, version = _parse_requires_dist("numpy>=1.21.0")
|
||||||
|
assert name == "numpy"
|
||||||
|
assert version == ">=1.21.0"
|
||||||
|
|
||||||
|
def test_package_with_parenthesized_version(self):
|
||||||
|
"""Test parsing package with parenthesized version."""
|
||||||
|
name, version = _parse_requires_dist("requests (>=2.25.0)")
|
||||||
|
assert name == "requests"
|
||||||
|
assert version == ">=2.25.0"
|
||||||
|
|
||||||
|
def test_package_with_python_version_marker(self):
|
||||||
|
"""Test that python_version markers are preserved but marker stripped."""
|
||||||
|
name, version = _parse_requires_dist("typing-extensions; python_version < '3.8'")
|
||||||
|
assert name == "typing-extensions"
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_filters_extra_dependencies(self):
|
||||||
|
"""Test that extra dependencies are filtered out."""
|
||||||
|
# Extra dependencies should return (None, None)
|
||||||
|
name, version = _parse_requires_dist("pytest; extra == 'test'")
|
||||||
|
assert name is None
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
name, version = _parse_requires_dist("sphinx; extra == 'docs'")
|
||||||
|
assert name is None
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_filters_platform_specific_darwin(self):
|
||||||
|
"""Test that macOS-specific dependencies are filtered out."""
|
||||||
|
name, version = _parse_requires_dist("pyobjc; sys_platform == 'darwin'")
|
||||||
|
assert name is None
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_filters_platform_specific_win32(self):
|
||||||
|
"""Test that Windows-specific dependencies are filtered out."""
|
||||||
|
name, version = _parse_requires_dist("pywin32; sys_platform == 'win32'")
|
||||||
|
assert name is None
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_filters_platform_system_marker(self):
|
||||||
|
"""Test that platform_system markers are filtered out."""
|
||||||
|
name, version = _parse_requires_dist("jaraco-windows; platform_system == 'Windows'")
|
||||||
|
assert name is None
|
||||||
|
assert version is None
|
||||||
|
|
||||||
|
def test_normalizes_package_name(self):
|
||||||
|
"""Test that package names are normalized (PEP 503)."""
|
||||||
|
name, version = _parse_requires_dist("Typing_Extensions>=3.7.4")
|
||||||
|
assert name == "typing-extensions"
|
||||||
|
assert version == ">=3.7.4"
|
||||||
|
|
||||||
|
def test_complex_version_constraint(self):
|
||||||
|
"""Test parsing complex version constraints."""
|
||||||
|
name, version = _parse_requires_dist("gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1")
|
||||||
|
assert name == "gast"
|
||||||
|
assert version == "!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1"
|
||||||
|
|
||||||
|
def test_version_range(self):
|
||||||
|
"""Test parsing version range constraints."""
|
||||||
|
name, version = _parse_requires_dist("grpcio<2.0,>=1.24.3")
|
||||||
|
assert name == "grpcio"
|
||||||
|
assert version == "<2.0,>=1.24.3"
|
||||||
|
|
||||||
|
def test_tilde_version(self):
|
||||||
|
"""Test parsing tilde version constraints."""
|
||||||
|
name, version = _parse_requires_dist("tensorboard~=2.20.0")
|
||||||
|
assert name == "tensorboard"
|
||||||
|
assert version == "~=2.20.0"
|
||||||
65
backend/tests/unit/test_rate_limit.py
Normal file
65
backend/tests/unit/test_rate_limit.py
Normal file
@@ -0,0 +1,65 @@
|
|||||||
|
"""Unit tests for rate limiting configuration."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
|
||||||
|
class TestRateLimitConfiguration:
|
||||||
|
"""Tests for rate limit configuration."""
|
||||||
|
|
||||||
|
def test_default_login_rate_limit(self):
|
||||||
|
"""Test default login rate limit is 5/minute."""
|
||||||
|
# Import fresh to get default value
|
||||||
|
import importlib
|
||||||
|
import app.rate_limit as rate_limit_module
|
||||||
|
|
||||||
|
# Save original env value
|
||||||
|
original = os.environ.get("ORCHARD_LOGIN_RATE_LIMIT")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Clear env variable to test default
|
||||||
|
if "ORCHARD_LOGIN_RATE_LIMIT" in os.environ:
|
||||||
|
del os.environ["ORCHARD_LOGIN_RATE_LIMIT"]
|
||||||
|
|
||||||
|
# Reload module to pick up new env
|
||||||
|
importlib.reload(rate_limit_module)
|
||||||
|
|
||||||
|
assert rate_limit_module.LOGIN_RATE_LIMIT == "5/minute"
|
||||||
|
finally:
|
||||||
|
# Restore original env value
|
||||||
|
if original is not None:
|
||||||
|
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = original
|
||||||
|
importlib.reload(rate_limit_module)
|
||||||
|
|
||||||
|
def test_custom_login_rate_limit(self):
|
||||||
|
"""Test custom login rate limit from environment."""
|
||||||
|
import importlib
|
||||||
|
import app.rate_limit as rate_limit_module
|
||||||
|
|
||||||
|
# Save original env value
|
||||||
|
original = os.environ.get("ORCHARD_LOGIN_RATE_LIMIT")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Set custom rate limit
|
||||||
|
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = "10/minute"
|
||||||
|
|
||||||
|
# Reload module to pick up new env
|
||||||
|
importlib.reload(rate_limit_module)
|
||||||
|
|
||||||
|
assert rate_limit_module.LOGIN_RATE_LIMIT == "10/minute"
|
||||||
|
finally:
|
||||||
|
# Restore original env value
|
||||||
|
if original is not None:
|
||||||
|
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = original
|
||||||
|
else:
|
||||||
|
if "ORCHARD_LOGIN_RATE_LIMIT" in os.environ:
|
||||||
|
del os.environ["ORCHARD_LOGIN_RATE_LIMIT"]
|
||||||
|
importlib.reload(rate_limit_module)
|
||||||
|
|
||||||
|
def test_limiter_exists(self):
|
||||||
|
"""Test that limiter object is created."""
|
||||||
|
from app.rate_limit import limiter
|
||||||
|
|
||||||
|
assert limiter is not None
|
||||||
|
# Limiter should have a key_func set
|
||||||
|
assert limiter._key_func is not None
|
||||||
300
backend/tests/unit/test_registry_client.py
Normal file
300
backend/tests/unit/test_registry_client.py
Normal file
@@ -0,0 +1,300 @@
|
|||||||
|
"""Unit tests for registry client functionality."""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import AsyncMock, MagicMock, patch
|
||||||
|
import httpx
|
||||||
|
from packaging.specifiers import SpecifierSet
|
||||||
|
|
||||||
|
from app.registry_client import (
|
||||||
|
PyPIRegistryClient,
|
||||||
|
VersionInfo,
|
||||||
|
FetchResult,
|
||||||
|
get_registry_client,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestPyPIRegistryClient:
|
||||||
|
"""Tests for PyPI registry client."""
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def mock_http_client(self):
|
||||||
|
"""Create a mock async HTTP client."""
|
||||||
|
return AsyncMock(spec=httpx.AsyncClient)
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def client(self, mock_http_client):
|
||||||
|
"""Create a PyPI registry client with mocked HTTP."""
|
||||||
|
return PyPIRegistryClient(
|
||||||
|
http_client=mock_http_client,
|
||||||
|
upstream_sources=[],
|
||||||
|
pypi_api_url="https://pypi.org/pypi",
|
||||||
|
)
|
||||||
|
|
||||||
|
def test_source_type(self, client):
|
||||||
|
"""Test source_type returns 'pypi'."""
|
||||||
|
assert client.source_type == "pypi"
|
||||||
|
|
||||||
|
def test_normalize_package_name(self, client):
|
||||||
|
"""Test package name normalization per PEP 503."""
|
||||||
|
assert client._normalize_package_name("My_Package") == "my-package"
|
||||||
|
assert client._normalize_package_name("my.package") == "my-package"
|
||||||
|
assert client._normalize_package_name("my-package") == "my-package"
|
||||||
|
assert client._normalize_package_name("MY-PACKAGE") == "my-package"
|
||||||
|
assert client._normalize_package_name("my__package") == "my-package"
|
||||||
|
assert client._normalize_package_name("my..package") == "my-package"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_available_versions_success(self, client, mock_http_client):
|
||||||
|
"""Test fetching available versions from PyPI."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {
|
||||||
|
"releases": {
|
||||||
|
"1.0.0": [{"packagetype": "bdist_wheel"}],
|
||||||
|
"1.1.0": [{"packagetype": "bdist_wheel"}],
|
||||||
|
"2.0.0": [{"packagetype": "bdist_wheel"}],
|
||||||
|
}
|
||||||
|
}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
versions = await client.get_available_versions("test-package")
|
||||||
|
|
||||||
|
assert "1.0.0" in versions
|
||||||
|
assert "1.1.0" in versions
|
||||||
|
assert "2.0.0" in versions
|
||||||
|
mock_http_client.get.assert_called_once()
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_available_versions_empty(self, client, mock_http_client):
|
||||||
|
"""Test handling package with no releases."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {"releases": {}}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
versions = await client.get_available_versions("empty-package")
|
||||||
|
|
||||||
|
assert versions == []
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_get_available_versions_404(self, client, mock_http_client):
|
||||||
|
"""Test handling non-existent package."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 404
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
versions = await client.get_available_versions("nonexistent")
|
||||||
|
|
||||||
|
assert versions == []
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_constraint_wildcard(self, client, mock_http_client):
|
||||||
|
"""Test resolving wildcard constraint returns latest."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {
|
||||||
|
"info": {"version": "2.0.0"},
|
||||||
|
"releases": {
|
||||||
|
"1.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
|
||||||
|
"filename": "test-1.0.0.whl",
|
||||||
|
"digests": {"sha256": "abc123"},
|
||||||
|
"size": 1000,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"2.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
|
||||||
|
"filename": "test-2.0.0.whl",
|
||||||
|
"digests": {"sha256": "def456"},
|
||||||
|
"size": 2000,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
result = await client.resolve_constraint("test-package", "*")
|
||||||
|
|
||||||
|
assert result is not None
|
||||||
|
assert result.version == "2.0.0"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_constraint_specific_version(self, client, mock_http_client):
|
||||||
|
"""Test resolving specific version constraint."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {
|
||||||
|
"releases": {
|
||||||
|
"1.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
|
||||||
|
"filename": "test-1.0.0.whl",
|
||||||
|
"digests": {"sha256": "abc123"},
|
||||||
|
"size": 1000,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"2.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
|
||||||
|
"filename": "test-2.0.0.whl",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
result = await client.resolve_constraint("test-package", ">=1.0.0,<2.0.0")
|
||||||
|
|
||||||
|
assert result is not None
|
||||||
|
assert result.version == "1.0.0"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_constraint_no_match(self, client, mock_http_client):
|
||||||
|
"""Test resolving constraint with no matching version."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {
|
||||||
|
"releases": {
|
||||||
|
"1.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
|
||||||
|
"filename": "test-1.0.0.whl",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
result = await client.resolve_constraint("test-package", ">=5.0.0")
|
||||||
|
|
||||||
|
assert result is None
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_resolve_constraint_bare_version(self, client, mock_http_client):
|
||||||
|
"""Test resolving bare version string as exact match."""
|
||||||
|
mock_response = MagicMock()
|
||||||
|
mock_response.status_code = 200
|
||||||
|
mock_response.json.return_value = {
|
||||||
|
"info": {"version": "2.0.0"},
|
||||||
|
"releases": {
|
||||||
|
"1.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
|
||||||
|
"filename": "test-1.0.0.whl",
|
||||||
|
"digests": {"sha256": "abc123"},
|
||||||
|
"size": 1000,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"2.0.0": [
|
||||||
|
{
|
||||||
|
"packagetype": "bdist_wheel",
|
||||||
|
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
|
||||||
|
"filename": "test-2.0.0.whl",
|
||||||
|
"digests": {"sha256": "def456"},
|
||||||
|
"size": 2000,
|
||||||
|
}
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
mock_http_client.get.return_value = mock_response
|
||||||
|
|
||||||
|
# Bare version "1.0.0" should resolve to exactly 1.0.0, not latest
|
||||||
|
result = await client.resolve_constraint("test-package", "1.0.0")
|
||||||
|
|
||||||
|
assert result is not None
|
||||||
|
assert result.version == "1.0.0"
|
||||||
|
|
||||||
|
|
||||||
|
class TestVersionInfo:
|
||||||
|
"""Tests for VersionInfo dataclass."""
|
||||||
|
|
||||||
|
def test_create_version_info(self):
|
||||||
|
"""Test creating VersionInfo with all fields."""
|
||||||
|
info = VersionInfo(
|
||||||
|
version="1.0.0",
|
||||||
|
download_url="https://example.com/pkg-1.0.0.whl",
|
||||||
|
filename="pkg-1.0.0.whl",
|
||||||
|
sha256="abc123",
|
||||||
|
size=5000,
|
||||||
|
content_type="application/zip",
|
||||||
|
)
|
||||||
|
assert info.version == "1.0.0"
|
||||||
|
assert info.download_url == "https://example.com/pkg-1.0.0.whl"
|
||||||
|
assert info.filename == "pkg-1.0.0.whl"
|
||||||
|
assert info.sha256 == "abc123"
|
||||||
|
assert info.size == 5000
|
||||||
|
|
||||||
|
def test_create_version_info_minimal(self):
|
||||||
|
"""Test creating VersionInfo with only required fields."""
|
||||||
|
info = VersionInfo(
|
||||||
|
version="1.0.0",
|
||||||
|
download_url="https://example.com/pkg.whl",
|
||||||
|
filename="pkg.whl",
|
||||||
|
)
|
||||||
|
assert info.sha256 is None
|
||||||
|
assert info.size is None
|
||||||
|
|
||||||
|
|
||||||
|
class TestFetchResult:
|
||||||
|
"""Tests for FetchResult dataclass."""
|
||||||
|
|
||||||
|
def test_create_fetch_result(self):
|
||||||
|
"""Test creating FetchResult."""
|
||||||
|
result = FetchResult(
|
||||||
|
artifact_id="abc123def456",
|
||||||
|
size=10000,
|
||||||
|
version="2.0.0",
|
||||||
|
filename="pkg-2.0.0.whl",
|
||||||
|
already_cached=True,
|
||||||
|
)
|
||||||
|
assert result.artifact_id == "abc123def456"
|
||||||
|
assert result.size == 10000
|
||||||
|
assert result.version == "2.0.0"
|
||||||
|
assert result.already_cached is True
|
||||||
|
|
||||||
|
def test_fetch_result_default_not_cached(self):
|
||||||
|
"""Test FetchResult defaults to not cached."""
|
||||||
|
result = FetchResult(
|
||||||
|
artifact_id="xyz",
|
||||||
|
size=100,
|
||||||
|
version="1.0.0",
|
||||||
|
filename="pkg.whl",
|
||||||
|
)
|
||||||
|
assert result.already_cached is False
|
||||||
|
|
||||||
|
|
||||||
|
class TestGetRegistryClient:
|
||||||
|
"""Tests for registry client factory function."""
|
||||||
|
|
||||||
|
def test_get_pypi_client(self):
|
||||||
|
"""Test getting PyPI client."""
|
||||||
|
mock_client = MagicMock()
|
||||||
|
mock_sources = []
|
||||||
|
|
||||||
|
client = get_registry_client("pypi", mock_client, mock_sources)
|
||||||
|
|
||||||
|
assert isinstance(client, PyPIRegistryClient)
|
||||||
|
|
||||||
|
def test_get_unsupported_client(self):
|
||||||
|
"""Test getting unsupported registry type returns None."""
|
||||||
|
mock_client = MagicMock()
|
||||||
|
|
||||||
|
client = get_registry_client("npm", mock_client, [])
|
||||||
|
|
||||||
|
assert client is None
|
||||||
|
|
||||||
|
def test_get_unknown_client(self):
|
||||||
|
"""Test getting unknown registry type returns None."""
|
||||||
|
mock_client = MagicMock()
|
||||||
|
|
||||||
|
client = get_registry_client("unknown", mock_client, [])
|
||||||
|
|
||||||
|
assert client is None
|
||||||
@@ -227,6 +227,21 @@
|
|||||||
line-height: 1.5;
|
line-height: 1.5;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
.graph-warning {
|
||||||
|
display: flex;
|
||||||
|
align-items: center;
|
||||||
|
gap: 8px;
|
||||||
|
padding: 8px 16px;
|
||||||
|
background: rgba(245, 158, 11, 0.1);
|
||||||
|
border-top: 1px solid rgba(245, 158, 11, 0.3);
|
||||||
|
color: var(--warning-color, #f59e0b);
|
||||||
|
font-size: 0.875rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.graph-warning svg {
|
||||||
|
flex-shrink: 0;
|
||||||
|
}
|
||||||
|
|
||||||
/* Missing Dependencies */
|
/* Missing Dependencies */
|
||||||
.missing-dependencies {
|
.missing-dependencies {
|
||||||
border-top: 1px solid var(--border-primary);
|
border-top: 1px solid var(--border-primary);
|
||||||
|
|||||||
@@ -106,6 +106,7 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
|
|||||||
|
|
||||||
const [loading, setLoading] = useState(true);
|
const [loading, setLoading] = useState(true);
|
||||||
const [error, setError] = useState<string | null>(null);
|
const [error, setError] = useState<string | null>(null);
|
||||||
|
const [warning, setWarning] = useState<string | null>(null);
|
||||||
const [resolution, setResolution] = useState<DependencyResolutionResponse | null>(null);
|
const [resolution, setResolution] = useState<DependencyResolutionResponse | null>(null);
|
||||||
const [nodes, setNodes, onNodesChange] = useNodesState<NodeData>([]);
|
const [nodes, setNodes, onNodesChange] = useNodesState<NodeData>([]);
|
||||||
const [edges, setEdges, onEdgesChange] = useEdgesState([]);
|
const [edges, setEdges, onEdgesChange] = useEdgesState([]);
|
||||||
@@ -127,16 +128,24 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
|
|||||||
|
|
||||||
// Fetch dependencies for each artifact
|
// Fetch dependencies for each artifact
|
||||||
const depsMap = new Map<string, Dependency[]>();
|
const depsMap = new Map<string, Dependency[]>();
|
||||||
|
const failedFetches: string[] = [];
|
||||||
|
|
||||||
for (const artifact of resolutionData.resolved) {
|
for (const artifact of resolutionData.resolved) {
|
||||||
try {
|
try {
|
||||||
const deps = await getArtifactDependencies(artifact.artifact_id);
|
const deps = await getArtifactDependencies(artifact.artifact_id);
|
||||||
depsMap.set(artifact.artifact_id, deps.dependencies);
|
depsMap.set(artifact.artifact_id, deps.dependencies);
|
||||||
} catch {
|
} catch (err) {
|
||||||
|
console.warn(`Failed to fetch dependencies for ${artifact.package}:`, err);
|
||||||
|
failedFetches.push(artifact.package);
|
||||||
depsMap.set(artifact.artifact_id, []);
|
depsMap.set(artifact.artifact_id, []);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Report warning if some fetches failed
|
||||||
|
if (failedFetches.length > 0) {
|
||||||
|
setWarning(`Could not load dependency details for: ${failedFetches.slice(0, 3).join(', ')}${failedFetches.length > 3 ? ` and ${failedFetches.length - 3} more` : ''}`);
|
||||||
|
}
|
||||||
|
|
||||||
// Find the root artifact
|
// Find the root artifact
|
||||||
const rootArtifact = resolutionData.resolved.find(
|
const rootArtifact = resolutionData.resolved.find(
|
||||||
a => a.project === resolutionData.requested.project &&
|
a => a.project === resolutionData.requested.project &&
|
||||||
@@ -324,6 +333,17 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
|
|||||||
)}
|
)}
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
{warning && (
|
||||||
|
<div className="graph-warning">
|
||||||
|
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
|
||||||
|
<path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"></path>
|
||||||
|
<line x1="12" y1="9" x2="12" y2="13"></line>
|
||||||
|
<line x1="12" y1="17" x2="12.01" y2="17"></line>
|
||||||
|
</svg>
|
||||||
|
<span>{warning}</span>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
{resolution && resolution.missing && resolution.missing.length > 0 && (
|
{resolution && resolution.missing && resolution.missing.length > 0 && (
|
||||||
<div className="missing-dependencies">
|
<div className="missing-dependencies">
|
||||||
<h3>Not Cached ({resolution.missing.length})</h3>
|
<h3>Not Cached ({resolution.missing.length})</h3>
|
||||||
|
|||||||
@@ -185,56 +185,6 @@ h2 {
|
|||||||
color: var(--warning-color, #f59e0b);
|
color: var(--warning-color, #f59e0b);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Usage Section */
|
|
||||||
.usage-section {
|
|
||||||
margin-top: 32px;
|
|
||||||
background: var(--bg-secondary);
|
|
||||||
}
|
|
||||||
|
|
||||||
.usage-section h3 {
|
|
||||||
margin-bottom: 12px;
|
|
||||||
color: var(--text-primary);
|
|
||||||
font-size: 1rem;
|
|
||||||
font-weight: 600;
|
|
||||||
}
|
|
||||||
|
|
||||||
.usage-section p {
|
|
||||||
color: var(--text-secondary);
|
|
||||||
margin-bottom: 12px;
|
|
||||||
font-size: 0.875rem;
|
|
||||||
}
|
|
||||||
|
|
||||||
.usage-section pre {
|
|
||||||
background: #0d0d0f;
|
|
||||||
border: 1px solid var(--border-primary);
|
|
||||||
padding: 16px 20px;
|
|
||||||
border-radius: var(--radius-md);
|
|
||||||
overflow-x: auto;
|
|
||||||
margin-bottom: 16px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.usage-section code {
|
|
||||||
font-family: 'JetBrains Mono', 'Fira Code', 'Consolas', monospace;
|
|
||||||
font-size: 0.8125rem;
|
|
||||||
color: #e2e8f0;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Syntax highlighting for code blocks */
|
|
||||||
.usage-section pre {
|
|
||||||
position: relative;
|
|
||||||
}
|
|
||||||
|
|
||||||
.usage-section pre::before {
|
|
||||||
content: 'bash';
|
|
||||||
position: absolute;
|
|
||||||
top: 8px;
|
|
||||||
right: 12px;
|
|
||||||
font-size: 0.6875rem;
|
|
||||||
color: var(--text-muted);
|
|
||||||
text-transform: uppercase;
|
|
||||||
letter-spacing: 0.05em;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Copy button for code blocks (optional enhancement) */
|
/* Copy button for code blocks (optional enhancement) */
|
||||||
.code-block {
|
.code-block {
|
||||||
position: relative;
|
position: relative;
|
||||||
|
|||||||
@@ -78,7 +78,7 @@ function PackagePage() {
|
|||||||
// Reverse dependencies state
|
// Reverse dependencies state
|
||||||
const [reverseDeps, setReverseDeps] = useState<DependentInfo[]>([]);
|
const [reverseDeps, setReverseDeps] = useState<DependentInfo[]>([]);
|
||||||
const [reverseDepsLoading, setReverseDepsLoading] = useState(false);
|
const [reverseDepsLoading, setReverseDepsLoading] = useState(false);
|
||||||
const [_reverseDepsError, setReverseDepsError] = useState<string | null>(null);
|
const [reverseDepsError, setReverseDepsError] = useState<string | null>(null);
|
||||||
const [reverseDepsPage, setReverseDepsPage] = useState(1);
|
const [reverseDepsPage, setReverseDepsPage] = useState(1);
|
||||||
const [reverseDepsTotal, setReverseDepsTotal] = useState(0);
|
const [reverseDepsTotal, setReverseDepsTotal] = useState(0);
|
||||||
const [reverseDepsHasMore, setReverseDepsHasMore] = useState(false);
|
const [reverseDepsHasMore, setReverseDepsHasMore] = useState(false);
|
||||||
@@ -647,10 +647,13 @@ function PackagePage() {
|
|||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
{/* Used By (Reverse Dependencies) Section - only show if there are reverse deps */}
|
{/* Used By (Reverse Dependencies) Section - only show if there are reverse deps or error */}
|
||||||
{reverseDeps.length > 0 && (
|
{(reverseDeps.length > 0 || reverseDepsError) && (
|
||||||
<div className="used-by-section card">
|
<div className="used-by-section card">
|
||||||
<h3>Used By</h3>
|
<h3>Used By</h3>
|
||||||
|
{reverseDepsError && (
|
||||||
|
<div className="error-message">{reverseDepsError}</div>
|
||||||
|
)}
|
||||||
<div className="reverse-deps-list">
|
<div className="reverse-deps-list">
|
||||||
<div className="deps-summary">
|
<div className="deps-summary">
|
||||||
{reverseDepsTotal} {reverseDepsTotal === 1 ? 'package depends' : 'packages depend'} on this:
|
{reverseDepsTotal} {reverseDepsTotal === 1 ? 'package depends' : 'packages depend'} on this:
|
||||||
@@ -696,18 +699,6 @@ function PackagePage() {
|
|||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
<div className="usage-section card">
|
|
||||||
<h3>Usage</h3>
|
|
||||||
<p>Download artifacts using:</p>
|
|
||||||
<pre>
|
|
||||||
<code>curl -O {window.location.origin}/api/v1/project/{projectName}/{packageName}/+/latest</code>
|
|
||||||
</pre>
|
|
||||||
<p>Or with a specific version:</p>
|
|
||||||
<pre>
|
|
||||||
<code>curl -O {window.location.origin}/api/v1/project/{projectName}/{packageName}/+/1.0.0</code>
|
|
||||||
</pre>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Dependency Graph Modal */}
|
{/* Dependency Graph Modal */}
|
||||||
{showGraph && selectedArtifact && (
|
{showGraph && selectedArtifact && (
|
||||||
<DependencyGraph
|
<DependencyGraph
|
||||||
|
|||||||
@@ -228,7 +228,7 @@ minioIngress:
|
|||||||
secretName: minio-tls # Overridden by CI
|
secretName: minio-tls # Overridden by CI
|
||||||
|
|
||||||
redis:
|
redis:
|
||||||
enabled: false
|
enabled: true
|
||||||
|
|
||||||
waitForDatabase: true
|
waitForDatabase: true
|
||||||
|
|
||||||
|
|||||||
@@ -140,7 +140,7 @@ minioIngress:
|
|||||||
enabled: false
|
enabled: false
|
||||||
|
|
||||||
redis:
|
redis:
|
||||||
enabled: false
|
enabled: true
|
||||||
|
|
||||||
waitForDatabase: true
|
waitForDatabase: true
|
||||||
|
|
||||||
|
|||||||
@@ -146,7 +146,7 @@ minioIngress:
|
|||||||
|
|
||||||
# Redis subchart configuration (for future caching)
|
# Redis subchart configuration (for future caching)
|
||||||
redis:
|
redis:
|
||||||
enabled: false
|
enabled: true
|
||||||
image:
|
image:
|
||||||
registry: containers.global.bsf.tools
|
registry: containers.global.bsf.tools
|
||||||
repository: bitnami/redis
|
repository: bitnami/redis
|
||||||
|
|||||||
19
provisioners/modules/aws-s3/.devcontainer/devcontainer.json
Normal file
19
provisioners/modules/aws-s3/.devcontainer/devcontainer.json
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
{
|
||||||
|
"name": "EC2 Provisioner Dev Container",
|
||||||
|
"image": "registry.global.bsf.tools/esv/bsf/bsf-integration/dev-env-setup/provisioner_image:v0.18.1",
|
||||||
|
"mounts": [
|
||||||
|
"source=${localEnv:HOME}/.ssh,target=/home/user/.ssh,type=bind,consistency=cached",
|
||||||
|
"source=${localEnv:HOME}/.okta,target=/home/user/.okta,type=bind,consistency=cached",
|
||||||
|
"source=${localEnv:HOME}/.netrc,target=/home/user/.netrc,type=bind,consistency=cached"
|
||||||
|
],
|
||||||
|
"forwardPorts": [
|
||||||
|
8000
|
||||||
|
],
|
||||||
|
"runArgs": [
|
||||||
|
"--network=host"
|
||||||
|
],
|
||||||
|
"containerUser": "ubuntu",
|
||||||
|
"remoteUser": "ubuntu",
|
||||||
|
"updateRemoteUserUID": true,
|
||||||
|
"onCreateCommand": "sudo usermod -s /bin/bash ubuntu"
|
||||||
|
}
|
||||||
70
provisioners/modules/aws-s3/data.tf
Normal file
70
provisioners/modules/aws-s3/data.tf
Normal file
@@ -0,0 +1,70 @@
|
|||||||
|
data "aws_caller_identity" "current" {}
|
||||||
|
|
||||||
|
# Main S3 bucket policy to reject HTTPS requests
|
||||||
|
data "aws_iam_policy_document" "s3_reject_https_policy" {
|
||||||
|
statement {
|
||||||
|
sid = "s3RejectHTTPS"
|
||||||
|
effect = "Deny"
|
||||||
|
|
||||||
|
principals {
|
||||||
|
type = "*"
|
||||||
|
identifiers = ["*"]
|
||||||
|
}
|
||||||
|
|
||||||
|
actions = ["s3:*"]
|
||||||
|
|
||||||
|
resources = [
|
||||||
|
aws_s3_bucket.s3_bucket.arn,
|
||||||
|
"${aws_s3_bucket.s3_bucket.arn}/*",
|
||||||
|
]
|
||||||
|
|
||||||
|
condition {
|
||||||
|
test = "Bool"
|
||||||
|
variable = "aws:SecureTransport"
|
||||||
|
values = ["false"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Logging bucket policy to reject HTTPS requests and take logs
|
||||||
|
data "aws_iam_policy_document" "logging_bucket_policy" {
|
||||||
|
statement {
|
||||||
|
principals {
|
||||||
|
identifiers = ["logging.s3.amazonaws.com"]
|
||||||
|
type = "Service"
|
||||||
|
}
|
||||||
|
|
||||||
|
actions = ["s3:PutObject"]
|
||||||
|
|
||||||
|
resources = ["${aws_s3_bucket.logging.arn}/*"]
|
||||||
|
|
||||||
|
condition {
|
||||||
|
test = "StringEquals"
|
||||||
|
variable = "aws:SourceAccount"
|
||||||
|
values = [data.aws_caller_identity.current.account_id]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
statement {
|
||||||
|
sid = "loggingRejectHTTPS"
|
||||||
|
effect = "Deny"
|
||||||
|
|
||||||
|
principals {
|
||||||
|
type = "*"
|
||||||
|
identifiers = ["*"]
|
||||||
|
}
|
||||||
|
|
||||||
|
actions = ["s3:*"]
|
||||||
|
|
||||||
|
resources = [
|
||||||
|
aws_s3_bucket.logging.arn,
|
||||||
|
"${aws_s3_bucket.logging.arn}/*"
|
||||||
|
]
|
||||||
|
|
||||||
|
condition {
|
||||||
|
test = "Bool"
|
||||||
|
variable = "aws:SecureTransport"
|
||||||
|
values = ["false"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
12
provisioners/modules/aws-s3/main.tf
Normal file
12
provisioners/modules/aws-s3/main.tf
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
terraform {
|
||||||
|
required_providers {
|
||||||
|
aws = {
|
||||||
|
source = "hashicorp/aws"
|
||||||
|
version = ">= 6.28"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
provider "aws" {
|
||||||
|
region = "us-gov-west-1"
|
||||||
|
}
|
||||||
137
provisioners/modules/aws-s3/s3.tf
Normal file
137
provisioners/modules/aws-s3/s3.tf
Normal file
@@ -0,0 +1,137 @@
|
|||||||
|
# Disable warnings about MFA delete and IAM access analyzer (currently cannot support them)
|
||||||
|
# kics-scan disable=c5b31ab9-0f26-4a49-b8aa-4cc064392f4d,e592a0c5-5bdb-414c-9066-5dba7cdea370
|
||||||
|
|
||||||
|
# Bucket to actually store artifacts
|
||||||
|
resource "aws_s3_bucket" "s3_bucket" {
|
||||||
|
bucket = var.bucket
|
||||||
|
|
||||||
|
tags = {
|
||||||
|
Name = "Orchard S3 Provisioning Bucket"
|
||||||
|
Environment = var.environment
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Control public access
|
||||||
|
resource "aws_s3_bucket_public_access_block" "s3_bucket_public_access_block" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
|
||||||
|
block_public_acls = true
|
||||||
|
block_public_policy = true
|
||||||
|
ignore_public_acls = true
|
||||||
|
restrict_public_buckets = true
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
Our lifecycle rule is as follows:
|
||||||
|
- Standard storage
|
||||||
|
-> OneZone IA storage after 30 days
|
||||||
|
-> Glacier storage after 180 days
|
||||||
|
*/
|
||||||
|
resource "aws_s3_bucket_lifecycle_configuration" "s3_bucket_lifecycle_configuration" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
|
||||||
|
rule {
|
||||||
|
id = "Standard to OneZone"
|
||||||
|
|
||||||
|
filter {}
|
||||||
|
|
||||||
|
status = "Enabled"
|
||||||
|
|
||||||
|
transition {
|
||||||
|
days = 30
|
||||||
|
storage_class = "ONEZONE_IA"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
rule {
|
||||||
|
id = "OneZone to Glacier"
|
||||||
|
|
||||||
|
filter {}
|
||||||
|
|
||||||
|
status = "Enabled"
|
||||||
|
|
||||||
|
transition {
|
||||||
|
days = 180
|
||||||
|
storage_class = "GLACIER"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Enable versioning but without MFA delete enabled
|
||||||
|
resource "aws_s3_bucket_versioning" "s3_bucket_versioning" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
|
||||||
|
versioning_configuration {
|
||||||
|
status = "Enabled"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Give preference to the bucket owner
|
||||||
|
resource "aws_s3_bucket_ownership_controls" "s3_bucket_ownership_controls" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
|
||||||
|
rule {
|
||||||
|
object_ownership = "BucketOwnerPreferred"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Set access control list to private
|
||||||
|
resource "aws_s3_bucket_acl" "s3_bucket_acl" {
|
||||||
|
depends_on = [aws_s3_bucket_ownership_controls.s3_bucket_ownership_controls]
|
||||||
|
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
acl = var.acl
|
||||||
|
}
|
||||||
|
|
||||||
|
# Bucket for logging
|
||||||
|
resource "aws_s3_bucket" "logging" {
|
||||||
|
bucket = "orchard-logging-bucket"
|
||||||
|
|
||||||
|
tags = {
|
||||||
|
Name = "Orchard S3 Logging Bucket"
|
||||||
|
Environment = var.environment
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Versioning for the logging bucket
|
||||||
|
resource "aws_s3_bucket_versioning" "orchard_logging_bucket_versioning" {
|
||||||
|
bucket = aws_s3_bucket.logging.id
|
||||||
|
|
||||||
|
versioning_configuration {
|
||||||
|
status = "Enabled"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
# Policies for the main s3 bucket and the logging bucket
|
||||||
|
resource "aws_s3_bucket_policy" "s3_bucket_https_policy" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.id
|
||||||
|
policy = data.aws_iam_policy_document.s3_reject_https_policy.json
|
||||||
|
}
|
||||||
|
resource "aws_s3_bucket_policy" "logging_policy" {
|
||||||
|
bucket = aws_s3_bucket.logging.bucket
|
||||||
|
policy = data.aws_iam_policy_document.logging_bucket_policy.json
|
||||||
|
}
|
||||||
|
|
||||||
|
# Set up the logging bucket with folders with logs for both buckets
|
||||||
|
resource "aws_s3_bucket_logging" "s3_bucket_logging" {
|
||||||
|
bucket = aws_s3_bucket.s3_bucket.bucket
|
||||||
|
|
||||||
|
target_bucket = aws_s3_bucket.logging.bucket
|
||||||
|
target_prefix = "s3_log/"
|
||||||
|
target_object_key_format {
|
||||||
|
partitioned_prefix {
|
||||||
|
partition_date_source = "EventTime"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
resource "aws_s3_bucket_logging" "logging_bucket_logging" {
|
||||||
|
bucket = aws_s3_bucket.logging.bucket
|
||||||
|
|
||||||
|
target_bucket = aws_s3_bucket.logging.bucket
|
||||||
|
target_prefix = "log/"
|
||||||
|
target_object_key_format {
|
||||||
|
partitioned_prefix {
|
||||||
|
partition_date_source = "EventTime"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
17
provisioners/modules/aws-s3/variables.tf
Normal file
17
provisioners/modules/aws-s3/variables.tf
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
variable "bucket" {
|
||||||
|
description = "Name of the S3 bucket"
|
||||||
|
type = string
|
||||||
|
default = "orchard-provisioning-bucket"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "acl" {
|
||||||
|
description = "Access control list for the bucket"
|
||||||
|
type = string
|
||||||
|
default = "private"
|
||||||
|
}
|
||||||
|
|
||||||
|
variable "environment" {
|
||||||
|
description = "Environment of the bucket"
|
||||||
|
type = string
|
||||||
|
default = "Development"
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user