63 Commits

Author SHA1 Message Date
Mondo Diaz
c60ed9ab21 Move Dashboard and Teams from navbar to user dropdown menu
Cleaner navbar with just Projects and Docs links.
Dashboard and Teams are now in the user menu dropdown.
2026-02-02 20:44:04 -06:00
Mondo Diaz
34ff9caa08 Fix circular dependency error message to show actual cycle path
The error was hardcoding [pkg_key, pkg_key] regardless of actual cycle.
Now tracks the path through dependencies to report the real cycle.
2026-02-02 20:43:05 -06:00
Mondo Diaz
ac3477ff22 Replace custom dependency graph with React Flow
- Install reactflow and dagre for professional graph visualization
- Use dagre for automatic tree layout (top-to-bottom)
- Custom styled nodes with package name, version, and size
- Built-in zoom/pan controls and minimap
- Click nodes to navigate to package page
- Cleaner, more professional appearance
2026-02-02 20:38:35 -06:00
Mondo Diaz
f87e5b4a51 Improve dependency UI: rename to DependGraph, hide empty Used By
- Rename "Dependency Graph" modal title to "DependGraph"
- Hide "Used By" section when no packages depend on this package
2026-02-02 20:34:32 -06:00
Mondo Diaz
01915bcb45 Fix circular dependency detection and hide empty graph modal
- Add artifact-level self-dependency check (skip if dep resolves to same artifact)
- Close dependency graph modal if package has no dependencies to show
  (only root package with no children and no missing deps)
2026-02-02 20:31:46 -06:00
Mondo Diaz
72952d84a1 Skip self-dependencies in dependency resolver
PyPI packages can have self-referential dependencies for extras
(e.g., pytest[testing] depends on pytest). These were incorrectly
detected as circular dependencies. Now we skip them.
2026-02-02 19:45:34 -06:00
Mondo Diaz
e6d42d91cd Fix [object Object] error when API returns structured error detail
The backend returns detail as an object for some errors (circular dependency,
conflicts, etc.). The API client now JSON.stringifies object details so they
can be properly parsed by error handlers like DependencyGraph.
2026-02-02 18:33:55 -06:00
Mondo Diaz
b3ae3b03eb Show missing dependencies in dependency graph instead of failing
When dependencies are not cached on the server (common since we removed
proactive caching), the dependency graph now:
- Continues resolving what it can find
- Shows missing dependencies in a separate section with amber styling
- Displays the constraint and which package required them
- Updates the header stats to show "X cached • Y not cached"

This provides a better user experience than showing an error when
some dependencies haven't been downloaded yet.
2026-02-02 16:29:37 -06:00
Mondo Diaz
ba0a658611 Fix dependency graph error for invalid version constraints
When a dependency has an invalid version constraint like '>=' (without
a version number), the resolver now treats it as a wildcard and returns
the latest available version instead of failing with 'Dependency not found'.

This handles malformed metadata that may have been stored from PyPI packages.
2026-02-02 16:26:18 -06:00
Mondo Diaz
081cc6df83 Remove proactive PyPI dependency caching feature
The background task queue for proactively caching package dependencies was
causing server instability and unnecessary growth. The PyPI proxy now only
caches packages on-demand when users request them.

Removed:
- PyPI cache worker (background task queue and worker pool)
- PyPICacheTask model and related database schema
- Cache management API endpoints (/pypi/cache/*)
- Background Jobs admin dashboard
- Dependency extraction and queueing logic

Kept:
- On-demand package caching (still works when users request packages)
- Async httpx for non-blocking downloads (prevents health check failures)
- URL-based cache lookups for deduplication
2026-02-02 16:17:33 -06:00
Mondo Diaz
cf7bdccb3a Center text in jobs table columns 2026-02-02 15:30:46 -06:00
Mondo Diaz
1329d380a4 Convert PyPI proxy from sync to async httpx to prevent event loop blocking
The pypi_download_file, pypi_simple_index, and pypi_package_versions endpoints
were using synchronous httpx.Client inside async functions. When upstream PyPI
servers respond slowly, this blocked the entire FastAPI event loop, preventing
health checks from responding. Kubernetes would then kill the pod after the
liveness probe timed out.

Changes:
- httpx.Client → httpx.AsyncClient
- client.get() → await client.get()
- response.iter_bytes() → response.aiter_bytes()

This ensures the event loop remains responsive during slow upstream downloads,
allowing health checks to succeed even when downloads take 20+ seconds.
2026-02-02 15:26:24 -06:00
Mondo Diaz
361210a2bc Add cancel job button and improve jobs table UI
- Remove "All Jobs" title
- Move Status column to front of table
- Add Cancel button for in-progress jobs
- Add cancel endpoint: POST /pypi/cache/cancel/{package_name}
- Add btn-danger CSS styling
2026-02-02 15:18:59 -06:00
Mondo Diaz
415ad9a29a Stream downloads to temp file to reduce memory usage
- Download packages in 64KB chunks to temp file instead of loading into memory
- Upload to S3 from temp file (streaming)
- Clean up temp file after processing
- Reduces memory footprint from 2x file size to 1x file size
2026-02-02 15:10:25 -06:00
Mondo Diaz
1667c5a416 Increase memory to 1Gi and reduce workers to 1 for stability 2026-02-02 15:08:00 -06:00
Mondo Diaz
1021e2b942 Add PyPI cache config and bump memory in values-prod.yaml 2026-02-02 14:38:47 -06:00
Mondo Diaz
d0e91658d7 Add PyPI cache config and bump memory in values-stage.yaml 2026-02-02 14:38:21 -06:00
Mondo Diaz
7b89f41704 Add PyPI cache config and bump memory in values-dev.yaml 2026-02-02 14:37:55 -06:00
Mondo Diaz
ba43110123 Add PyPI cache worker config and increase memory limit
- Add orchard.pypiCache config section to helm values
- Set default workers to 2 (reduced from 5 to limit memory)
- Bump pod memory from 512Mi to 768Mi (request=limit)
- Add ORCHARD_PYPI_CACHE_* env vars to deployment template
2026-02-02 14:37:27 -06:00
Mondo Diaz
92edef92e6 Redesign jobs dashboard with unified table and progress bar
- Add overall progress bar showing completed/active/failed counts
- Unify all job types into single table with Type column
- Simplify status to Working/Pending/Failed badges
- Remove NPM "Coming Soon" section
- Add get_recent_activity() function for future activity feed
- Fix dark mode CSS using CSS variables
2026-02-02 14:34:48 -06:00
Mondo Diaz
47b137f4eb Improve Active Workers table and recover stale tasks
Backend:
- Add _recover_stale_tasks() to reset tasks stuck in 'in_progress'
  from previous crashes (tasks >5 min old get reset to pending)
- Called automatically on startup

Frontend:
- Fix dark mode colors using CSS variables instead of hardcoded values
- Add elapsed time column showing how long task has been running
- Add spinning indicator next to package name
- Add status badge (Running/Stale?)
- Highlight stale tasks (>5 min) in amber
- Auto-updates every 5 seconds with existing refresh
2026-02-02 14:29:17 -06:00
Mondo Diaz
1138309aaa Add Active Workers table to Background Jobs dashboard
Shows currently processing cache tasks in a dynamic table with:
- Package name and version constraint being cached
- Recursion depth and attempt number
- Start timestamp
- Pulsing indicator to show live activity

Backend changes:
- Add get_active_tasks() function to pypi_cache_worker.py
- Add GET /pypi/cache/active endpoint to pypi_proxy.py

Frontend changes:
- Add PyPICacheActiveTask type
- Add getPyPICacheActiveTasks() API function
- Add Active Workers section with animated table
- Auto-refreshes every 5 seconds with existing data
2026-02-02 13:50:45 -06:00
Mondo Diaz
3bdeade7ca Fix nested dependency depth tracking in PyPI cache worker
When the cache worker downloaded a package through the proxy, dependencies
were always queued with depth=0 instead of depth+1. This meant depth limits
weren't properly enforced for nested dependencies.

Changes:
- Add cache-depth query parameter to pypi_download_file endpoint
- Worker now passes its current depth when fetching packages
- Dependencies are queued at cache_depth+1 instead of hardcoded 0
- Add tests for depth tracking behavior
2026-02-02 13:47:22 -06:00
Mondo Diaz
8edb45879f Fix jobs dashboard showing misleading completion message
The dashboard was showing "All jobs completed successfully" whenever
there were no failed tasks, even if there were pending or in-progress
jobs. Now shows:
- "All jobs completed" only when pending=0 and in_progress=0
- "Jobs are processing. No failures yet." when jobs are in queue
2026-02-02 11:56:01 -06:00
Mondo Diaz
97b39d000b Add security fixes and code cleanup for PyPI cache
- Add require_admin authentication to cache management endpoints
- Add limit validation (1-500) on failed tasks query
- Add thread lock for worker pool thread safety
- Fix exception handling with separate recovery DB session
- Remove obsolete design doc
2026-02-02 11:37:25 -06:00
Mondo Diaz
ba708332a5 Add Background Jobs dashboard for admin users
New admin page at /admin/jobs showing:
- PyPI cache job status (pending, in-progress, completed, failed)
- Failed task list with error details
- Retry individual packages or retry all failed
- Auto-refresh every 5 seconds (toggleable)
- Placeholder for future NPM cache jobs

Accessible from admin dropdown menu as "Background Jobs".
2026-02-02 11:26:55 -06:00
Mondo Diaz
d274f3f375 Add robust PyPI dependency caching with task queue
Replace unbounded thread spawning with managed worker pool:
- New pypi_cache_tasks table tracks caching jobs
- Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS)
- Automatic retries with exponential backoff (30s, 60s, then fail)
- Deduplication to prevent duplicate caching attempts

New API endpoints for visibility and control:
- GET /pypi/cache/status - queue health summary
- GET /pypi/cache/failed - list failed tasks with errors
- POST /pypi/cache/retry/{package} - retry single package
- POST /pypi/cache/retry-all - retry all failed packages

This fixes silent failures in background dependency caching where
packages would fail to cache without any tracking or retry mechanism.
2026-02-02 11:16:02 -06:00
Mondo Diaz
490b05438d Add design doc for PyPI cache robustness improvements 2026-02-02 11:06:51 -06:00
Mondo Diaz
3c2ab70ef0 Fix proactive dependency caching HTTPS redirect issue
When background threads fetch from our own proxy using the request's
base_url, it returns http:// but ingress requires https://. The 308
redirect was dropping trailing slashes, causing requests to hit the
frontend catch-all route instead of /pypi/simple/.

Force HTTPS explicitly in the background caching function to avoid
the redirect entirely.
2026-01-30 18:59:31 -06:00
Mondo Diaz
109a593f83 Add debug logging for proactive caching regex failures 2026-01-30 18:43:09 -06:00
Mondo Diaz
1d727b3f8c Fix proactive caching regex to match both hyphens and underscores
PEP 503 normalizes package names to use hyphens, but wheel filenames
may use underscores (e.g., typing_extensions-4.0.0-py3-none-any.whl).

Convert the search pattern to match either separator.
2026-01-30 18:25:30 -06:00
Mondo Diaz
47aa0afe91 Fix proactive caching failing on HTTP->HTTPS redirects
The background dependency caching was getting 308 redirects because
request.base_url returns http:// but the ingress redirects to https://.

Enable follow_redirects=True in httpx client to handle this.
2026-01-30 18:11:08 -06:00
Mondo Diaz
f992fc540e Add proactive dependency caching for PyPI packages
When a PyPI package is cached, its dependencies are now automatically
fetched in background threads. This ensures the entire dependency tree
is cached even if pip already has some packages installed locally.

Features:
- Background threads fetch each dependency without blocking the response
- Uses our own proxy endpoint to cache, which recursively caches transitive deps
- Max depth of 10 to prevent infinite loops
- Daemon threads so they don't block process shutdown
2026-01-30 17:45:30 -06:00
Mondo Diaz
044a6c1d27 Fix duplicate dependency constraint causing 500 errors
- Deduplicate dependencies by package name before inserting
- Some packages (like anyio) list the same dep (trio) multiple times with
  different version constraints for different extras
- The unique constraint on (artifact_id, project, package) rejected these
- Also removed debug logging from dependencies.py
2026-01-30 17:43:49 -06:00
Mondo Diaz
62c77dc16d Add detailed debug logging to _resolve_dependency_to_artifact 2026-01-30 17:29:19 -06:00
Mondo Diaz
7c05360eed Add debug logging to resolve_dependencies 2026-01-30 17:21:04 -06:00
Mondo Diaz
76878279e9 Add backfill script for PyPI package dependencies
Script extracts Requires-Dist metadata from cached PyPI packages
and stores them in artifact_dependencies table.

Usage:
  docker exec <container> python -m backend.scripts.backfill_pypi_dependencies
  docker exec <container> python -m backend.scripts.backfill_pypi_dependencies --dry-run
2026-01-30 15:38:45 -06:00
Mondo Diaz
e1b01abf9b Add PEP 440 version constraint matching for dependency resolution
- Parse version constraints like >=1.9, <2.0 using packaging library
- Find the latest version that satisfies the constraint
- Support wildcard (*) to get latest version
- Fall back to exact version and tag matching
2026-01-30 15:34:19 -06:00
Mondo Diaz
d07936b666 Fix ensure file modal z-index when opened from deps modal 2026-01-30 15:32:06 -06:00
Mondo Diaz
47b3eb439d Extract and store dependencies from PyPI packages
- Add functions to parse Requires-Dist metadata from wheel and sdist files
- Store extracted dependencies in artifact_dependencies table
- Fix streaming response for cached artifacts (proper tuple unpacking)
- Fix version uniqueness check to use version string instead of artifact_id
- Skip creating versions for .metadata files
2026-01-30 15:14:52 -06:00
Mondo Diaz
c5f75e4fd6 Add is_system to all ProjectResponse constructions in routes 2026-01-30 13:34:26 -06:00
Mondo Diaz
ff31379649 Fix: ensure existing _pypi project gets is_system=true 2026-01-30 13:33:31 -06:00
Mondo Diaz
424b1e5770 Add is_system field to ProjectResponse schema 2026-01-30 13:11:11 -06:00
Mondo Diaz
7b5b0c78d8 Hide Tags and Latest columns for system projects in package table 2026-01-30 12:55:28 -06:00
Mondo Diaz
924826f07a Improve system project UX and make dependencies a modal
- Hide tag count stat for system projects (show "versions" instead of "artifacts")
- Hide "Latest" tag stat for system projects
- Change "Create/Update Tag" to only show for non-system projects
- Add "View Artifact ID" menu option with modal showing the SHA256 hash
- Move dependencies section to a modal (opened via "View Dependencies" menu)
- Add deps-modal and artifact-id-modal CSS styles
2026-01-30 12:36:40 -06:00
Mondo Diaz
fe6c6c52d2 Fix PyPI proxy UX and package stats calculation
- Fix artifact_count and total_size calculation to use Tags instead of
  Uploads, so PyPI cached packages show their stats correctly
- Fix PackagePage dropdown menu positioning (use fixed position with backdrop)
- Add system project detection for projects starting with "_"
- Show Version as primary column for system projects, hide Tag column
- Hide upload button for system projects (they're cache-only)
- Rename section header to "Versions" for system projects
- Fix test_projects_sort_by_name to exclude system projects from sort comparison
2026-01-30 12:16:05 -06:00
Mondo Diaz
701e11ce83 Hide format filter and column for system projects
System projects like _pypi only contain packages of one format,
so the format filter dropdown and column are redundant.
2026-01-30 11:55:09 -06:00
Mondo Diaz
ff9e02606e Hide Settings and New Package buttons for system projects
System projects should be system-controlled only. Users should not
be able to create packages or change settings on system cache projects.
2026-01-30 11:54:02 -06:00
Mondo Diaz
f3afdd3bbf Improve PyPI proxy and Package page UX
PyPI proxy improvements:
- Set package format to "pypi" instead of "generic"
- Extract version from filename and create PackageVersion record
- Support .whl, .tar.gz, and .zip filename formats

Package page UX overhaul:
- Move upload to header button with modal
- Simplify table: combine Tag/Version, remove Type and Artifact ID columns
- Add row action menu (⋯) with: Copy ID, Ensure File, Create Tag, Dependencies
- Remove cluttered "Download by Artifact ID" and "Create/Update Tag" sections
- Add modals for upload and create tag actions
- Cleaner, more scannable table layout
2026-01-30 11:52:37 -06:00
Mondo Diaz
4b73196664 Show team name instead of individual user in Owner column
Projects owned by teams now display the team name in the Owner column
for better organizational continuity when team members change.
Falls back to created_by if no team is assigned.
2026-01-30 11:25:01 -06:00
Mondo Diaz
7ef66745f1 Add "(coming soon)" label for unsupported upstream source types
Only pypi and generic are currently supported. Other types now show
"(coming soon)" in both the dropdown and the sources table.
2026-01-30 11:03:44 -06:00
Mondo Diaz
2dc7fe5a7b Fix PyPI proxy: use correct storage method and make project public
- Use storage.get_stream(s3_key) instead of non-existent get_artifact_stream()
- Make _pypi project public (is_public=True) so cached packages are visible
2026-01-30 10:59:50 -06:00
Mondo Diaz
534e4b964f Fix Project and Tag model fields in PyPI proxy
Use correct model fields:
- Project: is_public, is_system, created_by (not visibility)
- Tag: add required created_by field
2026-01-30 10:29:25 -06:00
Mondo Diaz
757e43fc34 Fix Artifact model field names in PyPI proxy
Use correct Artifact model fields:
- original_name instead of filename
- Add required created_by and s3_key fields
- Include checksum fields from storage result
2026-01-30 09:58:15 -06:00
Mondo Diaz
d78092de55 Fix PyPI proxy to use correct storage.store() method
The code was calling storage.store_artifact() which doesn't exist.
Changed to use storage.store() which handles content-addressable
storage with automatic deduplication.
2026-01-30 09:41:34 -06:00
Mondo Diaz
0fa991f536 Allow full path in PyPI upstream source URL
Users can now configure the full path including /simple in their
upstream source URL (e.g., https://example.com/api/pypi/repo/simple)
instead of having the code append /simple/ automatically.

This matches pip's --index-url format, making configuration more
intuitive and copy/paste friendly.
2026-01-30 09:24:05 -06:00
Mondo Diaz
00fb2729e4 Fix test_rewrite_relative_links assertion to expect correct URL
The test was checking for the wrong URL pattern. When urljoin resolves
../../packages/ab/cd/... relative to /api/pypi/pypi-remote/simple/requests/,
it correctly produces /api/pypi/pypi-remote/packages/ab/cd/... (not
/api/pypi/packages/...).
2026-01-30 08:51:30 -06:00
Mondo Diaz
8ae4d7a685 Improve PyPI proxy test assertions for all status codes
Tests now verify the correct response for each scenario:
- 200: HTML content-type
- 404: "not found" error message
- 503: "No PyPI upstream sources configured" error message
2026-01-29 19:35:20 -06:00
Mondo Diaz
4b887d1aad Fix PyPI proxy tests to work with or without upstream sources
- Tests now accept 200/404/503 responses since upstream sources may or
  may not be configured in the test environment
- Added upstream_base_url parameter to _rewrite_package_links test
- Added test for relative URL resolution (Artifactory-style URLs)
2026-01-29 19:34:33 -06:00
Mondo Diaz
4dc54ace8a Fix HTTPS scheme detection behind reverse proxy
When behind a reverse proxy that terminates SSL, the server sees HTTP
requests internally. Added _get_base_url() helper that respects the
X-Forwarded-Proto header to generate correct external HTTPS URLs.

This fixes links in the PyPI simple index showing http:// instead of
https:// when accessed via HTTPS through a load balancer.
2026-01-29 18:02:21 -06:00
Mondo Diaz
64bfd3902f Fix relative URL handling in PyPI proxy
Artifactory and other registries may return relative URLs in their
Simple API responses (e.g., ../../packages/...). The proxy now resolves
these to absolute URLs using urljoin() before encoding them in the
upstream parameter.

This fixes package downloads failing when the upstream registry uses
relative URLs in its package index.
2026-01-29 18:01:19 -06:00
Mondo Diaz
bdfed77cb1 Remove dead code from pypi_proxy.py
- Remove unused imports (UpstreamClient, UpstreamClientConfig,
  UpstreamHTTPError, UpstreamConnectionError, UpstreamTimeoutError)
- Simplify matched_source selection logic, removing dead conditional
  that always evaluated to True due to 'or True'
2026-01-29 16:42:53 -06:00
Mondo Diaz
140f6c926a Fix httpx.Timeout configuration in PyPI proxy
httpx.Timeout requires either a default value or all four parameters.
Changed to httpx.Timeout(default, connect=X) format.
2026-01-29 16:40:06 -06:00
72 changed files with 3829 additions and 7518 deletions

View File

@@ -213,74 +213,6 @@ integration_test_feature:
- if: '$CI_COMMIT_BRANCH && $CI_COMMIT_BRANCH != "main"'
when: on_success
# Reset feature environment after integration tests
# Calls factory-reset to clean up test data created during integration tests
reset_feature:
stage: deploy
needs: [integration_test_feature]
image: deps.global.bsf.tools/docker/python:3.12-slim
timeout: 5m
before_script:
- pip install --index-url "$PIP_INDEX_URL" httpx
script:
# Debug: Check if variable is set at shell level
- echo "RESET_ADMIN_PASSWORD length at shell level:${#RESET_ADMIN_PASSWORD}"
- |
python - <<'RESET_SCRIPT'
import httpx
import os
import sys
BASE_URL = f"https://orchard-{os.environ['CI_COMMIT_REF_SLUG']}.common.global.bsf.tools"
PASSWORD_RAW = os.environ.get("RESET_ADMIN_PASSWORD")
if not PASSWORD_RAW:
print("ERROR: RESET_ADMIN_PASSWORD not set")
sys.exit(1)
# Debug: check for hidden characters
print(f"Raw password repr (first 3 chars): {repr(PASSWORD_RAW[:3])}")
print(f"Raw password repr (last 3 chars): {repr(PASSWORD_RAW[-3:])}")
print(f"Raw length: {len(PASSWORD_RAW)}")
# Strip any whitespace
PASSWORD = PASSWORD_RAW.strip()
print(f"Stripped length: {len(PASSWORD)}")
print(f"Resetting environment at {BASE_URL}")
client = httpx.Client(base_url=BASE_URL, timeout=60.0)
# Login as admin
login_resp = client.post("/api/v1/auth/login", json={
"username": "admin",
"password": PASSWORD
})
if login_resp.status_code != 200:
print(f"ERROR: Login failed: {login_resp.status_code}")
print(f"Response: {login_resp.text}")
sys.exit(1)
# Call factory reset
reset_resp = client.post(
"/api/v1/admin/factory-reset",
headers={"X-Confirm-Reset": "yes-delete-all-data"}
)
if reset_resp.status_code == 200:
print("SUCCESS: Factory reset completed")
print(reset_resp.json())
else:
print(f"ERROR: Factory reset failed: {reset_resp.status_code}")
print(reset_resp.text)
sys.exit(1)
RESET_SCRIPT
variables:
# Use same pattern as integration_test_feature - create new variable from CI variable
RESET_ADMIN_PASSWORD: $DEV_ADMIN_PASSWORD
rules:
- if: '$CI_COMMIT_BRANCH && $CI_COMMIT_BRANCH != "main"'
when: on_success
allow_failure: true # Don't fail the pipeline if reset fails
# Run Python backend unit tests
python_unit_tests:
stage: test

View File

@@ -7,37 +7,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Added
- Added S3 bucket provisioning terraform configuration (#59)
- Creates an S3 bucket to be used for anything Orchard
- Creates a log bucket for any logs tracking the S3 bucket
- Added auto-fetch capability to dependency resolution endpoint
- `GET /api/v1/project/{project}/{package}/+/{ref}/resolve?auto_fetch=true` fetches missing dependencies from upstream registries
- PyPI registry client queries PyPI JSON API to resolve version constraints
- Fetched artifacts are cached and included in response `fetched` field
- Missing dependencies show `fetch_attempted` and `fetch_error` status
- Configurable max fetch depth via `ORCHARD_AUTO_FETCH_MAX_DEPTH` (default: 3)
- Added `backend/app/registry_client.py` with extensible registry client abstraction
- `RegistryClient` ABC for implementing upstream registry clients
- `PyPIRegistryClient` implementation using PyPI JSON API
- `get_registry_client()` factory function for future npm/maven support
- Added `fetch_and_cache_pypi_package()` reusable function for PyPI package fetching
- Added HTTP connection pooling infrastructure for improved PyPI proxy performance
- `HttpClientManager` with configurable pool size, timeouts, and thread pool executor
- Eliminates per-request connection overhead (~100-500ms → ~5ms)
- Added Redis caching layer with category-aware TTL for hermetic builds
- `CacheService` with graceful fallback when Redis unavailable
- Immutable data (artifact metadata, dependencies) cached forever
- Mutable data (package index, versions) uses configurable TTL
- Added `ArtifactRepository` for batch database operations
- `batch_upsert_dependencies()` reduces N+1 queries to single INSERT
- `get_or_create_artifact()` uses atomic ON CONFLICT upsert
- Added infrastructure status to health endpoint (`/health`)
- Reports HTTP pool size and worker threads
- Reports Redis cache connection status
- Added new configuration settings for HTTP client, Redis, and cache TTL
- `ORCHARD_HTTP_MAX_CONNECTIONS`, `ORCHARD_HTTP_CONNECT_TIMEOUT`, etc.
- `ORCHARD_REDIS_HOST`, `ORCHARD_REDIS_PORT`, `ORCHARD_REDIS_ENABLED`
- `ORCHARD_CACHE_TTL_INDEX`, `ORCHARD_CACHE_TTL_VERSIONS`, etc.
- Added transparent PyPI proxy implementing PEP 503 Simple API (#108)
- `GET /pypi/simple/` - package index (proxied from upstream)
- `GET /pypi/simple/{package}/` - version list with rewritten download links
@@ -45,6 +14,35 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Allows `pip install --index-url https://orchard.../pypi/simple/ <package>`
- Artifacts cached on first access through configured upstream sources
- Added `POST /api/v1/cache/resolve` endpoint to cache packages by coordinates instead of URL (#108)
### Changed
- Upstream sources table text is now centered under column headers (#108)
- ENV badge now appears inline with source name instead of separate column (#108)
- Test and Edit buttons now have more prominent button styling (#108)
- Reduced footer padding for cleaner layout (#108)
### Fixed
- Fixed purge_seed_data crash when deleting access permissions - was comparing UUID to VARCHAR column (#107)
### Changed
- Upstream source connectivity test no longer follows redirects, fixing "Exceeded maximum allowed redirects" error with Artifactory proxies (#107)
- Test runs automatically after saving a new or updated upstream source (#107)
- Test status now shows as colored dots (green=success, red=error) instead of text badges (#107)
- Clicking red dot shows error details in a modal (#107)
- Source name column no longer wraps text for better table layout (#107)
- Renamed "Cache Management" page to "Upstream Sources" (#107)
- Moved Delete button from table row to edit modal for cleaner table layout (#107)
### Removed
- Removed `is_public` field from upstream sources - all sources are now treated as internal/private (#107)
- Removed `allow_public_internet` (air-gap mode) setting from cache settings - not needed for enterprise proxy use case (#107)
- Removed seeding of public registry URLs (npm-public, pypi-public, maven-central, docker-hub) (#107)
- Removed "Public" badge and checkbox from upstream sources UI (#107)
- Removed "Allow Public Internet" toggle from cache settings UI (#107)
- Removed "Global Settings" section from cache management UI - auto-create system projects is always enabled (#107)
- Removed unused CacheSettings frontend types and API functions (#107)
### Added
- Added `ORCHARD_PURGE_SEED_DATA` environment variable support to stage helm values to remove seed data from long-running deployments (#107)
- Added frontend system projects visual distinction (#105)
- "Cache" badge for system projects in project list
@@ -211,24 +209,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added comprehensive integration tests for all dependency features
### Changed
- Removed Usage section from Package page (curl command examples)
- PyPI proxy now uses shared HTTP connection pool instead of per-request clients
- PyPI proxy now caches upstream source configuration in Redis
- Dependency storage now uses batch INSERT instead of individual queries
- Increased default database pool size from 5 to 20 connections
- Increased default database max overflow from 10 to 30 connections
- Enabled Redis in Helm chart values for dev, stage, and prod environments
- Upstream sources table text is now centered under column headers (#108)
- ENV badge now appears inline with source name instead of separate column (#108)
- Test and Edit buttons now have more prominent button styling (#108)
- Reduced footer padding for cleaner layout (#108)
- Upstream source connectivity test no longer follows redirects, fixing "Exceeded maximum allowed redirects" error with Artifactory proxies (#107)
- Test runs automatically after saving a new or updated upstream source (#107)
- Test status now shows as colored dots (green=success, red=error) instead of text badges (#107)
- Clicking red dot shows error details in a modal (#107)
- Source name column no longer wraps text for better table layout (#107)
- Renamed "Cache Management" page to "Upstream Sources" (#107)
- Moved Delete button from table row to edit modal for cleaner table layout (#107)
- Added pre-test stage reset to ensure known environment state before integration tests (#54)
- Upload endpoint now accepts optional `ensure` file parameter for declaring dependencies
- Updated upload API documentation with ensure file format and examples
@@ -237,20 +217,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added orchard logo icon and dot separator to footer
### Fixed
- Fixed purge_seed_data crash when deleting access permissions - was comparing UUID to VARCHAR column (#107)
- Fixed dark theme styling for team pages - modals, forms, and dropdowns now use correct theme variables
- Fixed UserAutocomplete and TeamSelector dropdown backgrounds for dark theme
- Fixed PyPI proxy filtering platform-specific dependencies (pyobjc on macOS, pywin32 on Windows)
- Fixed bare version constraints being treated as wildcards (e.g., `certifi@2025.10.5` now fetches exact version)
### Removed
- Removed `is_public` field from upstream sources - all sources are now treated as internal/private (#107)
- Removed `allow_public_internet` (air-gap mode) setting from cache settings - not needed for enterprise proxy use case (#107)
- Removed seeding of public registry URLs (npm-public, pypi-public, maven-central, docker-hub) (#107)
- Removed "Public" badge and checkbox from upstream sources UI (#107)
- Removed "Allow Public Internet" toggle from cache settings UI (#107)
- Removed "Global Settings" section from cache management UI - auto-create system projects is always enabled (#107)
- Removed unused CacheSettings frontend types and API functions (#107)
## [0.5.1] - 2026-01-23
### Changed

View File

@@ -1,262 +0,0 @@
"""
Redis-backed caching service with category-aware TTL and invalidation.
Provides:
- Immutable caching for artifact data (hermetic builds)
- TTL-based caching for discovery data
- Event-driven invalidation for config changes
- Graceful fallback when Redis unavailable
"""
import logging
from enum import Enum
from typing import Optional
from .config import Settings
logger = logging.getLogger(__name__)
class CacheCategory(Enum):
"""
Cache categories with different TTL and invalidation rules.
Immutable (cache forever):
- ARTIFACT_METADATA: Artifact info by SHA256
- ARTIFACT_DEPENDENCIES: Extracted deps by SHA256
- DEPENDENCY_RESOLUTION: Resolution results by input hash
Mutable (TTL + event invalidation):
- UPSTREAM_SOURCES: Upstream config, invalidate on DB change
- PACKAGE_INDEX: PyPI/npm index pages, TTL only
- PACKAGE_VERSIONS: Version listings, TTL only
"""
# Immutable - cache forever (hermetic builds)
ARTIFACT_METADATA = "artifact"
ARTIFACT_DEPENDENCIES = "deps"
DEPENDENCY_RESOLUTION = "resolve"
# Mutable - TTL + event invalidation
UPSTREAM_SOURCES = "upstream"
PACKAGE_INDEX = "index"
PACKAGE_VERSIONS = "versions"
def get_category_ttl(category: CacheCategory, settings: Settings) -> Optional[int]:
"""
Get TTL for a cache category.
Returns:
TTL in seconds, or None for no expiry (immutable).
"""
ttl_map = {
# Immutable - no TTL
CacheCategory.ARTIFACT_METADATA: None,
CacheCategory.ARTIFACT_DEPENDENCIES: None,
CacheCategory.DEPENDENCY_RESOLUTION: None,
# Mutable - configurable TTL
CacheCategory.UPSTREAM_SOURCES: settings.cache_ttl_upstream,
CacheCategory.PACKAGE_INDEX: settings.cache_ttl_index,
CacheCategory.PACKAGE_VERSIONS: settings.cache_ttl_versions,
}
return ttl_map.get(category)
class CacheService:
"""
Redis-backed caching with category-aware TTL.
Key format: orchard:{category}:{protocol}:{identifier}
Example: orchard:deps:pypi:abc123def456
When Redis is disabled or unavailable, operations gracefully
return None/no-op to allow the application to function without caching.
"""
def __init__(self, settings: Settings):
self._settings = settings
self._enabled = settings.redis_enabled
self._redis: Optional["redis.asyncio.Redis"] = None
self._started = False
async def startup(self) -> None:
"""Initialize Redis connection. Called by FastAPI lifespan."""
if self._started:
return
if not self._enabled:
logger.info("CacheService disabled (redis_enabled=False)")
self._started = True
return
try:
import redis.asyncio as redis
logger.info(
f"Connecting to Redis at {self._settings.redis_host}:"
f"{self._settings.redis_port}/{self._settings.redis_db}"
)
self._redis = redis.Redis(
host=self._settings.redis_host,
port=self._settings.redis_port,
db=self._settings.redis_db,
password=self._settings.redis_password,
decode_responses=False, # We handle bytes
)
# Test connection
await self._redis.ping()
logger.info("CacheService connected to Redis")
except ImportError:
logger.warning("redis package not installed, caching disabled")
self._enabled = False
except Exception as e:
logger.warning(f"Redis connection failed, caching disabled: {e}")
self._enabled = False
self._redis = None
self._started = True
async def shutdown(self) -> None:
"""Close Redis connection. Called by FastAPI lifespan."""
if not self._started:
return
if self._redis:
await self._redis.aclose()
self._redis = None
self._started = False
logger.info("CacheService shutdown complete")
@staticmethod
def _make_key(category: CacheCategory, protocol: str, identifier: str) -> str:
"""Build namespaced cache key."""
return f"orchard:{category.value}:{protocol}:{identifier}"
async def get(
self,
category: CacheCategory,
key: str,
protocol: str = "default",
) -> Optional[bytes]:
"""
Get cached value.
Args:
category: Cache category for TTL rules
key: Unique identifier within category
protocol: Protocol namespace (pypi, npm, etc.)
Returns:
Cached bytes or None if not found/disabled.
"""
if not self._enabled or not self._redis:
return None
try:
full_key = self._make_key(category, protocol, key)
return await self._redis.get(full_key)
except Exception as e:
logger.warning(f"Cache get failed for {key}: {e}")
return None
async def set(
self,
category: CacheCategory,
key: str,
value: bytes,
protocol: str = "default",
) -> None:
"""
Set cached value with category-appropriate TTL.
Args:
category: Cache category for TTL rules
key: Unique identifier within category
value: Bytes to cache
protocol: Protocol namespace (pypi, npm, etc.)
"""
if not self._enabled or not self._redis:
return
try:
full_key = self._make_key(category, protocol, key)
ttl = get_category_ttl(category, self._settings)
if ttl is None:
await self._redis.set(full_key, value)
else:
await self._redis.setex(full_key, ttl, value)
except Exception as e:
logger.warning(f"Cache set failed for {key}: {e}")
async def delete(
self,
category: CacheCategory,
key: str,
protocol: str = "default",
) -> None:
"""Delete a specific cache entry."""
if not self._enabled or not self._redis:
return
try:
full_key = self._make_key(category, protocol, key)
await self._redis.delete(full_key)
except Exception as e:
logger.warning(f"Cache delete failed for {key}: {e}")
async def invalidate_pattern(
self,
category: CacheCategory,
pattern: str = "*",
protocol: str = "default",
) -> int:
"""
Invalidate all entries matching pattern.
Args:
category: Cache category
pattern: Glob pattern for keys (default "*" = all in category)
protocol: Protocol namespace
Returns:
Number of keys deleted.
"""
if not self._enabled or not self._redis:
return 0
try:
full_pattern = self._make_key(category, protocol, pattern)
keys = []
async for key in self._redis.scan_iter(match=full_pattern):
keys.append(key)
if keys:
return await self._redis.delete(*keys)
return 0
except Exception as e:
logger.warning(f"Cache invalidate failed for pattern {pattern}: {e}")
return 0
async def ping(self) -> bool:
"""Check if Redis is connected and responding."""
if not self._enabled or not self._redis:
return False
try:
await self._redis.ping()
return True
except Exception:
return False
@property
def enabled(self) -> bool:
"""Check if caching is enabled."""
return self._enabled

View File

@@ -22,8 +22,8 @@ class Settings(BaseSettings):
database_sslmode: str = "disable"
# Database connection pool settings
database_pool_size: int = 20 # Number of connections to keep open
database_max_overflow: int = 30 # Max additional connections beyond pool_size
database_pool_size: int = 5 # Number of connections to keep open
database_max_overflow: int = 10 # Max additional connections beyond pool_size
database_pool_timeout: int = 30 # Seconds to wait for a connection from pool
database_pool_recycle: int = (
1800 # Recycle connections after this many seconds (30 min)
@@ -51,26 +51,6 @@ class Settings(BaseSettings):
presigned_url_expiry: int = (
3600 # Presigned URL expiry in seconds (default: 1 hour)
)
pypi_download_mode: str = "redirect" # "redirect" (to S3) or "proxy" (stream through Orchard)
# HTTP Client pool settings
http_max_connections: int = 100 # Max connections per pool
http_max_keepalive: int = 20 # Keep-alive connections
http_connect_timeout: float = 30.0 # Connection timeout seconds
http_read_timeout: float = 60.0 # Read timeout seconds
http_worker_threads: int = 32 # Thread pool for blocking ops
# Redis cache settings
redis_host: str = "localhost"
redis_port: int = 6379
redis_db: int = 0
redis_password: Optional[str] = None
redis_enabled: bool = True # Set False to disable caching
# Cache TTL settings (seconds, 0 = no expiry)
cache_ttl_index: int = 300 # Package index pages: 5 min
cache_ttl_versions: int = 300 # Version listings: 5 min
cache_ttl_upstream: int = 3600 # Upstream source config: 1 hour
# Logging settings
log_level: str = "INFO" # DEBUG, INFO, WARNING, ERROR, CRITICAL
@@ -89,10 +69,6 @@ class Settings(BaseSettings):
pypi_cache_max_depth: int = 10 # Maximum recursion depth for dependency caching
pypi_cache_max_attempts: int = 3 # Maximum retry attempts for failed cache tasks
# Auto-fetch configuration for dependency resolution
auto_fetch_dependencies: bool = False # Server default for auto_fetch parameter
auto_fetch_timeout: int = 300 # Total timeout for auto-fetch resolution in seconds
# JWT Authentication settings (optional, for external identity providers)
jwt_enabled: bool = False # Enable JWT token validation
jwt_secret: str = "" # Secret key for HS256, or leave empty for RS256 with JWKS

View File

@@ -220,7 +220,17 @@ def _run_migrations():
CREATE UNIQUE INDEX idx_packages_project_name ON packages(project_id, name);
END IF;
-- Tag indexes removed: tags table no longer exists (removed in tag system removal)
IF NOT EXISTS (
SELECT 1 FROM pg_indexes WHERE indexname = 'idx_tags_package_name'
) THEN
CREATE UNIQUE INDEX idx_tags_package_name ON tags(package_id, name);
END IF;
IF NOT EXISTS (
SELECT 1 FROM pg_indexes WHERE indexname = 'idx_tags_package_created_at'
) THEN
CREATE INDEX idx_tags_package_created_at ON tags(package_id, created_at);
END IF;
END $$;
""",
),
@@ -277,8 +287,27 @@ def _run_migrations():
Migration(
name="008_create_tags_ref_count_triggers",
sql="""
-- Tags table removed: triggers no longer needed (tag system removed)
DO $$ BEGIN NULL; END $$;
DO $$
BEGIN
DROP TRIGGER IF EXISTS tags_ref_count_insert_trigger ON tags;
CREATE TRIGGER tags_ref_count_insert_trigger
AFTER INSERT ON tags
FOR EACH ROW
EXECUTE FUNCTION increment_artifact_ref_count();
DROP TRIGGER IF EXISTS tags_ref_count_delete_trigger ON tags;
CREATE TRIGGER tags_ref_count_delete_trigger
AFTER DELETE ON tags
FOR EACH ROW
EXECUTE FUNCTION decrement_artifact_ref_count();
DROP TRIGGER IF EXISTS tags_ref_count_update_trigger ON tags;
CREATE TRIGGER tags_ref_count_update_trigger
AFTER UPDATE ON tags
FOR EACH ROW
WHEN (OLD.artifact_id IS DISTINCT FROM NEW.artifact_id)
EXECUTE FUNCTION update_artifact_ref_count();
END $$;
""",
),
Migration(
@@ -325,11 +354,9 @@ def _run_migrations():
Migration(
name="011_migrate_semver_tags_to_versions",
sql=r"""
-- Migrate semver tags to versions (only if both tables exist - for existing databases)
DO $$
BEGIN
IF EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'package_versions')
AND EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'tags') THEN
IF EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'package_versions') THEN
INSERT INTO package_versions (id, package_id, artifact_id, version, version_source, created_by, created_at)
SELECT
gen_random_uuid(),
@@ -538,62 +565,6 @@ def _run_migrations():
WHERE name IN ('npm-public', 'pypi-public', 'maven-central', 'docker-hub');
""",
),
Migration(
name="024_remove_tags",
sql="""
-- Remove tag system, keeping only versions for artifact references
DO $$
BEGIN
-- Drop triggers on tags table (if they exist)
DROP TRIGGER IF EXISTS tags_ref_count_insert_trigger ON tags;
DROP TRIGGER IF EXISTS tags_ref_count_delete_trigger ON tags;
DROP TRIGGER IF EXISTS tags_ref_count_update_trigger ON tags;
DROP TRIGGER IF EXISTS tags_updated_at_trigger ON tags;
DROP TRIGGER IF EXISTS tag_changes_trigger ON tags;
-- Drop the tag change tracking function
DROP FUNCTION IF EXISTS track_tag_changes();
-- Remove tag_constraint from artifact_dependencies
IF EXISTS (
SELECT 1 FROM information_schema.table_constraints
WHERE constraint_name = 'check_constraint_type'
AND table_name = 'artifact_dependencies'
) THEN
ALTER TABLE artifact_dependencies DROP CONSTRAINT check_constraint_type;
END IF;
-- Remove the tag_constraint column if it exists
IF EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = 'artifact_dependencies' AND column_name = 'tag_constraint'
) THEN
ALTER TABLE artifact_dependencies DROP COLUMN tag_constraint;
END IF;
-- Make version_constraint NOT NULL
UPDATE artifact_dependencies SET version_constraint = '*' WHERE version_constraint IS NULL;
ALTER TABLE artifact_dependencies ALTER COLUMN version_constraint SET NOT NULL;
-- Drop tag_history table first (depends on tags)
DROP TABLE IF EXISTS tag_history;
-- Drop tags table
DROP TABLE IF EXISTS tags;
-- Rename uploads.tag_name to version if it exists and version doesn't
IF EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = 'uploads' AND column_name = 'tag_name'
) AND NOT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = 'uploads' AND column_name = 'version'
) THEN
ALTER TABLE uploads RENAME COLUMN tag_name TO version;
END IF;
END $$;
""",
),
]
with engine.connect() as conn:

View File

@@ -1,175 +0,0 @@
"""
Database utilities for optimized artifact operations.
Provides batch operations to eliminate N+1 queries.
"""
import logging
from typing import Optional
from sqlalchemy.dialects.postgresql import insert as pg_insert
from sqlalchemy.orm import Session
from .models import Artifact, ArtifactDependency, CachedUrl
logger = logging.getLogger(__name__)
class ArtifactRepository:
"""
Optimized database operations for artifact storage.
Key optimizations:
- Atomic upserts using ON CONFLICT
- Batch inserts for dependencies
- Joined queries to avoid N+1
"""
def __init__(self, db: Session):
self.db = db
@staticmethod
def _format_dependency_values(
artifact_id: str,
dependencies: list[tuple[str, str, str]],
) -> list[dict]:
"""
Format dependencies for batch insert.
Args:
artifact_id: SHA256 of the artifact
dependencies: List of (project, package, version_constraint)
Returns:
List of dicts ready for bulk insert.
"""
return [
{
"artifact_id": artifact_id,
"dependency_project": proj,
"dependency_package": pkg,
"version_constraint": ver,
}
for proj, pkg, ver in dependencies
]
def get_or_create_artifact(
self,
sha256: str,
size: int,
filename: str,
content_type: Optional[str] = None,
created_by: str = "system",
s3_key: Optional[str] = None,
) -> tuple[Artifact, bool]:
"""
Get existing artifact or create new one atomically.
Uses INSERT ... ON CONFLICT DO UPDATE to handle races.
If artifact exists, increments ref_count.
Args:
sha256: Content hash (primary key)
size: File size in bytes
filename: Original filename
content_type: MIME type
created_by: User who created the artifact
s3_key: S3 storage key (defaults to standard path)
Returns:
(artifact, created) tuple where created is True for new artifacts.
"""
if s3_key is None:
s3_key = f"fruits/{sha256[:2]}/{sha256[2:4]}/{sha256}"
stmt = pg_insert(Artifact).values(
id=sha256,
size=size,
original_name=filename,
content_type=content_type,
ref_count=1,
created_by=created_by,
s3_key=s3_key,
).on_conflict_do_update(
index_elements=['id'],
set_={'ref_count': Artifact.ref_count + 1}
).returning(Artifact)
result = self.db.execute(stmt)
artifact = result.scalar_one()
# Check if this was an insert or update by comparing ref_count
# ref_count=1 means new, >1 means existing
created = artifact.ref_count == 1
return artifact, created
def batch_upsert_dependencies(
self,
artifact_id: str,
dependencies: list[tuple[str, str, str]],
) -> int:
"""
Insert dependencies in a single batch operation.
Uses ON CONFLICT DO NOTHING to skip duplicates.
Args:
artifact_id: SHA256 of the artifact
dependencies: List of (project, package, version_constraint)
Returns:
Number of dependencies inserted.
"""
if not dependencies:
return 0
values = self._format_dependency_values(artifact_id, dependencies)
stmt = pg_insert(ArtifactDependency).values(values)
stmt = stmt.on_conflict_do_nothing(
index_elements=['artifact_id', 'dependency_project', 'dependency_package']
)
result = self.db.execute(stmt)
return result.rowcount
def get_cached_url_with_artifact(
self,
url_hash: str,
) -> Optional[tuple[CachedUrl, Artifact]]:
"""
Get cached URL and its artifact in a single query.
Args:
url_hash: SHA256 of the URL
Returns:
(CachedUrl, Artifact) tuple or None if not found.
"""
result = (
self.db.query(CachedUrl, Artifact)
.join(Artifact, CachedUrl.artifact_id == Artifact.id)
.filter(CachedUrl.url_hash == url_hash)
.first()
)
return result
def get_artifact_dependencies(
self,
artifact_id: str,
) -> list[ArtifactDependency]:
"""
Get all dependencies for an artifact in a single query.
Args:
artifact_id: SHA256 of the artifact
Returns:
List of ArtifactDependency objects.
"""
return (
self.db.query(ArtifactDependency)
.filter(ArtifactDependency.artifact_id == artifact_id)
.all()
)

View File

@@ -11,18 +11,11 @@ Handles:
"""
import re
import logging
import yaml
from typing import List, Dict, Any, Optional, Set, Tuple, TYPE_CHECKING
from typing import List, Dict, Any, Optional, Set, Tuple
from sqlalchemy.orm import Session
from sqlalchemy import and_
if TYPE_CHECKING:
from .storage import S3Storage
from .registry_client import RegistryClient
logger = logging.getLogger(__name__)
# Import packaging for PEP 440 version matching
try:
from packaging.specifiers import SpecifierSet, InvalidSpecifier
@@ -35,6 +28,7 @@ from .models import (
Project,
Package,
Artifact,
Tag,
ArtifactDependency,
PackageVersion,
)
@@ -53,22 +47,6 @@ from .schemas import (
)
def _normalize_pypi_package_name(name: str) -> str:
"""
Normalize a PyPI package name for comparison.
- Strips extras brackets (e.g., "package[extra]" -> "package")
- Replaces sequences of hyphens, underscores, and dots with a single hyphen
- Lowercases the result
This follows PEP 503 normalization rules.
"""
# Strip extras brackets like [test], [dev], etc.
base_name = re.sub(r'\[.*\]', '', name)
# Normalize separators and lowercase
return re.sub(r'[-_.]+', '-', base_name).lower()
class DependencyError(Exception):
"""Base exception for dependency errors."""
pass
@@ -109,17 +87,9 @@ class DependencyDepthExceededError(DependencyError):
super().__init__(f"Dependency resolution exceeded maximum depth of {max_depth}")
class TooManyArtifactsError(DependencyError):
"""Raised when dependency resolution resolves too many artifacts."""
def __init__(self, max_artifacts: int):
self.max_artifacts = max_artifacts
super().__init__(f"Dependency resolution exceeded maximum of {max_artifacts} artifacts")
# Safety limits to prevent DoS attacks
MAX_DEPENDENCY_DEPTH = 100 # Maximum levels of nested dependencies
MAX_DEPENDENCY_DEPTH = 50 # Maximum levels of nested dependencies
MAX_DEPENDENCIES_PER_ARTIFACT = 200 # Maximum direct dependencies per artifact
MAX_TOTAL_ARTIFACTS = 1000 # Maximum total artifacts in resolution to prevent memory issues
def parse_ensure_file(content: bytes) -> EnsureFileContent:
@@ -167,20 +137,26 @@ def parse_ensure_file(content: bytes) -> EnsureFileContent:
project = dep.get('project')
package = dep.get('package')
version = dep.get('version')
tag = dep.get('tag')
if not project:
raise InvalidEnsureFileError(f"Dependency {i} missing 'project'")
if not package:
raise InvalidEnsureFileError(f"Dependency {i} missing 'package'")
if not version:
if not version and not tag:
raise InvalidEnsureFileError(
f"Dependency {i} must have 'version'"
f"Dependency {i} must have either 'version' or 'tag'"
)
if version and tag:
raise InvalidEnsureFileError(
f"Dependency {i} cannot have both 'version' and 'tag'"
)
dependencies.append(EnsureFileDependency(
project=project,
package=package,
version=version,
tag=tag,
))
return EnsureFileContent(dependencies=dependencies)
@@ -234,6 +210,7 @@ def store_dependencies(
dependency_project=dep.project,
dependency_package=dep.package,
version_constraint=dep.version,
tag_constraint=dep.tag,
)
db.add(artifact_dep)
created.append(artifact_dep)
@@ -299,21 +276,26 @@ def get_reverse_dependencies(
if not artifact:
continue
# Find which package this artifact belongs to via versions
version_record = db.query(PackageVersion).filter(
PackageVersion.artifact_id == dep.artifact_id,
).first()
if version_record:
pkg = db.query(Package).filter(Package.id == version_record.package_id).first()
# Find which package this artifact belongs to via tags or versions
tag = db.query(Tag).filter(Tag.artifact_id == dep.artifact_id).first()
if tag:
pkg = db.query(Package).filter(Package.id == tag.package_id).first()
if pkg:
proj = db.query(Project).filter(Project.id == pkg.project_id).first()
if proj:
# Get version if available
version_record = db.query(PackageVersion).filter(
PackageVersion.artifact_id == dep.artifact_id,
PackageVersion.package_id == pkg.id,
).first()
dependents.append(DependentInfo(
artifact_id=dep.artifact_id,
project=proj.name,
package=pkg.name,
version=version_record.version,
constraint_value=dep.version_constraint,
version=version_record.version if version_record else None,
constraint_type="version" if dep.version_constraint else "tag",
constraint_value=dep.version_constraint or dep.tag_constraint,
))
total_pages = (total + limit - 1) // limit
@@ -340,33 +322,6 @@ def _is_version_constraint(version_str: str) -> bool:
return any(op in version_str for op in ['>=', '<=', '!=', '~=', '>', '<', '==', '*'])
def _version_satisfies_constraint(version: str, constraint: str) -> bool:
"""
Check if a version satisfies a constraint.
Args:
version: A version string (e.g., '1.26.0')
constraint: A version constraint (e.g., '>=1.20', '>=1.20,<2.0', '*')
Returns:
True if the version satisfies the constraint, False otherwise
"""
if not HAS_PACKAGING:
return False
# Wildcard matches everything
if constraint == '*' or not constraint:
return True
try:
spec = SpecifierSet(constraint)
v = Version(version)
return v in spec
except (InvalidSpecifier, InvalidVersion):
# If we can't parse, assume it doesn't match
return False
def _resolve_version_constraint(
db: Session,
package: Package,
@@ -452,7 +407,8 @@ def _resolve_dependency_to_artifact(
db: Session,
project_name: str,
package_name: str,
version: str,
version: Optional[str],
tag: Optional[str],
) -> Optional[Tuple[str, str, int]]:
"""
Resolve a dependency constraint to an artifact ID.
@@ -460,6 +416,7 @@ def _resolve_dependency_to_artifact(
Supports:
- Exact version matching (e.g., '1.2.3')
- Version constraints (e.g., '>=1.9', '<2.0,>=1.5')
- Tag matching
- Wildcard ('*' for any version)
Args:
@@ -467,9 +424,10 @@ def _resolve_dependency_to_artifact(
project_name: Project name
package_name: Package name
version: Version or version constraint
tag: Tag constraint
Returns:
Tuple of (artifact_id, resolved_version, size) or None if not found
Tuple of (artifact_id, resolved_version_or_tag, size) or None if not found
"""
# Get project and package
project = db.query(Project).filter(Project.name == project_name).first()
@@ -483,6 +441,7 @@ def _resolve_dependency_to_artifact(
if not package:
return None
if version:
# Check if this is a version constraint (>=, <, etc.) or exact version
if _is_version_constraint(version):
result = _resolve_version_constraint(db, package, version)
@@ -501,6 +460,31 @@ def _resolve_dependency_to_artifact(
if artifact:
return (artifact.id, version, artifact.size)
# Also check if there's a tag with this exact name
tag_record = db.query(Tag).filter(
Tag.package_id == package.id,
Tag.name == version,
).first()
if tag_record:
artifact = db.query(Artifact).filter(
Artifact.id == tag_record.artifact_id
).first()
if artifact:
return (artifact.id, version, artifact.size)
if tag:
# Look up by tag
tag_record = db.query(Tag).filter(
Tag.package_id == package.id,
Tag.name == tag,
).first()
if tag_record:
artifact = db.query(Artifact).filter(
Artifact.id == tag_record.artifact_id
).first()
if artifact:
return (artifact.id, tag, artifact.size)
return None
@@ -530,16 +514,10 @@ def _detect_package_cycle(
Returns:
Cycle path if detected, None otherwise
"""
# Normalize names for comparison (handles extras like [test] and separators)
pkg_normalized = _normalize_pypi_package_name(package_name)
target_pkg_normalized = _normalize_pypi_package_name(target_package)
# Use normalized key for tracking
pkg_key = f"{project_name.lower()}/{pkg_normalized}"
pkg_key = f"{project_name}/{package_name}"
# Check if we've reached the target package (cycle detected)
# Use normalized comparison to handle extras and naming variations
if project_name.lower() == target_project.lower() and pkg_normalized == target_pkg_normalized:
if project_name == target_project and package_name == target_package:
return path + [pkg_key]
if pkg_key in visiting:
@@ -560,9 +538,9 @@ def _detect_package_cycle(
Package.name == package_name,
).first()
if package:
# Find all artifacts in this package via versions
versions = db.query(PackageVersion).filter(PackageVersion.package_id == package.id).all()
artifact_ids = {v.artifact_id for v in versions}
# Find all artifacts in this package via tags
tags = db.query(Tag).filter(Tag.package_id == package.id).all()
artifact_ids = {t.artifact_id for t in tags}
# Get dependencies from all artifacts in this package
for artifact_id in artifact_ids:
@@ -605,8 +583,8 @@ def check_circular_dependencies(
db: Database session
artifact_id: The artifact that will have these dependencies
new_dependencies: Dependencies to be added
project_name: Project name (optional, will try to look up from version if not provided)
package_name: Package name (optional, will try to look up from version if not provided)
project_name: Project name (optional, will try to look up from tag if not provided)
package_name: Package name (optional, will try to look up from tag if not provided)
Returns:
Cycle path if detected, None otherwise
@@ -615,19 +593,17 @@ def check_circular_dependencies(
if project_name and package_name:
current_path = f"{project_name}/{package_name}"
else:
# Try to look up from version
# Try to look up from tag
artifact = db.query(Artifact).filter(Artifact.id == artifact_id).first()
if not artifact:
return None
# Find package for this artifact via version
version_record = db.query(PackageVersion).filter(
PackageVersion.artifact_id == artifact_id
).first()
if not version_record:
# Find package for this artifact
tag = db.query(Tag).filter(Tag.artifact_id == artifact_id).first()
if not tag:
return None
package = db.query(Package).filter(Package.id == version_record.package_id).first()
package = db.query(Package).filter(Package.id == tag.package_id).first()
if not package:
return None
@@ -643,15 +619,12 @@ def check_circular_dependencies(
else:
return None
# Normalize the initial path for consistency with _detect_package_cycle
normalized_path = f"{target_project.lower()}/{_normalize_pypi_package_name(target_package)}"
# For each new dependency, check if it would create a cycle back to our package
for dep in new_dependencies:
# Check if this dependency (transitively) depends on us at the package level
visiting: Set[str] = set()
visited: Set[str] = set()
path: List[str] = [normalized_path]
path: List[str] = [current_path]
# Check from the dependency's package
cycle = _detect_package_cycle(
@@ -684,7 +657,7 @@ def resolve_dependencies(
db: Database session
project_name: Project name
package_name: Package name
ref: Version reference (or artifact:hash)
ref: Tag or version reference
base_url: Base URL for download URLs
Returns:
@@ -707,22 +680,13 @@ def resolve_dependencies(
if not package:
raise DependencyNotFoundError(project_name, package_name, ref)
# Handle artifact: prefix for direct artifact ID references
if ref.startswith("artifact:"):
artifact_id = ref[9:]
artifact = db.query(Artifact).filter(Artifact.id == artifact_id).first()
if not artifact:
raise DependencyNotFoundError(project_name, package_name, ref)
root_artifact_id = artifact.id
root_version = artifact_id[:12] # Use short hash as version display
root_size = artifact.size
else:
# Try to find artifact by version
# Try to find artifact by tag or version
resolved = _resolve_dependency_to_artifact(
db, project_name, package_name, ref
db, project_name, package_name, ref, ref
)
if not resolved:
raise DependencyNotFoundError(project_name, package_name, ref)
root_artifact_id, root_version, root_size = resolved
# Track resolved artifacts and their versions
@@ -738,8 +702,6 @@ def resolve_dependencies(
current_path: Dict[str, str] = {}
# Resolution order (topological)
resolution_order: List[str] = []
# Track resolution path for debugging
resolution_path_sync: List[str] = []
def _resolve_recursive(
artifact_id: str,
@@ -751,16 +713,12 @@ def resolve_dependencies(
depth: int = 0,
):
"""Recursively resolve dependencies with cycle/conflict detection."""
pkg_key = f"{proj_name}/{pkg_name}"
# Safety limit: prevent DoS through deeply nested dependencies
if depth > MAX_DEPENDENCY_DEPTH:
logger.error(
f"Dependency depth exceeded at {pkg_key} (depth={depth}). "
f"Resolution path: {' -> '.join(resolution_path_sync[-20:])}"
)
raise DependencyDepthExceededError(MAX_DEPENDENCY_DEPTH)
pkg_key = f"{proj_name}/{pkg_name}"
# Cycle detection (at artifact level)
if artifact_id in visiting:
# Build cycle path from current_path
@@ -768,25 +726,34 @@ def resolve_dependencies(
cycle = [cycle_start, pkg_key]
raise CircularDependencyError(cycle)
# Version conflict handling - use first resolved version (lenient mode)
# Conflict detection - check if we've seen this package before with a different version
if pkg_key in version_requirements:
existing_versions = {r["version"] for r in version_requirements[pkg_key]}
if version_or_tag not in existing_versions:
# Different version requested - log and use existing (first wins)
existing = version_requirements[pkg_key][0]["version"]
logger.debug(
f"Version mismatch for {pkg_key}: using {existing} "
f"(also requested: {version_or_tag} by {required_by})"
# Conflict detected - same package, different version
requirements = version_requirements[pkg_key] + [
{"version": version_or_tag, "required_by": required_by}
]
raise DependencyConflictError([
DependencyConflict(
project=proj_name,
package=pkg_name,
requirements=[
{
"version": r["version"],
"required_by": [{"path": r["required_by"]}] if r["required_by"] else []
}
for r in requirements
],
)
# Already resolved this package - skip
])
# Same version already resolved - skip
if artifact_id in visited:
return
if artifact_id in visited:
return
# Track path for debugging (only after early-return checks)
resolution_path_sync.append(f"{pkg_key}@{version_or_tag}")
visiting.add(artifact_id)
current_path[artifact_id] = pkg_key
@@ -806,12 +773,7 @@ def resolve_dependencies(
# Resolve each dependency first (depth-first)
for dep in deps:
# Skip self-dependencies (can happen with PyPI extras like pytest[testing])
# Use normalized comparison for PyPI naming conventions (handles extras, separators)
dep_proj_normalized = dep.dependency_project.lower()
dep_pkg_normalized = _normalize_pypi_package_name(dep.dependency_package)
curr_proj_normalized = proj_name.lower()
curr_pkg_normalized = _normalize_pypi_package_name(pkg_name)
if dep_proj_normalized == curr_proj_normalized and dep_pkg_normalized == curr_pkg_normalized:
if dep.dependency_project == proj_name and dep.dependency_package == pkg_name:
continue
resolved_dep = _resolve_dependency_to_artifact(
@@ -819,11 +781,12 @@ def resolve_dependencies(
dep.dependency_project,
dep.dependency_package,
dep.version_constraint,
dep.tag_constraint,
)
if not resolved_dep:
# Dependency not cached on server - track as missing but continue
constraint = dep.version_constraint
constraint = dep.version_constraint or dep.tag_constraint
missing_dependencies.append(MissingDependency(
project=dep.dependency_project,
package=dep.dependency_package,
@@ -838,10 +801,6 @@ def resolve_dependencies(
if dep_artifact_id == artifact_id:
continue
# Skip if this artifact is already being visited (would cause cycle)
if dep_artifact_id in visiting:
continue
_resolve_recursive(
dep_artifact_id,
dep.dependency_project,
@@ -855,11 +814,6 @@ def resolve_dependencies(
visiting.remove(artifact_id)
del current_path[artifact_id]
visited.add(artifact_id)
resolution_path_sync.pop()
# Check total artifacts limit
if len(resolution_order) >= MAX_TOTAL_ARTIFACTS:
raise TooManyArtifactsError(MAX_TOTAL_ARTIFACTS)
# Add to resolution order (dependencies before dependents)
resolution_order.append(artifact_id)
@@ -896,417 +850,6 @@ def resolve_dependencies(
},
resolved=resolved_list,
missing=missing_dependencies,
fetched=[], # No fetching in sync version
total_size=total_size,
artifact_count=len(resolved_list),
)
# System project mapping for auto-fetch
SYSTEM_PROJECT_REGISTRY_MAP = {
"_pypi": "pypi",
"_npm": "npm",
"_maven": "maven",
}
async def resolve_dependencies_with_fetch(
db: Session,
project_name: str,
package_name: str,
ref: str,
base_url: str,
storage: "S3Storage",
registry_clients: Dict[str, "RegistryClient"],
) -> DependencyResolutionResponse:
"""
Resolve all dependencies for an artifact recursively, fetching missing ones from upstream.
This async version extends the basic resolution with auto-fetch capability:
when a missing dependency is from a system project (e.g., _pypi), it attempts
to fetch the package from the corresponding upstream registry.
If the root artifact itself doesn't exist in a system project, it will also
be fetched from upstream before resolution begins.
Args:
db: Database session
project_name: Project name
package_name: Package name
ref: Version reference (or artifact:hash)
base_url: Base URL for download URLs
storage: S3 storage for caching fetched artifacts
registry_clients: Map of system project to registry client {"_pypi": PyPIRegistryClient}
Returns:
DependencyResolutionResponse with all resolved artifacts and fetch status
Raises:
DependencyNotFoundError: If the root artifact cannot be found (even after fetch attempt)
CircularDependencyError: If circular dependencies are detected
"""
# Track fetched artifacts for response
fetched_artifacts: List[ResolvedArtifact] = []
# Check if project exists
project = db.query(Project).filter(Project.name == project_name).first()
# If project doesn't exist and it's a system project pattern, we can't auto-create it
if not project:
raise DependencyNotFoundError(project_name, package_name, ref)
# Check if package exists
package = db.query(Package).filter(
Package.project_id == project.id,
Package.name == package_name,
).first()
# Try to resolve the root artifact
root_artifact_id = None
root_version = None
root_size = None
# Handle artifact: prefix for direct artifact ID references
if ref.startswith("artifact:"):
artifact_id = ref[9:]
artifact = db.query(Artifact).filter(Artifact.id == artifact_id).first()
if artifact:
root_artifact_id = artifact.id
root_version = artifact_id[:12]
root_size = artifact.size
elif package:
# Try to resolve by version/constraint
resolved = _resolve_dependency_to_artifact(
db, project_name, package_name, ref
)
if resolved:
root_artifact_id, root_version, root_size = resolved
# If root artifact not found and this is a system project, try to fetch it
if root_artifact_id is None and project_name in SYSTEM_PROJECT_REGISTRY_MAP:
logger.info(
f"Root artifact {project_name}/{package_name}@{ref} not found, "
"attempting to fetch from upstream"
)
client = registry_clients.get(project_name)
if client:
try:
# Resolve the version constraint from upstream
version_info = await client.resolve_constraint(package_name, ref)
if version_info:
# Fetch and cache the package
fetch_result = await client.fetch_package(
package_name, version_info, db, storage
)
if fetch_result:
logger.info(
f"Successfully fetched root artifact {package_name}=="
f"{fetch_result.version} (artifact {fetch_result.artifact_id[:12]})"
)
root_artifact_id = fetch_result.artifact_id
root_version = fetch_result.version
root_size = fetch_result.size
# Add to fetched list
fetched_artifacts.append(ResolvedArtifact(
artifact_id=fetch_result.artifact_id,
project=project_name,
package=package_name,
version=fetch_result.version,
size=fetch_result.size,
download_url=f"{base_url}/api/v1/project/{project_name}/{package_name}/+/{fetch_result.version}",
))
except Exception as e:
logger.warning(f"Failed to fetch root artifact {package_name}: {e}")
# If still no root artifact, raise error
if root_artifact_id is None:
raise DependencyNotFoundError(project_name, package_name, ref)
# Track state
resolved_artifacts: Dict[str, ResolvedArtifact] = {}
missing_dependencies: List[MissingDependency] = []
# Note: fetched_artifacts was already initialized above (line 911)
# and may already contain the root artifact if it was fetched from upstream
version_requirements: Dict[str, List[Dict[str, Any]]] = {}
visiting: Set[str] = set()
visited: Set[str] = set()
current_path: Dict[str, str] = {}
resolution_order: List[str] = []
# Track fetch attempts to prevent loops
fetch_attempted: Set[str] = set() # "project/package@constraint"
async def _try_fetch_dependency(
dep_project: str,
dep_package: str,
constraint: str,
required_by: str,
) -> Optional[Tuple[str, str, int]]:
"""
Try to fetch a missing dependency from upstream registry.
Returns (artifact_id, version, size) if successful, None otherwise.
"""
# Only fetch from system projects
registry_type = SYSTEM_PROJECT_REGISTRY_MAP.get(dep_project)
if not registry_type:
logger.debug(
f"Not a system project, skipping fetch: {dep_project}/{dep_package}"
)
return None
# Build fetch key for loop prevention
fetch_key = f"{dep_project}/{dep_package}@{constraint}"
if fetch_key in fetch_attempted:
logger.debug(f"Already attempted fetch for {fetch_key}")
return None
fetch_attempted.add(fetch_key)
# Get registry client
client = registry_clients.get(dep_project)
if not client:
logger.debug(f"No registry client for {dep_project}")
return None
try:
# Resolve version constraint
version_info = await client.resolve_constraint(dep_package, constraint)
if not version_info:
logger.info(
f"No version of {dep_package} matches constraint '{constraint}' on upstream"
)
return None
# Fetch and cache the package
fetch_result = await client.fetch_package(
dep_package, version_info, db, storage
)
if not fetch_result:
logger.warning(f"Failed to fetch {dep_package}=={version_info.version}")
return None
logger.info(
f"Successfully fetched {dep_package}=={version_info.version} "
f"(artifact {fetch_result.artifact_id[:12]})"
)
# Add to fetched list for response
fetched_artifacts.append(ResolvedArtifact(
artifact_id=fetch_result.artifact_id,
project=dep_project,
package=dep_package,
version=fetch_result.version,
size=fetch_result.size,
download_url=f"{base_url}/api/v1/project/{dep_project}/{dep_package}/+/{fetch_result.version}",
))
return (fetch_result.artifact_id, fetch_result.version, fetch_result.size)
except Exception as e:
logger.warning(f"Error fetching {dep_package}: {e}")
return None
# Track resolution path for debugging
resolution_path: List[str] = []
async def _resolve_recursive_async(
artifact_id: str,
proj_name: str,
pkg_name: str,
version_or_tag: str,
size: int,
required_by: Optional[str],
depth: int = 0,
):
"""Recursively resolve dependencies with fetch capability."""
pkg_key = f"{proj_name}/{pkg_name}"
if depth > MAX_DEPENDENCY_DEPTH:
logger.error(
f"Dependency depth exceeded at {pkg_key} (depth={depth}). "
f"Resolution path: {' -> '.join(resolution_path[-20:])}"
)
raise DependencyDepthExceededError(MAX_DEPENDENCY_DEPTH)
# Cycle detection
if artifact_id in visiting:
cycle_start = current_path.get(artifact_id, pkg_key)
cycle = [cycle_start, pkg_key]
raise CircularDependencyError(cycle)
# Version conflict handling - use first resolved version (lenient mode)
if pkg_key in version_requirements:
existing_versions = {r["version"] for r in version_requirements[pkg_key]}
if version_or_tag not in existing_versions:
# Different version requested - log and use existing (first wins)
existing = version_requirements[pkg_key][0]["version"]
logger.debug(
f"Version mismatch for {pkg_key}: using {existing} "
f"(also requested: {version_or_tag} by {required_by})"
)
# Already resolved this package - skip
return
if artifact_id in visited:
return
# Track path for debugging (only after early-return checks)
resolution_path.append(f"{pkg_key}@{version_or_tag}")
visiting.add(artifact_id)
current_path[artifact_id] = pkg_key
if pkg_key not in version_requirements:
version_requirements[pkg_key] = []
version_requirements[pkg_key].append({
"version": version_or_tag,
"required_by": required_by,
})
# Get dependencies
deps = db.query(ArtifactDependency).filter(
ArtifactDependency.artifact_id == artifact_id
).all()
for dep in deps:
# Skip self-dependencies (common with PyPI extras like pytest[testing] -> pytest)
dep_proj_normalized = dep.dependency_project.lower()
dep_pkg_normalized = _normalize_pypi_package_name(dep.dependency_package)
curr_proj_normalized = proj_name.lower()
curr_pkg_normalized = _normalize_pypi_package_name(pkg_name)
if dep_proj_normalized == curr_proj_normalized and dep_pkg_normalized == curr_pkg_normalized:
logger.debug(
f"Skipping self-dependency: {pkg_key} -> {dep.dependency_project}/{dep.dependency_package}"
)
continue
# Also check if this dependency would resolve to the current artifact
# (handles cases where package names differ but resolve to same artifact)
resolved_dep = _resolve_dependency_to_artifact(
db,
dep.dependency_project,
dep.dependency_package,
dep.version_constraint,
)
if not resolved_dep:
# Try to fetch from upstream if it's a system project
fetched = await _try_fetch_dependency(
dep.dependency_project,
dep.dependency_package,
dep.version_constraint,
pkg_key,
)
if fetched:
resolved_dep = fetched
else:
# Still missing - add to missing list with fetch status
fetch_key = f"{dep.dependency_project}/{dep.dependency_package}@{dep.version_constraint}"
was_attempted = fetch_key in fetch_attempted
missing_dependencies.append(MissingDependency(
project=dep.dependency_project,
package=dep.dependency_package,
constraint=dep.version_constraint,
required_by=pkg_key,
fetch_attempted=was_attempted,
))
continue
dep_artifact_id, dep_version, dep_size = resolved_dep
# Skip if resolved to same artifact (self-dependency at artifact level)
if dep_artifact_id == artifact_id:
logger.debug(
f"Skipping self-dependency (same artifact): {pkg_key} -> "
f"{dep.dependency_project}/{dep.dependency_package} (artifact {dep_artifact_id[:12]})"
)
continue
# Skip if this artifact is already being visited (would cause cycle)
if dep_artifact_id in visiting:
logger.debug(
f"Skipping dependency already in resolution stack: {pkg_key} -> "
f"{dep.dependency_project}/{dep.dependency_package} (artifact {dep_artifact_id[:12]})"
)
continue
# Check if we've already resolved this package to a different version
dep_pkg_key = f"{dep.dependency_project}/{dep.dependency_package}"
if dep_pkg_key in version_requirements:
existing_version = version_requirements[dep_pkg_key][0]["version"]
if existing_version != dep_version:
# Different version resolved - check if existing satisfies new constraint
if HAS_PACKAGING and _version_satisfies_constraint(existing_version, dep.version_constraint):
logger.debug(
f"Reusing existing version {existing_version} for {dep_pkg_key} "
f"(satisfies constraint {dep.version_constraint})"
)
continue
else:
logger.debug(
f"Version conflict for {dep_pkg_key}: have {existing_version}, "
f"need {dep.version_constraint} (resolved to {dep_version})"
)
# Don't raise error - just use the first version we resolved
# This is more lenient than strict conflict detection
continue
await _resolve_recursive_async(
dep_artifact_id,
dep.dependency_project,
dep.dependency_package,
dep_version,
dep_size,
pkg_key,
depth + 1,
)
visiting.remove(artifact_id)
del current_path[artifact_id]
visited.add(artifact_id)
resolution_path.pop()
# Check total artifacts limit
if len(resolution_order) >= MAX_TOTAL_ARTIFACTS:
raise TooManyArtifactsError(MAX_TOTAL_ARTIFACTS)
resolution_order.append(artifact_id)
resolved_artifacts[artifact_id] = ResolvedArtifact(
artifact_id=artifact_id,
project=proj_name,
package=pkg_name,
version=version_or_tag,
size=size,
download_url=f"{base_url}/api/v1/project/{proj_name}/{pkg_name}/+/{version_or_tag}",
)
# Start resolution from root
await _resolve_recursive_async(
root_artifact_id,
project_name,
package_name,
root_version,
root_size,
None,
)
# Build response in topological order
resolved_list = [resolved_artifacts[aid] for aid in resolution_order]
total_size = sum(r.size for r in resolved_list)
return DependencyResolutionResponse(
requested={
"project": project_name,
"package": package_name,
"ref": ref,
},
resolved=resolved_list,
missing=missing_dependencies,
fetched=fetched_artifacts,
total_size=total_size,
artifact_count=len(resolved_list),
)

View File

@@ -1,179 +0,0 @@
"""
HTTP client manager with connection pooling and lifecycle management.
Provides:
- Shared connection pools for upstream requests
- Per-upstream client isolation when needed
- Thread pool for blocking I/O operations
- FastAPI lifespan integration
"""
import asyncio
import logging
from concurrent.futures import ThreadPoolExecutor
from typing import Any, Callable, Optional
import httpx
from .config import Settings
logger = logging.getLogger(__name__)
class HttpClientManager:
"""
Manages httpx.AsyncClient pools with FastAPI lifespan integration.
Features:
- Default shared pool for general requests
- Per-upstream pools for sources needing specific config/auth
- Dedicated thread pool for blocking operations
- Graceful shutdown
"""
def __init__(self, settings: Settings):
self.max_connections = settings.http_max_connections
self.max_keepalive = settings.http_max_keepalive
self.connect_timeout = settings.http_connect_timeout
self.read_timeout = settings.http_read_timeout
self.worker_threads = settings.http_worker_threads
self._default_client: Optional[httpx.AsyncClient] = None
self._upstream_clients: dict[str, httpx.AsyncClient] = {}
self._executor: Optional[ThreadPoolExecutor] = None
self._started = False
async def startup(self) -> None:
"""Initialize clients and thread pool. Called by FastAPI lifespan."""
if self._started:
return
logger.info(
f"Starting HttpClientManager: max_connections={self.max_connections}, "
f"worker_threads={self.worker_threads}"
)
# Create connection limits
limits = httpx.Limits(
max_connections=self.max_connections,
max_keepalive_connections=self.max_keepalive,
)
# Create timeout config
timeout = httpx.Timeout(
connect=self.connect_timeout,
read=self.read_timeout,
write=self.read_timeout,
pool=self.connect_timeout,
)
# Create default client
self._default_client = httpx.AsyncClient(
limits=limits,
timeout=timeout,
follow_redirects=False, # Handle redirects manually for auth
)
# Create thread pool for blocking operations
self._executor = ThreadPoolExecutor(
max_workers=self.worker_threads,
thread_name_prefix="orchard-blocking-",
)
self._started = True
logger.info("HttpClientManager started")
async def shutdown(self) -> None:
"""Close all clients and thread pool. Called by FastAPI lifespan."""
if not self._started:
return
logger.info("Shutting down HttpClientManager")
# Close default client
if self._default_client:
await self._default_client.aclose()
self._default_client = None
# Close upstream-specific clients
for name, client in self._upstream_clients.items():
logger.debug(f"Closing upstream client: {name}")
await client.aclose()
self._upstream_clients.clear()
# Shutdown thread pool
if self._executor:
self._executor.shutdown(wait=True)
self._executor = None
self._started = False
logger.info("HttpClientManager shutdown complete")
def get_client(self, upstream_name: Optional[str] = None) -> httpx.AsyncClient:
"""
Get HTTP client for making requests.
Args:
upstream_name: Optional upstream source name for dedicated pool.
If None, returns the default shared client.
Returns:
httpx.AsyncClient configured for the request.
Raises:
RuntimeError: If manager not started.
"""
if not self._started or not self._default_client:
raise RuntimeError("HttpClientManager not started. Call startup() first.")
if upstream_name and upstream_name in self._upstream_clients:
return self._upstream_clients[upstream_name]
return self._default_client
async def run_blocking(self, func: Callable[..., Any], *args: Any) -> Any:
"""
Run a blocking function in the thread pool.
Use this for:
- File I/O operations
- Archive extraction (zipfile, tarfile)
- Hash computation on large data
Args:
func: Synchronous function to execute
*args: Arguments to pass to the function
Returns:
The function's return value.
"""
if not self._executor:
raise RuntimeError("HttpClientManager not started. Call startup() first.")
loop = asyncio.get_running_loop()
return await loop.run_in_executor(self._executor, func, *args)
@property
def active_connections(self) -> int:
"""Get approximate number of active connections (for health checks)."""
if not self._default_client:
return 0
# httpx doesn't expose this directly, return pool size as approximation
return self.max_connections
@property
def pool_size(self) -> int:
"""Get configured pool size."""
return self.max_connections
@property
def executor_active(self) -> int:
"""Get number of active thread pool workers."""
if not self._executor:
return 0
return len(self._executor._threads)
@property
def executor_max(self) -> int:
"""Get max thread pool workers."""
return self.worker_threads

View File

@@ -15,8 +15,6 @@ from .pypi_proxy import router as pypi_router
from .seed import seed_database
from .auth import create_default_admin
from .rate_limit import limiter
from .http_client import HttpClientManager
from .cache_service import CacheService
settings = get_settings()
logging.basicConfig(level=logging.INFO)
@@ -40,17 +38,6 @@ async def lifespan(app: FastAPI):
finally:
db.close()
# Initialize infrastructure services
logger.info("Initializing infrastructure services...")
app.state.http_client = HttpClientManager(settings)
await app.state.http_client.startup()
app.state.cache = CacheService(settings)
await app.state.cache.startup()
logger.info("Infrastructure services ready")
# Seed test data in development mode
if settings.is_development:
logger.info(f"Running in {settings.env} mode - checking for seed data")
@@ -64,12 +51,6 @@ async def lifespan(app: FastAPI):
yield
# Shutdown infrastructure services
logger.info("Shutting down infrastructure services...")
await app.state.http_client.shutdown()
await app.state.cache.shutdown()
logger.info("Shutdown complete")
app = FastAPI(
title="Orchard",

View File

@@ -71,6 +71,7 @@ class Package(Base):
)
project = relationship("Project", back_populates="packages")
tags = relationship("Tag", back_populates="package", cascade="all, delete-orphan")
uploads = relationship(
"Upload", back_populates="package", cascade="all, delete-orphan"
)
@@ -119,6 +120,7 @@ class Artifact(Base):
ref_count = Column(Integer, default=1)
s3_key = Column(String(1024), nullable=False)
tags = relationship("Tag", back_populates="artifact")
uploads = relationship("Upload", back_populates="artifact")
versions = relationship("PackageVersion", back_populates="artifact")
dependencies = relationship(
@@ -149,6 +151,65 @@ class Artifact(Base):
)
class Tag(Base):
__tablename__ = "tags"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
package_id = Column(
UUID(as_uuid=True),
ForeignKey("packages.id", ondelete="CASCADE"),
nullable=False,
)
name = Column(String(255), nullable=False)
artifact_id = Column(String(64), ForeignKey("artifacts.id"), nullable=False)
created_at = Column(DateTime(timezone=True), default=datetime.utcnow)
updated_at = Column(
DateTime(timezone=True), default=datetime.utcnow, onupdate=datetime.utcnow
)
created_by = Column(String(255), nullable=False)
package = relationship("Package", back_populates="tags")
artifact = relationship("Artifact", back_populates="tags")
history = relationship(
"TagHistory", back_populates="tag", cascade="all, delete-orphan"
)
__table_args__ = (
Index("idx_tags_package_id", "package_id"),
Index("idx_tags_artifact_id", "artifact_id"),
Index(
"idx_tags_package_name", "package_id", "name", unique=True
), # Composite unique index
Index(
"idx_tags_package_created_at", "package_id", "created_at"
), # For recent tags queries
)
class TagHistory(Base):
__tablename__ = "tag_history"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
tag_id = Column(
UUID(as_uuid=True), ForeignKey("tags.id", ondelete="CASCADE"), nullable=False
)
old_artifact_id = Column(String(64), ForeignKey("artifacts.id"))
new_artifact_id = Column(String(64), ForeignKey("artifacts.id"), nullable=False)
change_type = Column(String(20), nullable=False, default="update")
changed_at = Column(DateTime(timezone=True), default=datetime.utcnow)
changed_by = Column(String(255), nullable=False)
tag = relationship("Tag", back_populates="history")
__table_args__ = (
Index("idx_tag_history_tag_id", "tag_id"),
Index("idx_tag_history_changed_at", "changed_at"),
CheckConstraint(
"change_type IN ('create', 'update', 'delete')", name="check_change_type"
),
)
class PackageVersion(Base):
"""Immutable version record for a package-artifact relationship.
@@ -188,7 +249,7 @@ class Upload(Base):
artifact_id = Column(String(64), ForeignKey("artifacts.id"), nullable=False)
package_id = Column(UUID(as_uuid=True), ForeignKey("packages.id"), nullable=False)
original_name = Column(String(1024))
version = Column(String(255)) # Version assigned during upload
tag_name = Column(String(255)) # Tag assigned during upload
user_agent = Column(String(512)) # Client identification
duration_ms = Column(Integer) # Upload timing in milliseconds
deduplicated = Column(Boolean, default=False) # Whether artifact was deduplicated
@@ -463,8 +524,8 @@ class PackageHistory(Base):
class ArtifactDependency(Base):
"""Dependency declared by an artifact on another package.
Each artifact can declare dependencies on other packages, specifying a version.
This enables recursive dependency resolution.
Each artifact can declare dependencies on other packages, specifying either
an exact version or a tag. This enables recursive dependency resolution.
"""
__tablename__ = "artifact_dependencies"
@@ -477,13 +538,20 @@ class ArtifactDependency(Base):
)
dependency_project = Column(String(255), nullable=False)
dependency_package = Column(String(255), nullable=False)
version_constraint = Column(String(255), nullable=False)
version_constraint = Column(String(255), nullable=True)
tag_constraint = Column(String(255), nullable=True)
created_at = Column(DateTime(timezone=True), default=datetime.utcnow)
# Relationship to the artifact that declares this dependency
artifact = relationship("Artifact", back_populates="dependencies")
__table_args__ = (
# Exactly one of version_constraint or tag_constraint must be set
CheckConstraint(
"(version_constraint IS NOT NULL AND tag_constraint IS NULL) OR "
"(version_constraint IS NULL AND tag_constraint IS NOT NULL)",
name="check_constraint_type",
),
# Each artifact can only depend on a specific project/package once
Index(
"idx_artifact_dependencies_artifact_id",

View File

@@ -12,6 +12,7 @@ from .models import (
Project,
Package,
Artifact,
Tag,
Upload,
PackageVersion,
ArtifactDependency,
@@ -59,6 +60,7 @@ def purge_seed_data(db: Session) -> dict:
results = {
"dependencies_deleted": 0,
"tags_deleted": 0,
"versions_deleted": 0,
"uploads_deleted": 0,
"artifacts_deleted": 0,
@@ -101,7 +103,15 @@ def purge_seed_data(db: Session) -> dict:
results["dependencies_deleted"] = count
logger.info(f"Deleted {count} artifact dependencies")
# 2. Delete package versions
# 2. Delete tags
if seed_package_ids:
count = db.query(Tag).filter(Tag.package_id.in_(seed_package_ids)).delete(
synchronize_session=False
)
results["tags_deleted"] = count
logger.info(f"Deleted {count} tags")
# 3. Delete package versions
if seed_package_ids:
count = db.query(PackageVersion).filter(
PackageVersion.package_id.in_(seed_package_ids)
@@ -109,7 +119,7 @@ def purge_seed_data(db: Session) -> dict:
results["versions_deleted"] = count
logger.info(f"Deleted {count} package versions")
# 3. Delete uploads
# 4. Delete uploads
if seed_package_ids:
count = db.query(Upload).filter(Upload.package_id.in_(seed_package_ids)).delete(
synchronize_session=False
@@ -117,7 +127,7 @@ def purge_seed_data(db: Session) -> dict:
results["uploads_deleted"] = count
logger.info(f"Deleted {count} uploads")
# 4. Delete S3 objects for seed artifacts
# 5. Delete S3 objects for seed artifacts
if seed_artifact_ids:
seed_artifacts = db.query(Artifact).filter(Artifact.id.in_(seed_artifact_ids)).all()
for artifact in seed_artifacts:
@@ -129,8 +139,8 @@ def purge_seed_data(db: Session) -> dict:
logger.warning(f"Failed to delete S3 object {artifact.s3_key}: {e}")
logger.info(f"Deleted {results['s3_objects_deleted']} S3 objects")
# 5. Delete artifacts (only those with ref_count that would be 0 after our deletions)
# Since we deleted all versions pointing to these artifacts, we can delete them
# 6. Delete artifacts (only those with ref_count that would be 0 after our deletions)
# Since we deleted all tags/versions pointing to these artifacts, we can delete them
if seed_artifact_ids:
count = db.query(Artifact).filter(Artifact.id.in_(seed_artifact_ids)).delete(
synchronize_session=False
@@ -138,7 +148,7 @@ def purge_seed_data(db: Session) -> dict:
results["artifacts_deleted"] = count
logger.info(f"Deleted {count} artifacts")
# 6. Delete packages
# 7. Delete packages
if seed_package_ids:
count = db.query(Package).filter(Package.id.in_(seed_package_ids)).delete(
synchronize_session=False
@@ -146,7 +156,7 @@ def purge_seed_data(db: Session) -> dict:
results["packages_deleted"] = count
logger.info(f"Deleted {count} packages")
# 7. Delete access permissions for seed projects
# 8. Delete access permissions for seed projects
if seed_project_ids:
count = db.query(AccessPermission).filter(
AccessPermission.project_id.in_(seed_project_ids)
@@ -154,14 +164,14 @@ def purge_seed_data(db: Session) -> dict:
results["permissions_deleted"] = count
logger.info(f"Deleted {count} access permissions")
# 8. Delete seed projects
# 9. Delete seed projects
count = db.query(Project).filter(Project.name.in_(SEED_PROJECT_NAMES)).delete(
synchronize_session=False
)
results["projects_deleted"] = count
logger.info(f"Deleted {count} projects")
# 9. Find and delete seed team
# 10. Find and delete seed team
seed_team = db.query(Team).filter(Team.slug == SEED_TEAM_SLUG).first()
if seed_team:
# Delete team memberships first
@@ -176,7 +186,7 @@ def purge_seed_data(db: Session) -> dict:
results["teams_deleted"] = 1
logger.info(f"Deleted team: {SEED_TEAM_SLUG}")
# 10. Delete seed users (but NOT admin)
# 11. Delete seed users (but NOT admin)
seed_users = db.query(User).filter(User.username.in_(SEED_USERNAMES)).all()
for user in seed_users:
# Delete any remaining team memberships for this user

View File

@@ -6,237 +6,33 @@ Artifacts are cached on first access through configured upstream sources.
"""
import hashlib
import json
import logging
import os
import re
import tarfile
import tempfile
import zipfile
from io import BytesIO
from typing import Optional, List, Tuple
from typing import Optional
from urllib.parse import urljoin, urlparse, quote, unquote
import httpx
from fastapi import APIRouter, Depends, HTTPException, Request, Response
from fastapi.responses import StreamingResponse, HTMLResponse, RedirectResponse
from fastapi.responses import StreamingResponse, HTMLResponse
from sqlalchemy.orm import Session
from .database import get_db
from .models import UpstreamSource, CachedUrl, Artifact, Project, Package, PackageVersion
from .models import UpstreamSource, CachedUrl, Artifact, Project, Package, Tag, PackageVersion
from .storage import S3Storage, get_storage
from .config import get_env_upstream_sources, get_settings
from .http_client import HttpClientManager
from .db_utils import ArtifactRepository
from .config import get_env_upstream_sources
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/pypi", tags=["pypi-proxy"])
def get_http_client(request: Request) -> HttpClientManager:
"""Get HttpClientManager from app state."""
return request.app.state.http_client
# Timeout configuration for proxy requests
PROXY_CONNECT_TIMEOUT = 30.0
PROXY_READ_TIMEOUT = 60.0
def _parse_requires_dist(requires_dist: str) -> Tuple[str, Optional[str]]:
"""Parse a Requires-Dist line into (package_name, version_constraint).
Filters out optional/extra dependencies and platform-specific dependencies
to avoid pulling in unnecessary packages during dependency resolution.
Examples:
"requests (>=2.25.0)" -> ("requests", ">=2.25.0")
"typing-extensions; python_version < '3.8'" -> ("typing-extensions", None)
"numpy>=1.21.0" -> ("numpy", ">=1.21.0")
"certifi" -> ("certifi", None)
"pytest; extra == 'test'" -> (None, None) # Filtered: extra dependency
"pyobjc; sys_platform == 'darwin'" -> (None, None) # Filtered: platform-specific
Returns:
Tuple of (normalized_package_name, version_constraint or None)
Returns (None, None) for dependencies that should be filtered out.
"""
# Check for and filter environment markers (after semicolon)
if ';' in requires_dist:
marker_part = requires_dist.split(';', 1)[1].lower()
# Filter out extra/optional dependencies - these are not core dependencies
# Examples: "pytest; extra == 'test'", "sphinx; extra == 'docs'"
if 'extra' in marker_part:
return None, None
# Filter out platform-specific dependencies to avoid cross-platform bloat
# Examples: "pyobjc; sys_platform == 'darwin'", "pywin32; sys_platform == 'win32'"
if 'sys_platform' in marker_part or 'platform_system' in marker_part:
return None, None
# Strip the marker for remaining dependencies (like python_version constraints)
requires_dist = requires_dist.split(';')[0].strip()
# Match patterns like "package (>=1.0)" or "package>=1.0" or "package"
match = re.match(
r'^([a-zA-Z0-9][-a-zA-Z0-9._]*)\s*(?:\(([^)]+)\)|([<>=!~][^\s;]+))?',
requires_dist.strip()
)
if not match:
return None, None
package_name = match.group(1)
# Version can be in parentheses (group 2) or directly after name (group 3)
version_constraint = match.group(2) or match.group(3)
# Normalize package name (PEP 503)
normalized_name = re.sub(r'[-_.]+', '-', package_name).lower()
# Clean up version constraint
if version_constraint:
version_constraint = version_constraint.strip()
return normalized_name, version_constraint
def _extract_requires_from_metadata(metadata_content: str) -> List[Tuple[str, Optional[str]]]:
"""Extract all Requires-Dist entries from METADATA/PKG-INFO content.
Args:
metadata_content: The content of a METADATA or PKG-INFO file
Returns:
List of (package_name, version_constraint) tuples
"""
dependencies = []
for line in metadata_content.split('\n'):
if line.startswith('Requires-Dist:'):
value = line[len('Requires-Dist:'):].strip()
pkg_name, version = _parse_requires_dist(value)
if pkg_name:
dependencies.append((pkg_name, version))
return dependencies
def _extract_metadata_from_wheel(file_path: str) -> Optional[str]:
"""Extract METADATA file content from a wheel (zip) file.
Args:
file_path: Path to the wheel file
Returns:
METADATA file content as string, or None if not found
"""
try:
with zipfile.ZipFile(file_path) as zf:
for name in zf.namelist():
if name.endswith('.dist-info/METADATA'):
return zf.read(name).decode('utf-8', errors='replace')
except Exception as e:
logger.warning(f"Failed to extract metadata from wheel: {e}")
return None
def _extract_metadata_from_sdist(file_path: str) -> Optional[str]:
"""Extract PKG-INFO file content from a source distribution (.tar.gz).
Args:
file_path: Path to the tarball file
Returns:
PKG-INFO file content as string, or None if not found
"""
try:
with tarfile.open(file_path, mode='r:gz') as tf:
for member in tf.getmembers():
if member.name.endswith('/PKG-INFO') and member.name.count('/') == 1:
f = tf.extractfile(member)
if f:
return f.read().decode('utf-8', errors='replace')
except Exception as e:
logger.warning(f"Failed to extract metadata from sdist: {e}")
return None
def _extract_dependencies_from_file(file_path: str, filename: str) -> List[Tuple[str, Optional[str]]]:
"""Extract dependencies from a PyPI package file.
Supports wheel (.whl) and source distribution (.tar.gz) formats.
Args:
file_path: Path to the package file
filename: The original filename
Returns:
List of (package_name, version_constraint) tuples
"""
metadata = None
if filename.endswith('.whl'):
metadata = _extract_metadata_from_wheel(file_path)
elif filename.endswith('.tar.gz'):
metadata = _extract_metadata_from_sdist(file_path)
if metadata:
return _extract_requires_from_metadata(metadata)
return []
def _parse_upstream_error(response: httpx.Response) -> str:
"""Parse upstream error response to extract useful error details.
Handles JFrog/Artifactory policy errors and other common formats.
Returns a user-friendly error message.
"""
status = response.status_code
try:
body = response.text
except Exception:
return f"HTTP {status}"
# Try to parse as JSON (JFrog/Artifactory format)
try:
data = json.loads(body)
# JFrog Artifactory format: {"errors": [{"status": 403, "message": "..."}]}
if "errors" in data and isinstance(data["errors"], list):
messages = []
for err in data["errors"]:
if isinstance(err, dict) and "message" in err:
messages.append(err["message"])
if messages:
return "; ".join(messages)
# Alternative format: {"message": "..."}
if "message" in data:
return data["message"]
# Alternative format: {"error": "..."}
if "error" in data:
return data["error"]
except (json.JSONDecodeError, ValueError):
pass
# Check for policy-related keywords in plain text response
policy_keywords = ["policy", "blocked", "forbidden", "curation", "security"]
if any(kw in body.lower() for kw in policy_keywords):
# Truncate long responses but preserve the message
if len(body) > 500:
return body[:500] + "..."
return body
# Default: just return status code
return f"HTTP {status}"
def _extract_pypi_version(filename: str) -> Optional[str]:
"""Extract version from PyPI filename.
@@ -411,7 +207,6 @@ async def pypi_simple_index(
# Try each source in priority order
last_error = None
last_status = None
for source in sources:
try:
headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
@@ -457,29 +252,21 @@ async def pypi_simple_index(
return HTMLResponse(content=content)
# Parse upstream error for policy/blocking messages
last_error = _parse_upstream_error(response)
last_status = response.status_code
logger.warning(f"PyPI proxy: upstream returned {response.status_code}: {last_error}")
last_error = f"HTTP {response.status_code}"
except httpx.ConnectError as e:
last_error = f"Connection failed: {e}"
last_status = 502
logger.warning(f"PyPI proxy: failed to connect to {source.url}: {e}")
except httpx.TimeoutException as e:
last_error = f"Timeout: {e}"
last_status = 504
logger.warning(f"PyPI proxy: timeout connecting to {source.url}: {e}")
except Exception as e:
last_error = str(e)
last_status = 502
logger.warning(f"PyPI proxy: error fetching from {source.url}: {e}")
# Pass through 4xx errors (like 403 policy blocks) so users understand why
status_code = last_status if last_status and 400 <= last_status < 500 else 502
raise HTTPException(
status_code=status_code,
detail=f"Upstream error: {last_error}"
status_code=502,
detail=f"Failed to fetch package index from upstream: {last_error}"
)
@@ -509,7 +296,6 @@ async def pypi_package_versions(
# Try each source in priority order
last_error = None
last_status = None
for source in sources:
try:
headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
@@ -562,287 +348,26 @@ async def pypi_package_versions(
if response.status_code == 404:
# Package not found in this source, try next
last_error = f"Package not found in {source.name}"
last_status = 404
continue
# Parse upstream error for policy/blocking messages
last_error = _parse_upstream_error(response)
last_status = response.status_code
logger.warning(f"PyPI proxy: upstream returned {response.status_code} for {package_name}: {last_error}")
last_error = f"HTTP {response.status_code}"
except httpx.ConnectError as e:
last_error = f"Connection failed: {e}"
last_status = 502
logger.warning(f"PyPI proxy: failed to connect to {source.url}: {e}")
except httpx.TimeoutException as e:
last_error = f"Timeout: {e}"
last_status = 504
logger.warning(f"PyPI proxy: timeout connecting to {source.url}: {e}")
except Exception as e:
last_error = str(e)
last_status = 502
logger.warning(f"PyPI proxy: error fetching {package_name} from {source.url}: {e}")
# Pass through 4xx errors (like 403 policy blocks) so users understand why
status_code = last_status if last_status and 400 <= last_status < 500 else 404
raise HTTPException(
status_code=status_code,
detail=f"Package '{package_name}' error: {last_error}"
status_code=404,
detail=f"Package '{package_name}' not found: {last_error}"
)
async def fetch_and_cache_pypi_package(
db: Session,
storage: S3Storage,
http_client: httpx.AsyncClient,
package_name: str,
filename: str,
download_url: str,
expected_sha256: Optional[str] = None,
) -> Optional[dict]:
"""
Fetch a PyPI package from upstream and cache it in Orchard.
This is the core caching logic extracted from pypi_download_file() for reuse
by the registry client during auto-fetch dependency resolution.
Args:
db: Database session
storage: S3 storage instance
http_client: Async HTTP client for making requests
package_name: Normalized package name (e.g., 'requests')
filename: Package filename (e.g., 'requests-2.31.0-py3-none-any.whl')
download_url: Full URL to download from upstream
expected_sha256: Optional SHA256 to verify download integrity
Returns:
Dict with artifact_id, size, version, already_cached if successful.
None if the fetch failed.
"""
# Normalize package name
normalized_name = re.sub(r'[-_.]+', '-', package_name).lower()
# Check if we already have this URL cached
url_hash = hashlib.sha256(download_url.encode()).hexdigest()
cached_url = db.query(CachedUrl).filter(CachedUrl.url_hash == url_hash).first()
if cached_url:
# Already cached - return existing artifact info
artifact = db.query(Artifact).filter(Artifact.id == cached_url.artifact_id).first()
if artifact:
version = _extract_pypi_version(filename)
logger.info(f"PyPI fetch: {filename} already cached (artifact {artifact.id[:12]})")
return {
"artifact_id": artifact.id,
"size": artifact.size,
"version": version,
"already_cached": True,
}
# Get upstream sources for auth headers
sources = _get_pypi_upstream_sources(db)
matched_source = sources[0] if sources else None
headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
if matched_source:
headers.update(_build_auth_headers(matched_source))
auth = _get_basic_auth(matched_source) if matched_source else None
download_timeout = httpx.Timeout(connect=30.0, read=300.0, write=300.0, pool=30.0)
try:
logger.info(f"PyPI fetch: downloading {filename} from {download_url}")
response = await http_client.get(
download_url,
headers=headers,
auth=auth,
timeout=download_timeout,
)
# Handle redirects manually
redirect_count = 0
while response.status_code in (301, 302, 303, 307, 308) and redirect_count < 5:
redirect_url = response.headers.get('location')
if not redirect_url:
break
if not redirect_url.startswith('http'):
redirect_url = urljoin(download_url, redirect_url)
logger.debug(f"PyPI fetch: following redirect to {redirect_url}")
# Don't send auth to different hosts
redirect_headers = {"User-Agent": "Orchard-PyPI-Proxy/1.0"}
redirect_auth = None
if urlparse(redirect_url).netloc == urlparse(download_url).netloc:
redirect_headers.update(headers)
redirect_auth = auth
response = await http_client.get(
redirect_url,
headers=redirect_headers,
auth=redirect_auth,
follow_redirects=False,
timeout=download_timeout,
)
redirect_count += 1
if response.status_code != 200:
error_detail = _parse_upstream_error(response)
logger.warning(f"PyPI fetch: upstream returned {response.status_code} for {filename}: {error_detail}")
return None
content_type = response.headers.get('content-type', 'application/octet-stream')
# Stream to temp file to avoid loading large packages into memory
tmp_path = None
try:
with tempfile.NamedTemporaryFile(delete=False, suffix=f"_{filename}") as tmp_file:
tmp_path = tmp_file.name
async for chunk in response.aiter_bytes(chunk_size=65536):
tmp_file.write(chunk)
# Store in S3 from temp file (computes hash and deduplicates automatically)
with open(tmp_path, 'rb') as f:
result = storage.store(f)
sha256 = result.sha256
size = result.size
# Verify hash if expected
if expected_sha256 and sha256 != expected_sha256.lower():
logger.error(
f"PyPI fetch: hash mismatch for {filename}: "
f"expected {expected_sha256[:12]}, got {sha256[:12]}"
)
return None
# Extract dependencies from the temp file
extracted_deps = _extract_dependencies_from_file(tmp_path, filename)
if extracted_deps:
logger.info(f"PyPI fetch: extracted {len(extracted_deps)} dependencies from {filename}")
logger.info(f"PyPI fetch: downloaded {filename}, {size} bytes, sha256={sha256[:12]}")
finally:
# Clean up temp file
if tmp_path and os.path.exists(tmp_path):
os.unlink(tmp_path)
# Check if artifact already exists
existing = db.query(Artifact).filter(Artifact.id == sha256).first()
if existing:
existing.ref_count += 1
db.flush()
else:
new_artifact = Artifact(
id=sha256,
original_name=filename,
content_type=content_type,
size=size,
ref_count=1,
created_by="pypi-proxy",
s3_key=result.s3_key,
checksum_md5=result.md5,
checksum_sha1=result.sha1,
s3_etag=result.s3_etag,
)
db.add(new_artifact)
db.flush()
# Create/get system project and package
system_project = db.query(Project).filter(Project.name == "_pypi").first()
if not system_project:
system_project = Project(
name="_pypi",
description="System project for cached PyPI packages",
is_public=True,
is_system=True,
created_by="pypi-proxy",
)
db.add(system_project)
db.flush()
elif not system_project.is_system:
system_project.is_system = True
db.flush()
package = db.query(Package).filter(
Package.project_id == system_project.id,
Package.name == normalized_name,
).first()
if not package:
package = Package(
project_id=system_project.id,
name=normalized_name,
description=f"PyPI package: {normalized_name}",
format="pypi",
)
db.add(package)
db.flush()
# Extract and create version
version = _extract_pypi_version(filename)
if version and not filename.endswith('.metadata'):
existing_version = db.query(PackageVersion).filter(
PackageVersion.package_id == package.id,
PackageVersion.version == version,
).first()
if not existing_version:
pkg_version = PackageVersion(
package_id=package.id,
artifact_id=sha256,
version=version,
version_source="filename",
created_by="pypi-proxy",
)
db.add(pkg_version)
# Cache the URL mapping
existing_cached = db.query(CachedUrl).filter(CachedUrl.url_hash == url_hash).first()
if not existing_cached:
cached_url_record = CachedUrl(
url_hash=url_hash,
url=download_url,
artifact_id=sha256,
)
db.add(cached_url_record)
# Store extracted dependencies using batch operation
if extracted_deps:
seen_deps: dict[str, str] = {}
for dep_name, dep_version in extracted_deps:
if dep_name not in seen_deps:
seen_deps[dep_name] = dep_version if dep_version else "*"
deps_to_store = [
("_pypi", dep_name, dep_version)
for dep_name, dep_version in seen_deps.items()
]
repo = ArtifactRepository(db)
inserted = repo.batch_upsert_dependencies(sha256, deps_to_store)
if inserted > 0:
logger.debug(f"Stored {inserted} dependencies for {sha256[:12]}...")
db.commit()
return {
"artifact_id": sha256,
"size": size,
"version": version,
"already_cached": False,
}
except httpx.ConnectError as e:
logger.warning(f"PyPI fetch: connection failed for {filename}: {e}")
return None
except httpx.TimeoutException as e:
logger.warning(f"PyPI fetch: timeout for {filename}: {e}")
return None
except Exception as e:
logger.exception(f"PyPI fetch: error downloading {filename}")
return None
@router.get("/simple/{package_name}/{filename}")
async def pypi_download_file(
request: Request,
@@ -851,7 +376,6 @@ async def pypi_download_file(
upstream: Optional[str] = None,
db: Session = Depends(get_db),
storage: S3Storage = Depends(get_storage),
http_client: HttpClientManager = Depends(get_http_client),
):
"""
Download a package file, caching it in Orchard.
@@ -879,22 +403,9 @@ async def pypi_download_file(
artifact = db.query(Artifact).filter(Artifact.id == cached_url.artifact_id).first()
if artifact:
logger.info(f"PyPI proxy: serving cached {filename} (artifact {artifact.id[:12]})")
settings = get_settings()
# Stream from S3
try:
if settings.pypi_download_mode == "redirect":
# Redirect to S3 presigned URL - client downloads directly from S3
presigned_url = storage.generate_presigned_url(artifact.s3_key)
return RedirectResponse(
url=presigned_url,
status_code=302,
headers={
"X-Checksum-SHA256": artifact.id,
"X-Cache": "HIT",
}
)
else:
# Proxy mode - stream from S3 through Orchard
stream, content_length, _ = storage.get_stream(artifact.s3_key)
def stream_content():
@@ -916,7 +427,7 @@ async def pypi_download_file(
}
)
except Exception as e:
logger.error(f"PyPI proxy: error serving cached artifact: {e}")
logger.error(f"PyPI proxy: error streaming cached artifact: {e}")
# Fall through to fetch from upstream
# Not cached - fetch from upstream
@@ -933,21 +444,16 @@ async def pypi_download_file(
headers.update(_build_auth_headers(matched_source))
auth = _get_basic_auth(matched_source) if matched_source else None
# Use shared HTTP client from pool with longer timeout for file downloads
client = http_client.get_client()
download_timeout = httpx.Timeout(connect=30.0, read=300.0, write=300.0, pool=30.0)
# Initialize extracted dependencies list
extracted_deps = []
timeout = httpx.Timeout(300.0, connect=PROXY_CONNECT_TIMEOUT) # 5 minutes for large files
# Fetch the file
logger.info(f"PyPI proxy: fetching {filename} from {upstream_url}")
async with httpx.AsyncClient(timeout=timeout, follow_redirects=False) as client:
response = await client.get(
upstream_url,
headers=headers,
auth=auth,
timeout=download_timeout,
)
# Handle redirects manually
@@ -974,17 +480,13 @@ async def pypi_download_file(
headers=redirect_headers,
auth=redirect_auth,
follow_redirects=False,
timeout=download_timeout,
)
redirect_count += 1
if response.status_code != 200:
# Parse upstream error for policy/blocking messages
error_detail = _parse_upstream_error(response)
logger.warning(f"PyPI proxy: upstream returned {response.status_code} for {filename}: {error_detail}")
raise HTTPException(
status_code=response.status_code,
detail=f"Upstream error: {error_detail}"
detail=f"Upstream returned {response.status_code}"
)
content_type = response.headers.get('content-type', 'application/octet-stream')
@@ -1004,12 +506,10 @@ async def pypi_download_file(
result = storage.store(f)
sha256 = result.sha256
size = result.size
s3_key = result.s3_key
# Extract dependencies from the temp file before cleaning up
extracted_deps = _extract_dependencies_from_file(tmp_path, filename)
if extracted_deps:
logger.info(f"PyPI proxy: extracted {len(extracted_deps)} dependencies from {filename}")
# Read content for response
with open(tmp_path, 'rb') as f:
content = f.read()
logger.info(f"PyPI proxy: downloaded {filename}, {size} bytes, sha256={sha256[:12]}")
finally:
@@ -1074,6 +574,20 @@ async def pypi_download_file(
db.add(package)
db.flush()
# Create tag with filename
existing_tag = db.query(Tag).filter(
Tag.package_id == package.id,
Tag.name == filename,
).first()
if not existing_tag:
tag = Tag(
package_id=package.id,
name=filename,
artifact_id=sha256,
created_by="pypi-proxy",
)
db.add(tag)
# Extract and create version
# Only create version for actual package files, not .metadata files
version = _extract_pypi_version(filename)
@@ -1103,56 +617,11 @@ async def pypi_download_file(
)
db.add(cached_url_record)
# Store extracted dependencies using batch operation
if extracted_deps:
# Deduplicate: keep first version constraint seen for each package name
seen_deps: dict[str, str] = {}
for dep_name, dep_version in extracted_deps:
if dep_name not in seen_deps:
seen_deps[dep_name] = dep_version if dep_version else "*"
# Convert to list of tuples for batch insert
deps_to_store = [
("_pypi", dep_name, dep_version)
for dep_name, dep_version in seen_deps.items()
]
# Batch upsert - handles duplicates with ON CONFLICT DO NOTHING
repo = ArtifactRepository(db)
inserted = repo.batch_upsert_dependencies(sha256, deps_to_store)
if inserted > 0:
logger.debug(f"Stored {inserted} dependencies for {sha256[:12]}...")
db.commit()
# Serve the file from S3
settings = get_settings()
try:
if settings.pypi_download_mode == "redirect":
# Redirect to S3 presigned URL - client downloads directly from S3
presigned_url = storage.generate_presigned_url(s3_key)
return RedirectResponse(
url=presigned_url,
status_code=302,
headers={
"X-Checksum-SHA256": sha256,
"X-Cache": "MISS",
}
)
else:
# Proxy mode - stream from S3 through Orchard
stream, content_length, _ = storage.get_stream(s3_key)
def stream_content():
"""Generator that yields chunks from the S3 stream."""
try:
for chunk in stream.iter_chunks():
yield chunk
finally:
stream.close()
return StreamingResponse(
stream_content(),
# Return the file
return Response(
content=content,
media_type=content_type,
headers={
"Content-Disposition": f'attachment; filename="{filename}"',
@@ -1161,9 +630,6 @@ async def pypi_download_file(
"X-Cache": "MISS",
}
)
except Exception as e:
logger.error(f"PyPI proxy: error serving from S3: {e}")
raise HTTPException(status_code=500, detail=f"Error serving file: {e}")
except httpx.ConnectError as e:
raise HTTPException(status_code=502, detail=f"Connection failed: {e}")

View File

@@ -1,426 +0,0 @@
"""
Registry client abstraction for upstream package registries.
Provides a pluggable interface for fetching packages from upstream registries
(PyPI, npm, Maven, etc.) during dependency resolution with auto-fetch enabled.
"""
import hashlib
import logging
import os
import re
import tempfile
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import List, Optional, TYPE_CHECKING
from urllib.parse import urljoin, urlparse
import httpx
from packaging.specifiers import SpecifierSet, InvalidSpecifier
from packaging.version import Version, InvalidVersion
from sqlalchemy.orm import Session
if TYPE_CHECKING:
from .storage import S3Storage
from .http_client import HttpClientManager
logger = logging.getLogger(__name__)
@dataclass
class VersionInfo:
"""Information about a package version from an upstream registry."""
version: str
download_url: str
filename: str
sha256: Optional[str] = None
size: Optional[int] = None
content_type: Optional[str] = None
@dataclass
class FetchResult:
"""Result of fetching a package from upstream."""
artifact_id: str # SHA256 hash
size: int
version: str
filename: str
already_cached: bool = False
class RegistryClient(ABC):
"""Abstract base class for upstream registry clients."""
@property
@abstractmethod
def source_type(self) -> str:
"""Return the source type this client handles (e.g., 'pypi', 'npm')."""
pass
@abstractmethod
async def get_available_versions(self, package_name: str) -> List[str]:
"""
Get all available versions of a package from upstream.
Args:
package_name: The normalized package name
Returns:
List of version strings, sorted from oldest to newest
"""
pass
@abstractmethod
async def resolve_constraint(
self, package_name: str, constraint: str
) -> Optional[VersionInfo]:
"""
Find the best version matching a constraint.
Args:
package_name: The normalized package name
constraint: Version constraint (e.g., '>=1.9', '<2.0,>=1.5', '*')
Returns:
VersionInfo with download URL, or None if no matching version found
"""
pass
@abstractmethod
async def fetch_package(
self,
package_name: str,
version_info: VersionInfo,
db: Session,
storage: "S3Storage",
) -> Optional[FetchResult]:
"""
Fetch and cache a package from upstream.
Args:
package_name: The normalized package name
version_info: Version details including download URL
db: Database session for creating records
storage: S3 storage for caching the artifact
Returns:
FetchResult with artifact_id, or None if fetch failed
"""
pass
class PyPIRegistryClient(RegistryClient):
"""PyPI registry client using the JSON API."""
# Timeout configuration for PyPI requests
CONNECT_TIMEOUT = 30.0
READ_TIMEOUT = 60.0
DOWNLOAD_TIMEOUT = 300.0 # Longer timeout for file downloads
def __init__(
self,
http_client: httpx.AsyncClient,
upstream_sources: List,
pypi_api_url: str = "https://pypi.org/pypi",
):
"""
Initialize PyPI registry client.
Args:
http_client: Shared async HTTP client
upstream_sources: List of configured upstream sources for auth
pypi_api_url: Base URL for PyPI JSON API
"""
self.client = http_client
self.sources = upstream_sources
self.api_url = pypi_api_url
@property
def source_type(self) -> str:
return "pypi"
def _normalize_package_name(self, name: str) -> str:
"""Normalize a PyPI package name per PEP 503."""
return re.sub(r"[-_.]+", "-", name).lower()
def _get_auth_headers(self) -> dict:
"""Get authentication headers from configured sources."""
headers = {"User-Agent": "Orchard-Registry-Client/1.0"}
if self.sources:
source = self.sources[0]
if hasattr(source, "auth_type"):
if source.auth_type == "bearer":
password = (
source.get_password()
if hasattr(source, "get_password")
else getattr(source, "password", None)
)
if password:
headers["Authorization"] = f"Bearer {password}"
elif source.auth_type == "api_key":
custom_headers = (
source.get_headers()
if hasattr(source, "get_headers")
else {}
)
if custom_headers:
headers.update(custom_headers)
return headers
def _get_basic_auth(self) -> Optional[tuple]:
"""Get basic auth credentials if configured."""
if self.sources:
source = self.sources[0]
if hasattr(source, "auth_type") and source.auth_type == "basic":
username = getattr(source, "username", None)
if username:
password = (
source.get_password()
if hasattr(source, "get_password")
else getattr(source, "password", "")
)
return (username, password or "")
return None
async def get_available_versions(self, package_name: str) -> List[str]:
"""Get all available versions from PyPI JSON API."""
normalized = self._normalize_package_name(package_name)
url = f"{self.api_url}/{normalized}/json"
headers = self._get_auth_headers()
auth = self._get_basic_auth()
timeout = httpx.Timeout(self.READ_TIMEOUT, connect=self.CONNECT_TIMEOUT)
try:
response = await self.client.get(
url, headers=headers, auth=auth, timeout=timeout
)
if response.status_code == 404:
logger.debug(f"Package {normalized} not found on PyPI")
return []
if response.status_code != 200:
logger.warning(
f"PyPI API returned {response.status_code} for {normalized}"
)
return []
data = response.json()
releases = data.get("releases", {})
# Filter to valid versions and sort
versions = []
for v in releases.keys():
try:
Version(v)
versions.append(v)
except InvalidVersion:
continue
versions.sort(key=lambda x: Version(x))
return versions
except httpx.RequestError as e:
logger.warning(f"Failed to query PyPI for {normalized}: {e}")
return []
except Exception as e:
logger.warning(f"Error parsing PyPI response for {normalized}: {e}")
return []
async def resolve_constraint(
self, package_name: str, constraint: str
) -> Optional[VersionInfo]:
"""Find best version matching constraint from PyPI."""
normalized = self._normalize_package_name(package_name)
url = f"{self.api_url}/{normalized}/json"
headers = self._get_auth_headers()
auth = self._get_basic_auth()
timeout = httpx.Timeout(self.READ_TIMEOUT, connect=self.CONNECT_TIMEOUT)
try:
response = await self.client.get(
url, headers=headers, auth=auth, timeout=timeout
)
if response.status_code == 404:
logger.debug(f"Package {normalized} not found on PyPI")
return None
if response.status_code != 200:
logger.warning(
f"PyPI API returned {response.status_code} for {normalized}"
)
return None
data = response.json()
releases = data.get("releases", {})
# Handle wildcard - return latest version
if constraint == "*":
latest_version = data.get("info", {}).get("version")
if latest_version and latest_version in releases:
return self._get_version_info(
normalized, latest_version, releases[latest_version]
)
return None
# Parse constraint
# If constraint looks like a bare version (no operator), treat as exact match
# e.g., "2025.10.5" -> "==2025.10.5"
effective_constraint = constraint
if constraint and constraint[0].isdigit():
effective_constraint = f"=={constraint}"
logger.debug(
f"Bare version '{constraint}' for {normalized}, "
f"treating as exact match '{effective_constraint}'"
)
try:
specifier = SpecifierSet(effective_constraint)
except InvalidSpecifier:
# Invalid constraint - treat as wildcard
logger.warning(
f"Invalid version constraint '{constraint}' for {normalized}, "
"treating as wildcard"
)
latest_version = data.get("info", {}).get("version")
if latest_version and latest_version in releases:
return self._get_version_info(
normalized, latest_version, releases[latest_version]
)
return None
# Find matching versions
matching = []
for v_str, files in releases.items():
if not files: # Skip versions with no files
continue
try:
v = Version(v_str)
if v in specifier:
matching.append((v_str, v, files))
except InvalidVersion:
continue
if not matching:
logger.debug(
f"No versions of {normalized} match constraint '{constraint}'"
)
return None
# Sort by version and return highest match
matching.sort(key=lambda x: x[1], reverse=True)
best_version, _, best_files = matching[0]
return self._get_version_info(normalized, best_version, best_files)
except httpx.RequestError as e:
logger.warning(f"Failed to query PyPI for {normalized}: {e}")
return None
except Exception as e:
logger.warning(f"Error resolving {normalized}@{constraint}: {e}")
return None
def _get_version_info(
self, package_name: str, version: str, files: List[dict]
) -> Optional[VersionInfo]:
"""Extract download info from PyPI release files."""
if not files:
return None
# Prefer wheel over sdist
wheel_file = None
sdist_file = None
for f in files:
filename = f.get("filename", "")
if filename.endswith(".whl"):
# Prefer platform-agnostic wheels
if "py3-none-any" in filename or wheel_file is None:
wheel_file = f
elif filename.endswith(".tar.gz") and sdist_file is None:
sdist_file = f
selected = wheel_file or sdist_file
if not selected:
# Fall back to first available file
selected = files[0]
return VersionInfo(
version=version,
download_url=selected.get("url", ""),
filename=selected.get("filename", ""),
sha256=selected.get("digests", {}).get("sha256"),
size=selected.get("size"),
content_type="application/zip"
if selected.get("filename", "").endswith(".whl")
else "application/gzip",
)
async def fetch_package(
self,
package_name: str,
version_info: VersionInfo,
db: Session,
storage: "S3Storage",
) -> Optional[FetchResult]:
"""Fetch and cache a PyPI package."""
# Import here to avoid circular imports
from .pypi_proxy import fetch_and_cache_pypi_package
normalized = self._normalize_package_name(package_name)
logger.info(
f"Fetching {normalized}=={version_info.version} from upstream PyPI"
)
result = await fetch_and_cache_pypi_package(
db=db,
storage=storage,
http_client=self.client,
package_name=normalized,
filename=version_info.filename,
download_url=version_info.download_url,
expected_sha256=version_info.sha256,
)
if result is None:
return None
return FetchResult(
artifact_id=result["artifact_id"],
size=result["size"],
version=version_info.version,
filename=version_info.filename,
already_cached=result.get("already_cached", False),
)
def get_registry_client(
source_type: str,
http_client: httpx.AsyncClient,
upstream_sources: List,
) -> Optional[RegistryClient]:
"""
Factory function to get a registry client for a source type.
Args:
source_type: The registry type ('pypi', 'npm', etc.)
http_client: Shared async HTTP client
upstream_sources: List of configured upstream sources
Returns:
RegistryClient for the source type, or None if not supported
"""
if source_type == "pypi":
# Filter to PyPI sources
pypi_sources = [s for s in upstream_sources if getattr(s, "source_type", "") == "pypi"]
return PyPIRegistryClient(http_client, pypi_sources)
# Future: Add npm, maven, etc.
logger.debug(f"No registry client available for source type: {source_type}")
return None

View File

@@ -9,6 +9,7 @@ from .base import BaseRepository
from .project import ProjectRepository
from .package import PackageRepository
from .artifact import ArtifactRepository
from .tag import TagRepository
from .upload import UploadRepository
__all__ = [
@@ -16,5 +17,6 @@ __all__ = [
"ProjectRepository",
"PackageRepository",
"ArtifactRepository",
"TagRepository",
"UploadRepository",
]

View File

@@ -8,7 +8,7 @@ from sqlalchemy import func, or_
from uuid import UUID
from .base import BaseRepository
from ..models import Artifact, PackageVersion, Upload, Package, Project
from ..models import Artifact, Tag, Upload, Package, Project
class ArtifactRepository(BaseRepository[Artifact]):
@@ -77,14 +77,14 @@ class ArtifactRepository(BaseRepository[Artifact]):
.all()
)
def get_artifacts_without_versions(self, limit: int = 100) -> List[Artifact]:
"""Get artifacts that have no versions pointing to them."""
# Subquery to find artifact IDs that have versions
versioned_artifacts = self.db.query(PackageVersion.artifact_id).distinct().subquery()
def get_artifacts_without_tags(self, limit: int = 100) -> List[Artifact]:
"""Get artifacts that have no tags pointing to them."""
# Subquery to find artifact IDs that have tags
tagged_artifacts = self.db.query(Tag.artifact_id).distinct().subquery()
return (
self.db.query(Artifact)
.filter(~Artifact.id.in_(versioned_artifacts))
.filter(~Artifact.id.in_(tagged_artifacts))
.limit(limit)
.all()
)
@@ -115,34 +115,34 @@ class ArtifactRepository(BaseRepository[Artifact]):
return artifacts, total
def get_referencing_versions(self, artifact_id: str) -> List[Tuple[PackageVersion, Package, Project]]:
"""Get all versions referencing this artifact with package and project info."""
def get_referencing_tags(self, artifact_id: str) -> List[Tuple[Tag, Package, Project]]:
"""Get all tags referencing this artifact with package and project info."""
return (
self.db.query(PackageVersion, Package, Project)
.join(Package, PackageVersion.package_id == Package.id)
self.db.query(Tag, Package, Project)
.join(Package, Tag.package_id == Package.id)
.join(Project, Package.project_id == Project.id)
.filter(PackageVersion.artifact_id == artifact_id)
.filter(Tag.artifact_id == artifact_id)
.all()
)
def search(self, query_str: str, limit: int = 10) -> List[Tuple[PackageVersion, Artifact, str, str]]:
def search(self, query_str: str, limit: int = 10) -> List[Tuple[Tag, Artifact, str, str]]:
"""
Search artifacts by version or original filename.
Returns (version, artifact, package_name, project_name) tuples.
Search artifacts by tag name or original filename.
Returns (tag, artifact, package_name, project_name) tuples.
"""
search_lower = query_str.lower()
return (
self.db.query(PackageVersion, Artifact, Package.name, Project.name)
.join(Artifact, PackageVersion.artifact_id == Artifact.id)
.join(Package, PackageVersion.package_id == Package.id)
self.db.query(Tag, Artifact, Package.name, Project.name)
.join(Artifact, Tag.artifact_id == Artifact.id)
.join(Package, Tag.package_id == Package.id)
.join(Project, Package.project_id == Project.id)
.filter(
or_(
func.lower(PackageVersion.version).contains(search_lower),
func.lower(Tag.name).contains(search_lower),
func.lower(Artifact.original_name).contains(search_lower)
)
)
.order_by(PackageVersion.version)
.order_by(Tag.name)
.limit(limit)
.all()
)

View File

@@ -8,7 +8,7 @@ from sqlalchemy import func, or_, asc, desc
from uuid import UUID
from .base import BaseRepository
from ..models import Package, Project, PackageVersion, Upload, Artifact
from ..models import Package, Project, Tag, Upload, Artifact
class PackageRepository(BaseRepository[Package]):
@@ -136,10 +136,10 @@ class PackageRepository(BaseRepository[Package]):
return self.update(package, **updates)
def get_stats(self, package_id: UUID) -> dict:
"""Get package statistics (version count, artifact count, total size)."""
version_count = (
self.db.query(func.count(PackageVersion.id))
.filter(PackageVersion.package_id == package_id)
"""Get package statistics (tag count, artifact count, total size)."""
tag_count = (
self.db.query(func.count(Tag.id))
.filter(Tag.package_id == package_id)
.scalar() or 0
)
@@ -154,7 +154,7 @@ class PackageRepository(BaseRepository[Package]):
)
return {
"version_count": version_count,
"tag_count": tag_count,
"artifact_count": artifact_stats[0] if artifact_stats else 0,
"total_size": artifact_stats[1] if artifact_stats else 0,
}

View File

@@ -0,0 +1,168 @@
"""
Tag repository for data access operations.
"""
from typing import Optional, List, Tuple
from sqlalchemy.orm import Session
from sqlalchemy import func, or_, asc, desc
from uuid import UUID
from .base import BaseRepository
from ..models import Tag, TagHistory, Artifact, Package, Project
class TagRepository(BaseRepository[Tag]):
"""Repository for Tag entity operations."""
model = Tag
def get_by_name(self, package_id: UUID, name: str) -> Optional[Tag]:
"""Get tag by name within a package."""
return (
self.db.query(Tag)
.filter(Tag.package_id == package_id, Tag.name == name)
.first()
)
def get_with_artifact(self, package_id: UUID, name: str) -> Optional[Tuple[Tag, Artifact]]:
"""Get tag with its artifact."""
return (
self.db.query(Tag, Artifact)
.join(Artifact, Tag.artifact_id == Artifact.id)
.filter(Tag.package_id == package_id, Tag.name == name)
.first()
)
def exists_by_name(self, package_id: UUID, name: str) -> bool:
"""Check if tag with name exists in package."""
return self.db.query(
self.db.query(Tag)
.filter(Tag.package_id == package_id, Tag.name == name)
.exists()
).scalar()
def list_by_package(
self,
package_id: UUID,
page: int = 1,
limit: int = 20,
search: Optional[str] = None,
sort: str = "name",
order: str = "asc",
) -> Tuple[List[Tuple[Tag, Artifact]], int]:
"""
List tags in a package with artifact metadata.
Returns tuple of ((tag, artifact) tuples, total_count).
"""
query = (
self.db.query(Tag, Artifact)
.join(Artifact, Tag.artifact_id == Artifact.id)
.filter(Tag.package_id == package_id)
)
# Apply search filter (tag name or artifact original filename)
if search:
search_lower = search.lower()
query = query.filter(
or_(
func.lower(Tag.name).contains(search_lower),
func.lower(Artifact.original_name).contains(search_lower)
)
)
# Get total count
total = query.count()
# Apply sorting
sort_columns = {
"name": Tag.name,
"created_at": Tag.created_at,
}
sort_column = sort_columns.get(sort, Tag.name)
if order == "desc":
query = query.order_by(desc(sort_column))
else:
query = query.order_by(asc(sort_column))
# Apply pagination
offset = (page - 1) * limit
results = query.offset(offset).limit(limit).all()
return results, total
def create_tag(
self,
package_id: UUID,
name: str,
artifact_id: str,
created_by: str,
) -> Tag:
"""Create a new tag."""
return self.create(
package_id=package_id,
name=name,
artifact_id=artifact_id,
created_by=created_by,
)
def update_artifact(
self,
tag: Tag,
new_artifact_id: str,
changed_by: str,
record_history: bool = True,
) -> Tag:
"""
Update tag to point to a different artifact.
Optionally records change in tag history.
"""
old_artifact_id = tag.artifact_id
if record_history and old_artifact_id != new_artifact_id:
history = TagHistory(
tag_id=tag.id,
old_artifact_id=old_artifact_id,
new_artifact_id=new_artifact_id,
changed_by=changed_by,
)
self.db.add(history)
tag.artifact_id = new_artifact_id
tag.created_by = changed_by
self.db.flush()
return tag
def get_history(self, tag_id: UUID) -> List[TagHistory]:
"""Get tag change history."""
return (
self.db.query(TagHistory)
.filter(TagHistory.tag_id == tag_id)
.order_by(TagHistory.changed_at.desc())
.all()
)
def get_latest_in_package(self, package_id: UUID) -> Optional[Tag]:
"""Get the most recently created/updated tag in a package."""
return (
self.db.query(Tag)
.filter(Tag.package_id == package_id)
.order_by(Tag.created_at.desc())
.first()
)
def get_by_artifact(self, artifact_id: str) -> List[Tag]:
"""Get all tags pointing to an artifact."""
return (
self.db.query(Tag)
.filter(Tag.artifact_id == artifact_id)
.all()
)
def count_by_artifact(self, artifact_id: str) -> int:
"""Count tags pointing to an artifact."""
return (
self.db.query(func.count(Tag.id))
.filter(Tag.artifact_id == artifact_id)
.scalar() or 0
)

File diff suppressed because it is too large Load Diff

View File

@@ -114,6 +114,14 @@ class PackageUpdate(BaseModel):
platform: Optional[str] = None
class TagSummary(BaseModel):
"""Lightweight tag info for embedding in package responses"""
name: str
artifact_id: str
created_at: datetime
class PackageDetailResponse(BaseModel):
"""Package with aggregated metadata"""
@@ -126,9 +134,13 @@ class PackageDetailResponse(BaseModel):
created_at: datetime
updated_at: datetime
# Aggregated fields
tag_count: int = 0
artifact_count: int = 0
total_size: int = 0
latest_tag: Optional[str] = None
latest_upload_at: Optional[datetime] = None
# Recent tags (limit 5)
recent_tags: List[TagSummary] = []
class Config:
from_attributes = True
@@ -153,6 +165,79 @@ class ArtifactResponse(BaseModel):
from_attributes = True
# Tag schemas
class TagCreate(BaseModel):
name: str
artifact_id: str
class TagResponse(BaseModel):
id: UUID
package_id: UUID
name: str
artifact_id: str
created_at: datetime
created_by: str
version: Optional[str] = None # Version of the artifact this tag points to
class Config:
from_attributes = True
class TagDetailResponse(BaseModel):
"""Tag with embedded artifact metadata"""
id: UUID
package_id: UUID
name: str
artifact_id: str
created_at: datetime
created_by: str
version: Optional[str] = None # Version of the artifact this tag points to
# Artifact metadata
artifact_size: int
artifact_content_type: Optional[str]
artifact_original_name: Optional[str]
artifact_created_at: datetime
artifact_format_metadata: Optional[Dict[str, Any]] = None
class Config:
from_attributes = True
class TagHistoryResponse(BaseModel):
"""History entry for tag changes"""
id: UUID
tag_id: UUID
old_artifact_id: Optional[str]
new_artifact_id: str
changed_at: datetime
changed_by: str
class Config:
from_attributes = True
class TagHistoryDetailResponse(BaseModel):
"""Tag history with artifact metadata for each version"""
id: UUID
tag_id: UUID
tag_name: str
old_artifact_id: Optional[str]
new_artifact_id: str
changed_at: datetime
changed_by: str
# Artifact metadata for new artifact
artifact_size: int
artifact_original_name: Optional[str]
artifact_content_type: Optional[str]
class Config:
from_attributes = True
# Audit log schemas
class AuditLogResponse(BaseModel):
"""Audit log entry response"""
@@ -179,7 +264,7 @@ class UploadHistoryResponse(BaseModel):
package_name: str
project_name: str
original_name: Optional[str]
version: Optional[str]
tag_name: Optional[str]
uploaded_at: datetime
uploaded_by: str
source_ip: Optional[str]
@@ -210,10 +295,10 @@ class ArtifactProvenanceResponse(BaseModel):
# Usage statistics
upload_count: int
# References
packages: List[Dict[str, Any]] # List of {project_name, package_name, versions}
versions: List[
packages: List[Dict[str, Any]] # List of {project_name, package_name, tag_names}
tags: List[
Dict[str, Any]
] # List of {project_name, package_name, version, created_at}
] # List of {project_name, package_name, tag_name, created_at}
# Upload history
uploads: List[Dict[str, Any]] # List of upload events
@@ -221,8 +306,18 @@ class ArtifactProvenanceResponse(BaseModel):
from_attributes = True
class ArtifactTagInfo(BaseModel):
"""Tag info for embedding in artifact responses"""
id: UUID
name: str
package_id: UUID
package_name: str
project_name: str
class ArtifactDetailResponse(BaseModel):
"""Artifact with metadata"""
"""Artifact with list of tags/packages referencing it"""
id: str
sha256: str # Explicit SHA256 field (same as id)
@@ -236,14 +331,14 @@ class ArtifactDetailResponse(BaseModel):
created_by: str
ref_count: int
format_metadata: Optional[Dict[str, Any]] = None
versions: List[Dict[str, Any]] = [] # List of {version, package_name, project_name}
tags: List[ArtifactTagInfo] = []
class Config:
from_attributes = True
class PackageArtifactResponse(BaseModel):
"""Artifact for package artifact listing"""
"""Artifact with tags for package artifact listing"""
id: str
sha256: str # Explicit SHA256 field (same as id)
@@ -256,7 +351,7 @@ class PackageArtifactResponse(BaseModel):
created_at: datetime
created_by: str
format_metadata: Optional[Dict[str, Any]] = None
version: Optional[str] = None # Version from PackageVersion if exists
tags: List[str] = [] # Tag names pointing to this artifact
class Config:
from_attributes = True
@@ -274,9 +369,28 @@ class GlobalArtifactResponse(BaseModel):
created_by: str
format_metadata: Optional[Dict[str, Any]] = None
ref_count: int = 0
# Context from versions/packages
# Context from tags/packages
projects: List[str] = [] # List of project names containing this artifact
packages: List[str] = [] # List of "project/package" paths
tags: List[str] = [] # List of "project/package:tag" references
class Config:
from_attributes = True
class GlobalTagResponse(BaseModel):
"""Tag with project/package context for global listing"""
id: UUID
name: str
artifact_id: str
created_at: datetime
created_by: str
project_name: str
package_name: str
artifact_size: Optional[int] = None
artifact_content_type: Optional[str] = None
version: Optional[str] = None # Version of the artifact this tag points to
class Config:
from_attributes = True
@@ -289,6 +403,7 @@ class UploadResponse(BaseModel):
size: int
project: str
package: str
tag: Optional[str]
version: Optional[str] = None # Version assigned to this artifact
version_source: Optional[str] = None # How version was determined: 'explicit', 'filename', 'metadata'
checksum_md5: Optional[str] = None
@@ -315,6 +430,7 @@ class ResumableUploadInitRequest(BaseModel):
filename: str
content_type: Optional[str] = None
size: int
tag: Optional[str] = None
version: Optional[str] = None # Explicit version (auto-detected if not provided)
@field_validator("expected_hash")
@@ -349,7 +465,7 @@ class ResumableUploadPartResponse(BaseModel):
class ResumableUploadCompleteRequest(BaseModel):
"""Request to complete a resumable upload"""
pass
tag: Optional[str] = None
class ResumableUploadCompleteResponse(BaseModel):
@@ -359,6 +475,7 @@ class ResumableUploadCompleteResponse(BaseModel):
size: int
project: str
package: str
tag: Optional[str]
class ResumableUploadStatusResponse(BaseModel):
@@ -411,6 +528,7 @@ class PackageVersionResponse(BaseModel):
size: Optional[int] = None
content_type: Optional[str] = None
original_name: Optional[str] = None
tags: List[str] = [] # Tag names pointing to this artifact
class Config:
from_attributes = True
@@ -452,10 +570,11 @@ class SearchResultPackage(BaseModel):
class SearchResultArtifact(BaseModel):
"""Artifact result for global search"""
"""Artifact/tag result for global search"""
tag_id: UUID
tag_name: str
artifact_id: str
version: Optional[str]
package_id: UUID
package_name: str
project_name: str
@@ -493,8 +612,6 @@ class HealthResponse(BaseModel):
version: str = "1.0.0"
storage_healthy: Optional[bool] = None
database_healthy: Optional[bool] = None
http_pool: Optional[Dict[str, Any]] = None
cache: Optional[Dict[str, Any]] = None
# Garbage collection schemas
@@ -570,7 +687,7 @@ class ProjectStatsResponse(BaseModel):
project_id: str
project_name: str
package_count: int
version_count: int
tag_count: int
artifact_count: int
total_size_bytes: int
upload_count: int
@@ -585,7 +702,7 @@ class PackageStatsResponse(BaseModel):
package_id: str
package_name: str
project_name: str
version_count: int
tag_count: int
artifact_count: int
total_size_bytes: int
upload_count: int
@@ -602,9 +719,9 @@ class ArtifactStatsResponse(BaseModel):
size: int
ref_count: int
storage_savings: int # (ref_count - 1) * size
tags: List[Dict[str, Any]] # Tags referencing this artifact
projects: List[str] # Projects using this artifact
packages: List[str] # Packages using this artifact
versions: List[Dict[str, Any]] = [] # List of {version, package_name, project_name}
first_uploaded: Optional[datetime] = None
last_referenced: Optional[datetime] = None
@@ -813,7 +930,20 @@ class DependencyCreate(BaseModel):
"""Schema for creating a dependency"""
project: str
package: str
version: str
version: Optional[str] = None
tag: Optional[str] = None
@field_validator('version', 'tag')
@classmethod
def validate_constraint(cls, v, info):
return v
def model_post_init(self, __context):
"""Validate that exactly one of version or tag is set"""
if self.version is None and self.tag is None:
raise ValueError("Either 'version' or 'tag' must be specified")
if self.version is not None and self.tag is not None:
raise ValueError("Cannot specify both 'version' and 'tag'")
class DependencyResponse(BaseModel):
@@ -822,7 +952,8 @@ class DependencyResponse(BaseModel):
artifact_id: str
project: str
package: str
version: str
version: Optional[str] = None
tag: Optional[str] = None
created_at: datetime
class Config:
@@ -837,6 +968,7 @@ class DependencyResponse(BaseModel):
project=dep.dependency_project,
package=dep.dependency_package,
version=dep.version_constraint,
tag=dep.tag_constraint,
created_at=dep.created_at,
)
@@ -853,6 +985,7 @@ class DependentInfo(BaseModel):
project: str
package: str
version: Optional[str] = None
constraint_type: str # 'version' or 'tag'
constraint_value: str
@@ -868,7 +1001,20 @@ class EnsureFileDependency(BaseModel):
"""Dependency entry from orchard.ensure file"""
project: str
package: str
version: str
version: Optional[str] = None
tag: Optional[str] = None
@field_validator('version', 'tag')
@classmethod
def validate_constraint(cls, v, info):
return v
def model_post_init(self, __context):
"""Validate that exactly one of version or tag is set"""
if self.version is None and self.tag is None:
raise ValueError("Either 'version' or 'tag' must be specified")
if self.version is not None and self.tag is not None:
raise ValueError("Cannot specify both 'version' and 'tag'")
class EnsureFileContent(BaseModel):
@@ -882,6 +1028,7 @@ class ResolvedArtifact(BaseModel):
project: str
package: str
version: Optional[str] = None
tag: Optional[str] = None
size: int
download_url: str
@@ -892,8 +1039,6 @@ class MissingDependency(BaseModel):
package: str
constraint: Optional[str] = None
required_by: Optional[str] = None
fetch_attempted: bool = False # True if auto-fetch was attempted
fetch_error: Optional[str] = None # Error message if fetch failed
class DependencyResolutionResponse(BaseModel):
@@ -901,7 +1046,6 @@ class DependencyResolutionResponse(BaseModel):
requested: Dict[str, str] # project, package, ref
resolved: List[ResolvedArtifact]
missing: List[MissingDependency] = []
fetched: List[ResolvedArtifact] = [] # Artifacts fetched from upstream during resolution
total_size: int
artifact_count: int
@@ -910,7 +1054,7 @@ class DependencyConflict(BaseModel):
"""Details about a dependency conflict"""
project: str
package: str
requirements: List[Dict[str, Any]] # version and required_by info
requirements: List[Dict[str, Any]] # version/tag and required_by info
class DependencyConflictError(BaseModel):
@@ -1244,10 +1388,10 @@ class CacheRequest(BaseModel):
url: str
source_type: str
package_name: Optional[str] = None # Auto-derived from URL if not provided
version: Optional[str] = None # Auto-derived from URL if not provided
tag: Optional[str] = None # Auto-derived from URL if not provided
user_project: Optional[str] = None # Cross-reference to user project
user_package: Optional[str] = None
user_version: Optional[str] = None
user_tag: Optional[str] = None
expected_hash: Optional[str] = None # Verify downloaded content
@field_validator('url')
@@ -1294,8 +1438,8 @@ class CacheResponse(BaseModel):
source_name: Optional[str]
system_project: str
system_package: str
system_version: Optional[str]
user_reference: Optional[str] = None # e.g., "my-app/npm-deps/+/4.17.21"
system_tag: Optional[str]
user_reference: Optional[str] = None # e.g., "my-app/npm-deps:lodash-4.17.21"
class CacheResolveRequest(BaseModel):
@@ -1309,7 +1453,7 @@ class CacheResolveRequest(BaseModel):
version: str
user_project: Optional[str] = None
user_package: Optional[str] = None
user_version: Optional[str] = None
user_tag: Optional[str] = None
@field_validator('source_type')
@classmethod

View File

@@ -5,7 +5,7 @@ import hashlib
import logging
from sqlalchemy.orm import Session
from .models import Project, Package, Artifact, Upload, PackageVersion, ArtifactDependency, Team, TeamMembership, User
from .models import Project, Package, Artifact, Tag, Upload, PackageVersion, ArtifactDependency, Team, TeamMembership, User
from .storage import get_storage
from .auth import hash_password
@@ -125,14 +125,14 @@ TEST_ARTIFACTS = [
]
# Dependencies to create (source artifact -> dependency)
# Format: (source_project, source_package, source_version, dep_project, dep_package, version_constraint)
# Format: (source_project, source_package, source_version, dep_project, dep_package, version_constraint, tag_constraint)
TEST_DEPENDENCIES = [
# ui-components v1.1.0 depends on design-tokens v1.0.0
("frontend-libs", "ui-components", "1.1.0", "frontend-libs", "design-tokens", "1.0.0"),
("frontend-libs", "ui-components", "1.1.0", "frontend-libs", "design-tokens", "1.0.0", None),
# auth-lib v1.0.0 depends on common-utils v2.0.0
("backend-services", "auth-lib", "1.0.0", "backend-services", "common-utils", "2.0.0"),
# auth-lib v1.0.0 also depends on design-tokens v1.0.0
("backend-services", "auth-lib", "1.0.0", "frontend-libs", "design-tokens", "1.0.0"),
("backend-services", "auth-lib", "1.0.0", "backend-services", "common-utils", "2.0.0", None),
# auth-lib v1.0.0 also depends on design-tokens (stable tag)
("backend-services", "auth-lib", "1.0.0", "frontend-libs", "design-tokens", None, "latest"),
]
@@ -252,8 +252,9 @@ def seed_database(db: Session) -> None:
logger.info(f"Created {len(project_map)} projects and {len(package_map)} packages (assigned to {demo_team.slug})")
# Create artifacts and versions
# Create artifacts, tags, and versions
artifact_count = 0
tag_count = 0
version_count = 0
for artifact_data in TEST_ARTIFACTS:
@@ -315,12 +316,23 @@ def seed_database(db: Session) -> None:
db.add(version)
version_count += 1
# Create tags
for tag_name in artifact_data["tags"]:
tag = Tag(
package_id=package.id,
name=tag_name,
artifact_id=sha256_hash,
created_by=team_owner_username,
)
db.add(tag)
tag_count += 1
db.flush()
# Create dependencies
dependency_count = 0
for dep_data in TEST_DEPENDENCIES:
src_project, src_package, src_version, dep_project, dep_package, version_constraint = dep_data
src_project, src_package, src_version, dep_project, dep_package, version_constraint, tag_constraint = dep_data
# Find the source artifact by looking up its version
src_pkg = package_map.get((src_project, src_package))
@@ -344,10 +356,11 @@ def seed_database(db: Session) -> None:
dependency_project=dep_project,
dependency_package=dep_package,
version_constraint=version_constraint,
tag_constraint=tag_constraint,
)
db.add(dependency)
dependency_count += 1
db.commit()
logger.info(f"Created {artifact_count} artifacts, {version_count} versions, and {dependency_count} dependencies")
logger.info(f"Created {artifact_count} artifacts, {tag_count} tags, {version_count} versions, and {dependency_count} dependencies")
logger.info("Database seeding complete")

View File

@@ -6,8 +6,9 @@ from typing import List, Optional, Tuple
from sqlalchemy.orm import Session
import logging
from ..models import Artifact, PackageVersion
from ..models import Artifact, Tag
from ..repositories.artifact import ArtifactRepository
from ..repositories.tag import TagRepository
from ..storage import S3Storage
logger = logging.getLogger(__name__)
@@ -20,8 +21,8 @@ class ArtifactCleanupService:
Reference counting rules:
- ref_count starts at 1 when artifact is first uploaded
- ref_count increments when the same artifact is uploaded again (deduplication)
- ref_count decrements when a version is deleted or updated to point elsewhere
- ref_count decrements when a package is deleted (for each version pointing to artifact)
- ref_count decrements when a tag is deleted or updated to point elsewhere
- ref_count decrements when a package is deleted (for each tag pointing to artifact)
- When ref_count reaches 0, artifact is a candidate for deletion from S3
"""
@@ -29,11 +30,12 @@ class ArtifactCleanupService:
self.db = db
self.storage = storage
self.artifact_repo = ArtifactRepository(db)
self.tag_repo = TagRepository(db)
def on_version_deleted(self, artifact_id: str) -> Artifact:
def on_tag_deleted(self, artifact_id: str) -> Artifact:
"""
Called when a version is deleted.
Decrements ref_count for the artifact the version was pointing to.
Called when a tag is deleted.
Decrements ref_count for the artifact the tag was pointing to.
"""
artifact = self.artifact_repo.get_by_sha256(artifact_id)
if artifact:
@@ -43,11 +45,11 @@ class ArtifactCleanupService:
)
return artifact
def on_version_updated(
def on_tag_updated(
self, old_artifact_id: str, new_artifact_id: str
) -> Tuple[Optional[Artifact], Optional[Artifact]]:
"""
Called when a version is updated to point to a different artifact.
Called when a tag is updated to point to a different artifact.
Decrements ref_count for old artifact, increments for new (if different).
Returns (old_artifact, new_artifact) tuple.
@@ -77,21 +79,21 @@ class ArtifactCleanupService:
def on_package_deleted(self, package_id) -> List[str]:
"""
Called when a package is deleted.
Decrements ref_count for all artifacts that had versions in the package.
Decrements ref_count for all artifacts that had tags in the package.
Returns list of artifact IDs that were affected.
"""
# Get all versions in the package before deletion
versions = self.db.query(PackageVersion).filter(PackageVersion.package_id == package_id).all()
# Get all tags in the package before deletion
tags = self.db.query(Tag).filter(Tag.package_id == package_id).all()
affected_artifacts = []
for version in versions:
artifact = self.artifact_repo.get_by_sha256(version.artifact_id)
for tag in tags:
artifact = self.artifact_repo.get_by_sha256(tag.artifact_id)
if artifact:
self.artifact_repo.decrement_ref_count(artifact)
affected_artifacts.append(version.artifact_id)
affected_artifacts.append(tag.artifact_id)
logger.info(
f"Decremented ref_count for artifact {version.artifact_id} (package delete)"
f"Decremented ref_count for artifact {tag.artifact_id} (package delete)"
)
return affected_artifacts
@@ -150,7 +152,7 @@ class ArtifactCleanupService:
def verify_ref_counts(self, fix: bool = False) -> List[dict]:
"""
Verify that ref_counts match actual version references.
Verify that ref_counts match actual tag references.
Args:
fix: If True, fix any mismatched ref_counts
@@ -160,28 +162,28 @@ class ArtifactCleanupService:
"""
from sqlalchemy import func
# Get actual version counts per artifact
version_counts = (
self.db.query(PackageVersion.artifact_id, func.count(PackageVersion.id).label("version_count"))
.group_by(PackageVersion.artifact_id)
# Get actual tag counts per artifact
tag_counts = (
self.db.query(Tag.artifact_id, func.count(Tag.id).label("tag_count"))
.group_by(Tag.artifact_id)
.all()
)
version_count_map = {artifact_id: count for artifact_id, count in version_counts}
tag_count_map = {artifact_id: count for artifact_id, count in tag_counts}
# Check all artifacts
artifacts = self.db.query(Artifact).all()
mismatches = []
for artifact in artifacts:
actual_count = version_count_map.get(artifact.id, 0)
actual_count = tag_count_map.get(artifact.id, 0)
# ref_count should be at least 1 (initial upload) + additional uploads
# But versions are the primary reference, so we check against version count
# But tags are the primary reference, so we check against tag count
if artifact.ref_count < actual_count:
mismatch = {
"artifact_id": artifact.id,
"stored_ref_count": artifact.ref_count,
"actual_version_count": actual_count,
"actual_tag_count": actual_count,
}
mismatches.append(mismatch)

View File

@@ -12,7 +12,6 @@ passlib[bcrypt]==1.7.4
bcrypt==4.0.1
slowapi==0.1.9
httpx>=0.25.0
redis>=5.0.0
# Test dependencies
pytest>=7.4.0

View File

@@ -96,6 +96,7 @@ def upload_test_file(
package: str,
content: bytes,
filename: str = "test.bin",
tag: Optional[str] = None,
version: Optional[str] = None,
) -> dict:
"""
@@ -107,6 +108,7 @@ def upload_test_file(
package: Package name
content: File content as bytes
filename: Original filename
tag: Optional tag to assign
version: Optional version to assign
Returns:
@@ -114,6 +116,8 @@ def upload_test_file(
"""
files = {"file": (filename, io.BytesIO(content), "application/octet-stream")}
data = {}
if tag:
data["tag"] = tag
if version:
data["version"] = version

View File

@@ -25,7 +25,7 @@ class TestArtifactRetrieval:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project_name, package_name, content, version="v1"
integration_client, project_name, package_name, content, tag="v1"
)
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
@@ -46,27 +46,27 @@ class TestArtifactRetrieval:
assert response.status_code == 404
@pytest.mark.integration
def test_artifact_includes_versions(self, integration_client, test_package):
"""Test artifact response includes versions pointing to it."""
def test_artifact_includes_tags(self, integration_client, test_package):
"""Test artifact response includes tags pointing to it."""
project_name, package_name = test_package
content = b"artifact with versions test"
content = b"artifact with tags test"
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project_name, package_name, content, version="1.0.0"
integration_client, project_name, package_name, content, tag="tagged-v1"
)
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.status_code == 200
data = response.json()
assert "versions" in data
assert len(data["versions"]) >= 1
assert "tags" in data
assert len(data["tags"]) >= 1
version = data["versions"][0]
assert "version" in version
assert "package_name" in version
assert "project_name" in version
tag = data["tags"][0]
assert "name" in tag
assert "package_name" in tag
assert "project_name" in tag
class TestArtifactStats:
@@ -82,7 +82,7 @@ class TestArtifactStats:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version=f"art-{unique_test_id}"
integration_client, project, package, content, tag=f"art-{unique_test_id}"
)
response = integration_client.get(f"/api/v1/artifact/{expected_hash}/stats")
@@ -94,7 +94,7 @@ class TestArtifactStats:
assert "size" in data
assert "ref_count" in data
assert "storage_savings" in data
assert "versions" in data
assert "tags" in data
assert "projects" in data
assert "packages" in data
@@ -136,8 +136,8 @@ class TestArtifactStats:
)
# Upload same content to both projects
upload_test_file(integration_client, proj1, "pkg", content, version="v1")
upload_test_file(integration_client, proj2, "pkg", content, version="v1")
upload_test_file(integration_client, proj1, "pkg", content, tag="v1")
upload_test_file(integration_client, proj2, "pkg", content, tag="v1")
# Check artifact stats
response = integration_client.get(f"/api/v1/artifact/{expected_hash}/stats")
@@ -203,7 +203,7 @@ class TestArtifactProvenance:
assert "first_uploaded_by" in data
assert "upload_count" in data
assert "packages" in data
assert "versions" in data
assert "tags" in data
assert "uploads" in data
@pytest.mark.integration
@@ -214,17 +214,17 @@ class TestArtifactProvenance:
assert response.status_code == 404
@pytest.mark.integration
def test_artifact_history_with_version(self, integration_client, test_package):
"""Test artifact history includes version information when versioned."""
def test_artifact_history_with_tag(self, integration_client, test_package):
"""Test artifact history includes tag information when tagged."""
project_name, package_name = test_package
upload_result = upload_test_file(
integration_client,
project_name,
package_name,
b"versioned provenance test",
"versioned.txt",
version="v1.0.0",
b"tagged provenance test",
"tagged.txt",
tag="v1.0.0",
)
artifact_id = upload_result["artifact_id"]
@@ -232,12 +232,12 @@ class TestArtifactProvenance:
assert response.status_code == 200
data = response.json()
assert len(data["versions"]) >= 1
assert len(data["tags"]) >= 1
version = data["versions"][0]
assert "project_name" in version
assert "package_name" in version
assert "version" in version
tag = data["tags"][0]
assert "project_name" in tag
assert "package_name" in tag
assert "tag_name" in tag
class TestArtifactUploads:
@@ -306,24 +306,24 @@ class TestOrphanedArtifacts:
assert len(response.json()) <= 5
@pytest.mark.integration
def test_artifact_becomes_orphaned_when_version_deleted(
def test_artifact_becomes_orphaned_when_tag_deleted(
self, integration_client, test_package, unique_test_id
):
"""Test artifact appears in orphaned list after version is deleted."""
"""Test artifact appears in orphaned list after tag is deleted."""
project, package = test_package
content = f"orphan test {unique_test_id}".encode()
expected_hash = compute_sha256(content)
# Upload with version
upload_test_file(integration_client, project, package, content, version="1.0.0-temp")
# Upload with tag
upload_test_file(integration_client, project, package, content, tag="temp-tag")
# Verify not in orphaned list
response = integration_client.get("/api/v1/admin/orphaned-artifacts?limit=1000")
orphaned_ids = [a["id"] for a in response.json()]
assert expected_hash not in orphaned_ids
# Delete the version
integration_client.delete(f"/api/v1/project/{project}/{package}/versions/1.0.0-temp")
# Delete the tag
integration_client.delete(f"/api/v1/project/{project}/{package}/tags/temp-tag")
# Verify now in orphaned list
response = integration_client.get("/api/v1/admin/orphaned-artifacts?limit=1000")
@@ -356,9 +356,9 @@ class TestGarbageCollection:
content = f"dry run test {unique_test_id}".encode()
expected_hash = compute_sha256(content)
# Upload and delete version to create orphan
upload_test_file(integration_client, project, package, content, version="1.0.0-dryrun")
integration_client.delete(f"/api/v1/project/{project}/{package}/versions/1.0.0-dryrun")
# Upload and delete tag to create orphan
upload_test_file(integration_client, project, package, content, tag="dry-run")
integration_client.delete(f"/api/v1/project/{project}/{package}/tags/dry-run")
# Verify artifact exists
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
@@ -385,7 +385,7 @@ class TestGarbageCollection:
expected_hash = compute_sha256(content)
# Upload with tag (ref_count=1)
upload_test_file(integration_client, project, package, content, version="keep-this")
upload_test_file(integration_client, project, package, content, tag="keep-this")
# Verify artifact exists with ref_count=1
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
@@ -534,6 +534,50 @@ class TestGlobalArtifacts:
assert response.status_code == 400
class TestGlobalTags:
"""Tests for global tags endpoint."""
@pytest.mark.integration
def test_global_tags_returns_200(self, integration_client):
"""Test global tags endpoint returns 200."""
response = integration_client.get("/api/v1/tags")
assert response.status_code == 200
data = response.json()
assert "items" in data
assert "pagination" in data
@pytest.mark.integration
def test_global_tags_pagination(self, integration_client):
"""Test global tags endpoint respects pagination."""
response = integration_client.get("/api/v1/tags?limit=5&page=1")
assert response.status_code == 200
data = response.json()
assert len(data["items"]) <= 5
assert data["pagination"]["limit"] == 5
@pytest.mark.integration
def test_global_tags_has_project_context(self, integration_client):
"""Test global tags response includes project/package context."""
response = integration_client.get("/api/v1/tags?limit=1")
assert response.status_code == 200
data = response.json()
if len(data["items"]) > 0:
item = data["items"][0]
assert "project_name" in item
assert "package_name" in item
assert "artifact_id" in item
@pytest.mark.integration
def test_global_tags_search_with_wildcard(self, integration_client):
"""Test global tags search supports wildcards."""
response = integration_client.get("/api/v1/tags?search=v*")
assert response.status_code == 200
# Just verify it doesn't error; results may vary
class TestAuditLogs:
"""Tests for global audit logs endpoint."""

View File

@@ -63,7 +63,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"concurrent-{idx}"},
data={"tag": f"concurrent-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -117,7 +117,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"concurrent5-{idx}"},
data={"tag": f"concurrent5-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -171,7 +171,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"concurrent10-{idx}"},
data={"tag": f"concurrent10-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -195,38 +195,19 @@ class TestConcurrentUploads:
@pytest.mark.integration
@pytest.mark.concurrent
def test_concurrent_uploads_same_file_deduplication(
self, integration_client, test_project, unique_test_id
):
"""Test concurrent uploads of same file handle deduplication correctly.
Same content uploaded to different packages should result in:
- Same artifact_id (content-addressable)
- ref_count = number of packages (one version per package)
"""
project = test_project
def test_concurrent_uploads_same_file_deduplication(self, integration_client, test_package):
"""Test concurrent uploads of same file handle deduplication correctly."""
project, package = test_package
api_key = get_api_key(integration_client)
assert api_key, "Failed to create API key"
num_concurrent = 5
package_names = []
# Create multiple packages for concurrent uploads
for i in range(num_concurrent):
pkg_name = f"dedup-pkg-{unique_test_id}-{i}"
response = integration_client.post(
f"/api/v1/project/{project}/packages",
json={"name": pkg_name, "description": f"Dedup test package {i}"},
)
assert response.status_code == 200
package_names.append(pkg_name)
content, expected_hash = generate_content_with_hash(4096, seed=999)
num_concurrent = 5
results = []
errors = []
def upload_worker(idx, package):
def upload_worker(idx):
try:
from httpx import Client
base_url = os.environ.get("ORCHARD_TEST_URL", "http://localhost:8080")
@@ -238,7 +219,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": f"dedup-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -249,10 +230,7 @@ class TestConcurrentUploads:
errors.append(f"Worker {idx}: {str(e)}")
with ThreadPoolExecutor(max_workers=num_concurrent) as executor:
futures = [
executor.submit(upload_worker, i, package_names[i])
for i in range(num_concurrent)
]
futures = [executor.submit(upload_worker, i) for i in range(num_concurrent)]
for future in as_completed(futures):
pass
@@ -264,7 +242,7 @@ class TestConcurrentUploads:
assert len(artifact_ids) == 1
assert expected_hash in artifact_ids
# Verify final ref_count equals number of packages
# Verify final ref_count equals number of uploads
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.status_code == 200
assert response.json()["ref_count"] == num_concurrent
@@ -309,7 +287,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "latest"},
data={"tag": "latest"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -343,7 +321,7 @@ class TestConcurrentDownloads:
content, expected_hash = generate_content_with_hash(2048, seed=400)
# Upload first
upload_test_file(integration_client, project, package, content, version="download-test")
upload_test_file(integration_client, project, package, content, tag="download-test")
results = []
errors = []
@@ -384,7 +362,7 @@ class TestConcurrentDownloads:
project, package = test_package
content, expected_hash = generate_content_with_hash(4096, seed=500)
upload_test_file(integration_client, project, package, content, version="download5-test")
upload_test_file(integration_client, project, package, content, tag="download5-test")
num_downloads = 5
results = []
@@ -425,7 +403,7 @@ class TestConcurrentDownloads:
project, package = test_package
content, expected_hash = generate_content_with_hash(8192, seed=600)
upload_test_file(integration_client, project, package, content, version="download10-test")
upload_test_file(integration_client, project, package, content, tag="download10-test")
num_downloads = 10
results = []
@@ -472,7 +450,7 @@ class TestConcurrentDownloads:
content, expected_hash = generate_content_with_hash(1024, seed=700 + i)
upload_test_file(
integration_client, project, package, content,
version=f"multi-download-{i}"
tag=f"multi-download-{i}"
)
uploads.append((f"multi-download-{i}", content))
@@ -524,7 +502,7 @@ class TestMixedConcurrentOperations:
# Upload initial content
content1, hash1 = generate_content_with_hash(10240, seed=800) # 10KB
upload_test_file(integration_client, project, package, content1, version="initial")
upload_test_file(integration_client, project, package, content1, tag="initial")
# New content for upload during download
content2, hash2 = generate_content_with_hash(10240, seed=801)
@@ -561,7 +539,7 @@ class TestMixedConcurrentOperations:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "during-download"},
data={"tag": "during-download"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -601,7 +579,7 @@ class TestMixedConcurrentOperations:
existing_files = []
for i in range(3):
content, hash = generate_content_with_hash(2048, seed=900 + i)
upload_test_file(integration_client, project, package, content, version=f"existing-{i}")
upload_test_file(integration_client, project, package, content, tag=f"existing-{i}")
existing_files.append((f"existing-{i}", content))
# New files for uploading
@@ -641,7 +619,7 @@ class TestMixedConcurrentOperations:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"new-{idx}"},
data={"tag": f"new-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -711,7 +689,7 @@ class TestMixedConcurrentOperations:
upload_resp = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"pattern-{idx}"},
data={"tag": f"pattern-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if upload_resp.status_code != 200:

View File

@@ -68,7 +68,7 @@ class TestUploadErrorHandling:
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
data={"version": "no-file-provided"},
data={"tag": "no-file-provided"},
)
assert response.status_code == 422
@@ -200,7 +200,7 @@ class TestTimeoutBehavior:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content, version="timeout-test"
integration_client, project, package, content, tag="timeout-test"
)
elapsed = time.time() - start_time
@@ -219,7 +219,7 @@ class TestTimeoutBehavior:
# First upload
upload_test_file(
integration_client, project, package, content, version="download-timeout-test"
integration_client, project, package, content, tag="download-timeout-test"
)
# Then download and time it

View File

@@ -41,7 +41,7 @@ class TestRoundTripVerification:
# Upload and capture returned hash
result = upload_test_file(
integration_client, project, package, content, version="roundtrip"
integration_client, project, package, content, tag="roundtrip"
)
uploaded_hash = result["artifact_id"]
@@ -84,7 +84,7 @@ class TestRoundTripVerification:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="header-check"
integration_client, project, package, content, tag="header-check"
)
response = integration_client.get(
@@ -102,7 +102,7 @@ class TestRoundTripVerification:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="etag-check"
integration_client, project, package, content, tag="etag-check"
)
response = integration_client.get(
@@ -186,7 +186,7 @@ class TestClientSideVerificationWorkflow:
content = b"Client post-download verification"
upload_test_file(
integration_client, project, package, content, version="verify-after"
integration_client, project, package, content, tag="verify-after"
)
response = integration_client.get(
@@ -215,7 +215,7 @@ class TestIntegritySizeVariants:
content, expected_hash = sized_content(SIZE_1KB, seed=100)
result = upload_test_file(
integration_client, project, package, content, version="int-1kb"
integration_client, project, package, content, tag="int-1kb"
)
assert result["artifact_id"] == expected_hash
@@ -234,7 +234,7 @@ class TestIntegritySizeVariants:
content, expected_hash = sized_content(SIZE_100KB, seed=101)
result = upload_test_file(
integration_client, project, package, content, version="int-100kb"
integration_client, project, package, content, tag="int-100kb"
)
assert result["artifact_id"] == expected_hash
@@ -253,7 +253,7 @@ class TestIntegritySizeVariants:
content, expected_hash = sized_content(SIZE_1MB, seed=102)
result = upload_test_file(
integration_client, project, package, content, version="int-1mb"
integration_client, project, package, content, tag="int-1mb"
)
assert result["artifact_id"] == expected_hash
@@ -273,7 +273,7 @@ class TestIntegritySizeVariants:
content, expected_hash = sized_content(SIZE_10MB, seed=103)
result = upload_test_file(
integration_client, project, package, content, version="int-10mb"
integration_client, project, package, content, tag="int-10mb"
)
assert result["artifact_id"] == expected_hash
@@ -323,13 +323,7 @@ class TestConsistencyCheck:
@pytest.mark.integration
def test_consistency_check_after_upload(self, integration_client, test_package):
"""Test consistency check runs successfully after a valid upload.
Note: We don't assert healthy=True because other tests (especially
corruption detection tests) may leave orphaned S3 objects behind.
This test validates the consistency check endpoint works and the
uploaded artifact is included in the check count.
"""
"""Test consistency check passes after valid upload."""
project, package = test_package
content = b"Consistency check test content"
@@ -341,10 +335,9 @@ class TestConsistencyCheck:
assert response.status_code == 200
data = response.json()
# Verify check ran - at least 1 artifact was checked
# Verify check ran and no issues
assert data["total_artifacts_checked"] >= 1
# Verify no missing S3 objects (uploaded artifact should exist)
assert data["missing_s3_objects"] == 0
assert data["healthy"] is True
@pytest.mark.integration
def test_consistency_check_limit_parameter(self, integration_client):
@@ -373,7 +366,7 @@ class TestDigestHeader:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="digest-test"
integration_client, project, package, content, tag="digest-test"
)
response = integration_client.get(
@@ -397,7 +390,7 @@ class TestDigestHeader:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="digest-b64"
integration_client, project, package, content, tag="digest-b64"
)
response = integration_client.get(
@@ -427,7 +420,7 @@ class TestVerificationModes:
content = b"Pre-verification mode test"
upload_test_file(
integration_client, project, package, content, version="pre-verify"
integration_client, project, package, content, tag="pre-verify"
)
response = integration_client.get(
@@ -447,7 +440,7 @@ class TestVerificationModes:
content = b"Stream verification mode test"
upload_test_file(
integration_client, project, package, content, version="stream-verify"
integration_client, project, package, content, tag="stream-verify"
)
response = integration_client.get(
@@ -484,7 +477,7 @@ class TestArtifactIntegrityEndpoint:
expected_size = len(content)
upload_test_file(
integration_client, project, package, content, version="content-len"
integration_client, project, package, content, tag="content-len"
)
response = integration_client.get(
@@ -520,7 +513,7 @@ class TestCorruptionDetection:
# Upload original content
result = upload_test_file(
integration_client, project, package, content, version="corrupt-test"
integration_client, project, package, content, tag="corrupt-test"
)
assert result["artifact_id"] == expected_hash
@@ -562,7 +555,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="bitflip-test"
integration_client, project, package, content, tag="bitflip-test"
)
assert result["artifact_id"] == expected_hash
@@ -599,7 +592,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="truncate-test"
integration_client, project, package, content, tag="truncate-test"
)
assert result["artifact_id"] == expected_hash
@@ -634,7 +627,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="append-test"
integration_client, project, package, content, tag="append-test"
)
assert result["artifact_id"] == expected_hash
@@ -677,7 +670,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="client-detect"
integration_client, project, package, content, tag="client-detect"
)
# Corrupt the S3 object
@@ -720,7 +713,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="size-mismatch"
integration_client, project, package, content, tag="size-mismatch"
)
# Modify S3 object to have different size
@@ -754,7 +747,7 @@ class TestCorruptionDetection:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project, package, content, version="missing-s3"
integration_client, project, package, content, tag="missing-s3"
)
# Delete the S3 object

View File

@@ -41,7 +41,7 @@ class TestUploadMetrics:
content = b"duration test content"
result = upload_test_file(
integration_client, project, package, content, version="duration-test"
integration_client, project, package, content, tag="duration-test"
)
assert "duration_ms" in result
@@ -55,7 +55,7 @@ class TestUploadMetrics:
content = b"throughput test content"
result = upload_test_file(
integration_client, project, package, content, version="throughput-test"
integration_client, project, package, content, tag="throughput-test"
)
assert "throughput_mbps" in result
@@ -72,7 +72,7 @@ class TestUploadMetrics:
start = time.time()
result = upload_test_file(
integration_client, project, package, content, version="duration-check"
integration_client, project, package, content, tag="duration-check"
)
actual_duration = (time.time() - start) * 1000 # ms
@@ -92,7 +92,7 @@ class TestLargeFileUploads:
content, expected_hash = sized_content(SIZE_10MB, seed=200)
result = upload_test_file(
integration_client, project, package, content, version="large-10mb"
integration_client, project, package, content, tag="large-10mb"
)
assert result["artifact_id"] == expected_hash
@@ -109,7 +109,7 @@ class TestLargeFileUploads:
content, expected_hash = sized_content(SIZE_100MB, seed=300)
result = upload_test_file(
integration_client, project, package, content, version="large-100mb"
integration_client, project, package, content, tag="large-100mb"
)
assert result["artifact_id"] == expected_hash
@@ -126,7 +126,7 @@ class TestLargeFileUploads:
content, expected_hash = sized_content(SIZE_1GB, seed=400)
result = upload_test_file(
integration_client, project, package, content, version="large-1gb"
integration_client, project, package, content, tag="large-1gb"
)
assert result["artifact_id"] == expected_hash
@@ -147,14 +147,14 @@ class TestLargeFileUploads:
# First upload
result1 = upload_test_file(
integration_client, project, package, content, version=f"dedup-{unique_test_id}-1"
integration_client, project, package, content, tag=f"dedup-{unique_test_id}-1"
)
# Note: may be True if previous test uploaded same content
first_dedupe = result1["deduplicated"]
# Second upload of same content
result2 = upload_test_file(
integration_client, project, package, content, version=f"dedup-{unique_test_id}-2"
integration_client, project, package, content, tag=f"dedup-{unique_test_id}-2"
)
assert result2["artifact_id"] == expected_hash
# Second upload MUST be deduplicated
@@ -277,7 +277,7 @@ class TestUploadSizeLimits:
content = b"X"
result = upload_test_file(
integration_client, project, package, content, version="min-size"
integration_client, project, package, content, tag="min-size"
)
assert result["size"] == 1
@@ -289,7 +289,7 @@ class TestUploadSizeLimits:
content = b"content length verification test"
result = upload_test_file(
integration_client, project, package, content, version="content-length-test"
integration_client, project, package, content, tag="content-length-test"
)
# Size in response should match actual content length
@@ -336,7 +336,7 @@ class TestUploadErrorHandling:
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
data={"version": "no-file"},
data={"tag": "no-file"},
)
assert response.status_code == 422
@@ -459,7 +459,7 @@ class TestUploadTimeout:
# httpx client should handle this quickly
result = upload_test_file(
integration_client, project, package, content, version="timeout-small"
integration_client, project, package, content, tag="timeout-small"
)
assert result["artifact_id"] is not None
@@ -474,7 +474,7 @@ class TestUploadTimeout:
start = time.time()
result = upload_test_file(
integration_client, project, package, content, version="timeout-check"
integration_client, project, package, content, tag="timeout-check"
)
duration = time.time() - start
@@ -525,7 +525,7 @@ class TestConcurrentUploads:
response = client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": f"concurrent-diff-{idx}"},
data={"tag": f"concurrent-diff-{idx}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:

View File

@@ -175,7 +175,7 @@ class TestPackageStats:
assert "package_id" in data
assert "package_name" in data
assert "project_name" in data
assert "version_count" in data
assert "tag_count" in data
assert "artifact_count" in data
assert "total_size_bytes" in data
assert "upload_count" in data
@@ -234,11 +234,7 @@ class TestPackageCascadeDelete:
def test_ref_count_decrements_on_package_delete(
self, integration_client, unique_test_id
):
"""Test ref_count decrements when package is deleted.
Each package can only have one version per artifact (same content = same version).
This test verifies that deleting a package decrements the artifact's ref_count.
"""
"""Test ref_count decrements for all tags when package is deleted."""
project_name = f"cascade-pkg-{unique_test_id}"
package_name = f"test-pkg-{unique_test_id}"
@@ -260,17 +256,23 @@ class TestPackageCascadeDelete:
)
assert response.status_code == 200
# Upload content with version
# Upload content with multiple tags
content = f"cascade delete test {unique_test_id}".encode()
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project_name, package_name, content, version="1.0.0"
integration_client, project_name, package_name, content, tag="v1"
)
upload_test_file(
integration_client, project_name, package_name, content, tag="v2"
)
upload_test_file(
integration_client, project_name, package_name, content, tag="v3"
)
# Verify ref_count is 1
# Verify ref_count is 3
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 1
assert response.json()["ref_count"] == 3
# Delete the package
delete_response = integration_client.delete(

View File

@@ -149,7 +149,7 @@ class TestProjectStats:
assert "project_id" in data
assert "project_name" in data
assert "package_count" in data
assert "version_count" in data
assert "tag_count" in data
assert "artifact_count" in data
assert "total_size_bytes" in data
assert "upload_count" in data
@@ -229,11 +229,7 @@ class TestProjectCascadeDelete:
def test_ref_count_decrements_on_project_delete(
self, integration_client, unique_test_id
):
"""Test ref_count decrements for all versions when project is deleted.
Each package can only have one version per artifact (same content = same version).
With 2 packages, ref_count should be 2, and go to 0 when project is deleted.
"""
"""Test ref_count decrements for all tags when project is deleted."""
project_name = f"cascade-proj-{unique_test_id}"
package1_name = f"pkg1-{unique_test_id}"
package2_name = f"pkg2-{unique_test_id}"
@@ -257,20 +253,26 @@ class TestProjectCascadeDelete:
)
assert response.status_code == 200
# Upload same content to both packages
# Upload same content with tags in both packages
content = f"project cascade test {unique_test_id}".encode()
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project_name, package1_name, content, version="1.0.0"
integration_client, project_name, package1_name, content, tag="v1"
)
upload_test_file(
integration_client, project_name, package2_name, content, version="1.0.0"
integration_client, project_name, package1_name, content, tag="v2"
)
upload_test_file(
integration_client, project_name, package2_name, content, tag="latest"
)
upload_test_file(
integration_client, project_name, package2_name, content, tag="stable"
)
# Verify ref_count is 2 (1 version in each of 2 packages)
# Verify ref_count is 4 (2 tags in each of 2 packages)
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 2
assert response.json()["ref_count"] == 4
# Delete the project
delete_response = integration_client.delete(f"/api/v1/projects/{project_name}")

View File

@@ -135,19 +135,3 @@ class TestPyPIPackageNormalization:
assert "text/html" in response.headers.get("content-type", "")
elif response.status_code == 503:
assert "No PyPI upstream sources configured" in response.json()["detail"]
class TestPyPIProxyInfrastructure:
"""Tests for PyPI proxy infrastructure integration."""
@pytest.mark.integration
def test_health_endpoint_includes_infrastructure(self, integration_client):
"""Health endpoint should report infrastructure status."""
response = integration_client.get("/health")
assert response.status_code == 200
data = response.json()
assert data["status"] == "ok"
# Infrastructure status should be present
assert "http_pool" in data
assert "cache" in data

View File

@@ -48,7 +48,7 @@ class TestSmallFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="1byte.bin", version="1byte"
filename="1byte.bin", tag="1byte"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_1B
@@ -70,7 +70,7 @@ class TestSmallFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="1kb.bin", version="1kb"
filename="1kb.bin", tag="1kb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_1KB
@@ -90,7 +90,7 @@ class TestSmallFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="10kb.bin", version="10kb"
filename="10kb.bin", tag="10kb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_10KB
@@ -110,7 +110,7 @@ class TestSmallFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="100kb.bin", version="100kb"
filename="100kb.bin", tag="100kb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_100KB
@@ -134,7 +134,7 @@ class TestMediumFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="1mb.bin", version="1mb"
filename="1mb.bin", tag="1mb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_1MB
@@ -155,7 +155,7 @@ class TestMediumFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="5mb.bin", version="5mb"
filename="5mb.bin", tag="5mb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_5MB
@@ -177,7 +177,7 @@ class TestMediumFileSizes:
result = upload_test_file(
integration_client, project, package, content,
filename="10mb.bin", version="10mb"
filename="10mb.bin", tag="10mb"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == SIZE_10MB
@@ -200,7 +200,7 @@ class TestMediumFileSizes:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content,
filename="50mb.bin", version="50mb"
filename="50mb.bin", tag="50mb"
)
upload_time = time.time() - start_time
@@ -240,7 +240,7 @@ class TestLargeFileSizes:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content,
filename="100mb.bin", version="100mb"
filename="100mb.bin", tag="100mb"
)
upload_time = time.time() - start_time
@@ -271,7 +271,7 @@ class TestLargeFileSizes:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content,
filename="250mb.bin", version="250mb"
filename="250mb.bin", tag="250mb"
)
upload_time = time.time() - start_time
@@ -302,7 +302,7 @@ class TestLargeFileSizes:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content,
filename="500mb.bin", version="500mb"
filename="500mb.bin", tag="500mb"
)
upload_time = time.time() - start_time
@@ -336,7 +336,7 @@ class TestLargeFileSizes:
start_time = time.time()
result = upload_test_file(
integration_client, project, package, content,
filename="1gb.bin", version="1gb"
filename="1gb.bin", tag="1gb"
)
upload_time = time.time() - start_time
@@ -368,7 +368,7 @@ class TestChunkBoundaries:
result = upload_test_file(
integration_client, project, package, content,
filename="chunk.bin", version="chunk-exact"
filename="chunk.bin", tag="chunk-exact"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == CHUNK_SIZE
@@ -389,7 +389,7 @@ class TestChunkBoundaries:
result = upload_test_file(
integration_client, project, package, content,
filename="chunk_plus.bin", version="chunk-plus"
filename="chunk_plus.bin", tag="chunk-plus"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == size
@@ -410,7 +410,7 @@ class TestChunkBoundaries:
result = upload_test_file(
integration_client, project, package, content,
filename="chunk_minus.bin", version="chunk-minus"
filename="chunk_minus.bin", tag="chunk-minus"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == size
@@ -431,7 +431,7 @@ class TestChunkBoundaries:
result = upload_test_file(
integration_client, project, package, content,
filename="multi_chunk.bin", version="multi-chunk"
filename="multi_chunk.bin", tag="multi-chunk"
)
assert result["artifact_id"] == expected_hash
assert result["size"] == size
@@ -457,7 +457,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename="binary.bin", version="binary"
filename="binary.bin", tag="binary"
)
assert result["artifact_id"] == expected_hash
@@ -477,7 +477,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename="text.txt", version="text"
filename="text.txt", tag="text"
)
assert result["artifact_id"] == expected_hash
@@ -498,7 +498,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename="nulls.bin", version="nulls"
filename="nulls.bin", tag="nulls"
)
assert result["artifact_id"] == expected_hash
@@ -519,7 +519,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename="文件名.txt", version="unicode-name"
filename="文件名.txt", tag="unicode-name"
)
assert result["artifact_id"] == expected_hash
assert result["original_name"] == "文件名.txt"
@@ -543,7 +543,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename="data.gz", version="compressed"
filename="data.gz", tag="compressed"
)
assert result["artifact_id"] == expected_hash
@@ -568,7 +568,7 @@ class TestDataIntegrity:
result = upload_test_file(
integration_client, project, package, content,
filename=f"hash_test_{size}.bin", version=f"hash-{size}"
filename=f"hash_test_{size}.bin", tag=f"hash-{size}"
)
# Verify artifact_id matches expected hash

View File

@@ -32,7 +32,7 @@ class TestRangeRequests:
"""Test range request for first N bytes."""
project, package = test_package
content = b"0123456789" * 100 # 1000 bytes
upload_test_file(integration_client, project, package, content, version="range-test")
upload_test_file(integration_client, project, package, content, tag="range-test")
# Request first 10 bytes
response = integration_client.get(
@@ -50,7 +50,7 @@ class TestRangeRequests:
"""Test range request for bytes in the middle."""
project, package = test_package
content = b"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
upload_test_file(integration_client, project, package, content, version="range-mid")
upload_test_file(integration_client, project, package, content, tag="range-mid")
# Request bytes 10-19 (KLMNOPQRST)
response = integration_client.get(
@@ -66,7 +66,7 @@ class TestRangeRequests:
"""Test range request for last N bytes (suffix range)."""
project, package = test_package
content = b"0123456789ABCDEF" # 16 bytes
upload_test_file(integration_client, project, package, content, version="range-suffix")
upload_test_file(integration_client, project, package, content, tag="range-suffix")
# Request last 4 bytes
response = integration_client.get(
@@ -82,7 +82,7 @@ class TestRangeRequests:
"""Test range request from offset to end."""
project, package = test_package
content = b"0123456789"
upload_test_file(integration_client, project, package, content, version="range-open")
upload_test_file(integration_client, project, package, content, tag="range-open")
# Request from byte 5 to end
response = integration_client.get(
@@ -100,7 +100,7 @@ class TestRangeRequests:
"""Test that range requests include Accept-Ranges header."""
project, package = test_package
content = b"test content"
upload_test_file(integration_client, project, package, content, version="accept-ranges")
upload_test_file(integration_client, project, package, content, tag="accept-ranges")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/accept-ranges",
@@ -117,7 +117,7 @@ class TestRangeRequests:
"""Test that full downloads advertise range support."""
project, package = test_package
content = b"test content"
upload_test_file(integration_client, project, package, content, version="full-accept")
upload_test_file(integration_client, project, package, content, tag="full-accept")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/full-accept",
@@ -136,7 +136,7 @@ class TestConditionalRequests:
project, package = test_package
content = b"conditional request test content"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="cond-etag")
upload_test_file(integration_client, project, package, content, tag="cond-etag")
# Request with matching ETag
response = integration_client.get(
@@ -153,7 +153,7 @@ class TestConditionalRequests:
project, package = test_package
content = b"etag no quotes test"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="cond-noquote")
upload_test_file(integration_client, project, package, content, tag="cond-noquote")
# Request with ETag without quotes
response = integration_client.get(
@@ -168,7 +168,7 @@ class TestConditionalRequests:
"""Test If-None-Match with non-matching ETag returns 200."""
project, package = test_package
content = b"etag mismatch test"
upload_test_file(integration_client, project, package, content, version="cond-mismatch")
upload_test_file(integration_client, project, package, content, tag="cond-mismatch")
# Request with different ETag
response = integration_client.get(
@@ -184,7 +184,7 @@ class TestConditionalRequests:
"""Test If-Modified-Since with future date returns 304."""
project, package = test_package
content = b"modified since test"
upload_test_file(integration_client, project, package, content, version="cond-modified")
upload_test_file(integration_client, project, package, content, tag="cond-modified")
# Request with future date (artifact was definitely created before this)
future_date = formatdate(time.time() + 86400, usegmt=True) # Tomorrow
@@ -202,7 +202,7 @@ class TestConditionalRequests:
"""Test If-Modified-Since with old date returns 200."""
project, package = test_package
content = b"old date test"
upload_test_file(integration_client, project, package, content, version="cond-old")
upload_test_file(integration_client, project, package, content, tag="cond-old")
# Request with old date (2020-01-01)
old_date = "Wed, 01 Jan 2020 00:00:00 GMT"
@@ -220,7 +220,7 @@ class TestConditionalRequests:
project, package = test_package
content = b"304 etag test"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="304-etag")
upload_test_file(integration_client, project, package, content, tag="304-etag")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/304-etag",
@@ -236,7 +236,7 @@ class TestConditionalRequests:
project, package = test_package
content = b"304 cache test"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="304-cache")
upload_test_file(integration_client, project, package, content, tag="304-cache")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/304-cache",
@@ -255,7 +255,7 @@ class TestCachingHeaders:
"""Test download response includes Cache-Control header."""
project, package = test_package
content = b"cache control test"
upload_test_file(integration_client, project, package, content, version="cache-ctl")
upload_test_file(integration_client, project, package, content, tag="cache-ctl")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/cache-ctl",
@@ -272,7 +272,7 @@ class TestCachingHeaders:
"""Test download response includes Last-Modified header."""
project, package = test_package
content = b"last modified test"
upload_test_file(integration_client, project, package, content, version="last-mod")
upload_test_file(integration_client, project, package, content, tag="last-mod")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/last-mod",
@@ -290,7 +290,7 @@ class TestCachingHeaders:
project, package = test_package
content = b"etag header test"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="etag-hdr")
upload_test_file(integration_client, project, package, content, tag="etag-hdr")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/etag-hdr",
@@ -308,7 +308,7 @@ class TestDownloadResume:
"""Test resuming download from where it left off."""
project, package = test_package
content = b"ABCDEFGHIJ" * 100 # 1000 bytes
upload_test_file(integration_client, project, package, content, version="resume-test")
upload_test_file(integration_client, project, package, content, tag="resume-test")
# Simulate partial download (first 500 bytes)
response1 = integration_client.get(
@@ -340,7 +340,7 @@ class TestDownloadResume:
project, package = test_package
content = b"resume etag verification test content"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="resume-etag")
upload_test_file(integration_client, project, package, content, tag="resume-etag")
# Get ETag from first request
response1 = integration_client.get(
@@ -373,7 +373,7 @@ class TestLargeFileStreaming:
project, package = test_package
content, expected_hash = sized_content(SIZE_1MB, seed=500)
upload_test_file(integration_client, project, package, content, version="stream-1mb")
upload_test_file(integration_client, project, package, content, tag="stream-1mb")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/stream-1mb",
@@ -391,7 +391,7 @@ class TestLargeFileStreaming:
project, package = test_package
content, expected_hash = sized_content(SIZE_100KB, seed=501)
upload_test_file(integration_client, project, package, content, version="stream-hdr")
upload_test_file(integration_client, project, package, content, tag="stream-hdr")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/stream-hdr",
@@ -410,7 +410,7 @@ class TestLargeFileStreaming:
project, package = test_package
content, _ = sized_content(SIZE_100KB, seed=502)
upload_test_file(integration_client, project, package, content, version="range-large")
upload_test_file(integration_client, project, package, content, tag="range-large")
# Request a slice from the middle
start = 50000
@@ -433,7 +433,7 @@ class TestDownloadModes:
"""Test proxy mode streams content through backend."""
project, package = test_package
content = b"proxy mode test content"
upload_test_file(integration_client, project, package, content, version="mode-proxy")
upload_test_file(integration_client, project, package, content, tag="mode-proxy")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/mode-proxy",
@@ -447,7 +447,7 @@ class TestDownloadModes:
"""Test presigned mode returns JSON with URL."""
project, package = test_package
content = b"presigned mode test"
upload_test_file(integration_client, project, package, content, version="mode-presign")
upload_test_file(integration_client, project, package, content, tag="mode-presign")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/mode-presign",
@@ -464,7 +464,7 @@ class TestDownloadModes:
"""Test redirect mode returns 302 to presigned URL."""
project, package = test_package
content = b"redirect mode test"
upload_test_file(integration_client, project, package, content, version="mode-redir")
upload_test_file(integration_client, project, package, content, tag="mode-redir")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/mode-redir",
@@ -484,7 +484,7 @@ class TestIntegrityDuringStreaming:
project, package = test_package
content = b"integrity check content"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="integrity")
upload_test_file(integration_client, project, package, content, tag="integrity")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/integrity",
@@ -505,7 +505,7 @@ class TestIntegrityDuringStreaming:
project, package = test_package
content = b"etag integrity test"
expected_hash = compute_sha256(content)
upload_test_file(integration_client, project, package, content, version="etag-int")
upload_test_file(integration_client, project, package, content, tag="etag-int")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/etag-int",
@@ -524,7 +524,7 @@ class TestIntegrityDuringStreaming:
"""Test Digest header is present in RFC 3230 format."""
project, package = test_package
content = b"digest header test"
upload_test_file(integration_client, project, package, content, version="digest")
upload_test_file(integration_client, project, package, content, tag="digest")
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/digest",

View File

@@ -0,0 +1,403 @@
"""
Integration tests for tag API endpoints.
Tests cover:
- Tag CRUD operations
- Tag listing with pagination and search
- Tag history tracking
- ref_count behavior with tag operations
"""
import pytest
from tests.factories import compute_sha256, upload_test_file
class TestTagCRUD:
"""Tests for tag create, read, delete operations."""
@pytest.mark.integration
def test_create_tag_via_upload(self, integration_client, test_package):
"""Test creating a tag via upload endpoint."""
project_name, package_name = test_package
result = upload_test_file(
integration_client,
project_name,
package_name,
b"tag create test",
tag="v1.0.0",
)
assert result["tag"] == "v1.0.0"
assert result["artifact_id"]
@pytest.mark.integration
def test_create_tag_via_post(
self, integration_client, test_package, unique_test_id
):
"""Test creating a tag via POST /tags endpoint."""
project_name, package_name = test_package
# First upload an artifact
result = upload_test_file(
integration_client,
project_name,
package_name,
b"artifact for tag",
)
artifact_id = result["artifact_id"]
# Create tag via POST
tag_name = f"post-tag-{unique_test_id}"
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/tags",
json={"name": tag_name, "artifact_id": artifact_id},
)
assert response.status_code == 200
data = response.json()
assert data["name"] == tag_name
assert data["artifact_id"] == artifact_id
@pytest.mark.integration
def test_get_tag(self, integration_client, test_package):
"""Test getting a tag by name."""
project_name, package_name = test_package
upload_test_file(
integration_client,
project_name,
package_name,
b"get tag test",
tag="get-tag",
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags/get-tag"
)
assert response.status_code == 200
data = response.json()
assert data["name"] == "get-tag"
assert "artifact_id" in data
assert "artifact_size" in data
assert "artifact_content_type" in data
@pytest.mark.integration
def test_list_tags(self, integration_client, test_package):
"""Test listing tags for a package."""
project_name, package_name = test_package
# Create some tags
upload_test_file(
integration_client,
project_name,
package_name,
b"list tags test",
tag="list-v1",
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags"
)
assert response.status_code == 200
data = response.json()
assert "items" in data
assert "pagination" in data
tag_names = [t["name"] for t in data["items"]]
assert "list-v1" in tag_names
@pytest.mark.integration
def test_delete_tag(self, integration_client, test_package):
"""Test deleting a tag."""
project_name, package_name = test_package
upload_test_file(
integration_client,
project_name,
package_name,
b"delete tag test",
tag="to-delete",
)
# Delete tag
response = integration_client.delete(
f"/api/v1/project/{project_name}/{package_name}/tags/to-delete"
)
assert response.status_code == 204
# Verify deleted
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags/to-delete"
)
assert response.status_code == 404
class TestTagListingFilters:
"""Tests for tag listing with filters and search."""
@pytest.mark.integration
def test_tags_pagination(self, integration_client, test_package):
"""Test tag listing respects pagination."""
project_name, package_name = test_package
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags?limit=5"
)
assert response.status_code == 200
data = response.json()
assert len(data["items"]) <= 5
assert data["pagination"]["limit"] == 5
@pytest.mark.integration
def test_tags_search(self, integration_client, test_package, unique_test_id):
"""Test tag search by name."""
project_name, package_name = test_package
tag_name = f"searchable-{unique_test_id}"
upload_test_file(
integration_client,
project_name,
package_name,
b"search test",
tag=tag_name,
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags?search=searchable"
)
assert response.status_code == 200
data = response.json()
tag_names = [t["name"] for t in data["items"]]
assert tag_name in tag_names
class TestTagHistory:
"""Tests for tag history tracking."""
@pytest.mark.integration
def test_tag_history_on_create(self, integration_client, test_package):
"""Test tag history is created when tag is created."""
project_name, package_name = test_package
upload_test_file(
integration_client,
project_name,
package_name,
b"history create test",
tag="history-create",
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags/history-create/history"
)
assert response.status_code == 200
data = response.json()
assert len(data) >= 1
@pytest.mark.integration
def test_tag_history_on_update(
self, integration_client, test_package, unique_test_id
):
"""Test tag history is created when tag is updated."""
project_name, package_name = test_package
tag_name = f"history-update-{unique_test_id}"
# Create tag with first artifact
upload_test_file(
integration_client,
project_name,
package_name,
b"first content",
tag=tag_name,
)
# Update tag with second artifact
upload_test_file(
integration_client,
project_name,
package_name,
b"second content",
tag=tag_name,
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags/{tag_name}/history"
)
assert response.status_code == 200
data = response.json()
# Should have at least 2 history entries (create + update)
assert len(data) >= 2
class TestTagRefCount:
"""Tests for ref_count behavior with tag operations."""
@pytest.mark.integration
def test_ref_count_decrements_on_tag_delete(self, integration_client, test_package):
"""Test ref_count decrements when a tag is deleted."""
project_name, package_name = test_package
content = b"ref count delete test"
expected_hash = compute_sha256(content)
# Upload with two tags
upload_test_file(
integration_client, project_name, package_name, content, tag="rc-v1"
)
upload_test_file(
integration_client, project_name, package_name, content, tag="rc-v2"
)
# Verify ref_count is 2
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 2
# Delete one tag
delete_response = integration_client.delete(
f"/api/v1/project/{project_name}/{package_name}/tags/rc-v1"
)
assert delete_response.status_code == 204
# Verify ref_count is now 1
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 1
@pytest.mark.integration
def test_ref_count_zero_after_all_tags_deleted(
self, integration_client, test_package
):
"""Test ref_count goes to 0 when all tags are deleted."""
project_name, package_name = test_package
content = b"orphan test content"
expected_hash = compute_sha256(content)
# Upload with one tag
upload_test_file(
integration_client, project_name, package_name, content, tag="only-tag"
)
# Delete the tag
integration_client.delete(
f"/api/v1/project/{project_name}/{package_name}/tags/only-tag"
)
# Verify ref_count is 0
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 0
@pytest.mark.integration
def test_ref_count_adjusts_on_tag_update(
self, integration_client, test_package, unique_test_id
):
"""Test ref_count adjusts when a tag is updated to point to different artifact."""
project_name, package_name = test_package
# Upload two different artifacts
content1 = f"artifact one {unique_test_id}".encode()
content2 = f"artifact two {unique_test_id}".encode()
hash1 = compute_sha256(content1)
hash2 = compute_sha256(content2)
# Upload first artifact with tag "latest"
upload_test_file(
integration_client, project_name, package_name, content1, tag="latest"
)
# Verify first artifact has ref_count 1
response = integration_client.get(f"/api/v1/artifact/{hash1}")
assert response.json()["ref_count"] == 1
# Upload second artifact with different tag
upload_test_file(
integration_client, project_name, package_name, content2, tag="stable"
)
# Now update "latest" tag to point to second artifact
upload_test_file(
integration_client, project_name, package_name, content2, tag="latest"
)
# Verify first artifact ref_count decreased to 0
response = integration_client.get(f"/api/v1/artifact/{hash1}")
assert response.json()["ref_count"] == 0
# Verify second artifact ref_count increased to 2
response = integration_client.get(f"/api/v1/artifact/{hash2}")
assert response.json()["ref_count"] == 2
@pytest.mark.integration
def test_ref_count_unchanged_when_tag_same_artifact(
self, integration_client, test_package, unique_test_id
):
"""Test ref_count doesn't change when tag is 'updated' to same artifact."""
project_name, package_name = test_package
content = f"same artifact {unique_test_id}".encode()
expected_hash = compute_sha256(content)
# Upload with tag
upload_test_file(
integration_client, project_name, package_name, content, tag="same-v1"
)
# Verify ref_count is 1
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 1
# Upload same content with same tag (no-op)
upload_test_file(
integration_client, project_name, package_name, content, tag="same-v1"
)
# Verify ref_count is still 1
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 1
@pytest.mark.integration
def test_tag_via_post_endpoint_increments_ref_count(
self, integration_client, test_package, unique_test_id
):
"""Test creating tag via POST /tags endpoint increments ref_count."""
project_name, package_name = test_package
content = f"tag endpoint test {unique_test_id}".encode()
expected_hash = compute_sha256(content)
# Upload artifact without tag
result = upload_test_file(
integration_client, project_name, package_name, content, filename="test.bin"
)
artifact_id = result["artifact_id"]
# Verify ref_count is 0 (no tags yet)
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 0
# Create tag via POST endpoint
tag_response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/tags",
json={"name": "post-v1", "artifact_id": artifact_id},
)
assert tag_response.status_code == 200
# Verify ref_count is now 1
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 1
# Create another tag via POST endpoint
tag_response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/tags",
json={"name": "post-latest", "artifact_id": artifact_id},
)
assert tag_response.status_code == 200
# Verify ref_count is now 2
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.json()["ref_count"] == 2

View File

@@ -47,7 +47,7 @@ class TestUploadBasics:
expected_hash = compute_sha256(content)
result = upload_test_file(
integration_client, project_name, package_name, content, version="v1"
integration_client, project_name, package_name, content, tag="v1"
)
assert result["artifact_id"] == expected_hash
@@ -116,23 +116,31 @@ class TestUploadBasics:
assert result["created_at"] is not None
@pytest.mark.integration
def test_upload_without_version_succeeds(self, integration_client, test_package):
"""Test upload without version succeeds (no version created)."""
def test_upload_without_tag_succeeds(self, integration_client, test_package):
"""Test upload without tag succeeds (no tag created)."""
project, package = test_package
content = b"upload without version test"
content = b"upload without tag test"
expected_hash = compute_sha256(content)
files = {"file": ("no_version.bin", io.BytesIO(content), "application/octet-stream")}
files = {"file": ("no_tag.bin", io.BytesIO(content), "application/octet-stream")}
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
# No version parameter
# No tag parameter
)
assert response.status_code == 200
result = response.json()
assert result["artifact_id"] == expected_hash
# Version should be None when not specified
assert result.get("version") is None
# Verify no tag was created - list tags and check
tags_response = integration_client.get(
f"/api/v1/project/{project}/{package}/tags"
)
assert tags_response.status_code == 200
tags = tags_response.json()
# Filter for tags pointing to this artifact
artifact_tags = [t for t in tags.get("items", tags) if t.get("artifact_id") == expected_hash]
assert len(artifact_tags) == 0, "Tag should not be created when not specified"
@pytest.mark.integration
def test_upload_creates_artifact_in_database(self, integration_client, test_package):
@@ -164,29 +172,25 @@ class TestUploadBasics:
assert s3_object_exists(expected_hash), "S3 object should exist after upload"
@pytest.mark.integration
def test_upload_with_version_creates_version_record(self, integration_client, test_package):
"""Test upload with version creates version record."""
def test_upload_with_tag_creates_tag_record(self, integration_client, test_package):
"""Test upload with tag creates tag record."""
project, package = test_package
content = b"version creation test"
content = b"tag creation test"
expected_hash = compute_sha256(content)
version_name = "1.0.0"
tag_name = "my-tag-v1"
result = upload_test_file(
integration_client, project, package, content, version=version_name
upload_test_file(
integration_client, project, package, content, tag=tag_name
)
# Verify version was created
assert result.get("version") == version_name
assert result["artifact_id"] == expected_hash
# Verify version exists in versions list
versions_response = integration_client.get(
f"/api/v1/project/{project}/{package}/versions"
# Verify tag exists
tags_response = integration_client.get(
f"/api/v1/project/{project}/{package}/tags"
)
assert versions_response.status_code == 200
versions = versions_response.json()
version_names = [v["version"] for v in versions.get("items", [])]
assert version_name in version_names
assert tags_response.status_code == 200
tags = tags_response.json()
tag_names = [t["name"] for t in tags.get("items", tags)]
assert tag_name in tag_names
class TestDuplicateUploads:
@@ -203,44 +207,36 @@ class TestDuplicateUploads:
# First upload
result1 = upload_test_file(
integration_client, project, package, content, version="first"
integration_client, project, package, content, tag="first"
)
assert result1["artifact_id"] == expected_hash
# Second upload
result2 = upload_test_file(
integration_client, project, package, content, version="second"
integration_client, project, package, content, tag="second"
)
assert result2["artifact_id"] == expected_hash
assert result1["artifact_id"] == result2["artifact_id"]
@pytest.mark.integration
def test_same_file_twice_returns_existing_version(
def test_same_file_twice_increments_ref_count(
self, integration_client, test_package
):
"""Test uploading same file twice in same package returns existing version.
Same artifact can only have one version per package. Uploading the same content
with a different version name returns the existing version, not a new one.
ref_count stays at 1 because there's still only one PackageVersion reference.
"""
"""Test uploading same file twice increments ref_count to 2."""
project, package = test_package
content = b"content for ref count increment test"
# First upload
result1 = upload_test_file(
integration_client, project, package, content, version="v1"
integration_client, project, package, content, tag="v1"
)
assert result1["ref_count"] == 1
# Second upload with different version name returns existing version
# Second upload
result2 = upload_test_file(
integration_client, project, package, content, version="v2"
integration_client, project, package, content, tag="v2"
)
# Same artifact, same package = same version returned, ref_count stays 1
assert result2["ref_count"] == 1
assert result2["deduplicated"] is True
assert result1["version"] == result2["version"] # Both return "v1"
assert result2["ref_count"] == 2
@pytest.mark.integration
def test_same_file_different_packages_shares_artifact(
@@ -265,12 +261,12 @@ class TestDuplicateUploads:
)
# Upload to first package
result1 = upload_test_file(integration_client, project, pkg1, content, version="v1")
result1 = upload_test_file(integration_client, project, pkg1, content, tag="v1")
assert result1["artifact_id"] == expected_hash
assert result1["deduplicated"] is False
# Upload to second package
result2 = upload_test_file(integration_client, project, pkg2, content, version="v1")
result2 = upload_test_file(integration_client, project, pkg2, content, tag="v1")
assert result2["artifact_id"] == expected_hash
assert result2["deduplicated"] is True
@@ -290,7 +286,7 @@ class TestDuplicateUploads:
package,
content,
filename="file1.bin",
version="v1",
tag="v1",
)
assert result1["artifact_id"] == expected_hash
@@ -301,7 +297,7 @@ class TestDuplicateUploads:
package,
content,
filename="file2.bin",
version="v2",
tag="v2",
)
assert result2["artifact_id"] == expected_hash
assert result2["deduplicated"] is True
@@ -311,17 +307,17 @@ class TestDownload:
"""Tests for download functionality."""
@pytest.mark.integration
def test_download_by_version(self, integration_client, test_package):
"""Test downloading artifact by version."""
def test_download_by_tag(self, integration_client, test_package):
"""Test downloading artifact by tag name."""
project, package = test_package
original_content = b"download by version test"
original_content = b"download by tag test"
upload_test_file(
integration_client, project, package, original_content, version="1.0.0"
integration_client, project, package, original_content, tag="download-tag"
)
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/1.0.0",
f"/api/v1/project/{project}/{package}/+/download-tag",
params={"mode": "proxy"},
)
assert response.status_code == 200
@@ -344,29 +340,29 @@ class TestDownload:
assert response.content == original_content
@pytest.mark.integration
def test_download_by_version_prefix(self, integration_client, test_package):
"""Test downloading artifact using version: prefix."""
def test_download_by_tag_prefix(self, integration_client, test_package):
"""Test downloading artifact using tag: prefix."""
project, package = test_package
original_content = b"download by version prefix test"
original_content = b"download by tag prefix test"
upload_test_file(
integration_client, project, package, original_content, version="2.0.0"
integration_client, project, package, original_content, tag="prefix-tag"
)
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/version:2.0.0",
f"/api/v1/project/{project}/{package}/+/tag:prefix-tag",
params={"mode": "proxy"},
)
assert response.status_code == 200
assert response.content == original_content
@pytest.mark.integration
def test_download_nonexistent_version(self, integration_client, test_package):
"""Test downloading nonexistent version returns 404."""
def test_download_nonexistent_tag(self, integration_client, test_package):
"""Test downloading nonexistent tag returns 404."""
project, package = test_package
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/nonexistent-version"
f"/api/v1/project/{project}/{package}/+/nonexistent-tag"
)
assert response.status_code == 404
@@ -404,7 +400,7 @@ class TestDownload:
original_content = b"exact content verification test data 12345"
upload_test_file(
integration_client, project, package, original_content, version="verify"
integration_client, project, package, original_content, tag="verify"
)
response = integration_client.get(
@@ -425,7 +421,7 @@ class TestDownloadHeaders:
upload_test_file(
integration_client, project, package, content,
filename="test.txt", version="content-type-test"
filename="test.txt", tag="content-type-test"
)
response = integration_client.get(
@@ -444,7 +440,7 @@ class TestDownloadHeaders:
expected_length = len(content)
upload_test_file(
integration_client, project, package, content, version="content-length-test"
integration_client, project, package, content, tag="content-length-test"
)
response = integration_client.get(
@@ -464,7 +460,7 @@ class TestDownloadHeaders:
upload_test_file(
integration_client, project, package, content,
filename=filename, version="disposition-test"
filename=filename, tag="disposition-test"
)
response = integration_client.get(
@@ -485,7 +481,7 @@ class TestDownloadHeaders:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="checksum-headers"
integration_client, project, package, content, tag="checksum-headers"
)
response = integration_client.get(
@@ -505,7 +501,7 @@ class TestDownloadHeaders:
expected_hash = compute_sha256(content)
upload_test_file(
integration_client, project, package, content, version="etag-test"
integration_client, project, package, content, tag="etag-test"
)
response = integration_client.get(
@@ -523,31 +519,17 @@ class TestConcurrentUploads:
"""Tests for concurrent upload handling."""
@pytest.mark.integration
def test_concurrent_uploads_same_file(self, integration_client, test_project, unique_test_id):
"""Test concurrent uploads of same file to different packages handle deduplication correctly.
Same artifact can only have one version per package, so we create multiple packages
to test that concurrent uploads to different packages correctly increment ref_count.
"""
def test_concurrent_uploads_same_file(self, integration_client, test_package):
"""Test concurrent uploads of same file handle deduplication correctly."""
project, package = test_package
content = b"content for concurrent upload test"
expected_hash = compute_sha256(content)
num_concurrent = 5
# Create packages for each concurrent upload
packages = []
for i in range(num_concurrent):
pkg_name = f"concurrent-pkg-{unique_test_id}-{i}"
response = integration_client.post(
f"/api/v1/project/{test_project}/packages",
json={"name": pkg_name},
)
assert response.status_code == 200
packages.append(pkg_name)
# Create an API key for worker threads
api_key_response = integration_client.post(
"/api/v1/auth/keys",
json={"name": f"concurrent-test-key-{unique_test_id}"},
json={"name": "concurrent-test-key"},
)
assert api_key_response.status_code == 200, f"Failed to create API key: {api_key_response.text}"
api_key = api_key_response.json()["key"]
@@ -555,7 +537,7 @@ class TestConcurrentUploads:
results = []
errors = []
def upload_worker(idx):
def upload_worker(tag_suffix):
try:
from httpx import Client
@@ -563,15 +545,15 @@ class TestConcurrentUploads:
with Client(base_url=base_url, timeout=30.0) as client:
files = {
"file": (
f"concurrent-{idx}.bin",
f"concurrent-{tag_suffix}.bin",
io.BytesIO(content),
"application/octet-stream",
)
}
response = client.post(
f"/api/v1/project/{test_project}/{packages[idx]}/upload",
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": f"concurrent-{tag_suffix}"},
headers={"Authorization": f"Bearer {api_key}"},
)
if response.status_code == 200:
@@ -594,7 +576,7 @@ class TestConcurrentUploads:
assert len(artifact_ids) == 1
assert expected_hash in artifact_ids
# Verify final ref_count equals number of packages
# Verify final ref_count
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
assert response.status_code == 200
assert response.json()["ref_count"] == num_concurrent
@@ -623,7 +605,7 @@ class TestFileSizeValidation:
content = b"X"
result = upload_test_file(
integration_client, project, package, content, version="tiny"
integration_client, project, package, content, tag="tiny"
)
assert result["artifact_id"] is not None
@@ -639,7 +621,7 @@ class TestFileSizeValidation:
expected_size = len(content)
result = upload_test_file(
integration_client, project, package, content, version="size-test"
integration_client, project, package, content, tag="size-test"
)
assert result["size"] == expected_size
@@ -667,7 +649,7 @@ class TestUploadFailureCleanup:
response = integration_client.post(
f"/api/v1/project/nonexistent-project-{unique_test_id}/nonexistent-pkg/upload",
files=files,
data={"version": "test"},
data={"tag": "test"},
)
assert response.status_code == 404
@@ -690,7 +672,7 @@ class TestUploadFailureCleanup:
response = integration_client.post(
f"/api/v1/project/{test_project}/nonexistent-package-{unique_test_id}/upload",
files=files,
data={"version": "test"},
data={"tag": "test"},
)
assert response.status_code == 404
@@ -711,7 +693,7 @@ class TestUploadFailureCleanup:
response = integration_client.post(
f"/api/v1/project/{test_project}/nonexistent-package-{unique_test_id}/upload",
files=files,
data={"version": "test"},
data={"tag": "test"},
)
assert response.status_code == 404
@@ -737,7 +719,7 @@ class TestS3StorageVerification:
# Upload same content multiple times
for tag in ["s3test1", "s3test2", "s3test3"]:
upload_test_file(integration_client, project, package, content, version=tag)
upload_test_file(integration_client, project, package, content, tag=tag)
# Verify only one S3 object exists
s3_objects = list_s3_objects_by_hash(expected_hash)
@@ -753,26 +735,16 @@ class TestS3StorageVerification:
@pytest.mark.integration
def test_artifact_table_single_row_after_duplicates(
self, integration_client, test_project, unique_test_id
self, integration_client, test_package
):
"""Test artifact table contains only one row after duplicate uploads to different packages.
Same artifact can only have one version per package, so we create multiple packages
to test deduplication across packages.
"""
"""Test artifact table contains only one row after duplicate uploads."""
project, package = test_package
content = b"content for single row test"
expected_hash = compute_sha256(content)
# Create 3 packages and upload same content to each
for i in range(3):
pkg_name = f"single-row-pkg-{unique_test_id}-{i}"
integration_client.post(
f"/api/v1/project/{test_project}/packages",
json={"name": pkg_name},
)
upload_test_file(
integration_client, test_project, pkg_name, content, version="1.0.0"
)
# Upload same content multiple times
for tag in ["v1", "v2", "v3"]:
upload_test_file(integration_client, project, package, content, tag=tag)
# Query artifact
response = integration_client.get(f"/api/v1/artifact/{expected_hash}")
@@ -811,7 +783,7 @@ class TestSecurityPathTraversal:
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "traversal-test"},
data={"tag": "traversal-test"},
)
assert response.status_code == 200
result = response.json()
@@ -829,16 +801,48 @@ class TestSecurityPathTraversal:
assert response.status_code in [400, 404, 422]
@pytest.mark.integration
def test_path_traversal_in_version_name(self, integration_client, test_package):
"""Test version names with path traversal are handled safely."""
def test_path_traversal_in_tag_name(self, integration_client, test_package):
"""Test tag names with path traversal are handled safely."""
project, package = test_package
content = b"version traversal test"
content = b"tag traversal test"
files = {"file": ("test.bin", io.BytesIO(content), "application/octet-stream")}
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "../../../etc/passwd"},
data={"tag": "../../../etc/passwd"},
)
assert response.status_code in [200, 400, 422]
@pytest.mark.integration
def test_download_path_traversal_in_ref(self, integration_client, test_package):
"""Test download ref with path traversal is rejected."""
project, package = test_package
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/../../../etc/passwd"
)
assert response.status_code in [400, 404, 422]
@pytest.mark.integration
def test_path_traversal_in_package_name(self, integration_client, test_project):
"""Test package names with path traversal sequences are rejected."""
response = integration_client.get(
f"/api/v1/project/{test_project}/packages/../../../etc/passwd"
)
assert response.status_code in [400, 404, 422]
@pytest.mark.integration
def test_path_traversal_in_tag_name(self, integration_client, test_package):
"""Test tag names with path traversal are rejected or handled safely."""
project, package = test_package
content = b"tag traversal test"
files = {"file": ("test.bin", io.BytesIO(content), "application/octet-stream")}
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"tag": "../../../etc/passwd"},
)
assert response.status_code in [200, 400, 422]
@@ -863,7 +867,7 @@ class TestSecurityMalformedRequests:
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
data={"version": "no-file"},
data={"tag": "no-file"},
)
assert response.status_code == 422

View File

@@ -39,6 +39,31 @@ class TestVersionCreation:
assert result.get("version") == "1.0.0"
assert result.get("version_source") == "explicit"
@pytest.mark.integration
def test_upload_with_version_and_tag(self, integration_client, test_package):
"""Test upload with both version and tag creates both records."""
project, package = test_package
content = b"version and tag test"
files = {"file": ("app.tar.gz", io.BytesIO(content), "application/octet-stream")}
response = integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files,
data={"version": "2.0.0", "tag": "latest"},
)
assert response.status_code == 200
result = response.json()
assert result.get("version") == "2.0.0"
# Verify tag was also created
tags_response = integration_client.get(
f"/api/v1/project/{project}/{package}/tags"
)
assert tags_response.status_code == 200
tags = tags_response.json()
tag_names = [t["name"] for t in tags.get("items", tags)]
assert "latest" in tag_names
@pytest.mark.integration
def test_duplicate_version_same_content_succeeds(self, integration_client, test_package):
"""Test uploading same version with same content succeeds (deduplication)."""
@@ -237,10 +262,11 @@ class TestDownloadByVersion:
assert response.status_code == 404
@pytest.mark.integration
def test_version_resolution_with_prefix(self, integration_client, test_package):
"""Test that version: prefix explicitly resolves to version."""
def test_version_resolution_priority(self, integration_client, test_package):
"""Test that version: prefix explicitly resolves to version, not tag."""
project, package = test_package
version_content = b"this is the version content"
tag_content = b"this is the tag content"
# Create a version 6.0.0
files1 = {"file": ("app-v.tar.gz", io.BytesIO(version_content), "application/octet-stream")}
@@ -250,6 +276,14 @@ class TestDownloadByVersion:
data={"version": "6.0.0"},
)
# Create a tag named "6.0.0" pointing to different content
files2 = {"file": ("app-t.tar.gz", io.BytesIO(tag_content), "application/octet-stream")}
integration_client.post(
f"/api/v1/project/{project}/{package}/upload",
files=files2,
data={"tag": "6.0.0"},
)
# Download with version: prefix should get version content
response = integration_client.get(
f"/api/v1/project/{project}/{package}/+/version:6.0.0",
@@ -258,6 +292,14 @@ class TestDownloadByVersion:
assert response.status_code == 200
assert response.content == version_content
# Download with tag: prefix should get tag content
response2 = integration_client.get(
f"/api/v1/project/{project}/{package}/+/tag:6.0.0",
params={"mode": "proxy"},
)
assert response2.status_code == 200
assert response2.content == tag_content
class TestVersionDeletion:
"""Tests for deleting versions."""

View File

@@ -27,9 +27,11 @@ class TestVersionCreation:
project_name,
package_name,
b"version create test",
tag="latest",
version="1.0.0",
)
assert result["tag"] == "latest"
assert result["version"] == "1.0.0"
assert result["version_source"] == "explicit"
assert result["artifact_id"]
@@ -147,6 +149,7 @@ class TestVersionCRUD:
package_name,
b"version with info",
version="1.0.0",
tag="release",
)
response = integration_client.get(
@@ -163,6 +166,8 @@ class TestVersionCRUD:
assert version_item is not None
assert "size" in version_item
assert "artifact_id" in version_item
assert "tags" in version_item
assert "release" in version_item["tags"]
@pytest.mark.integration
def test_get_version(self, integration_client, test_package):
@@ -267,9 +272,94 @@ class TestVersionDownload:
follow_redirects=False,
)
# Should resolve version
# Should resolve version first (before tag)
assert response.status_code in [200, 302, 307]
@pytest.mark.integration
def test_version_takes_precedence_over_tag(self, integration_client, test_package):
"""Test that version is checked before tag when resolving refs."""
project_name, package_name = test_package
# Upload with version "1.0"
version_result = upload_test_file(
integration_client,
project_name,
package_name,
b"version content",
version="1.0",
)
# Create a tag with the same name "1.0" pointing to different artifact
tag_result = upload_test_file(
integration_client,
project_name,
package_name,
b"tag content different",
tag="1.0",
)
# Download by "1.0" should resolve to version, not tag
# Since version:1.0 artifact was uploaded first
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/1.0",
follow_redirects=False,
)
assert response.status_code in [200, 302, 307]
class TestTagVersionEnrichment:
"""Tests for tag responses including version information."""
@pytest.mark.integration
def test_tag_response_includes_version(self, integration_client, test_package):
"""Test that tag responses include version of the artifact."""
project_name, package_name = test_package
# Upload with both version and tag
upload_test_file(
integration_client,
project_name,
package_name,
b"enriched tag test",
version="7.0.0",
tag="stable",
)
# Get tag and check version field
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags/stable"
)
assert response.status_code == 200
data = response.json()
assert data["name"] == "stable"
assert data["version"] == "7.0.0"
@pytest.mark.integration
def test_tag_list_includes_versions(self, integration_client, test_package):
"""Test that tag list responses include version for each tag."""
project_name, package_name = test_package
upload_test_file(
integration_client,
project_name,
package_name,
b"list version test",
version="8.0.0",
tag="latest",
)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/tags"
)
assert response.status_code == 200
data = response.json()
tag_item = next((t for t in data["items"] if t["name"] == "latest"), None)
assert tag_item is not None
assert tag_item.get("version") == "8.0.0"
class TestVersionPagination:
"""Tests for version listing pagination and sorting."""

View File

@@ -39,7 +39,7 @@ class TestDependencySchema:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
data={"tag": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 200
@@ -59,17 +59,29 @@ class TestDependencySchema:
integration_client.delete(f"/api/v1/projects/{dep_project_name}")
@pytest.mark.integration
def test_dependency_requires_version(self, integration_client):
"""Test that dependency requires version."""
def test_dependency_requires_version_or_tag(self, integration_client):
"""Test that dependency must have either version or tag, not both or neither."""
from app.schemas import DependencyCreate
# Test: missing version
with pytest.raises(ValidationError):
# Test: neither version nor tag
with pytest.raises(ValidationError) as exc_info:
DependencyCreate(project="proj", package="pkg")
assert "Either 'version' or 'tag' must be specified" in str(exc_info.value)
# Test: both version and tag
with pytest.raises(ValidationError) as exc_info:
DependencyCreate(project="proj", package="pkg", version="1.0.0", tag="stable")
assert "Cannot specify both 'version' and 'tag'" in str(exc_info.value)
# Test: valid with version
dep = DependencyCreate(project="proj", package="pkg", version="1.0.0")
assert dep.version == "1.0.0"
assert dep.tag is None
# Test: valid with tag
dep = DependencyCreate(project="proj", package="pkg", tag="stable")
assert dep.tag == "stable"
assert dep.version is None
@pytest.mark.integration
def test_dependency_unique_constraint(
@@ -114,7 +126,7 @@ class TestEnsureFileParsing:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
data={"tag": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 200
data = response.json()
@@ -150,7 +162,7 @@ class TestEnsureFileParsing:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
data={"tag": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 400
assert "Invalid ensure file" in response.json().get("detail", "")
@@ -176,7 +188,7 @@ class TestEnsureFileParsing:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
data={"tag": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 400
assert "Project" in response.json().get("detail", "")
@@ -196,7 +208,7 @@ class TestEnsureFileParsing:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-nodeps-{unique_test_id}"},
data={"tag": f"v1.0.0-nodeps-{unique_test_id}"},
)
assert response.status_code == 200
@@ -214,14 +226,13 @@ class TestEnsureFileParsing:
assert response.status_code == 200
try:
# Test with missing version field (version is now required)
ensure_content = yaml.dump({
"dependencies": [
{"project": dep_project_name, "package": "pkg"} # Missing version
{"project": dep_project_name, "package": "pkg", "version": "1.0.0", "tag": "stable"}
]
})
content = unique_content("test-missing-version", unique_test_id, "constraint")
content = unique_content("test-both", unique_test_id, "constraint")
files = {
"file": ("test.tar.gz", BytesIO(content), "application/gzip"),
"ensure": ("orchard.ensure", BytesIO(ensure_content.encode()), "application/x-yaml"),
@@ -229,10 +240,11 @@ class TestEnsureFileParsing:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
data={"tag": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 400
assert "version" in response.json().get("detail", "").lower()
assert "both" in response.json().get("detail", "").lower() or \
"version" in response.json().get("detail", "").lower()
finally:
integration_client.delete(f"/api/v1/projects/{dep_project_name}")
@@ -259,7 +271,7 @@ class TestDependencyQueryEndpoints:
ensure_content = yaml.dump({
"dependencies": [
{"project": dep_project_name, "package": "lib-a", "version": "1.0.0"},
{"project": dep_project_name, "package": "lib-b", "version": "2.0.0"},
{"project": dep_project_name, "package": "lib-b", "tag": "stable"},
]
})
@@ -271,7 +283,7 @@ class TestDependencyQueryEndpoints:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v2.0.0-{unique_test_id}"},
data={"tag": f"v2.0.0-{unique_test_id}"},
)
assert response.status_code == 200
artifact_id = response.json()["artifact_id"]
@@ -287,8 +299,10 @@ class TestDependencyQueryEndpoints:
deps = {d["package"]: d for d in data["dependencies"]}
assert "lib-a" in deps
assert deps["lib-a"]["version"] == "1.0.0"
assert deps["lib-a"]["tag"] is None
assert "lib-b" in deps
assert deps["lib-b"]["version"] == "2.0.0"
assert deps["lib-b"]["tag"] == "stable"
assert deps["lib-b"]["version"] is None
finally:
integration_client.delete(f"/api/v1/projects/{dep_project_name}")
@@ -322,7 +336,7 @@ class TestDependencyQueryEndpoints:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": tag_name},
data={"tag": tag_name},
)
assert response.status_code == 200
@@ -367,7 +381,7 @@ class TestDependencyQueryEndpoints:
response = integration_client.post(
f"/api/v1/project/{dep_project_name}/target-lib/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -386,7 +400,7 @@ class TestDependencyQueryEndpoints:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v4.0.0-{unique_test_id}"},
data={"tag": f"v4.0.0-{unique_test_id}"},
)
assert response.status_code == 200
@@ -405,6 +419,7 @@ class TestDependencyQueryEndpoints:
for dep in data["dependents"]:
if dep["project"] == project_name:
found = True
assert dep["constraint_type"] == "version"
assert dep["constraint_value"] == "1.0.0"
break
assert found, "Our package should be in the dependents list"
@@ -427,7 +442,7 @@ class TestDependencyQueryEndpoints:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v5.0.0-nodeps-{unique_test_id}"},
data={"tag": f"v5.0.0-nodeps-{unique_test_id}"},
)
assert response.status_code == 200
artifact_id = response.json()["artifact_id"]
@@ -467,7 +482,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_c}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -485,7 +500,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_b}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -503,7 +518,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -551,7 +566,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_d}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -569,7 +584,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_b}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -587,7 +602,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_c}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -606,7 +621,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -648,7 +663,7 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"solo-{unique_test_id}"},
data={"tag": f"solo-{unique_test_id}"},
)
assert response.status_code == 200
@@ -683,21 +698,17 @@ class TestDependencyResolution:
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"missing-dep-{unique_test_id}"},
data={"tag": f"missing-dep-{unique_test_id}"},
)
# Should fail at upload time since package doesn't exist
# OR succeed at upload but fail at resolution
# Depending on implementation choice
if response.status_code == 200:
# Resolution should return missing dependencies
# Resolution should fail
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/missing-dep-{unique_test_id}/resolve"
)
# Expect 200 with missing dependencies listed
assert response.status_code == 200
data = response.json()
# The missing dependency should be in the 'missing' list
assert len(data.get("missing", [])) >= 1
assert response.status_code == 404
class TestCircularDependencyDetection:
@@ -725,7 +736,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -743,7 +754,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_b}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -761,7 +772,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "2.0.0"},
data={"tag": "2.0.0"},
)
# Should be rejected with 400 (circular dependency)
assert response.status_code == 400
@@ -796,7 +807,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -814,7 +825,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_b}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -832,7 +843,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_c}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -850,7 +861,7 @@ class TestCircularDependencyDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "2.0.0"},
data={"tag": "2.0.0"},
)
assert response.status_code == 400
data = response.json()
@@ -873,14 +884,10 @@ class TestCircularDependencyDetection:
class TestConflictDetection:
"""Tests for dependency conflict handling.
The resolver uses "first version wins" strategy for version conflicts,
allowing resolution to succeed rather than failing with an error.
"""
"""Tests for #81: Dependency Conflict Detection and Reporting"""
@pytest.mark.integration
def test_version_conflict_uses_first_version(
def test_detect_version_conflict(
self, integration_client, test_project, unique_test_id
):
"""Test conflict when two deps require different versions of same package."""
@@ -903,7 +910,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_common}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -913,7 +920,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_common}/upload",
files=files,
data={"version": "2.0.0"},
data={"tag": "2.0.0"},
)
assert response.status_code == 200
@@ -931,7 +938,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_lib_a}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -949,7 +956,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_lib_b}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -968,23 +975,25 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_app}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
# Try to resolve app - with lenient conflict handling, this should succeed
# The resolver uses "first version wins" strategy for conflicting versions
# Try to resolve app - should report conflict
response = integration_client.get(
f"/api/v1/project/{test_project}/{pkg_app}/+/1.0.0/resolve"
)
assert response.status_code == 200
assert response.status_code == 409
data = response.json()
# Error details are nested in "detail" for HTTPException
detail = data.get("detail", data)
assert detail.get("error") == "dependency_conflict"
assert len(detail.get("conflicts", [])) > 0
# Resolution should succeed with first-encountered version of common
assert data["artifact_count"] >= 1
# Find the common package in resolved list
common_resolved = [r for r in data["resolved"] if r["package"] == pkg_common]
assert len(common_resolved) == 1 # Only one version should be included
# Verify conflict details
conflict = detail["conflicts"][0]
assert conflict["package"] == pkg_common
assert len(conflict["requirements"]) == 2
finally:
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
@@ -1014,7 +1023,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_common}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -1033,7 +1042,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{lib_pkg}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -1052,7 +1061,7 @@ class TestConflictDetection:
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_app}/upload",
files=files,
data={"version": "1.0.0"},
data={"tag": "1.0.0"},
)
assert response.status_code == 200
@@ -1069,277 +1078,3 @@ class TestConflictDetection:
finally:
for pkg in [pkg_app, pkg_lib_a, pkg_lib_b, pkg_common]:
integration_client.delete(f"/api/v1/project/{test_project}/packages/{pkg}")
class TestAutoFetchDependencies:
"""Tests for auto-fetch functionality in dependency resolution.
These tests verify:
- Resolution with auto_fetch=true (default) fetches missing dependencies from upstream
- Resolution with auto_fetch=false skips network calls for fast resolution
- Proper handling of missing/non-existent packages
- Response schema includes fetched artifacts list
"""
@pytest.mark.integration
def test_resolve_auto_fetch_true_is_default(
self, integration_client, test_package, unique_test_id
):
"""Test that auto_fetch=true is the default (no fetch needed when all deps cached)."""
project_name, package_name = test_package
# Upload a simple artifact without dependencies
content = unique_content("autofetch-default", unique_test_id, "nodeps")
files = {"file": ("default.tar.gz", BytesIO(content), "application/gzip")}
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v1.0.0-{unique_test_id}"},
)
assert response.status_code == 200
# Resolve without auto_fetch param (should default to false)
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/v1.0.0-{unique_test_id}/resolve"
)
assert response.status_code == 200
data = response.json()
# Should have empty fetched list
assert data.get("fetched", []) == []
assert data["artifact_count"] == 1
@pytest.mark.integration
def test_resolve_auto_fetch_explicit_false(
self, integration_client, test_package, unique_test_id
):
"""Test that auto_fetch=false works explicitly."""
project_name, package_name = test_package
content = unique_content("autofetch-explicit-false", unique_test_id, "nodeps")
files = {"file": ("explicit.tar.gz", BytesIO(content), "application/gzip")}
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v2.0.0-{unique_test_id}"},
)
assert response.status_code == 200
# Resolve with explicit auto_fetch=false
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/v2.0.0-{unique_test_id}/resolve",
params={"auto_fetch": "false"},
)
assert response.status_code == 200
data = response.json()
assert data.get("fetched", []) == []
@pytest.mark.integration
def test_resolve_auto_fetch_true_no_missing_deps(
self, integration_client, test_project, unique_test_id
):
"""Test that auto_fetch=true works when all deps are already cached."""
pkg_a = f"fetch-a-{unique_test_id}"
pkg_b = f"fetch-b-{unique_test_id}"
for pkg in [pkg_a, pkg_b]:
response = integration_client.post(
f"/api/v1/project/{test_project}/packages",
json={"name": pkg}
)
assert response.status_code == 200
try:
# Upload B (no deps)
content_b = unique_content("B", unique_test_id, "fetch")
files = {"file": ("b.tar.gz", BytesIO(content_b), "application/gzip")}
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_b}/upload",
files=files,
data={"version": "1.0.0"},
)
assert response.status_code == 200
# Upload A (depends on B)
ensure_a = yaml.dump({
"dependencies": [
{"project": test_project, "package": pkg_b, "version": "1.0.0"}
]
})
content_a = unique_content("A", unique_test_id, "fetch")
files = {
"file": ("a.tar.gz", BytesIO(content_a), "application/gzip"),
"ensure": ("orchard.ensure", BytesIO(ensure_a.encode()), "application/x-yaml"),
}
response = integration_client.post(
f"/api/v1/project/{test_project}/{pkg_a}/upload",
files=files,
data={"version": "1.0.0"},
)
assert response.status_code == 200
# Resolve with auto_fetch=true - should work since deps are cached
response = integration_client.get(
f"/api/v1/project/{test_project}/{pkg_a}/+/1.0.0/resolve",
params={"auto_fetch": "true"},
)
assert response.status_code == 200
data = response.json()
# Should resolve successfully
assert data["artifact_count"] == 2
# Nothing fetched since everything was cached
assert len(data.get("fetched", [])) == 0
# No missing deps
assert len(data.get("missing", [])) == 0
finally:
for pkg in [pkg_a, pkg_b]:
integration_client.delete(f"/api/v1/project/{test_project}/packages/{pkg}")
@pytest.mark.integration
def test_resolve_missing_dep_with_auto_fetch_false(
self, integration_client, test_package, unique_test_id
):
"""Test that missing deps are reported when auto_fetch=false."""
project_name, package_name = test_package
# Create _pypi system project if it doesn't exist
response = integration_client.get("/api/v1/projects/_pypi")
if response.status_code == 404:
response = integration_client.post(
"/api/v1/projects",
json={"name": "_pypi", "description": "System project for PyPI packages"}
)
# May fail if already exists or can't create - that's ok
# Upload artifact with dependency on _pypi package that doesn't exist locally
ensure_content = yaml.dump({
"dependencies": [
{"project": "_pypi", "package": "nonexistent-pkg-xyz123", "version": ">=1.0.0"}
]
})
content = unique_content("missing-pypi", unique_test_id, "dep")
files = {
"file": ("missing-pypi-dep.tar.gz", BytesIO(content), "application/gzip"),
"ensure": ("orchard.ensure", BytesIO(ensure_content.encode()), "application/x-yaml"),
}
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v3.0.0-{unique_test_id}"},
)
# Upload should succeed - validation is loose for system projects
if response.status_code == 200:
# Resolve without auto_fetch - should report missing
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/v3.0.0-{unique_test_id}/resolve",
params={"auto_fetch": "false"},
)
assert response.status_code == 200
data = response.json()
# Should have missing dependencies
assert len(data.get("missing", [])) >= 1
# Verify missing dependency structure
missing = data["missing"][0]
assert missing["project"] == "_pypi"
assert missing["package"] == "nonexistent-pkg-xyz123"
# Without auto_fetch, these should be false/None
assert missing.get("fetch_attempted", False) is False
@pytest.mark.integration
def test_resolve_response_schema_has_fetched_field(
self, integration_client, test_package, unique_test_id
):
"""Test that the resolve response always includes the fetched field."""
project_name, package_name = test_package
content = unique_content("schema-check", unique_test_id, "nodeps")
files = {"file": ("schema.tar.gz", BytesIO(content), "application/gzip")}
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v4.0.0-{unique_test_id}"},
)
assert response.status_code == 200
# Check both auto_fetch modes include fetched field
for auto_fetch in ["false", "true"]:
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/v4.0.0-{unique_test_id}/resolve",
params={"auto_fetch": auto_fetch},
)
assert response.status_code == 200
data = response.json()
# Required fields
assert "requested" in data
assert "resolved" in data
assert "missing" in data
assert "fetched" in data # New field
assert "total_size" in data
assert "artifact_count" in data
# Types
assert isinstance(data["fetched"], list)
assert isinstance(data["missing"], list)
@pytest.mark.integration
def test_missing_dep_schema_has_fetch_fields(
self, integration_client, test_package, unique_test_id
):
"""Test that missing dependency entries have fetch_attempted and fetch_error fields."""
project_name, package_name = test_package
# Create a dependency on a non-existent package in a real project
dep_project_name = f"dep-test-{unique_test_id}"
response = integration_client.post(
"/api/v1/projects", json={"name": dep_project_name}
)
assert response.status_code == 200
try:
ensure_content = yaml.dump({
"dependencies": [
{"project": dep_project_name, "package": "nonexistent-pkg", "version": "1.0.0"}
]
})
content = unique_content("missing-schema", unique_test_id, "check")
files = {
"file": ("missing-schema.tar.gz", BytesIO(content), "application/gzip"),
"ensure": ("orchard.ensure", BytesIO(ensure_content.encode()), "application/x-yaml"),
}
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
files=files,
data={"version": f"v5.0.0-{unique_test_id}"},
)
assert response.status_code == 200
# Resolve
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/v5.0.0-{unique_test_id}/resolve",
params={"auto_fetch": "true"},
)
assert response.status_code == 200
data = response.json()
# Should have missing dependencies
assert len(data.get("missing", [])) >= 1
# Check schema for missing dependency
missing = data["missing"][0]
assert "project" in missing
assert "package" in missing
assert "constraint" in missing
assert "required_by" in missing
# New fields
assert "fetch_attempted" in missing
assert "fetch_error" in missing # May be None
finally:
integration_client.delete(f"/api/v1/projects/{dep_project_name}")

View File

@@ -26,16 +26,16 @@ def upload_test_file(integration_client):
Factory fixture to upload a test file and return its artifact ID.
Usage:
artifact_id = upload_test_file(project, package, content, version="v1.0")
artifact_id = upload_test_file(project, package, content, tag="v1.0")
"""
def _upload(project_name: str, package_name: str, content: bytes, version: str = None):
def _upload(project_name: str, package_name: str, content: bytes, tag: str = None):
files = {
"file": ("test-file.bin", io.BytesIO(content), "application/octet-stream")
}
data = {}
if version:
data["version"] = version
if tag:
data["tag"] = tag
response = integration_client.post(
f"/api/v1/project/{project_name}/{package_name}/upload",
@@ -66,7 +66,7 @@ class TestDownloadChecksumHeaders:
# Upload file
artifact_id = upload_test_file(
project_name, package_name, content, version="sha256-header-test"
project_name, package_name, content, tag="sha256-header-test"
)
# Download with proxy mode
@@ -88,7 +88,7 @@ class TestDownloadChecksumHeaders:
content = b"Content for ETag header test"
artifact_id = upload_test_file(
project_name, package_name, content, version="etag-test"
project_name, package_name, content, tag="etag-test"
)
response = integration_client.get(
@@ -110,7 +110,7 @@ class TestDownloadChecksumHeaders:
content = b"Content for Digest header test"
sha256 = hashlib.sha256(content).hexdigest()
upload_test_file(project_name, package_name, content, version="digest-test")
upload_test_file(project_name, package_name, content, tag="digest-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/digest-test",
@@ -137,7 +137,7 @@ class TestDownloadChecksumHeaders:
project_name, package_name = test_package
content = b"Content for X-Content-Length test"
upload_test_file(project_name, package_name, content, version="content-length-test")
upload_test_file(project_name, package_name, content, tag="content-length-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/content-length-test",
@@ -156,7 +156,7 @@ class TestDownloadChecksumHeaders:
project_name, package_name = test_package
content = b"Content for X-Verified false test"
upload_test_file(project_name, package_name, content, version="verified-false-test")
upload_test_file(project_name, package_name, content, tag="verified-false-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/verified-false-test",
@@ -184,7 +184,7 @@ class TestPreVerificationMode:
project_name, package_name = test_package
content = b"Content for pre-verification success test"
upload_test_file(project_name, package_name, content, version="pre-verify-success")
upload_test_file(project_name, package_name, content, tag="pre-verify-success")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/pre-verify-success",
@@ -205,7 +205,7 @@ class TestPreVerificationMode:
# Use binary content to verify no corruption
content = bytes(range(256)) * 10 # 2560 bytes of all byte values
upload_test_file(project_name, package_name, content, version="pre-verify-content")
upload_test_file(project_name, package_name, content, tag="pre-verify-content")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/pre-verify-content",
@@ -233,7 +233,7 @@ class TestStreamingVerificationMode:
content = b"Content for streaming verification success test"
upload_test_file(
project_name, package_name, content, version="stream-verify-success"
project_name, package_name, content, tag="stream-verify-success"
)
response = integration_client.get(
@@ -255,7 +255,7 @@ class TestStreamingVerificationMode:
# 100KB of content
content = b"x" * (100 * 1024)
upload_test_file(project_name, package_name, content, version="stream-verify-large")
upload_test_file(project_name, package_name, content, tag="stream-verify-large")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/stream-verify-large",
@@ -283,7 +283,7 @@ class TestHeadRequestHeaders:
content = b"Content for HEAD SHA256 test"
artifact_id = upload_test_file(
project_name, package_name, content, version="head-sha256-test"
project_name, package_name, content, tag="head-sha256-test"
)
response = integration_client.head(
@@ -303,7 +303,7 @@ class TestHeadRequestHeaders:
content = b"Content for HEAD ETag test"
artifact_id = upload_test_file(
project_name, package_name, content, version="head-etag-test"
project_name, package_name, content, tag="head-etag-test"
)
response = integration_client.head(
@@ -322,7 +322,7 @@ class TestHeadRequestHeaders:
project_name, package_name = test_package
content = b"Content for HEAD Digest test"
upload_test_file(project_name, package_name, content, version="head-digest-test")
upload_test_file(project_name, package_name, content, tag="head-digest-test")
response = integration_client.head(
f"/api/v1/project/{project_name}/{package_name}/+/head-digest-test"
@@ -340,7 +340,7 @@ class TestHeadRequestHeaders:
project_name, package_name = test_package
content = b"Content for HEAD Content-Length test"
upload_test_file(project_name, package_name, content, version="head-length-test")
upload_test_file(project_name, package_name, content, tag="head-length-test")
response = integration_client.head(
f"/api/v1/project/{project_name}/{package_name}/+/head-length-test"
@@ -356,7 +356,7 @@ class TestHeadRequestHeaders:
project_name, package_name = test_package
content = b"Content for HEAD no-body test"
upload_test_file(project_name, package_name, content, version="head-no-body-test")
upload_test_file(project_name, package_name, content, tag="head-no-body-test")
response = integration_client.head(
f"/api/v1/project/{project_name}/{package_name}/+/head-no-body-test"
@@ -382,7 +382,7 @@ class TestRangeRequestHeaders:
project_name, package_name = test_package
content = b"Content for range request checksum header test"
upload_test_file(project_name, package_name, content, version="range-checksum-test")
upload_test_file(project_name, package_name, content, tag="range-checksum-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/range-checksum-test",
@@ -412,7 +412,7 @@ class TestClientSideVerification:
project_name, package_name = test_package
content = b"Content for client-side verification test"
upload_test_file(project_name, package_name, content, version="client-verify-test")
upload_test_file(project_name, package_name, content, tag="client-verify-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/client-verify-test",
@@ -438,7 +438,7 @@ class TestClientSideVerification:
project_name, package_name = test_package
content = b"Content for Digest header verification"
upload_test_file(project_name, package_name, content, version="digest-verify-test")
upload_test_file(project_name, package_name, content, tag="digest-verify-test")
response = integration_client.get(
f"/api/v1/project/{project_name}/{package_name}/+/digest-verify-test",

View File

@@ -192,6 +192,7 @@ class TestCacheSettingsModel:
settings = CacheSettings()
assert hasattr(settings, 'id')
assert hasattr(settings, 'allow_public_internet')
assert hasattr(settings, 'auto_create_system_projects')
def test_model_with_values(self):
@@ -200,9 +201,11 @@ class TestCacheSettingsModel:
settings = CacheSettings(
id=1,
allow_public_internet=False,
auto_create_system_projects=True,
)
assert settings.id == 1
assert settings.allow_public_internet is False
assert settings.auto_create_system_projects is True
@@ -362,14 +365,16 @@ class TestCacheSettingsSchemas:
from app.schemas import CacheSettingsUpdate
update = CacheSettingsUpdate()
assert update.allow_public_internet is None
assert update.auto_create_system_projects is None
def test_update_schema_partial(self):
"""Test CacheSettingsUpdate with partial fields."""
from app.schemas import CacheSettingsUpdate
update = CacheSettingsUpdate(auto_create_system_projects=True)
assert update.auto_create_system_projects is True
update = CacheSettingsUpdate(allow_public_internet=False)
assert update.allow_public_internet is False
assert update.auto_create_system_projects is None
class TestCacheRequestSchemas:
@@ -383,7 +388,7 @@ class TestCacheRequestSchemas:
url="https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
source_type="npm",
package_name="lodash",
version="4.17.21",
tag="4.17.21",
)
assert request.url == "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz"
assert request.source_type == "npm"
@@ -1132,7 +1137,7 @@ class TestCacheRequestValidation:
url="https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz",
source_type="npm",
package_name="lodash",
version="4.17.21",
tag="4.17.21",
)
assert request.url == "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz"
assert request.source_type == "npm"
@@ -1599,9 +1604,11 @@ class TestCacheSettingsAdminAPI:
data = response.json()
# Check expected fields exist
assert "allow_public_internet" in data
assert "auto_create_system_projects" in data
# Check types
assert isinstance(data["allow_public_internet"], bool)
assert isinstance(data["auto_create_system_projects"], bool)
@pytest.mark.integration
@@ -1614,7 +1621,7 @@ class TestCacheSettingsAdminAPI:
with httpx.Client(base_url=base_url, timeout=30.0) as unauthenticated_client:
response = unauthenticated_client.put(
"/api/v1/admin/cache-settings",
json={"auto_create_system_projects": False},
json={"allow_public_internet": False},
)
assert response.status_code in (401, 403)
@@ -1628,43 +1635,76 @@ class TestCacheSettingsAdminAPI:
response = integration_client.put(
"/api/v1/admin/cache-settings",
json={
"allow_public_internet": not original["allow_public_internet"],
"auto_create_system_projects": not original["auto_create_system_projects"],
},
)
assert response.status_code == 200
data = response.json()
assert data["allow_public_internet"] == (not original["allow_public_internet"])
assert data["auto_create_system_projects"] == (not original["auto_create_system_projects"])
# Restore original settings
integration_client.put(
"/api/v1/admin/cache-settings",
json={
"allow_public_internet": original["allow_public_internet"],
"auto_create_system_projects": original["auto_create_system_projects"],
},
)
@pytest.mark.integration
def test_update_cache_settings_allow_public_internet(self, integration_client):
"""Test enabling and disabling public internet access (air-gap mode)."""
# First get current settings to restore later
original = integration_client.get("/api/v1/admin/cache-settings").json()
# Disable public internet (enable air-gap mode)
response = integration_client.put(
"/api/v1/admin/cache-settings",
json={"allow_public_internet": False},
)
assert response.status_code == 200
assert response.json()["allow_public_internet"] is False
# Enable public internet (disable air-gap mode)
response = integration_client.put(
"/api/v1/admin/cache-settings",
json={"allow_public_internet": True},
)
assert response.status_code == 200
assert response.json()["allow_public_internet"] is True
# Restore original settings
integration_client.put(
"/api/v1/admin/cache-settings",
json={"allow_public_internet": original["allow_public_internet"]},
)
@pytest.mark.integration
def test_update_cache_settings_partial(self, integration_client):
"""Test that partial updates only change specified fields."""
# Get current settings
original = integration_client.get("/api/v1/admin/cache-settings").json()
# Update only auto_create_system_projects
new_value = not original["auto_create_system_projects"]
# Update only allow_public_internet
new_value = not original["allow_public_internet"]
response = integration_client.put(
"/api/v1/admin/cache-settings",
json={"auto_create_system_projects": new_value},
json={"allow_public_internet": new_value},
)
assert response.status_code == 200
data = response.json()
assert data["auto_create_system_projects"] == new_value
assert data["allow_public_internet"] == new_value
# Other field should be unchanged
assert data["auto_create_system_projects"] == original["auto_create_system_projects"]
# Restore
integration_client.put(
"/api/v1/admin/cache-settings",
json={"auto_create_system_projects": original["auto_create_system_projects"]},
json={"allow_public_internet": original["allow_public_internet"]},
)
@pytest.mark.integration
@@ -1902,4 +1942,5 @@ class TestCacheSettingsEnvOverride:
data = response.json()
# These fields should exist (may be null if no env override)
assert "allow_public_internet_env_override" in data
assert "auto_create_system_projects_env_override" in data

View File

@@ -1,374 +0,0 @@
"""Tests for CacheService."""
import pytest
from unittest.mock import MagicMock, AsyncMock, patch
class TestCacheCategory:
"""Tests for cache category enum."""
@pytest.mark.unit
def test_immutable_categories_have_no_ttl(self):
"""Immutable categories should return None for TTL."""
from app.cache_service import CacheCategory, get_category_ttl
from app.config import Settings
settings = Settings()
assert get_category_ttl(CacheCategory.ARTIFACT_METADATA, settings) is None
assert get_category_ttl(CacheCategory.ARTIFACT_DEPENDENCIES, settings) is None
assert get_category_ttl(CacheCategory.DEPENDENCY_RESOLUTION, settings) is None
@pytest.mark.unit
def test_mutable_categories_have_ttl(self):
"""Mutable categories should return configured TTL."""
from app.cache_service import CacheCategory, get_category_ttl
from app.config import Settings
settings = Settings(
cache_ttl_index=300,
cache_ttl_upstream=3600,
)
assert get_category_ttl(CacheCategory.PACKAGE_INDEX, settings) == 300
assert get_category_ttl(CacheCategory.UPSTREAM_SOURCES, settings) == 3600
class TestCacheService:
"""Tests for Redis cache service."""
@pytest.mark.asyncio
@pytest.mark.unit
async def test_disabled_cache_returns_none(self):
"""When Redis disabled, get() should return None."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key")
assert result is None
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_disabled_cache_set_is_noop(self):
"""When Redis disabled, set() should be a no-op."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
# Should not raise
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"test-value")
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_cache_key_namespacing(self):
"""Cache keys should be properly namespaced."""
from app.cache_service import CacheService, CacheCategory
key = CacheService._make_key(CacheCategory.PACKAGE_INDEX, "pypi", "numpy")
assert key == "orchard:index:pypi:numpy"
@pytest.mark.asyncio
@pytest.mark.unit
async def test_ping_returns_false_when_disabled(self):
"""ping() should return False when Redis is disabled."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
result = await cache.ping()
assert result is False
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_enabled_property(self):
"""enabled property should reflect Redis state."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
assert cache.enabled is False
@pytest.mark.asyncio
@pytest.mark.unit
async def test_delete_is_noop_when_disabled(self):
"""delete() should be a no-op when Redis is disabled."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
# Should not raise
await cache.delete(CacheCategory.PACKAGE_INDEX, "test-key")
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_invalidate_pattern_returns_zero_when_disabled(self):
"""invalidate_pattern() should return 0 when Redis is disabled."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
result = await cache.invalidate_pattern(CacheCategory.PACKAGE_INDEX)
assert result == 0
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_startup_already_started(self):
"""startup() should be idempotent."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
await cache.startup()
await cache.startup() # Should not raise
assert cache._started is True
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_shutdown_not_started(self):
"""shutdown() should handle not-started state."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=False)
cache = CacheService(settings)
# Should not raise
await cache.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_make_key_with_default_protocol(self):
"""_make_key should work with default protocol."""
from app.cache_service import CacheService, CacheCategory
key = CacheService._make_key(CacheCategory.ARTIFACT_METADATA, "default", "abc123")
assert key == "orchard:artifact:default:abc123"
class TestCacheServiceWithMockedRedis:
"""Tests for CacheService with mocked Redis client."""
@pytest.mark.asyncio
@pytest.mark.unit
async def test_get_returns_cached_value(self):
"""get() should return cached value when available."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
# Mock the redis client
mock_redis = AsyncMock()
mock_redis.get.return_value = b"cached-data"
cache._redis = mock_redis
cache._enabled = True
cache._started = True
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key", "pypi")
assert result == b"cached-data"
mock_redis.get.assert_called_once_with("orchard:index:pypi:test-key")
@pytest.mark.asyncio
@pytest.mark.unit
async def test_set_with_ttl(self):
"""set() should use setex for mutable categories."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True, cache_ttl_index=300)
cache = CacheService(settings)
mock_redis = AsyncMock()
cache._redis = mock_redis
cache._enabled = True
cache._started = True
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"test-value", "pypi")
mock_redis.setex.assert_called_once_with(
"orchard:index:pypi:test-key", 300, b"test-value"
)
@pytest.mark.asyncio
@pytest.mark.unit
async def test_set_without_ttl(self):
"""set() should use set (no expiry) for immutable categories."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
cache._redis = mock_redis
cache._enabled = True
cache._started = True
await cache.set(
CacheCategory.ARTIFACT_METADATA, "abc123", b"metadata", "pypi"
)
mock_redis.set.assert_called_once_with(
"orchard:artifact:pypi:abc123", b"metadata"
)
@pytest.mark.asyncio
@pytest.mark.unit
async def test_delete_calls_redis_delete(self):
"""delete() should call Redis delete."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
cache._redis = mock_redis
cache._enabled = True
cache._started = True
await cache.delete(CacheCategory.PACKAGE_INDEX, "test-key", "pypi")
mock_redis.delete.assert_called_once_with("orchard:index:pypi:test-key")
@pytest.mark.asyncio
@pytest.mark.unit
async def test_invalidate_pattern_deletes_matching_keys(self):
"""invalidate_pattern() should delete all matching keys."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
# Create an async generator for scan_iter
async def mock_scan_iter(match=None):
for key in [b"orchard:index:pypi:numpy", b"orchard:index:pypi:requests"]:
yield key
mock_redis.scan_iter = mock_scan_iter
mock_redis.delete.return_value = 2
cache._redis = mock_redis
cache._enabled = True
cache._started = True
result = await cache.invalidate_pattern(CacheCategory.PACKAGE_INDEX, "*", "pypi")
assert result == 2
mock_redis.delete.assert_called_once()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_ping_returns_true_when_connected(self):
"""ping() should return True when Redis responds."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
mock_redis.ping.return_value = True
cache._redis = mock_redis
cache._enabled = True
cache._started = True
result = await cache.ping()
assert result is True
@pytest.mark.asyncio
@pytest.mark.unit
async def test_get_handles_exception(self):
"""get() should return None and log warning on exception."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
mock_redis.get.side_effect = Exception("Connection lost")
cache._redis = mock_redis
cache._enabled = True
cache._started = True
result = await cache.get(CacheCategory.PACKAGE_INDEX, "test-key")
assert result is None
@pytest.mark.asyncio
@pytest.mark.unit
async def test_set_handles_exception(self):
"""set() should log warning on exception."""
from app.cache_service import CacheService, CacheCategory
from app.config import Settings
settings = Settings(redis_enabled=True, cache_ttl_index=300)
cache = CacheService(settings)
mock_redis = AsyncMock()
mock_redis.setex.side_effect = Exception("Connection lost")
cache._redis = mock_redis
cache._enabled = True
cache._started = True
# Should not raise
await cache.set(CacheCategory.PACKAGE_INDEX, "test-key", b"value")
@pytest.mark.asyncio
@pytest.mark.unit
async def test_ping_returns_false_on_exception(self):
"""ping() should return False on exception."""
from app.cache_service import CacheService
from app.config import Settings
settings = Settings(redis_enabled=True)
cache = CacheService(settings)
mock_redis = AsyncMock()
mock_redis.ping.side_effect = Exception("Connection lost")
cache._redis = mock_redis
cache._enabled = True
cache._started = True
result = await cache.ping()
assert result is False

View File

@@ -1,167 +0,0 @@
"""Tests for database utility functions."""
import pytest
from unittest.mock import MagicMock, patch
class TestArtifactRepository:
"""Tests for ArtifactRepository."""
def test_batch_dependency_values_formatting(self):
"""batch_upsert_dependencies should format values correctly."""
from app.db_utils import ArtifactRepository
deps = [
("_pypi", "numpy", ">=1.21.0"),
("_pypi", "requests", "*"),
("myproject", "mylib", "==1.0.0"),
]
values = ArtifactRepository._format_dependency_values("abc123", deps)
assert len(values) == 3
assert values[0] == {
"artifact_id": "abc123",
"dependency_project": "_pypi",
"dependency_package": "numpy",
"version_constraint": ">=1.21.0",
}
assert values[2]["dependency_project"] == "myproject"
def test_empty_dependencies_returns_empty_list(self):
"""Empty dependency list should return empty values."""
from app.db_utils import ArtifactRepository
values = ArtifactRepository._format_dependency_values("abc123", [])
assert values == []
def test_format_dependency_values_preserves_special_characters(self):
"""Version constraints with special characters should be preserved."""
from app.db_utils import ArtifactRepository
deps = [
("_pypi", "package-name", ">=1.0.0,<2.0.0"),
("_pypi", "another_pkg", "~=1.4.2"),
]
values = ArtifactRepository._format_dependency_values("hash123", deps)
assert values[0]["version_constraint"] == ">=1.0.0,<2.0.0"
assert values[1]["version_constraint"] == "~=1.4.2"
def test_batch_upsert_dependencies_returns_zero_for_empty(self):
"""batch_upsert_dependencies should return 0 for empty list without DB call."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
repo = ArtifactRepository(mock_db)
result = repo.batch_upsert_dependencies("abc123", [])
assert result == 0
# Verify no DB operations were performed
mock_db.execute.assert_not_called()
def test_get_or_create_artifact_builds_correct_statement(self):
"""get_or_create_artifact should use ON CONFLICT DO UPDATE."""
from app.db_utils import ArtifactRepository
from app.models import Artifact
mock_db = MagicMock()
mock_result = MagicMock()
mock_artifact = MagicMock()
mock_artifact.ref_count = 1
mock_result.scalar_one.return_value = mock_artifact
mock_db.execute.return_value = mock_result
repo = ArtifactRepository(mock_db)
artifact, created = repo.get_or_create_artifact(
sha256="abc123def456",
size=1024,
filename="test.whl",
content_type="application/zip",
)
assert mock_db.execute.called
assert created is True
assert artifact == mock_artifact
def test_get_or_create_artifact_existing_not_created(self):
"""get_or_create_artifact should return created=False for existing artifact."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
mock_result = MagicMock()
mock_artifact = MagicMock()
mock_artifact.ref_count = 5 # Existing artifact with ref_count > 1
mock_result.scalar_one.return_value = mock_artifact
mock_db.execute.return_value = mock_result
repo = ArtifactRepository(mock_db)
artifact, created = repo.get_or_create_artifact(
sha256="abc123def456",
size=1024,
filename="test.whl",
)
assert created is False
def test_get_cached_url_with_artifact_returns_tuple(self):
"""get_cached_url_with_artifact should return (CachedUrl, Artifact) tuple."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
mock_cached_url = MagicMock()
mock_artifact = MagicMock()
mock_db.query.return_value.join.return_value.filter.return_value.first.return_value = (
mock_cached_url,
mock_artifact,
)
repo = ArtifactRepository(mock_db)
result = repo.get_cached_url_with_artifact("url_hash_123")
assert result == (mock_cached_url, mock_artifact)
def test_get_cached_url_with_artifact_returns_none_when_not_found(self):
"""get_cached_url_with_artifact should return None when URL not cached."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
mock_db.query.return_value.join.return_value.filter.return_value.first.return_value = None
repo = ArtifactRepository(mock_db)
result = repo.get_cached_url_with_artifact("nonexistent_hash")
assert result is None
def test_get_artifact_dependencies_returns_list(self):
"""get_artifact_dependencies should return list of dependencies."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
mock_dep1 = MagicMock()
mock_dep2 = MagicMock()
mock_db.query.return_value.filter.return_value.all.return_value = [
mock_dep1,
mock_dep2,
]
repo = ArtifactRepository(mock_db)
result = repo.get_artifact_dependencies("artifact_hash_123")
assert len(result) == 2
assert result[0] == mock_dep1
assert result[1] == mock_dep2
def test_get_artifact_dependencies_returns_empty_list(self):
"""get_artifact_dependencies should return empty list when no dependencies."""
from app.db_utils import ArtifactRepository
mock_db = MagicMock()
mock_db.query.return_value.filter.return_value.all.return_value = []
repo = ArtifactRepository(mock_db)
result = repo.get_artifact_dependencies("artifact_without_deps")
assert result == []

View File

@@ -1,194 +0,0 @@
"""Tests for HttpClientManager."""
import pytest
from unittest.mock import MagicMock, AsyncMock, patch
class TestHttpClientManager:
"""Tests for HTTP client pool management."""
@pytest.mark.unit
def test_manager_initializes_with_settings(self):
"""Manager should initialize with config settings."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings(
http_max_connections=50,
http_connect_timeout=15.0,
)
manager = HttpClientManager(settings)
assert manager.max_connections == 50
assert manager.connect_timeout == 15.0
assert manager._default_client is None # Not started yet
@pytest.mark.asyncio
@pytest.mark.unit
async def test_startup_creates_client(self):
"""Startup should create the default async client."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
assert manager._default_client is not None
await manager.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_shutdown_closes_client(self):
"""Shutdown should close all clients gracefully."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
client = manager._default_client
await manager.shutdown()
assert manager._default_client is None
assert client.is_closed
@pytest.mark.asyncio
@pytest.mark.unit
async def test_get_client_returns_default(self):
"""get_client() should return the default client."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
client = manager.get_client()
assert client is manager._default_client
await manager.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_get_client_raises_if_not_started(self):
"""get_client() should raise RuntimeError if manager not started."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
with pytest.raises(RuntimeError, match="not started"):
manager.get_client()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_run_blocking_executes_in_thread_pool(self):
"""run_blocking should execute sync functions in thread pool."""
from app.http_client import HttpClientManager
from app.config import Settings
import threading
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
main_thread = threading.current_thread()
execution_thread = None
def blocking_func():
nonlocal execution_thread
execution_thread = threading.current_thread()
return "result"
result = await manager.run_blocking(blocking_func)
assert result == "result"
assert execution_thread is not main_thread
await manager.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_run_blocking_raises_if_not_started(self):
"""run_blocking should raise RuntimeError if manager not started."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
with pytest.raises(RuntimeError, match="not started"):
await manager.run_blocking(lambda: None)
@pytest.mark.asyncio
@pytest.mark.unit
async def test_startup_idempotent(self):
"""Calling startup multiple times should be safe."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
client1 = manager._default_client
await manager.startup() # Should not create a new client
client2 = manager._default_client
assert client1 is client2 # Same client instance
await manager.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_shutdown_idempotent(self):
"""Calling shutdown multiple times should be safe."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
await manager.startup()
await manager.shutdown()
await manager.shutdown() # Should not raise
assert manager._default_client is None
@pytest.mark.asyncio
@pytest.mark.unit
async def test_properties_return_configured_values(self):
"""Properties should return configured values."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings(
http_max_connections=75,
http_worker_threads=16,
)
manager = HttpClientManager(settings)
await manager.startup()
assert manager.pool_size == 75
assert manager.executor_max == 16
await manager.shutdown()
@pytest.mark.asyncio
@pytest.mark.unit
async def test_active_connections_when_not_started(self):
"""active_connections should return 0 when not started."""
from app.http_client import HttpClientManager
from app.config import Settings
settings = Settings()
manager = HttpClientManager(settings)
assert manager.active_connections == 0

View File

@@ -1,243 +0,0 @@
"""Unit tests for metadata extraction functionality."""
import io
import gzip
import tarfile
import zipfile
import pytest
from app.metadata import (
extract_metadata,
extract_deb_metadata,
extract_wheel_metadata,
extract_tarball_metadata,
extract_jar_metadata,
parse_deb_control,
)
class TestDebMetadata:
"""Tests for Debian package metadata extraction."""
def test_parse_deb_control_basic(self):
"""Test parsing a basic control file."""
control = """Package: my-package
Version: 1.2.3
Architecture: amd64
Maintainer: Test <test@example.com>
Description: A test package
"""
result = parse_deb_control(control)
assert result["package_name"] == "my-package"
assert result["version"] == "1.2.3"
assert result["architecture"] == "amd64"
assert result["format"] == "deb"
def test_parse_deb_control_with_epoch(self):
"""Test parsing version with epoch."""
control = """Package: another-pkg
Version: 2:1.0.0-1
"""
result = parse_deb_control(control)
assert result["version"] == "2:1.0.0-1"
assert result["package_name"] == "another-pkg"
assert result["format"] == "deb"
def test_extract_deb_metadata_invalid_magic(self):
"""Test that invalid ar magic returns empty dict."""
file = io.BytesIO(b"not an ar archive")
result = extract_deb_metadata(file)
assert result == {}
def test_extract_deb_metadata_valid_ar_no_control(self):
"""Test ar archive without control.tar returns empty."""
# Create minimal ar archive with just debian-binary
ar_data = b"!<arch>\n"
ar_data += b"debian-binary/ 0 0 0 100644 4 `\n"
ar_data += b"2.0\n"
file = io.BytesIO(ar_data)
result = extract_deb_metadata(file)
# Should return empty since no control.tar found
assert result == {} or "version" not in result
class TestWheelMetadata:
"""Tests for Python wheel metadata extraction."""
def _create_wheel_with_metadata(self, metadata_content: str) -> io.BytesIO:
"""Helper to create a wheel file with given METADATA content."""
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
zf.writestr('package-1.0.0.dist-info/METADATA', metadata_content)
buf.seek(0)
return buf
def test_extract_wheel_version(self):
"""Test extracting version from wheel METADATA."""
metadata = """Metadata-Version: 2.1
Name: my-package
Version: 2.3.4
Summary: A test package
"""
file = self._create_wheel_with_metadata(metadata)
result = extract_wheel_metadata(file)
assert result.get("version") == "2.3.4"
assert result.get("package_name") == "my-package"
assert result.get("format") == "wheel"
def test_extract_wheel_no_version(self):
"""Test wheel without version field."""
metadata = """Metadata-Version: 2.1
Name: no-version-pkg
"""
file = self._create_wheel_with_metadata(metadata)
result = extract_wheel_metadata(file)
assert "version" not in result
assert result.get("package_name") == "no-version-pkg"
assert result.get("format") == "wheel"
def test_extract_wheel_invalid_zip(self):
"""Test that invalid zip returns format-only dict."""
file = io.BytesIO(b"not a zip file")
result = extract_wheel_metadata(file)
assert result == {"format": "wheel"}
def test_extract_wheel_no_metadata_file(self):
"""Test wheel without METADATA file returns format-only dict."""
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
zf.writestr('some_file.py', 'print("hello")')
buf.seek(0)
result = extract_wheel_metadata(buf)
assert result == {"format": "wheel"}
class TestTarballMetadata:
"""Tests for tarball metadata extraction from filename."""
def test_extract_version_from_filename_standard(self):
"""Test standard package-version.tar.gz format."""
file = io.BytesIO(b"") # Content doesn't matter for filename extraction
result = extract_tarball_metadata(file, "mypackage-1.2.3.tar.gz")
assert result.get("version") == "1.2.3"
assert result.get("package_name") == "mypackage"
assert result.get("format") == "tarball"
def test_extract_version_with_v_prefix(self):
"""Test version with v prefix."""
file = io.BytesIO(b"")
result = extract_tarball_metadata(file, "package-v2.0.0.tar.gz")
assert result.get("version") == "2.0.0"
assert result.get("package_name") == "package"
assert result.get("format") == "tarball"
def test_extract_version_underscore_separator(self):
"""Test package_version format."""
file = io.BytesIO(b"")
result = extract_tarball_metadata(file, "my_package_3.1.4.tar.gz")
assert result.get("version") == "3.1.4"
assert result.get("package_name") == "my_package"
assert result.get("format") == "tarball"
def test_extract_version_complex(self):
"""Test complex version string."""
file = io.BytesIO(b"")
result = extract_tarball_metadata(file, "package-1.0.0-beta.1.tar.gz")
# The regex handles versions with suffix like -beta_1
assert result.get("format") == "tarball"
# May or may not extract version depending on regex match
if "version" in result:
assert result.get("package_name") == "package"
def test_extract_no_version_in_filename(self):
"""Test filename without version returns format-only dict."""
file = io.BytesIO(b"")
result = extract_tarball_metadata(file, "package.tar.gz")
# Should return format but no version
assert result.get("version") is None
assert result.get("format") == "tarball"
class TestJarMetadata:
"""Tests for JAR/Java metadata extraction."""
def _create_jar_with_manifest(self, manifest_content: str) -> io.BytesIO:
"""Helper to create a JAR file with given MANIFEST.MF content."""
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
zf.writestr('META-INF/MANIFEST.MF', manifest_content)
buf.seek(0)
return buf
def test_extract_jar_version_from_manifest(self):
"""Test extracting version from MANIFEST.MF."""
manifest = """Manifest-Version: 1.0
Implementation-Title: my-library
Implementation-Version: 4.5.6
"""
file = self._create_jar_with_manifest(manifest)
result = extract_jar_metadata(file)
assert result.get("version") == "4.5.6"
assert result.get("package_name") == "my-library"
assert result.get("format") == "jar"
def test_extract_jar_bundle_version(self):
"""Test extracting OSGi Bundle-Version."""
manifest = """Manifest-Version: 1.0
Bundle-Version: 2.1.0
Bundle-Name: Test Bundle
"""
file = self._create_jar_with_manifest(manifest)
result = extract_jar_metadata(file)
# Bundle-Version is stored in bundle_version, not version
assert result.get("bundle_version") == "2.1.0"
assert result.get("bundle_name") == "Test Bundle"
assert result.get("format") == "jar"
def test_extract_jar_invalid_zip(self):
"""Test that invalid JAR returns format-only dict."""
file = io.BytesIO(b"not a jar file")
result = extract_jar_metadata(file)
assert result == {"format": "jar"}
class TestExtractMetadataDispatch:
"""Tests for the main extract_metadata dispatcher function."""
def test_dispatch_to_wheel(self):
"""Test that .whl files use wheel extractor."""
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
zf.writestr('pkg-1.0.dist-info/METADATA', 'Version: 1.0.0\nName: pkg')
buf.seek(0)
result = extract_metadata(buf, "package-1.0.0-py3-none-any.whl")
assert result.get("version") == "1.0.0"
assert result.get("package_name") == "pkg"
assert result.get("format") == "wheel"
def test_dispatch_to_tarball(self):
"""Test that .tar.gz files use tarball extractor."""
file = io.BytesIO(b"")
result = extract_metadata(file, "mypackage-2.3.4.tar.gz")
assert result.get("version") == "2.3.4"
assert result.get("package_name") == "mypackage"
assert result.get("format") == "tarball"
def test_dispatch_unknown_extension(self):
"""Test that unknown extensions return empty dict."""
file = io.BytesIO(b"some content")
result = extract_metadata(file, "unknown.xyz")
assert result == {}
def test_file_position_reset_after_extraction(self):
"""Test that file position is reset to start after extraction."""
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
zf.writestr('pkg-1.0.dist-info/METADATA', 'Version: 1.0.0\nName: pkg')
buf.seek(0)
extract_metadata(buf, "package.whl")
# File should be back at position 0
assert buf.tell() == 0

View File

@@ -145,6 +145,54 @@ class TestPackageModel:
assert platform_col.default.arg == "any"
class TestTagModel:
"""Tests for the Tag model."""
@pytest.mark.unit
def test_tag_requires_package_id(self):
"""Test tag requires package_id."""
from app.models import Tag
tag = Tag(
name="v1.0.0",
package_id=uuid.uuid4(),
artifact_id="f" * 64,
created_by="test-user",
)
assert tag.package_id is not None
assert tag.artifact_id == "f" * 64
class TestTagHistoryModel:
"""Tests for the TagHistory model."""
@pytest.mark.unit
def test_tag_history_default_change_type(self):
"""Test tag history change_type column has default value of 'update'."""
from app.models import TagHistory
# Check the column definition has the right default
change_type_col = TagHistory.__table__.columns["change_type"]
assert change_type_col.default is not None
assert change_type_col.default.arg == "update"
@pytest.mark.unit
def test_tag_history_allows_null_old_artifact(self):
"""Test tag history allows null old_artifact_id (for create events)."""
from app.models import TagHistory
history = TagHistory(
tag_id=uuid.uuid4(),
old_artifact_id=None,
new_artifact_id="h" * 64,
change_type="create",
changed_by="test-user",
)
assert history.old_artifact_id is None
class TestUploadModel:
"""Tests for the Upload model."""

View File

@@ -1,85 +0,0 @@
"""Unit tests for PyPI proxy functionality."""
import pytest
from app.pypi_proxy import _parse_requires_dist
class TestParseRequiresDist:
"""Tests for _parse_requires_dist function."""
def test_simple_package(self):
"""Test parsing a simple package name."""
name, version = _parse_requires_dist("numpy")
assert name == "numpy"
assert version is None
def test_package_with_version(self):
"""Test parsing package with version constraint."""
name, version = _parse_requires_dist("numpy>=1.21.0")
assert name == "numpy"
assert version == ">=1.21.0"
def test_package_with_parenthesized_version(self):
"""Test parsing package with parenthesized version."""
name, version = _parse_requires_dist("requests (>=2.25.0)")
assert name == "requests"
assert version == ">=2.25.0"
def test_package_with_python_version_marker(self):
"""Test that python_version markers are preserved but marker stripped."""
name, version = _parse_requires_dist("typing-extensions; python_version < '3.8'")
assert name == "typing-extensions"
assert version is None
def test_filters_extra_dependencies(self):
"""Test that extra dependencies are filtered out."""
# Extra dependencies should return (None, None)
name, version = _parse_requires_dist("pytest; extra == 'test'")
assert name is None
assert version is None
name, version = _parse_requires_dist("sphinx; extra == 'docs'")
assert name is None
assert version is None
def test_filters_platform_specific_darwin(self):
"""Test that macOS-specific dependencies are filtered out."""
name, version = _parse_requires_dist("pyobjc; sys_platform == 'darwin'")
assert name is None
assert version is None
def test_filters_platform_specific_win32(self):
"""Test that Windows-specific dependencies are filtered out."""
name, version = _parse_requires_dist("pywin32; sys_platform == 'win32'")
assert name is None
assert version is None
def test_filters_platform_system_marker(self):
"""Test that platform_system markers are filtered out."""
name, version = _parse_requires_dist("jaraco-windows; platform_system == 'Windows'")
assert name is None
assert version is None
def test_normalizes_package_name(self):
"""Test that package names are normalized (PEP 503)."""
name, version = _parse_requires_dist("Typing_Extensions>=3.7.4")
assert name == "typing-extensions"
assert version == ">=3.7.4"
def test_complex_version_constraint(self):
"""Test parsing complex version constraints."""
name, version = _parse_requires_dist("gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1")
assert name == "gast"
assert version == "!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1"
def test_version_range(self):
"""Test parsing version range constraints."""
name, version = _parse_requires_dist("grpcio<2.0,>=1.24.3")
assert name == "grpcio"
assert version == "<2.0,>=1.24.3"
def test_tilde_version(self):
"""Test parsing tilde version constraints."""
name, version = _parse_requires_dist("tensorboard~=2.20.0")
assert name == "tensorboard"
assert version == "~=2.20.0"

View File

@@ -1,65 +0,0 @@
"""Unit tests for rate limiting configuration."""
import os
import pytest
class TestRateLimitConfiguration:
"""Tests for rate limit configuration."""
def test_default_login_rate_limit(self):
"""Test default login rate limit is 5/minute."""
# Import fresh to get default value
import importlib
import app.rate_limit as rate_limit_module
# Save original env value
original = os.environ.get("ORCHARD_LOGIN_RATE_LIMIT")
try:
# Clear env variable to test default
if "ORCHARD_LOGIN_RATE_LIMIT" in os.environ:
del os.environ["ORCHARD_LOGIN_RATE_LIMIT"]
# Reload module to pick up new env
importlib.reload(rate_limit_module)
assert rate_limit_module.LOGIN_RATE_LIMIT == "5/minute"
finally:
# Restore original env value
if original is not None:
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = original
importlib.reload(rate_limit_module)
def test_custom_login_rate_limit(self):
"""Test custom login rate limit from environment."""
import importlib
import app.rate_limit as rate_limit_module
# Save original env value
original = os.environ.get("ORCHARD_LOGIN_RATE_LIMIT")
try:
# Set custom rate limit
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = "10/minute"
# Reload module to pick up new env
importlib.reload(rate_limit_module)
assert rate_limit_module.LOGIN_RATE_LIMIT == "10/minute"
finally:
# Restore original env value
if original is not None:
os.environ["ORCHARD_LOGIN_RATE_LIMIT"] = original
else:
if "ORCHARD_LOGIN_RATE_LIMIT" in os.environ:
del os.environ["ORCHARD_LOGIN_RATE_LIMIT"]
importlib.reload(rate_limit_module)
def test_limiter_exists(self):
"""Test that limiter object is created."""
from app.rate_limit import limiter
assert limiter is not None
# Limiter should have a key_func set
assert limiter._key_func is not None

View File

@@ -1,300 +0,0 @@
"""Unit tests for registry client functionality."""
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
import httpx
from packaging.specifiers import SpecifierSet
from app.registry_client import (
PyPIRegistryClient,
VersionInfo,
FetchResult,
get_registry_client,
)
class TestPyPIRegistryClient:
"""Tests for PyPI registry client."""
@pytest.fixture
def mock_http_client(self):
"""Create a mock async HTTP client."""
return AsyncMock(spec=httpx.AsyncClient)
@pytest.fixture
def client(self, mock_http_client):
"""Create a PyPI registry client with mocked HTTP."""
return PyPIRegistryClient(
http_client=mock_http_client,
upstream_sources=[],
pypi_api_url="https://pypi.org/pypi",
)
def test_source_type(self, client):
"""Test source_type returns 'pypi'."""
assert client.source_type == "pypi"
def test_normalize_package_name(self, client):
"""Test package name normalization per PEP 503."""
assert client._normalize_package_name("My_Package") == "my-package"
assert client._normalize_package_name("my.package") == "my-package"
assert client._normalize_package_name("my-package") == "my-package"
assert client._normalize_package_name("MY-PACKAGE") == "my-package"
assert client._normalize_package_name("my__package") == "my-package"
assert client._normalize_package_name("my..package") == "my-package"
@pytest.mark.asyncio
async def test_get_available_versions_success(self, client, mock_http_client):
"""Test fetching available versions from PyPI."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"releases": {
"1.0.0": [{"packagetype": "bdist_wheel"}],
"1.1.0": [{"packagetype": "bdist_wheel"}],
"2.0.0": [{"packagetype": "bdist_wheel"}],
}
}
mock_http_client.get.return_value = mock_response
versions = await client.get_available_versions("test-package")
assert "1.0.0" in versions
assert "1.1.0" in versions
assert "2.0.0" in versions
mock_http_client.get.assert_called_once()
@pytest.mark.asyncio
async def test_get_available_versions_empty(self, client, mock_http_client):
"""Test handling package with no releases."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {"releases": {}}
mock_http_client.get.return_value = mock_response
versions = await client.get_available_versions("empty-package")
assert versions == []
@pytest.mark.asyncio
async def test_get_available_versions_404(self, client, mock_http_client):
"""Test handling non-existent package."""
mock_response = MagicMock()
mock_response.status_code = 404
mock_http_client.get.return_value = mock_response
versions = await client.get_available_versions("nonexistent")
assert versions == []
@pytest.mark.asyncio
async def test_resolve_constraint_wildcard(self, client, mock_http_client):
"""Test resolving wildcard constraint returns latest."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"info": {"version": "2.0.0"},
"releases": {
"1.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
"filename": "test-1.0.0.whl",
"digests": {"sha256": "abc123"},
"size": 1000,
}
],
"2.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
"filename": "test-2.0.0.whl",
"digests": {"sha256": "def456"},
"size": 2000,
}
],
},
}
mock_http_client.get.return_value = mock_response
result = await client.resolve_constraint("test-package", "*")
assert result is not None
assert result.version == "2.0.0"
@pytest.mark.asyncio
async def test_resolve_constraint_specific_version(self, client, mock_http_client):
"""Test resolving specific version constraint."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"releases": {
"1.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
"filename": "test-1.0.0.whl",
"digests": {"sha256": "abc123"},
"size": 1000,
}
],
"2.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
"filename": "test-2.0.0.whl",
}
],
},
}
mock_http_client.get.return_value = mock_response
result = await client.resolve_constraint("test-package", ">=1.0.0,<2.0.0")
assert result is not None
assert result.version == "1.0.0"
@pytest.mark.asyncio
async def test_resolve_constraint_no_match(self, client, mock_http_client):
"""Test resolving constraint with no matching version."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"releases": {
"1.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
"filename": "test-1.0.0.whl",
}
],
},
}
mock_http_client.get.return_value = mock_response
result = await client.resolve_constraint("test-package", ">=5.0.0")
assert result is None
@pytest.mark.asyncio
async def test_resolve_constraint_bare_version(self, client, mock_http_client):
"""Test resolving bare version string as exact match."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"info": {"version": "2.0.0"},
"releases": {
"1.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-1.0.0.whl",
"filename": "test-1.0.0.whl",
"digests": {"sha256": "abc123"},
"size": 1000,
}
],
"2.0.0": [
{
"packagetype": "bdist_wheel",
"url": "https://files.pythonhosted.org/test-2.0.0.whl",
"filename": "test-2.0.0.whl",
"digests": {"sha256": "def456"},
"size": 2000,
}
],
},
}
mock_http_client.get.return_value = mock_response
# Bare version "1.0.0" should resolve to exactly 1.0.0, not latest
result = await client.resolve_constraint("test-package", "1.0.0")
assert result is not None
assert result.version == "1.0.0"
class TestVersionInfo:
"""Tests for VersionInfo dataclass."""
def test_create_version_info(self):
"""Test creating VersionInfo with all fields."""
info = VersionInfo(
version="1.0.0",
download_url="https://example.com/pkg-1.0.0.whl",
filename="pkg-1.0.0.whl",
sha256="abc123",
size=5000,
content_type="application/zip",
)
assert info.version == "1.0.0"
assert info.download_url == "https://example.com/pkg-1.0.0.whl"
assert info.filename == "pkg-1.0.0.whl"
assert info.sha256 == "abc123"
assert info.size == 5000
def test_create_version_info_minimal(self):
"""Test creating VersionInfo with only required fields."""
info = VersionInfo(
version="1.0.0",
download_url="https://example.com/pkg.whl",
filename="pkg.whl",
)
assert info.sha256 is None
assert info.size is None
class TestFetchResult:
"""Tests for FetchResult dataclass."""
def test_create_fetch_result(self):
"""Test creating FetchResult."""
result = FetchResult(
artifact_id="abc123def456",
size=10000,
version="2.0.0",
filename="pkg-2.0.0.whl",
already_cached=True,
)
assert result.artifact_id == "abc123def456"
assert result.size == 10000
assert result.version == "2.0.0"
assert result.already_cached is True
def test_fetch_result_default_not_cached(self):
"""Test FetchResult defaults to not cached."""
result = FetchResult(
artifact_id="xyz",
size=100,
version="1.0.0",
filename="pkg.whl",
)
assert result.already_cached is False
class TestGetRegistryClient:
"""Tests for registry client factory function."""
def test_get_pypi_client(self):
"""Test getting PyPI client."""
mock_client = MagicMock()
mock_sources = []
client = get_registry_client("pypi", mock_client, mock_sources)
assert isinstance(client, PyPIRegistryClient)
def test_get_unsupported_client(self):
"""Test getting unsupported registry type returns None."""
mock_client = MagicMock()
client = get_registry_client("npm", mock_client, [])
assert client is None
def test_get_unknown_client(self):
"""Test getting unknown registry type returns None."""
mock_client = MagicMock()
client = get_registry_client("unknown", mock_client, [])
assert client is None

View File

@@ -1,228 +0,0 @@
# PyPI Proxy Performance & Multi-Protocol Architecture Design
**Date:** 2026-02-04
**Status:** Approved
**Branch:** fix/pypi-proxy-timeout
## Overview
Comprehensive infrastructure overhaul to address latency, throughput, and resource consumption issues in the PyPI proxy, while establishing a foundation for npm, Maven, and other package protocols.
## Goals
1. **Reduce latency** - Eliminate per-request connection overhead, cache aggressively
2. **Increase throughput** - Handle hundreds of concurrent requests without degradation
3. **Lower resource usage** - Connection pooling, efficient DB queries, proper async I/O
4. **Enable multi-protocol** - Abstract base class ready for npm/Maven/etc.
5. **Maintain hermetic builds** - Immutable artifact content and metadata, mutable discovery data
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ FastAPI Application │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ PyPI Proxy │ │ npm Proxy │ │ Maven Proxy │ │ (future) │ │
│ │ Router │ │ Router │ │ Router │ │ │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └─────────────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ PackageProxyBase │ ← Abstract base class │
│ │ - check_cache() │ │
│ │ - fetch_upstream() │ │
│ │ - store_artifact() │ │
│ │ - serve_artifact() │ │
│ └───────────┬───────────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ HttpClient │ │ CacheService│ │ ThreadPool │ │
│ │ Manager │ │ (Redis) │ │ Executor │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
└─────────┼────────────────┼────────────────┼──────────────────────────┘
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────────┐
│ Upstream │ │ Redis │ │ S3/MinIO │
│ Sources │ │ │ │ │
└──────────┘ └──────────┘ └──────────────┘
```
## Components
### 1. HttpClientManager
Manages httpx.AsyncClient pools with FastAPI lifespan integration.
**Features:**
- Default pool for general requests
- Per-upstream pools for sources needing specific config/auth
- Graceful shutdown drains in-flight requests
- Dedicated thread pool for blocking operations
**Configuration:**
```bash
ORCHARD_HTTP_MAX_CONNECTIONS=100 # Default pool size
ORCHARD_HTTP_KEEPALIVE_CONNECTIONS=20 # Keep-alive connections
ORCHARD_HTTP_CONNECT_TIMEOUT=30 # Connection timeout (seconds)
ORCHARD_HTTP_READ_TIMEOUT=60 # Read timeout (seconds)
ORCHARD_HTTP_WORKER_THREADS=32 # Thread pool size
```
**File:** `backend/app/http_client.py`
### 2. CacheService (Redis Layer)
Redis-backed caching with category-aware TTL and invalidation.
**Cache Categories:**
| Category | TTL | Invalidation | Purpose |
|----------|-----|--------------|---------|
| ARTIFACT_METADATA | Forever | Never (immutable) | Artifact info by SHA256 |
| ARTIFACT_DEPENDENCIES | Forever | Never (immutable) | Extracted deps by SHA256 |
| DEPENDENCY_RESOLUTION | Forever | Manual/refresh param | Resolution results |
| UPSTREAM_SOURCES | 1 hour | On DB change | Upstream config |
| PACKAGE_INDEX | 5 min | TTL only | PyPI/npm index pages |
| PACKAGE_VERSIONS | 5 min | TTL only | Version listings |
**Key format:** `orchard:{category}:{protocol}:{identifier}`
**Configuration:**
```bash
ORCHARD_REDIS_HOST=redis
ORCHARD_REDIS_PORT=6379
ORCHARD_REDIS_DB=0
ORCHARD_CACHE_TTL_INDEX=300 # Package index: 5 minutes
ORCHARD_CACHE_TTL_VERSIONS=300 # Version listings: 5 minutes
ORCHARD_CACHE_TTL_UPSTREAM=3600 # Upstream config: 1 hour
```
**File:** `backend/app/cache_service.py`
### 3. PackageProxyBase
Abstract base class defining the cache→fetch→store→serve flow.
**Abstract methods (protocol-specific):**
- `get_protocol_name()` - Return 'pypi', 'npm', 'maven'
- `get_system_project_name()` - Return '_pypi', '_npm'
- `rewrite_index_html()` - Rewrite upstream index to Orchard URLs
- `extract_metadata()` - Extract deps from package file
- `parse_package_url()` - Parse URL into package/version/filename
**Concrete methods (shared):**
- `serve_index()` - Serve package index with caching
- `serve_artifact()` - Full cache→fetch→store→serve flow
**File:** `backend/app/proxy_base.py`
### 4. ArtifactRepository (DB Optimization)
Optimized database operations eliminating N+1 queries.
**Key methods:**
- `get_or_create_artifact()` - Atomic upsert via ON CONFLICT
- `batch_upsert_dependencies()` - Single INSERT for all deps
- `get_cached_url_with_artifact()` - Joined query for cache lookup
**Query reduction:**
| Operation | Before | After |
|-----------|--------|-------|
| Cache hit check | 2 queries | 1 query (joined) |
| Store artifact | 3-4 queries | 1 query (upsert) |
| Store 50 deps | 50+ queries | 1 query (batch) |
**Configuration:**
```bash
ORCHARD_DATABASE_POOL_SIZE=20 # Base connections (up from 5)
ORCHARD_DATABASE_MAX_OVERFLOW=30 # Burst capacity (up from 10)
ORCHARD_DATABASE_POOL_TIMEOUT=30 # Wait timeout
ORCHARD_DATABASE_POOL_PRE_PING=false # Disable in prod for performance
```
**File:** `backend/app/db_utils.py`
### 5. Dependency Resolution Caching
Cache resolution results for ensure files and API queries.
**Cache key:** Hash of (artifact_id, max_depth, include_optional)
**Invalidation:** Manual only (immutable artifact deps mean cached resolutions stay valid)
**Refresh:** `?refresh=true` parameter forces fresh resolution
**File:** Updates to `backend/app/dependencies.py`
### 6. FastAPI Integration
Lifespan-managed infrastructure with dependency injection.
**Startup:**
1. Initialize HttpClientManager (connection pools)
2. Initialize CacheService (Redis connection)
3. Load upstream source configs
**Shutdown:**
1. Drain in-flight HTTP requests
2. Close Redis connections
3. Shutdown thread pool
**Health endpoint additions:**
- Database connection status
- Redis ping
- HTTP pool active/max connections
- Thread pool active/max workers
**File:** Updates to `backend/app/main.py`
## Files Summary
**New files:**
- `backend/app/http_client.py` - HttpClientManager
- `backend/app/cache_service.py` - CacheService
- `backend/app/proxy_base.py` - PackageProxyBase
- `backend/app/db_utils.py` - ArtifactRepository
**Modified files:**
- `backend/app/config.py` - New settings
- `backend/app/main.py` - Lifespan integration
- `backend/app/pypi_proxy.py` - Refactor to use base class
- `backend/app/dependencies.py` - Resolution caching
- `backend/app/routes.py` - Health endpoint, DI
## Hermetic Build Guarantees
**Immutable (cached forever):**
- Artifact content (by SHA256)
- Extracted dependencies for a specific artifact
- Dependency resolution results
**Mutable (TTL + event invalidation):**
- Package index listings
- Version discovery
- Upstream source configuration
Once an artifact is cached with SHA256 `abc123` and dependencies extracted, that data never changes.
## Performance Expectations
| Metric | Before | After |
|--------|--------|-------|
| HTTP connection setup | Per request (~100-500ms) | Pooled (~5ms) |
| Cache hit (index page) | N/A | ~5ms (Redis) |
| Store 50 dependencies | ~500ms (50 queries) | ~10ms (1 query) |
| Dependency resolution (cached) | N/A | ~5ms |
| Concurrent request capacity | ~15 (DB pool) | ~50 (configurable) |
## Testing Requirements
- Unit tests for each new component
- Integration tests for full proxy flow
- Load tests to verify pool sizing
- Cache hit/miss verification tests

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -37,15 +37,6 @@
"ufo": "1.5.4",
"rollup": "4.52.4",
"caniuse-lite": "1.0.30001692",
"baseline-browser-mapping": "2.9.5",
"lodash": "4.17.21",
"electron-to-chromium": "1.5.72",
"@babel/core": "7.26.0",
"@babel/traverse": "7.26.4",
"@babel/types": "7.26.3",
"@babel/compat-data": "7.26.3",
"@babel/parser": "7.26.3",
"@babel/generator": "7.26.3",
"@babel/code-frame": "7.26.2"
"baseline-browser-mapping": "2.9.5"
}
}

View File

@@ -1,11 +1,14 @@
import {
Project,
Package,
Tag,
TagDetail,
Artifact,
ArtifactDetail,
PackageArtifact,
UploadResponse,
PaginatedResponse,
ListParams,
TagListParams,
PackageListParams,
ArtifactListParams,
ProjectListParams,
@@ -237,6 +240,32 @@ export async function createPackage(projectName: string, data: { name: string; d
return handleResponse<Package>(response);
}
// Tag API
export async function listTags(projectName: string, packageName: string, params: TagListParams = {}): Promise<PaginatedResponse<TagDetail>> {
const query = buildQueryString(params as Record<string, unknown>);
const response = await fetch(`${API_BASE}/project/${projectName}/${packageName}/tags${query}`);
return handleResponse<PaginatedResponse<TagDetail>>(response);
}
export async function listTagsSimple(projectName: string, packageName: string, params: TagListParams = {}): Promise<TagDetail[]> {
const data = await listTags(projectName, packageName, params);
return data.items;
}
export async function getTag(projectName: string, packageName: string, tagName: string): Promise<TagDetail> {
const response = await fetch(`${API_BASE}/project/${projectName}/${packageName}/tags/${tagName}`);
return handleResponse<TagDetail>(response);
}
export async function createTag(projectName: string, packageName: string, data: { name: string; artifact_id: string }): Promise<Tag> {
const response = await fetch(`${API_BASE}/project/${projectName}/${packageName}/tags`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(data),
});
return handleResponse<Tag>(response);
}
// Artifact API
export async function getArtifact(artifactId: string): Promise<ArtifactDetail> {
const response = await fetch(`${API_BASE}/artifact/${artifactId}`);
@@ -247,10 +276,10 @@ export async function listPackageArtifacts(
projectName: string,
packageName: string,
params: ArtifactListParams = {}
): Promise<PaginatedResponse<PackageArtifact>> {
): Promise<PaginatedResponse<Artifact & { tags: string[] }>> {
const query = buildQueryString(params as Record<string, unknown>);
const response = await fetch(`${API_BASE}/project/${projectName}/${packageName}/artifacts${query}`);
return handleResponse<PaginatedResponse<PackageArtifact>>(response);
return handleResponse<PaginatedResponse<Artifact & { tags: string[] }>>(response);
}
// Upload
@@ -258,10 +287,14 @@ export async function uploadArtifact(
projectName: string,
packageName: string,
file: File,
tag?: string,
version?: string
): Promise<UploadResponse> {
const formData = new FormData();
formData.append('file', file);
if (tag) {
formData.append('tag', tag);
}
if (version) {
formData.append('version', version);
}

View File

@@ -227,21 +227,6 @@
line-height: 1.5;
}
.graph-warning {
display: flex;
align-items: center;
gap: 8px;
padding: 8px 16px;
background: rgba(245, 158, 11, 0.1);
border-top: 1px solid rgba(245, 158, 11, 0.3);
color: var(--warning-color, #f59e0b);
font-size: 0.875rem;
}
.graph-warning svg {
flex-shrink: 0;
}
/* Missing Dependencies */
.missing-dependencies {
border-top: 1px solid var(--border-primary);

View File

@@ -106,7 +106,6 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
const [warning, setWarning] = useState<string | null>(null);
const [resolution, setResolution] = useState<DependencyResolutionResponse | null>(null);
const [nodes, setNodes, onNodesChange] = useNodesState<NodeData>([]);
const [edges, setEdges, onEdgesChange] = useEdgesState([]);
@@ -128,24 +127,16 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
// Fetch dependencies for each artifact
const depsMap = new Map<string, Dependency[]>();
const failedFetches: string[] = [];
for (const artifact of resolutionData.resolved) {
try {
const deps = await getArtifactDependencies(artifact.artifact_id);
depsMap.set(artifact.artifact_id, deps.dependencies);
} catch (err) {
console.warn(`Failed to fetch dependencies for ${artifact.package}:`, err);
failedFetches.push(artifact.package);
} catch {
depsMap.set(artifact.artifact_id, []);
}
}
// Report warning if some fetches failed
if (failedFetches.length > 0) {
setWarning(`Could not load dependency details for: ${failedFetches.slice(0, 3).join(', ')}${failedFetches.length > 3 ? ` and ${failedFetches.length - 3} more` : ''}`);
}
// Find the root artifact
const rootArtifact = resolutionData.resolved.find(
a => a.project === resolutionData.requested.project &&
@@ -179,7 +170,7 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
label: `${artifact.project}/${artifact.package}`,
project: artifact.project,
package: artifact.package,
version: artifact.version,
version: artifact.version || artifact.tag,
size: artifact.size,
isRoot,
onNavigate,
@@ -333,17 +324,6 @@ function DependencyGraph({ projectName, packageName, tagName, onClose }: Depende
)}
</div>
{warning && (
<div className="graph-warning">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"></path>
<line x1="12" y1="9" x2="12" y2="13"></line>
<line x1="12" y1="17" x2="12.01" y2="17"></line>
</svg>
<span>{warning}</span>
</div>
)}
{resolution && resolution.missing && resolution.missing.length > 0 && (
<div className="missing-dependencies">
<h3>Not Cached ({resolution.missing.length})</h3>

View File

@@ -290,25 +290,20 @@
color: var(--error-color, #dc3545);
}
/* Progress Bar - scoped to upload component */
.drag-drop-upload .progress-bar,
.upload-queue .progress-bar {
/* Progress Bar */
.progress-bar {
height: 8px;
background: var(--border-color, #ddd);
border-radius: 4px;
overflow: hidden;
width: 100%;
max-width: 100%;
}
.drag-drop-upload .progress-bar--small,
.upload-queue .progress-bar--small {
.progress-bar--small {
height: 4px;
margin-top: 0.25rem;
}
.drag-drop-upload .progress-bar__fill,
.upload-queue .progress-bar__fill {
.progress-bar__fill {
height: 100%;
background: var(--accent-color, #007bff);
border-radius: 4px;

View File

@@ -504,4 +504,42 @@ describe('DragDropUpload', () => {
});
});
});
describe('Tag Support', () => {
it('includes tag in upload request', async () => {
let capturedFormData: FormData | null = null;
class MockXHR {
status = 200;
responseText = JSON.stringify({ artifact_id: 'abc123', size: 100 });
timeout = 0;
upload = { addEventListener: vi.fn() };
addEventListener = vi.fn((event: string, handler: () => void) => {
if (event === 'load') setTimeout(handler, 10);
});
open = vi.fn();
send = vi.fn((data: FormData) => {
capturedFormData = data;
});
}
vi.stubGlobal('XMLHttpRequest', MockXHR);
render(<DragDropUpload {...defaultProps} tag="v1.0.0" />);
const input = document.querySelector('input[type="file"]') as HTMLInputElement;
const file = createMockFile('test.txt', 100, 'text/plain');
Object.defineProperty(input, 'files', {
value: Object.assign([file], { item: (i: number) => [file][i] }),
});
fireEvent.change(input);
await vi.advanceTimersByTimeAsync(100);
await waitFor(() => {
expect(capturedFormData?.get('tag')).toBe('v1.0.0');
});
});
});
});

View File

@@ -13,6 +13,7 @@ interface StoredUploadState {
completedParts: number[];
project: string;
package: string;
tag?: string;
createdAt: number;
}
@@ -86,6 +87,7 @@ export interface DragDropUploadProps {
maxFileSize?: number; // in bytes
maxConcurrentUploads?: number;
maxRetries?: number;
tag?: string;
className?: string;
disabled?: boolean;
disabledReason?: string;
@@ -228,6 +230,7 @@ export function DragDropUpload({
maxFileSize,
maxConcurrentUploads = 3,
maxRetries = 3,
tag,
className = '',
disabled = false,
disabledReason,
@@ -365,6 +368,7 @@ export function DragDropUpload({
expected_hash: fileHash,
filename: item.file.name,
size: item.file.size,
tag: tag || undefined,
}),
}
);
@@ -388,6 +392,7 @@ export function DragDropUpload({
completedParts: [],
project: projectName,
package: packageName,
tag: tag || undefined,
createdAt: Date.now(),
});
@@ -433,6 +438,7 @@ export function DragDropUpload({
completedParts,
project: projectName,
package: packageName,
tag: tag || undefined,
createdAt: Date.now(),
});
@@ -453,7 +459,7 @@ export function DragDropUpload({
{
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({}),
body: JSON.stringify({ tag: tag || undefined }),
}
);
@@ -469,7 +475,7 @@ export function DragDropUpload({
size: completeData.size,
deduplicated: false,
};
}, [projectName, packageName, isOnline]);
}, [projectName, packageName, tag, isOnline]);
const uploadFileSimple = useCallback((item: UploadItem): Promise<UploadResult> => {
return new Promise((resolve, reject) => {
@@ -478,6 +484,9 @@ export function DragDropUpload({
const formData = new FormData();
formData.append('file', item.file);
if (tag) {
formData.append('tag', tag);
}
let lastLoaded = 0;
let lastTime = Date.now();
@@ -546,7 +555,7 @@ export function DragDropUpload({
: u
));
});
}, [projectName, packageName]);
}, [projectName, packageName, tag]);
const uploadFile = useCallback((item: UploadItem): Promise<UploadResult> => {
if (item.file.size >= CHUNKED_UPLOAD_THRESHOLD) {

View File

@@ -233,7 +233,7 @@ export function GlobalSearch() {
const flatIndex = results.projects.length + results.packages.length + index;
return (
<button
key={artifact.artifact_id}
key={artifact.tag_id}
className={`global-search__result ${selectedIndex === flatIndex ? 'selected' : ''}`}
onClick={() => navigateToResult({ type: 'artifact', item: artifact })}
onMouseEnter={() => setSelectedIndex(flatIndex)}
@@ -243,7 +243,7 @@ export function GlobalSearch() {
<line x1="7" y1="7" x2="7.01" y2="7" />
</svg>
<div className="global-search__result-content">
<span className="global-search__result-name">{artifact.version}</span>
<span className="global-search__result-name">{artifact.tag_name}</span>
<span className="global-search__result-path">
{artifact.project_name} / {artifact.package_name}
</span>

View File

@@ -185,6 +185,56 @@ h2 {
color: var(--warning-color, #f59e0b);
}
/* Usage Section */
.usage-section {
margin-top: 32px;
background: var(--bg-secondary);
}
.usage-section h3 {
margin-bottom: 12px;
color: var(--text-primary);
font-size: 1rem;
font-weight: 600;
}
.usage-section p {
color: var(--text-secondary);
margin-bottom: 12px;
font-size: 0.875rem;
}
.usage-section pre {
background: #0d0d0f;
border: 1px solid var(--border-primary);
padding: 16px 20px;
border-radius: var(--radius-md);
overflow-x: auto;
margin-bottom: 16px;
}
.usage-section code {
font-family: 'JetBrains Mono', 'Fira Code', 'Consolas', monospace;
font-size: 0.8125rem;
color: #e2e8f0;
}
/* Syntax highlighting for code blocks */
.usage-section pre {
position: relative;
}
.usage-section pre::before {
content: 'bash';
position: absolute;
top: 8px;
right: 12px;
font-size: 0.6875rem;
color: var(--text-muted);
text-transform: uppercase;
letter-spacing: 0.05em;
}
/* Copy button for code blocks (optional enhancement) */
.code-block {
position: relative;

View File

@@ -1,7 +1,7 @@
import { useState, useEffect, useCallback } from 'react';
import { useParams, useSearchParams, useNavigate, useLocation, Link } from 'react-router-dom';
import { PackageArtifact, Package, PaginatedResponse, AccessLevel, Dependency, DependentInfo } from '../types';
import { listPackageArtifacts, getDownloadUrl, getPackage, getMyProjectAccess, getArtifactDependencies, getReverseDependencies, getEnsureFile, UnauthorizedError, ForbiddenError } from '../api';
import { TagDetail, Package, PaginatedResponse, AccessLevel, Dependency, DependentInfo } from '../types';
import { listTags, getDownloadUrl, getPackage, getMyProjectAccess, createTag, getArtifactDependencies, getReverseDependencies, getEnsureFile, UnauthorizedError, ForbiddenError } from '../api';
import { Breadcrumb } from '../components/Breadcrumb';
import { Badge } from '../components/Badge';
import { SearchInput } from '../components/SearchInput';
@@ -57,20 +57,25 @@ function PackagePage() {
const { user } = useAuth();
const [pkg, setPkg] = useState<Package | null>(null);
const [artifactsData, setArtifactsData] = useState<PaginatedResponse<PackageArtifact> | null>(null);
const [tagsData, setTagsData] = useState<PaginatedResponse<TagDetail> | null>(null);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
const [accessDenied, setAccessDenied] = useState(false);
const [uploadTag, setUploadTag] = useState('');
const [uploadSuccess, setUploadSuccess] = useState<string | null>(null);
const [accessLevel, setAccessLevel] = useState<AccessLevel | null>(null);
const [createTagName, setCreateTagName] = useState('');
const [createTagArtifactId, setCreateTagArtifactId] = useState('');
const [createTagLoading, setCreateTagLoading] = useState(false);
// UI state
const [showUploadModal, setShowUploadModal] = useState(false);
const [showCreateTagModal, setShowCreateTagModal] = useState(false);
const [openMenuId, setOpenMenuId] = useState<string | null>(null);
const [menuPosition, setMenuPosition] = useState<{ top: number; left: number } | null>(null);
// Dependencies state
const [selectedArtifact, setSelectedArtifact] = useState<PackageArtifact | null>(null);
const [selectedTag, setSelectedTag] = useState<TagDetail | null>(null);
const [dependencies, setDependencies] = useState<Dependency[]>([]);
const [depsLoading, setDepsLoading] = useState(false);
const [depsError, setDepsError] = useState<string | null>(null);
@@ -78,7 +83,7 @@ function PackagePage() {
// Reverse dependencies state
const [reverseDeps, setReverseDeps] = useState<DependentInfo[]>([]);
const [reverseDepsLoading, setReverseDepsLoading] = useState(false);
const [reverseDepsError, setReverseDepsError] = useState<string | null>(null);
const [_reverseDepsError, setReverseDepsError] = useState<string | null>(null);
const [reverseDepsPage, setReverseDepsPage] = useState(1);
const [reverseDepsTotal, setReverseDepsTotal] = useState(0);
const [reverseDepsHasMore, setReverseDepsHasMore] = useState(false);
@@ -107,11 +112,10 @@ function PackagePage() {
const isSystemProject = projectName?.startsWith('_') ?? false;
// Get params from URL
// Valid sort fields for artifacts: created_at, size, original_name
const page = parseInt(searchParams.get('page') || '1', 10);
const search = searchParams.get('search') || '';
const sort = searchParams.get('sort') || 'created_at';
const order = (searchParams.get('order') || 'desc') as 'asc' | 'desc';
const sort = searchParams.get('sort') || 'name';
const order = (searchParams.get('order') || 'asc') as 'asc' | 'desc';
const updateParams = useCallback(
(updates: Record<string, string | undefined>) => {
@@ -134,13 +138,13 @@ function PackagePage() {
try {
setLoading(true);
setAccessDenied(false);
const [pkgData, artifactsResult, accessResult] = await Promise.all([
const [pkgData, tagsResult, accessResult] = await Promise.all([
getPackage(projectName, packageName),
listPackageArtifacts(projectName, packageName, { page, search, sort, order }),
listTags(projectName, packageName, { page, search, sort, order }),
getMyProjectAccess(projectName),
]);
setPkg(pkgData);
setArtifactsData(artifactsResult);
setTagsData(tagsResult);
setAccessLevel(accessResult.access_level);
setError(null);
} catch (err) {
@@ -164,15 +168,25 @@ function PackagePage() {
loadData();
}, [loadData]);
// Auto-select artifact when artifacts are loaded (prefer first artifact)
// Re-run when package changes to pick up new artifacts
// Auto-select tag when tags are loaded (prefer version from URL, then first tag)
// Re-run when package changes to pick up new tags
useEffect(() => {
if (artifactsData?.items && artifactsData.items.length > 0) {
// Fall back to first artifact
setSelectedArtifact(artifactsData.items[0]);
if (tagsData?.items && tagsData.items.length > 0) {
const versionParam = searchParams.get('version');
if (versionParam) {
// Find tag matching the version parameter
const matchingTag = tagsData.items.find(t => t.version === versionParam);
if (matchingTag) {
setSelectedTag(matchingTag);
setDependencies([]);
return;
}
}
// Fall back to first tag
setSelectedTag(tagsData.items[0]);
setDependencies([]);
}
}, [artifactsData, projectName, packageName]);
}, [tagsData, searchParams, projectName, packageName]);
// Fetch dependencies when selected tag changes
const fetchDependencies = useCallback(async (artifactId: string) => {
@@ -190,10 +204,10 @@ function PackagePage() {
}, []);
useEffect(() => {
if (selectedArtifact) {
fetchDependencies(selectedArtifact.id);
if (selectedTag) {
fetchDependencies(selectedTag.artifact_id);
}
}, [selectedArtifact, fetchDependencies]);
}, [selectedTag, fetchDependencies]);
// Fetch reverse dependencies
const fetchReverseDeps = useCallback(async (pageNum: number = 1) => {
@@ -221,15 +235,15 @@ function PackagePage() {
}
}, [projectName, packageName, loading, fetchReverseDeps]);
// Fetch ensure file for a specific version or artifact
const fetchEnsureFileForRef = useCallback(async (ref: string) => {
// Fetch ensure file for a specific tag
const fetchEnsureFileForTag = useCallback(async (tagName: string) => {
if (!projectName || !packageName) return;
setEnsureFileTagName(ref);
setEnsureFileTagName(tagName);
setEnsureFileLoading(true);
setEnsureFileError(null);
try {
const content = await getEnsureFile(projectName, packageName, ref);
const content = await getEnsureFile(projectName, packageName, tagName);
setEnsureFileContent(content);
setShowEnsureFile(true);
} catch (err) {
@@ -240,13 +254,11 @@ function PackagePage() {
}
}, [projectName, packageName]);
// Fetch ensure file for selected artifact
// Fetch ensure file for selected tag
const fetchEnsureFile = useCallback(async () => {
if (!selectedArtifact) return;
const version = getArtifactVersion(selectedArtifact);
const ref = version || `artifact:${selectedArtifact.id}`;
fetchEnsureFileForRef(ref);
}, [selectedArtifact, fetchEnsureFileForRef]);
if (!selectedTag) return;
fetchEnsureFileForTag(selectedTag.name);
}, [selectedTag, fetchEnsureFileForTag]);
// Keyboard navigation - go back with backspace
useEffect(() => {
@@ -266,6 +278,7 @@ function PackagePage() {
? `Uploaded successfully! Artifact ID: ${results[0].artifact_id}`
: `${count} files uploaded successfully!`;
setUploadSuccess(message);
setUploadTag('');
loadData();
// Auto-dismiss success message after 5 seconds
@@ -276,6 +289,30 @@ function PackagePage() {
setError(errorMsg);
}, []);
const handleCreateTag = async (e: React.FormEvent) => {
e.preventDefault();
if (!createTagName.trim() || createTagArtifactId.length !== 64) return;
setCreateTagLoading(true);
setError(null);
try {
await createTag(projectName!, packageName!, {
name: createTagName.trim(),
artifact_id: createTagArtifactId,
});
setUploadSuccess(`Tag "${createTagName}" created successfully!`);
setCreateTagName('');
setCreateTagArtifactId('');
loadData();
setTimeout(() => setUploadSuccess(null), 5000);
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to create tag');
} finally {
setCreateTagLoading(false);
}
};
const handleSearchChange = (value: string) => {
updateParams({ search: value, page: '1' });
};
@@ -294,36 +331,25 @@ function PackagePage() {
};
const hasActiveFilters = search !== '';
const artifacts = artifactsData?.items || [];
const pagination = artifactsData?.pagination;
const tags = tagsData?.items || [];
const pagination = tagsData?.pagination;
const handleArtifactSelect = (artifact: PackageArtifact) => {
setSelectedArtifact(artifact);
const handleTagSelect = (tag: TagDetail) => {
setSelectedTag(tag);
};
const handleMenuOpen = (e: React.MouseEvent, artifactId: string) => {
const handleMenuOpen = (e: React.MouseEvent, tagId: string) => {
e.stopPropagation();
if (openMenuId === artifactId) {
if (openMenuId === tagId) {
setOpenMenuId(null);
setMenuPosition(null);
} else {
const rect = e.currentTarget.getBoundingClientRect();
setMenuPosition({ top: rect.bottom + 4, left: rect.right - 180 });
setOpenMenuId(artifactId);
setOpenMenuId(tagId);
}
};
// Helper to get version from artifact - prefer direct version field, fallback to metadata
const getArtifactVersion = (a: PackageArtifact): string | null => {
return a.version || (a.format_metadata?.version as string) || null;
};
// Helper to get download ref - prefer version, fallback to artifact ID
const getDownloadRef = (a: PackageArtifact): string => {
const version = getArtifactVersion(a);
return version || `artifact:${a.id}`;
};
// System projects show Version first, regular projects show Tag first
const columns = isSystemProject
? [
@@ -331,47 +357,45 @@ function PackagePage() {
{
key: 'version',
header: 'Version',
// version is from format_metadata, not a sortable DB field
render: (a: PackageArtifact) => (
sortable: true,
render: (t: TagDetail) => (
<strong
className={`tag-name-link ${selectedArtifact?.id === a.id ? 'selected' : ''}`}
onClick={() => handleArtifactSelect(a)}
className={`tag-name-link ${selectedTag?.id === t.id ? 'selected' : ''}`}
onClick={() => handleTagSelect(t)}
style={{ cursor: 'pointer' }}
>
<span className="version-badge">{getArtifactVersion(a) || a.id.slice(0, 12)}</span>
<span className="version-badge">{t.version || t.name}</span>
</strong>
),
},
{
key: 'original_name',
key: 'artifact_original_name',
header: 'Filename',
sortable: true,
className: 'cell-truncate',
render: (a: PackageArtifact) => (
<span title={a.original_name || a.id}>{a.original_name || a.id.slice(0, 12)}</span>
render: (t: TagDetail) => (
<span title={t.artifact_original_name || t.name}>{t.artifact_original_name || t.name}</span>
),
},
{
key: 'size',
key: 'artifact_size',
header: 'Size',
sortable: true,
render: (a: PackageArtifact) => <span>{formatBytes(a.size)}</span>,
render: (t: TagDetail) => <span>{formatBytes(t.artifact_size)}</span>,
},
{
key: 'created_at',
header: 'Cached',
sortable: true,
render: (a: PackageArtifact) => (
<span>{new Date(a.created_at).toLocaleDateString()}</span>
render: (t: TagDetail) => (
<span>{new Date(t.created_at).toLocaleDateString()}</span>
),
},
{
key: 'actions',
header: '',
render: (a: PackageArtifact) => (
render: (t: TagDetail) => (
<div className="action-buttons">
<a
href={getDownloadUrl(projectName!, packageName!, getDownloadRef(a))}
href={getDownloadUrl(projectName!, packageName!, t.name)}
className="btn btn-icon"
download
title="Download"
@@ -384,7 +408,7 @@ function PackagePage() {
</a>
<button
className="btn btn-icon"
onClick={(e) => handleMenuOpen(e, a.id)}
onClick={(e) => handleMenuOpen(e, t.id)}
title="More actions"
>
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
@@ -398,52 +422,56 @@ function PackagePage() {
},
]
: [
// Regular project columns: Version, Filename, Size, Created
// Valid sort fields: created_at, size, original_name
// Regular project columns: Tag, Version, Filename
{
key: 'version',
header: 'Version',
// version is from format_metadata, not a sortable DB field
render: (a: PackageArtifact) => (
key: 'name',
header: 'Tag',
sortable: true,
render: (t: TagDetail) => (
<strong
className={`tag-name-link ${selectedArtifact?.id === a.id ? 'selected' : ''}`}
onClick={() => handleArtifactSelect(a)}
className={`tag-name-link ${selectedTag?.id === t.id ? 'selected' : ''}`}
onClick={() => handleTagSelect(t)}
style={{ cursor: 'pointer' }}
>
<span className="version-badge">{getArtifactVersion(a) || a.id.slice(0, 12)}</span>
{t.name}
</strong>
),
},
{
key: 'original_name',
header: 'Filename',
sortable: true,
className: 'cell-truncate',
render: (a: PackageArtifact) => (
<span title={a.original_name || undefined}>{a.original_name || '—'}</span>
key: 'version',
header: 'Version',
render: (t: TagDetail) => (
<span className="version-badge">{t.version || '—'}</span>
),
},
{
key: 'size',
key: 'artifact_original_name',
header: 'Filename',
className: 'cell-truncate',
render: (t: TagDetail) => (
<span title={t.artifact_original_name || undefined}>{t.artifact_original_name || '—'}</span>
),
},
{
key: 'artifact_size',
header: 'Size',
sortable: true,
render: (a: PackageArtifact) => <span>{formatBytes(a.size)}</span>,
render: (t: TagDetail) => <span>{formatBytes(t.artifact_size)}</span>,
},
{
key: 'created_at',
header: 'Created',
sortable: true,
render: (a: PackageArtifact) => (
<span title={`by ${a.created_by}`}>{new Date(a.created_at).toLocaleDateString()}</span>
render: (t: TagDetail) => (
<span title={`by ${t.created_by}`}>{new Date(t.created_at).toLocaleDateString()}</span>
),
},
{
key: 'actions',
header: '',
render: (a: PackageArtifact) => (
render: (t: TagDetail) => (
<div className="action-buttons">
<a
href={getDownloadUrl(projectName!, packageName!, getDownloadRef(a))}
href={getDownloadUrl(projectName!, packageName!, t.name)}
className="btn btn-icon"
download
title="Download"
@@ -456,7 +484,7 @@ function PackagePage() {
</a>
<button
className="btn btn-icon"
onClick={(e) => handleMenuOpen(e, a.id)}
onClick={(e) => handleMenuOpen(e, t.id)}
title="More actions"
>
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
@@ -470,8 +498,8 @@ function PackagePage() {
},
];
// Find the artifact for the open menu
const openMenuArtifact = artifacts.find(a => a.id === openMenuId);
// Find the tag for the open menu
const openMenuTag = tags.find(t => t.id === openMenuId);
// Close menu when clicking outside
const handleClickOutside = () => {
@@ -483,8 +511,8 @@ function PackagePage() {
// Render dropdown menu as a portal-like element
const renderActionMenu = () => {
if (!openMenuId || !menuPosition || !openMenuArtifact) return null;
const a = openMenuArtifact;
if (!openMenuId || !menuPosition || !openMenuTag) return null;
const t = openMenuTag;
return (
<div
className="action-menu-backdrop"
@@ -495,16 +523,21 @@ function PackagePage() {
style={{ top: menuPosition.top, left: menuPosition.left }}
onClick={(e) => e.stopPropagation()}
>
<button onClick={() => { setViewArtifactId(a.id); setShowArtifactIdModal(true); setOpenMenuId(null); setMenuPosition(null); }}>
<button onClick={() => { setViewArtifactId(t.artifact_id); setShowArtifactIdModal(true); setOpenMenuId(null); setMenuPosition(null); }}>
View Artifact ID
</button>
<button onClick={() => { navigator.clipboard.writeText(a.id); setOpenMenuId(null); setMenuPosition(null); }}>
<button onClick={() => { navigator.clipboard.writeText(t.artifact_id); setOpenMenuId(null); setMenuPosition(null); }}>
Copy Artifact ID
</button>
<button onClick={() => { const version = getArtifactVersion(a); const ref = version || `artifact:${a.id}`; fetchEnsureFileForRef(ref); setOpenMenuId(null); setMenuPosition(null); }}>
<button onClick={() => { fetchEnsureFileForTag(t.name); setOpenMenuId(null); setMenuPosition(null); }}>
View Ensure File
</button>
<button onClick={() => { handleArtifactSelect(a); setShowDepsModal(true); setOpenMenuId(null); setMenuPosition(null); }}>
{canWrite && !isSystemProject && (
<button onClick={() => { setCreateTagArtifactId(t.artifact_id); setShowCreateTagModal(true); setOpenMenuId(null); setMenuPosition(null); }}>
Create/Update Tag
</button>
)}
<button onClick={() => { handleTagSelect(t); setShowDepsModal(true); setOpenMenuId(null); setMenuPosition(null); }}>
View Dependencies
</button>
</div>
@@ -512,7 +545,7 @@ function PackagePage() {
);
};
if (loading && !artifactsData) {
if (loading && !tagsData) {
return <div className="loading">Loading...</div>;
}
@@ -581,8 +614,13 @@ function PackagePage() {
</>
)}
</div>
{pkg && pkg.artifact_count !== undefined && (
{pkg && (pkg.tag_count !== undefined || pkg.artifact_count !== undefined) && (
<div className="package-header-stats">
{!isSystemProject && pkg.tag_count !== undefined && (
<span className="stat-item">
<strong>{pkg.tag_count}</strong> tags
</span>
)}
{pkg.artifact_count !== undefined && (
<span className="stat-item">
<strong>{pkg.artifact_count}</strong> {isSystemProject ? 'versions' : 'artifacts'}
@@ -593,6 +631,11 @@ function PackagePage() {
<strong>{formatBytes(pkg.total_size)}</strong> total
</span>
)}
{!isSystemProject && pkg.latest_tag && (
<span className="stat-item">
Latest: <strong className="accent">{pkg.latest_tag}</strong>
</span>
)}
</div>
)}
</div>
@@ -603,14 +646,14 @@ function PackagePage() {
<div className="section-header">
<h2>{isSystemProject ? 'Versions' : 'Artifacts'}</h2>
<h2>{isSystemProject ? 'Versions' : 'Tags / Versions'}</h2>
</div>
<div className="list-controls">
<SearchInput
value={search}
onChange={handleSearchChange}
placeholder="Filter artifacts..."
placeholder="Filter tags..."
className="list-controls__search"
/>
</div>
@@ -623,13 +666,13 @@ function PackagePage() {
<div className="data-table--responsive">
<DataTable
data={artifacts}
data={tags}
columns={columns}
keyExtractor={(a) => a.id}
keyExtractor={(t) => t.id}
emptyMessage={
hasActiveFilters
? 'No artifacts match your filters. Try adjusting your search.'
: 'No artifacts yet. Upload a file to get started!'
? 'No tags match your filters. Try adjusting your search.'
: 'No tags yet. Upload an artifact with a tag to create one!'
}
onSort={handleSortChange}
sortKey={sort}
@@ -647,13 +690,10 @@ function PackagePage() {
/>
)}
{/* Used By (Reverse Dependencies) Section - only show if there are reverse deps or error */}
{(reverseDeps.length > 0 || reverseDepsError) && (
{/* Used By (Reverse Dependencies) Section - only show if there are reverse deps */}
{reverseDeps.length > 0 && (
<div className="used-by-section card">
<h3>Used By</h3>
{reverseDepsError && (
<div className="error-message">{reverseDepsError}</div>
)}
<div className="reverse-deps-list">
<div className="deps-summary">
{reverseDepsTotal} {reverseDepsTotal === 1 ? 'package depends' : 'packages depend'} on this:
@@ -699,12 +739,24 @@ function PackagePage() {
</div>
)}
<div className="usage-section card">
<h3>Usage</h3>
<p>Download artifacts using:</p>
<pre>
<code>curl -O {window.location.origin}/api/v1/project/{projectName}/{packageName}/+/latest</code>
</pre>
<p>Or with a specific tag:</p>
<pre>
<code>curl -O {window.location.origin}/api/v1/project/{projectName}/{packageName}/+/v1.0.0</code>
</pre>
</div>
{/* Dependency Graph Modal */}
{showGraph && selectedArtifact && (
{showGraph && selectedTag && (
<DependencyGraph
projectName={projectName!}
packageName={packageName!}
tagName={getArtifactVersion(selectedArtifact) || `artifact:${selectedArtifact.id}`}
tagName={selectedTag.name}
onClose={() => setShowGraph(false)}
/>
)}
@@ -727,12 +779,24 @@ function PackagePage() {
</button>
</div>
<div className="modal-body">
<div className="form-group">
<label htmlFor="upload-tag">Tag (optional)</label>
<input
id="upload-tag"
type="text"
value={uploadTag}
onChange={(e) => setUploadTag(e.target.value)}
placeholder="v1.0.0, latest, stable..."
/>
</div>
<DragDropUpload
projectName={projectName!}
packageName={packageName!}
tag={uploadTag || undefined}
onUploadComplete={(result) => {
handleUploadComplete(result);
setShowUploadModal(false);
setUploadTag('');
}}
onUploadError={handleUploadError}
/>
@@ -741,6 +805,74 @@ function PackagePage() {
</div>
)}
{/* Create/Update Tag Modal */}
{showCreateTagModal && (
<div className="modal-overlay" onClick={() => setShowCreateTagModal(false)}>
<div className="create-tag-modal" onClick={(e) => e.stopPropagation()}>
<div className="modal-header">
<h3>Create / Update Tag</h3>
<button
className="modal-close"
onClick={() => { setShowCreateTagModal(false); setCreateTagName(''); setCreateTagArtifactId(''); }}
title="Close"
>
<svg width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2">
<line x1="18" y1="6" x2="6" y2="18"></line>
<line x1="6" y1="6" x2="18" y2="18"></line>
</svg>
</button>
</div>
<div className="modal-body">
<p className="modal-description">Point a tag at an artifact by its ID</p>
<form onSubmit={(e) => { handleCreateTag(e); setShowCreateTagModal(false); }}>
<div className="form-group">
<label htmlFor="modal-tag-name">Tag Name</label>
<input
id="modal-tag-name"
type="text"
value={createTagName}
onChange={(e) => setCreateTagName(e.target.value)}
placeholder="latest, stable, v1.0.0..."
disabled={createTagLoading}
/>
</div>
<div className="form-group">
<label htmlFor="modal-artifact-id">Artifact ID</label>
<input
id="modal-artifact-id"
type="text"
value={createTagArtifactId}
onChange={(e) => setCreateTagArtifactId(e.target.value.toLowerCase().replace(/[^a-f0-9]/g, '').slice(0, 64))}
placeholder="SHA256 hash (64 hex characters)"
className="artifact-id-input"
disabled={createTagLoading}
/>
{createTagArtifactId.length > 0 && createTagArtifactId.length !== 64 && (
<p className="validation-hint">{createTagArtifactId.length}/64 characters</p>
)}
</div>
<div className="modal-actions">
<button
type="button"
className="btn btn-secondary"
onClick={() => { setShowCreateTagModal(false); setCreateTagName(''); setCreateTagArtifactId(''); }}
>
Cancel
</button>
<button
type="submit"
className="btn btn-primary"
disabled={createTagLoading || !createTagName.trim() || createTagArtifactId.length !== 64}
>
{createTagLoading ? 'Creating...' : 'Create Tag'}
</button>
</div>
</form>
</div>
</div>
</div>
)}
{/* Ensure File Modal */}
{showEnsureFile && (
<div className="modal-overlay" onClick={() => setShowEnsureFile(false)}>
@@ -784,11 +916,11 @@ function PackagePage() {
)}
{/* Dependencies Modal */}
{showDepsModal && selectedArtifact && (
{showDepsModal && selectedTag && (
<div className="modal-overlay" onClick={() => setShowDepsModal(false)}>
<div className="deps-modal" onClick={(e) => e.stopPropagation()}>
<div className="modal-header">
<h3>Dependencies for {selectedArtifact.original_name || selectedArtifact.id.slice(0, 12)}</h3>
<h3>Dependencies for {selectedTag.version || selectedTag.name}</h3>
<button
className="modal-close"
onClick={() => setShowDepsModal(false)}
@@ -838,7 +970,7 @@ function PackagePage() {
{dep.project}/{dep.package}
</Link>
<span className="dep-constraint">
@ {dep.version}
@ {dep.version || dep.tag}
</span>
<span className="dep-status dep-status--ok" title="Package exists">
&#10003;

View File

@@ -349,9 +349,9 @@ function ProjectPage() {
render: (pkg: Package) => <Badge variant="default">{pkg.format}</Badge>,
}] : []),
...(!project?.is_system ? [{
key: 'version_count',
header: 'Versions',
render: (pkg: Package) => pkg.version_count ?? '—',
key: 'tag_count',
header: 'Tags',
render: (pkg: Package) => pkg.tag_count ?? '—',
}] : []),
{
key: 'artifact_count',
@@ -365,10 +365,10 @@ function ProjectPage() {
pkg.total_size !== undefined && pkg.total_size > 0 ? formatBytes(pkg.total_size) : '—',
},
...(!project?.is_system ? [{
key: 'latest_version',
key: 'latest_tag',
header: 'Latest',
render: (pkg: Package) =>
pkg.latest_version ? <strong style={{ color: 'var(--accent-primary)' }}>{pkg.latest_version}</strong> : '—',
pkg.latest_tag ? <strong style={{ color: 'var(--accent-primary)' }}>{pkg.latest_tag}</strong> : '—',
}] : []),
{
key: 'created_at',

View File

@@ -19,6 +19,12 @@ export interface Project {
team_name?: string | null;
}
export interface TagSummary {
name: string;
artifact_id: string;
created_at: string;
}
export interface Package {
id: string;
project_id: string;
@@ -29,11 +35,12 @@ export interface Package {
created_at: string;
updated_at: string;
// Aggregated fields (from PackageDetailResponse)
tag_count?: number;
artifact_count?: number;
version_count?: number;
total_size?: number;
latest_tag?: string | null;
latest_upload_at?: string | null;
latest_version?: string | null;
recent_tags?: TagSummary[];
}
export interface Artifact {
@@ -46,19 +53,22 @@ export interface Artifact {
ref_count: number;
}
export interface PackageArtifact {
export interface Tag {
id: string;
sha256: string;
size: number;
content_type: string | null;
original_name: string | null;
checksum_md5?: string | null;
checksum_sha1?: string | null;
s3_etag?: string | null;
package_id: string;
name: string;
artifact_id: string;
created_at: string;
created_by: string;
format_metadata?: Record<string, unknown> | null;
version?: string | null; // Version from PackageVersion if exists
}
export interface TagDetail extends Tag {
artifact_size: number;
artifact_content_type: string | null;
artifact_original_name: string | null;
artifact_created_at: string;
artifact_format_metadata: Record<string, unknown> | null;
version: string | null;
}
export interface PackageVersion {
@@ -73,9 +83,20 @@ export interface PackageVersion {
size?: number;
content_type?: string | null;
original_name?: string | null;
tags?: string[];
}
export interface ArtifactDetail extends Artifact {}
export interface ArtifactTagInfo {
id: string;
name: string;
package_id: string;
package_name: string;
project_name: string;
}
export interface ArtifactDetail extends Artifact {
tags: ArtifactTagInfo[];
}
export interface PaginatedResponse<T> {
items: T[];
@@ -95,6 +116,8 @@ export interface ListParams {
order?: 'asc' | 'desc';
}
export interface TagListParams extends ListParams {}
export interface PackageListParams extends ListParams {
format?: string;
platform?: string;
@@ -119,6 +142,7 @@ export interface UploadResponse {
size: number;
project: string;
package: string;
tag: string | null;
version: string | null;
version_source: string | null;
}
@@ -141,8 +165,9 @@ export interface SearchResultPackage {
}
export interface SearchResultArtifact {
tag_id: string;
tag_name: string;
artifact_id: string;
version: string | null;
package_id: string;
package_name: string;
project_name: string;
@@ -365,7 +390,8 @@ export interface Dependency {
artifact_id: string;
project: string;
package: string;
version: string;
version: string | null;
tag: string | null;
created_at: string;
}
@@ -379,6 +405,7 @@ export interface DependentInfo {
project: string;
package: string;
version: string | null;
constraint_type: 'version' | 'tag';
constraint_value: string;
}
@@ -401,6 +428,7 @@ export interface ResolvedArtifact {
project: string;
package: string;
version: string | null;
tag: string | null;
size: number;
download_url: string;
}

View File

@@ -228,7 +228,7 @@ minioIngress:
secretName: minio-tls # Overridden by CI
redis:
enabled: true
enabled: false
waitForDatabase: true

View File

@@ -140,7 +140,7 @@ minioIngress:
enabled: false
redis:
enabled: true
enabled: false
waitForDatabase: true

View File

@@ -146,7 +146,7 @@ minioIngress:
# Redis subchart configuration (for future caching)
redis:
enabled: true
enabled: false
image:
registry: containers.global.bsf.tools
repository: bitnami/redis

View File

@@ -1,33 +0,0 @@
-- Migration: Remove tag system
-- Date: 2026-02-03
-- Description: Remove tags table and related objects, keeping only versions for artifact references
-- Drop triggers on tags table
DROP TRIGGER IF EXISTS tags_ref_count_insert_trigger ON tags;
DROP TRIGGER IF EXISTS tags_ref_count_delete_trigger ON tags;
DROP TRIGGER IF EXISTS tags_ref_count_update_trigger ON tags;
DROP TRIGGER IF EXISTS tags_updated_at_trigger ON tags;
DROP TRIGGER IF EXISTS tag_changes_trigger ON tags;
-- Drop the tag change tracking function
DROP FUNCTION IF EXISTS track_tag_changes();
-- Remove tag_constraint from artifact_dependencies
-- First drop the constraint that requires either version or tag
ALTER TABLE artifact_dependencies DROP CONSTRAINT IF EXISTS check_constraint_type;
-- Remove the tag_constraint column
ALTER TABLE artifact_dependencies DROP COLUMN IF EXISTS tag_constraint;
-- Make version_constraint NOT NULL (now the only option)
UPDATE artifact_dependencies SET version_constraint = '*' WHERE version_constraint IS NULL;
ALTER TABLE artifact_dependencies ALTER COLUMN version_constraint SET NOT NULL;
-- Drop tag_history table first (depends on tags)
DROP TABLE IF EXISTS tag_history;
-- Drop tags table
DROP TABLE IF EXISTS tags;
-- Rename uploads.tag_name to uploads.version (historical data field)
ALTER TABLE uploads RENAME COLUMN tag_name TO version;

View File

@@ -1,19 +0,0 @@
{
"name": "EC2 Provisioner Dev Container",
"image": "registry.global.bsf.tools/esv/bsf/bsf-integration/dev-env-setup/provisioner_image:v0.18.1",
"mounts": [
"source=${localEnv:HOME}/.ssh,target=/home/user/.ssh,type=bind,consistency=cached",
"source=${localEnv:HOME}/.okta,target=/home/user/.okta,type=bind,consistency=cached",
"source=${localEnv:HOME}/.netrc,target=/home/user/.netrc,type=bind,consistency=cached"
],
"forwardPorts": [
8000
],
"runArgs": [
"--network=host"
],
"containerUser": "ubuntu",
"remoteUser": "ubuntu",
"updateRemoteUserUID": true,
"onCreateCommand": "sudo usermod -s /bin/bash ubuntu"
}

View File

@@ -1,70 +0,0 @@
data "aws_caller_identity" "current" {}
# Main S3 bucket policy to reject HTTPS requests
data "aws_iam_policy_document" "s3_reject_https_policy" {
statement {
sid = "s3RejectHTTPS"
effect = "Deny"
principals {
type = "*"
identifiers = ["*"]
}
actions = ["s3:*"]
resources = [
aws_s3_bucket.s3_bucket.arn,
"${aws_s3_bucket.s3_bucket.arn}/*",
]
condition {
test = "Bool"
variable = "aws:SecureTransport"
values = ["false"]
}
}
}
# Logging bucket policy to reject HTTPS requests and take logs
data "aws_iam_policy_document" "logging_bucket_policy" {
statement {
principals {
identifiers = ["logging.s3.amazonaws.com"]
type = "Service"
}
actions = ["s3:PutObject"]
resources = ["${aws_s3_bucket.logging.arn}/*"]
condition {
test = "StringEquals"
variable = "aws:SourceAccount"
values = [data.aws_caller_identity.current.account_id]
}
}
statement {
sid = "loggingRejectHTTPS"
effect = "Deny"
principals {
type = "*"
identifiers = ["*"]
}
actions = ["s3:*"]
resources = [
aws_s3_bucket.logging.arn,
"${aws_s3_bucket.logging.arn}/*"
]
condition {
test = "Bool"
variable = "aws:SecureTransport"
values = ["false"]
}
}
}

View File

@@ -1,12 +0,0 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 6.28"
}
}
}
provider "aws" {
region = "us-gov-west-1"
}

View File

@@ -1,137 +0,0 @@
# Disable warnings about MFA delete and IAM access analyzer (currently cannot support them)
# kics-scan disable=c5b31ab9-0f26-4a49-b8aa-4cc064392f4d,e592a0c5-5bdb-414c-9066-5dba7cdea370
# Bucket to actually store artifacts
resource "aws_s3_bucket" "s3_bucket" {
bucket = var.bucket
tags = {
Name = "Orchard S3 Provisioning Bucket"
Environment = var.environment
}
}
# Control public access
resource "aws_s3_bucket_public_access_block" "s3_bucket_public_access_block" {
bucket = aws_s3_bucket.s3_bucket.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
/*
Our lifecycle rule is as follows:
- Standard storage
-> OneZone IA storage after 30 days
-> Glacier storage after 180 days
*/
resource "aws_s3_bucket_lifecycle_configuration" "s3_bucket_lifecycle_configuration" {
bucket = aws_s3_bucket.s3_bucket.id
rule {
id = "Standard to OneZone"
filter {}
status = "Enabled"
transition {
days = 30
storage_class = "ONEZONE_IA"
}
}
rule {
id = "OneZone to Glacier"
filter {}
status = "Enabled"
transition {
days = 180
storage_class = "GLACIER"
}
}
}
# Enable versioning but without MFA delete enabled
resource "aws_s3_bucket_versioning" "s3_bucket_versioning" {
bucket = aws_s3_bucket.s3_bucket.id
versioning_configuration {
status = "Enabled"
}
}
# Give preference to the bucket owner
resource "aws_s3_bucket_ownership_controls" "s3_bucket_ownership_controls" {
bucket = aws_s3_bucket.s3_bucket.id
rule {
object_ownership = "BucketOwnerPreferred"
}
}
# Set access control list to private
resource "aws_s3_bucket_acl" "s3_bucket_acl" {
depends_on = [aws_s3_bucket_ownership_controls.s3_bucket_ownership_controls]
bucket = aws_s3_bucket.s3_bucket.id
acl = var.acl
}
# Bucket for logging
resource "aws_s3_bucket" "logging" {
bucket = "orchard-logging-bucket"
tags = {
Name = "Orchard S3 Logging Bucket"
Environment = var.environment
}
}
# Versioning for the logging bucket
resource "aws_s3_bucket_versioning" "orchard_logging_bucket_versioning" {
bucket = aws_s3_bucket.logging.id
versioning_configuration {
status = "Enabled"
}
}
# Policies for the main s3 bucket and the logging bucket
resource "aws_s3_bucket_policy" "s3_bucket_https_policy" {
bucket = aws_s3_bucket.s3_bucket.id
policy = data.aws_iam_policy_document.s3_reject_https_policy.json
}
resource "aws_s3_bucket_policy" "logging_policy" {
bucket = aws_s3_bucket.logging.bucket
policy = data.aws_iam_policy_document.logging_bucket_policy.json
}
# Set up the logging bucket with folders with logs for both buckets
resource "aws_s3_bucket_logging" "s3_bucket_logging" {
bucket = aws_s3_bucket.s3_bucket.bucket
target_bucket = aws_s3_bucket.logging.bucket
target_prefix = "s3_log/"
target_object_key_format {
partitioned_prefix {
partition_date_source = "EventTime"
}
}
}
resource "aws_s3_bucket_logging" "logging_bucket_logging" {
bucket = aws_s3_bucket.logging.bucket
target_bucket = aws_s3_bucket.logging.bucket
target_prefix = "log/"
target_object_key_format {
partitioned_prefix {
partition_date_source = "EventTime"
}
}
}

View File

@@ -1,17 +0,0 @@
variable "bucket" {
description = "Name of the S3 bucket"
type = string
default = "orchard-provisioning-bucket"
}
variable "acl" {
description = "Access control list for the bucket"
type = string
default = "private"
}
variable "environment" {
description = "Environment of the bucket"
type = string
default = "Development"
}