- Add artifact-level self-dependency check (skip if dep resolves to same artifact)
- Close dependency graph modal if package has no dependencies to show
(only root package with no children and no missing deps)
PyPI packages can have self-referential dependencies for extras
(e.g., pytest[testing] depends on pytest). These were incorrectly
detected as circular dependencies. Now we skip them.
The backend returns detail as an object for some errors (circular dependency,
conflicts, etc.). The API client now JSON.stringifies object details so they
can be properly parsed by error handlers like DependencyGraph.
When dependencies are not cached on the server (common since we removed
proactive caching), the dependency graph now:
- Continues resolving what it can find
- Shows missing dependencies in a separate section with amber styling
- Displays the constraint and which package required them
- Updates the header stats to show "X cached • Y not cached"
This provides a better user experience than showing an error when
some dependencies haven't been downloaded yet.
When a dependency has an invalid version constraint like '>=' (without
a version number), the resolver now treats it as a wildcard and returns
the latest available version instead of failing with 'Dependency not found'.
This handles malformed metadata that may have been stored from PyPI packages.
The background task queue for proactively caching package dependencies was
causing server instability and unnecessary growth. The PyPI proxy now only
caches packages on-demand when users request them.
Removed:
- PyPI cache worker (background task queue and worker pool)
- PyPICacheTask model and related database schema
- Cache management API endpoints (/pypi/cache/*)
- Background Jobs admin dashboard
- Dependency extraction and queueing logic
Kept:
- On-demand package caching (still works when users request packages)
- Async httpx for non-blocking downloads (prevents health check failures)
- URL-based cache lookups for deduplication
The pypi_download_file, pypi_simple_index, and pypi_package_versions endpoints
were using synchronous httpx.Client inside async functions. When upstream PyPI
servers respond slowly, this blocked the entire FastAPI event loop, preventing
health checks from responding. Kubernetes would then kill the pod after the
liveness probe timed out.
Changes:
- httpx.Client → httpx.AsyncClient
- client.get() → await client.get()
- response.iter_bytes() → response.aiter_bytes()
This ensures the event loop remains responsive during slow upstream downloads,
allowing health checks to succeed even when downloads take 20+ seconds.
- Remove "All Jobs" title
- Move Status column to front of table
- Add Cancel button for in-progress jobs
- Add cancel endpoint: POST /pypi/cache/cancel/{package_name}
- Add btn-danger CSS styling
- Download packages in 64KB chunks to temp file instead of loading into memory
- Upload to S3 from temp file (streaming)
- Clean up temp file after processing
- Reduces memory footprint from 2x file size to 1x file size
- Add orchard.pypiCache config section to helm values
- Set default workers to 2 (reduced from 5 to limit memory)
- Bump pod memory from 512Mi to 768Mi (request=limit)
- Add ORCHARD_PYPI_CACHE_* env vars to deployment template
- Add overall progress bar showing completed/active/failed counts
- Unify all job types into single table with Type column
- Simplify status to Working/Pending/Failed badges
- Remove NPM "Coming Soon" section
- Add get_recent_activity() function for future activity feed
- Fix dark mode CSS using CSS variables
Backend:
- Add _recover_stale_tasks() to reset tasks stuck in 'in_progress'
from previous crashes (tasks >5 min old get reset to pending)
- Called automatically on startup
Frontend:
- Fix dark mode colors using CSS variables instead of hardcoded values
- Add elapsed time column showing how long task has been running
- Add spinning indicator next to package name
- Add status badge (Running/Stale?)
- Highlight stale tasks (>5 min) in amber
- Auto-updates every 5 seconds with existing refresh
Shows currently processing cache tasks in a dynamic table with:
- Package name and version constraint being cached
- Recursion depth and attempt number
- Start timestamp
- Pulsing indicator to show live activity
Backend changes:
- Add get_active_tasks() function to pypi_cache_worker.py
- Add GET /pypi/cache/active endpoint to pypi_proxy.py
Frontend changes:
- Add PyPICacheActiveTask type
- Add getPyPICacheActiveTasks() API function
- Add Active Workers section with animated table
- Auto-refreshes every 5 seconds with existing data
When the cache worker downloaded a package through the proxy, dependencies
were always queued with depth=0 instead of depth+1. This meant depth limits
weren't properly enforced for nested dependencies.
Changes:
- Add cache-depth query parameter to pypi_download_file endpoint
- Worker now passes its current depth when fetching packages
- Dependencies are queued at cache_depth+1 instead of hardcoded 0
- Add tests for depth tracking behavior
The dashboard was showing "All jobs completed successfully" whenever
there were no failed tasks, even if there were pending or in-progress
jobs. Now shows:
- "All jobs completed" only when pending=0 and in_progress=0
- "Jobs are processing. No failures yet." when jobs are in queue
New admin page at /admin/jobs showing:
- PyPI cache job status (pending, in-progress, completed, failed)
- Failed task list with error details
- Retry individual packages or retry all failed
- Auto-refresh every 5 seconds (toggleable)
- Placeholder for future NPM cache jobs
Accessible from admin dropdown menu as "Background Jobs".
Replace unbounded thread spawning with managed worker pool:
- New pypi_cache_tasks table tracks caching jobs
- Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS)
- Automatic retries with exponential backoff (30s, 60s, then fail)
- Deduplication to prevent duplicate caching attempts
New API endpoints for visibility and control:
- GET /pypi/cache/status - queue health summary
- GET /pypi/cache/failed - list failed tasks with errors
- POST /pypi/cache/retry/{package} - retry single package
- POST /pypi/cache/retry-all - retry all failed packages
This fixes silent failures in background dependency caching where
packages would fail to cache without any tracking or retry mechanism.
When background threads fetch from our own proxy using the request's
base_url, it returns http:// but ingress requires https://. The 308
redirect was dropping trailing slashes, causing requests to hit the
frontend catch-all route instead of /pypi/simple/.
Force HTTPS explicitly in the background caching function to avoid
the redirect entirely.
PEP 503 normalizes package names to use hyphens, but wheel filenames
may use underscores (e.g., typing_extensions-4.0.0-py3-none-any.whl).
Convert the search pattern to match either separator.
The background dependency caching was getting 308 redirects because
request.base_url returns http:// but the ingress redirects to https://.
Enable follow_redirects=True in httpx client to handle this.
When a PyPI package is cached, its dependencies are now automatically
fetched in background threads. This ensures the entire dependency tree
is cached even if pip already has some packages installed locally.
Features:
- Background threads fetch each dependency without blocking the response
- Uses our own proxy endpoint to cache, which recursively caches transitive deps
- Max depth of 10 to prevent infinite loops
- Daemon threads so they don't block process shutdown
- Deduplicate dependencies by package name before inserting
- Some packages (like anyio) list the same dep (trio) multiple times with
different version constraints for different extras
- The unique constraint on (artifact_id, project, package) rejected these
- Also removed debug logging from dependencies.py
- Parse version constraints like >=1.9, <2.0 using packaging library
- Find the latest version that satisfies the constraint
- Support wildcard (*) to get latest version
- Fall back to exact version and tag matching
- Add functions to parse Requires-Dist metadata from wheel and sdist files
- Store extracted dependencies in artifact_dependencies table
- Fix streaming response for cached artifacts (proper tuple unpacking)
- Fix version uniqueness check to use version string instead of artifact_id
- Skip creating versions for .metadata files
- Hide tag count stat for system projects (show "versions" instead of "artifacts")
- Hide "Latest" tag stat for system projects
- Change "Create/Update Tag" to only show for non-system projects
- Add "View Artifact ID" menu option with modal showing the SHA256 hash
- Move dependencies section to a modal (opened via "View Dependencies" menu)
- Add deps-modal and artifact-id-modal CSS styles
- Fix artifact_count and total_size calculation to use Tags instead of
Uploads, so PyPI cached packages show their stats correctly
- Fix PackagePage dropdown menu positioning (use fixed position with backdrop)
- Add system project detection for projects starting with "_"
- Show Version as primary column for system projects, hide Tag column
- Hide upload button for system projects (they're cache-only)
- Rename section header to "Versions" for system projects
- Fix test_projects_sort_by_name to exclude system projects from sort comparison
PyPI proxy improvements:
- Set package format to "pypi" instead of "generic"
- Extract version from filename and create PackageVersion record
- Support .whl, .tar.gz, and .zip filename formats
Package page UX overhaul:
- Move upload to header button with modal
- Simplify table: combine Tag/Version, remove Type and Artifact ID columns
- Add row action menu (⋯) with: Copy ID, Ensure File, Create Tag, Dependencies
- Remove cluttered "Download by Artifact ID" and "Create/Update Tag" sections
- Add modals for upload and create tag actions
- Cleaner, more scannable table layout
Projects owned by teams now display the team name in the Owner column
for better organizational continuity when team members change.
Falls back to created_by if no team is assigned.
- Use storage.get_stream(s3_key) instead of non-existent get_artifact_stream()
- Make _pypi project public (is_public=True) so cached packages are visible