orchard

Author	SHA1	Message	Date
Mondo Diaz	c21af708af	Fix PyPI proxy timeout by streaming from S3 instead of loading into memory Large packages like TensorFlow (~600MB) caused read timeouts because the entire file was loaded into memory before responding to the client. Now the file is stored to S3 first, then streamed back using StreamingResponse.	2026-02-03 16:42:30 -06:00
Mondo Diaz	c4c9c20763	Remove tag system, use versions only for artifact references Tags were mutable aliases that caused confusion alongside the immutable version system. This removes tags entirely, keeping only PackageVersion for artifact references. Changes: - Remove tags and tag_history tables (migration 012) - Remove Tag model, TagRepository, and 6 tag API endpoints - Update cache system to create versions instead of tags - Update frontend to display versions instead of tags - Remove tag-related schemas and types - Update artifact cleanup service for version-based ref_count	2026-02-03 12:18:19 -06:00
Mondo Diaz	17d3004058	Pass upstream policy errors through PyPI proxy to users - Add _parse_upstream_error() to extract policy messages from JFrog/Artifactory - Pass through 403 and other 4xx errors with detailed messages - Pin babel and electron-to-chromium to older versions for CI compatibility	2026-02-03 08:09:08 -06:00
Mondo Diaz	081cc6df83	Remove proactive PyPI dependency caching feature The background task queue for proactively caching package dependencies was causing server instability and unnecessary growth. The PyPI proxy now only caches packages on-demand when users request them. Removed: - PyPI cache worker (background task queue and worker pool) - PyPICacheTask model and related database schema - Cache management API endpoints (/pypi/cache/*) - Background Jobs admin dashboard - Dependency extraction and queueing logic Kept: - On-demand package caching (still works when users request packages) - Async httpx for non-blocking downloads (prevents health check failures) - URL-based cache lookups for deduplication	2026-02-02 16:17:33 -06:00
Mondo Diaz	1329d380a4	Convert PyPI proxy from sync to async httpx to prevent event loop blocking The pypi_download_file, pypi_simple_index, and pypi_package_versions endpoints were using synchronous httpx.Client inside async functions. When upstream PyPI servers respond slowly, this blocked the entire FastAPI event loop, preventing health checks from responding. Kubernetes would then kill the pod after the liveness probe timed out. Changes: - httpx.Client → httpx.AsyncClient - client.get() → await client.get() - response.iter_bytes() → response.aiter_bytes() This ensures the event loop remains responsive during slow upstream downloads, allowing health checks to succeed even when downloads take 20+ seconds.	2026-02-02 15:26:24 -06:00
Mondo Diaz	361210a2bc	Add cancel job button and improve jobs table UI - Remove "All Jobs" title - Move Status column to front of table - Add Cancel button for in-progress jobs - Add cancel endpoint: POST /pypi/cache/cancel/{package_name} - Add btn-danger CSS styling	2026-02-02 15:18:59 -06:00
Mondo Diaz	415ad9a29a	Stream downloads to temp file to reduce memory usage - Download packages in 64KB chunks to temp file instead of loading into memory - Upload to S3 from temp file (streaming) - Clean up temp file after processing - Reduces memory footprint from 2x file size to 1x file size	2026-02-02 15:10:25 -06:00
Mondo Diaz	92edef92e6	Redesign jobs dashboard with unified table and progress bar - Add overall progress bar showing completed/active/failed counts - Unify all job types into single table with Type column - Simplify status to Working/Pending/Failed badges - Remove NPM "Coming Soon" section - Add get_recent_activity() function for future activity feed - Fix dark mode CSS using CSS variables	2026-02-02 14:34:48 -06:00
Mondo Diaz	1138309aaa	Add Active Workers table to Background Jobs dashboard Shows currently processing cache tasks in a dynamic table with: - Package name and version constraint being cached - Recursion depth and attempt number - Start timestamp - Pulsing indicator to show live activity Backend changes: - Add get_active_tasks() function to pypi_cache_worker.py - Add GET /pypi/cache/active endpoint to pypi_proxy.py Frontend changes: - Add PyPICacheActiveTask type - Add getPyPICacheActiveTasks() API function - Add Active Workers section with animated table - Auto-refreshes every 5 seconds with existing data	2026-02-02 13:50:45 -06:00
Mondo Diaz	3bdeade7ca	Fix nested dependency depth tracking in PyPI cache worker When the cache worker downloaded a package through the proxy, dependencies were always queued with depth=0 instead of depth+1. This meant depth limits weren't properly enforced for nested dependencies. Changes: - Add cache-depth query parameter to pypi_download_file endpoint - Worker now passes its current depth when fetching packages - Dependencies are queued at cache_depth+1 instead of hardcoded 0 - Add tests for depth tracking behavior	2026-02-02 13:47:22 -06:00
Mondo Diaz	97b39d000b	Add security fixes and code cleanup for PyPI cache - Add require_admin authentication to cache management endpoints - Add limit validation (1-500) on failed tasks query - Add thread lock for worker pool thread safety - Fix exception handling with separate recovery DB session - Remove obsolete design doc	2026-02-02 11:37:25 -06:00
Mondo Diaz	d274f3f375	Add robust PyPI dependency caching with task queue Replace unbounded thread spawning with managed worker pool: - New pypi_cache_tasks table tracks caching jobs - Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS) - Automatic retries with exponential backoff (30s, 60s, then fail) - Deduplication to prevent duplicate caching attempts New API endpoints for visibility and control: - GET /pypi/cache/status - queue health summary - GET /pypi/cache/failed - list failed tasks with errors - POST /pypi/cache/retry/{package} - retry single package - POST /pypi/cache/retry-all - retry all failed packages This fixes silent failures in background dependency caching where packages would fail to cache without any tracking or retry mechanism.	2026-02-02 11:16:02 -06:00
Mondo Diaz	3c2ab70ef0	Fix proactive dependency caching HTTPS redirect issue When background threads fetch from our own proxy using the request's base_url, it returns http:// but ingress requires https://. The 308 redirect was dropping trailing slashes, causing requests to hit the frontend catch-all route instead of /pypi/simple/. Force HTTPS explicitly in the background caching function to avoid the redirect entirely.	2026-01-30 18:59:31 -06:00
Mondo Diaz	109a593f83	Add debug logging for proactive caching regex failures	2026-01-30 18:43:09 -06:00
Mondo Diaz	1d727b3f8c	Fix proactive caching regex to match both hyphens and underscores PEP 503 normalizes package names to use hyphens, but wheel filenames may use underscores (e.g., typing_extensions-4.0.0-py3-none-any.whl). Convert the search pattern to match either separator.	2026-01-30 18:25:30 -06:00
Mondo Diaz	47aa0afe91	Fix proactive caching failing on HTTP->HTTPS redirects The background dependency caching was getting 308 redirects because request.base_url returns http:// but the ingress redirects to https://. Enable follow_redirects=True in httpx client to handle this.	2026-01-30 18:11:08 -06:00
Mondo Diaz	f992fc540e	Add proactive dependency caching for PyPI packages When a PyPI package is cached, its dependencies are now automatically fetched in background threads. This ensures the entire dependency tree is cached even if pip already has some packages installed locally. Features: - Background threads fetch each dependency without blocking the response - Uses our own proxy endpoint to cache, which recursively caches transitive deps - Max depth of 10 to prevent infinite loops - Daemon threads so they don't block process shutdown	2026-01-30 17:45:30 -06:00
Mondo Diaz	044a6c1d27	Fix duplicate dependency constraint causing 500 errors - Deduplicate dependencies by package name before inserting - Some packages (like anyio) list the same dep (trio) multiple times with different version constraints for different extras - The unique constraint on (artifact_id, project, package) rejected these - Also removed debug logging from dependencies.py	2026-01-30 17:43:49 -06:00
Mondo Diaz	47b3eb439d	Extract and store dependencies from PyPI packages - Add functions to parse Requires-Dist metadata from wheel and sdist files - Store extracted dependencies in artifact_dependencies table - Fix streaming response for cached artifacts (proper tuple unpacking) - Fix version uniqueness check to use version string instead of artifact_id - Skip creating versions for .metadata files	2026-01-30 15:14:52 -06:00
Mondo Diaz	ff31379649	Fix: ensure existing _pypi project gets is_system=true	2026-01-30 13:33:31 -06:00
Mondo Diaz	f3afdd3bbf	Improve PyPI proxy and Package page UX PyPI proxy improvements: - Set package format to "pypi" instead of "generic" - Extract version from filename and create PackageVersion record - Support .whl, .tar.gz, and .zip filename formats Package page UX overhaul: - Move upload to header button with modal - Simplify table: combine Tag/Version, remove Type and Artifact ID columns - Add row action menu (⋯) with: Copy ID, Ensure File, Create Tag, Dependencies - Remove cluttered "Download by Artifact ID" and "Create/Update Tag" sections - Add modals for upload and create tag actions - Cleaner, more scannable table layout	2026-01-30 11:52:37 -06:00
Mondo Diaz	2dc7fe5a7b	Fix PyPI proxy: use correct storage method and make project public - Use storage.get_stream(s3_key) instead of non-existent get_artifact_stream() - Make _pypi project public (is_public=True) so cached packages are visible	2026-01-30 10:59:50 -06:00
Mondo Diaz	534e4b964f	Fix Project and Tag model fields in PyPI proxy Use correct model fields: - Project: is_public, is_system, created_by (not visibility) - Tag: add required created_by field	2026-01-30 10:29:25 -06:00
Mondo Diaz	757e43fc34	Fix Artifact model field names in PyPI proxy Use correct Artifact model fields: - original_name instead of filename - Add required created_by and s3_key fields - Include checksum fields from storage result	2026-01-30 09:58:15 -06:00
Mondo Diaz	d78092de55	Fix PyPI proxy to use correct storage.store() method The code was calling storage.store_artifact() which doesn't exist. Changed to use storage.store() which handles content-addressable storage with automatic deduplication.	2026-01-30 09:41:34 -06:00
Mondo Diaz	0fa991f536	Allow full path in PyPI upstream source URL Users can now configure the full path including /simple in their upstream source URL (e.g., https://example.com/api/pypi/repo/simple) instead of having the code append /simple/ automatically. This matches pip's --index-url format, making configuration more intuitive and copy/paste friendly.	2026-01-30 09:24:05 -06:00
Mondo Diaz	4dc54ace8a	Fix HTTPS scheme detection behind reverse proxy When behind a reverse proxy that terminates SSL, the server sees HTTP requests internally. Added _get_base_url() helper that respects the X-Forwarded-Proto header to generate correct external HTTPS URLs. This fixes links in the PyPI simple index showing http:// instead of https:// when accessed via HTTPS through a load balancer.	2026-01-29 18:02:21 -06:00
Mondo Diaz	64bfd3902f	Fix relative URL handling in PyPI proxy Artifactory and other registries may return relative URLs in their Simple API responses (e.g., ../../packages/...). The proxy now resolves these to absolute URLs using urljoin() before encoding them in the upstream parameter. This fixes package downloads failing when the upstream registry uses relative URLs in its package index.	2026-01-29 18:01:19 -06:00
Mondo Diaz	bdfed77cb1	Remove dead code from pypi_proxy.py - Remove unused imports (UpstreamClient, UpstreamClientConfig, UpstreamHTTPError, UpstreamConnectionError, UpstreamTimeoutError) - Simplify matched_source selection logic, removing dead conditional that always evaluated to True due to 'or True'	2026-01-29 16:42:53 -06:00
Mondo Diaz	140f6c926a	Fix httpx.Timeout configuration in PyPI proxy httpx.Timeout requires either a default value or all four parameters. Changed to httpx.Timeout(default, connect=X) format.	2026-01-29 16:40:06 -06:00
Mondo Diaz	97498b2f86	Add transparent PyPI proxy and improve upstream sources UI	2026-01-29 16:12:57 -06:00

31 Commits