orchard

Author	SHA1	Message	Date
Mondo Diaz	6b9863f9c3	fix: fetch root artifact from upstream when missing in auto_fetch mode When auto_fetch=true and the root artifact doesn't exist locally in a system project (_pypi), now attempts to fetch it from upstream before starting dependency resolution. Also fixed a bug where fetched_artifacts was being redeclared, which would lose the root artifact from the list.	2026-02-04 12:18:44 -06:00
Mondo Diaz	5cff4092e3	feat: add auto-fetch for missing dependencies from upstream registries Add auto_fetch parameter to dependency resolution endpoint that fetches missing dependencies from upstream registries (PyPI) when resolving. - Add RegistryClient abstraction with PyPIRegistryClient implementation - Extract fetch_and_cache_pypi_package() for reuse - Add resolve_dependencies_with_fetch() async function - Extend MissingDependency schema with fetch_attempted/fetch_error - Add fetched list to DependencyResolutionResponse - Add auto_fetch_max_depth config setting (default: 3) - Remove Usage section from Package page UI - Add 6 integration tests for auto-fetch functionality	2026-02-04 12:01:49 -06:00
Mondo Diaz	b82bd1c85a	fix: remove dead code and security issue from code review - Remove unused _get_pypi_upstream_sources_cached function (never called) - Remove unused CacheService import and get_cache helper - Remove unused cache parameter from pypi_download_file - Fix asyncio.get_event_loop() deprecation - use get_running_loop() - Note: The caching implementation was incomplete but the other performance improvements (connection pooling, batch DB ops) remain	2026-02-04 10:57:32 -06:00
Mondo Diaz	170561b32a	feat: add infrastructure status to health endpoint	2026-02-04 09:54:45 -06:00
Mondo Diaz	7ad5a15ef4	perf: use batch dependency storage in pypi_proxy	2026-02-04 09:52:16 -06:00
Mondo Diaz	8fdb73901e	perf: use shared HTTP client pool in pypi_download_file	2026-02-04 09:51:05 -06:00
Mondo Diaz	79dd7b833e	perf: cache upstream sources lookup in pypi_proxy	2026-02-04 09:49:59 -06:00
Mondo Diaz	71089aee0e	refactor: add infrastructure dependency injection to pypi_proxy Add dependency injection helper functions for HttpClientManager and CacheService, along with imports for the new infrastructure modules (http_client, cache_service, db_utils).	2026-02-04 09:49:04 -06:00
Mondo Diaz	ffe0529ea8	feat: add ArtifactRepository with batch DB operations Add optimized database operations for artifact storage: - Atomic upserts using ON CONFLICT for artifact creation - Batch inserts for dependencies to eliminate N+1 queries - Joined queries for cached URL lookups - All methods include comprehensive unit tests	2026-02-04 09:48:08 -06:00
Mondo Diaz	146ca2ad74	feat: integrate HttpClientManager and CacheService into lifespan	2026-02-04 09:45:09 -06:00
Mondo Diaz	a045509fe4	feat: add CacheService with Redis caching and graceful fallback Implements Redis-backed caching with category-aware TTL management: - Immutable categories (artifact metadata, dependencies) cached forever - Mutable categories (index pages, upstream sources) use configurable TTL - Graceful fallback when Redis unavailable or disabled - Pattern-based invalidation for bulk cache clearing	2026-02-04 09:44:12 -06:00
Mondo Diaz	14806b05f0	feat: add HttpClientManager with connection pooling Add HttpClientManager class for managing httpx.AsyncClient pools with FastAPI lifespan integration. Features include: - Default shared connection pool for general requests - Configurable max connections, keep-alive, and timeouts - Dedicated thread pool for blocking I/O operations - Graceful startup/shutdown lifecycle management - Per-upstream client isolation support (for future use) Includes comprehensive unit tests covering initialization, startup, shutdown, client retrieval, blocking operations, idempotency, and error handling.	2026-02-04 09:16:27 -06:00
Mondo Diaz	c67004af52	config: add HTTP pool, Redis, and updated DB pool settings	2026-02-04 09:12:01 -06:00
Mondo Diaz	19e034ef56	Fix duplicate dependency extraction from PyPI wheel METADATA Wheel METADATA files can list the same dependency multiple times under different extras (e.g., bokeh appears under [docs] and [bokeh-tests]). This caused unique constraint violations when storing dependencies. Fix by deduplicating extracted deps before DB insertion.	2026-02-03 17:43:38 -06:00
Mondo Diaz	45a48cc1ee	Add inline migration for tag removal (024_remove_tags) Adds the tag removal migration to the inline migrations in database.py: - Drops tag-related triggers and functions - Removes tag_constraint column from artifact_dependencies - Makes version_constraint NOT NULL - Drops tags and tag_history tables - Renames uploads.tag_name to version	2026-02-03 17:22:40 -06:00
Mondo Diaz	7068f36cb5	Restore dependency extraction from PyPI packages Re-adds the dependency extraction that was accidentally removed with the proactive caching feature. Now when a PyPI package is cached: 1. Extract METADATA from wheel or PKG-INFO from sdist 2. Parse Requires-Dist lines for dependencies 3. Store in artifact_dependencies table This restores the dependency graph functionality for PyPI packages.	2026-02-03 17:18:54 -06:00
Mondo Diaz	e471202f2e	Fix SQLAlchemy subquery warning in artifact listing	2026-02-03 17:10:34 -06:00
Mondo Diaz	d12e4cdfc5	Add configurable PyPI download mode (redirect vs proxy) Adds ORCHARD_PYPI_DOWNLOAD_MODE setting (default: "redirect"): - "redirect": Redirect pip to S3 presigned URL - reduces pod bandwidth - "proxy": Stream through Orchard pod - for environments where clients can't reach S3 In redirect mode, Orchard only handles metadata requests and upstream fetches. All file transfers go directly from S3 to the client.	2026-02-03 17:09:05 -06:00
Mondo Diaz	1ffe17bf62	Fix artifact listing to include PyPI proxy cached packages The list_package_artifacts endpoint was only querying artifacts via the Upload table. PyPI proxy creates PackageVersion records but not Upload records, so cached packages would show stats (size, version count) but no artifacts in the listing. Now queries artifacts from both Upload and PackageVersion tables using a union, so PyPI-cached packages display their artifacts correctly.	2026-02-03 16:46:35 -06:00
Mondo Diaz	c21af708af	Fix PyPI proxy timeout by streaming from S3 instead of loading into memory Large packages like TensorFlow (~600MB) caused read timeouts because the entire file was loaded into memory before responding to the client. Now the file is stored to S3 first, then streamed back using StreamingResponse.	2026-02-03 16:42:30 -06:00
Mondo Diaz	1ae989249b	Fix PackageArtifactResponse missing sha256 and version fields - Add sha256 field to list_package_artifacts response (artifact ID is SHA256) - Add version field to PackageArtifactResponse schema - Add version field to frontend PackageArtifact type - Update getArtifactVersion to prefer direct version field	2026-02-03 16:24:31 -06:00
Mondo Diaz	c0c8603d05	Fix migrations 008 and 011 to handle removed tags table	2026-02-03 16:05:29 -06:00
Mondo Diaz	2501ba21d4	Fix migration 005 to not create indexes on removed tags table	2026-02-03 16:01:09 -06:00
Mondo Diaz	c94fe0389b	Fix tests for tag removal and version behavior - Fix upload response to return actual version (not requested version) when artifact already has a version in the package - Update ref_count tests to use multiple packages (one version per artifact per package design constraint) - Remove allow_public_internet references from upstream caching tests - Update consistency check test to not assert global system health - Add versions field to artifact schemas - Fix dependencies resolution to handle removed tag constraint	2026-02-03 15:35:44 -06:00
Mondo Diaz	c4c9c20763	Remove tag system, use versions only for artifact references Tags were mutable aliases that caused confusion alongside the immutable version system. This removes tags entirely, keeping only PackageVersion for artifact references. Changes: - Remove tags and tag_history tables (migration 012) - Remove Tag model, TagRepository, and 6 tag API endpoints - Update cache system to create versions instead of tags - Update frontend to display versions instead of tags - Remove tag-related schemas and types - Update artifact cleanup service for version-based ref_count	2026-02-03 12:18:19 -06:00
Mondo Diaz	62c709e368	Remove superuser-only session_replication_role from factory reset	2026-02-03 11:19:50 -06:00
Mondo Diaz	281474d72f	Fix self-dependency detection to strip PyPI extras brackets The circular dependency error '_pypi/psutil → _pypi/psutil' occurred because dependencies with extras like 'psutil[test]' weren't being recognized as self-dependencies. The comparison 'psutil[test] != psutil' failed. - Add _normalize_pypi_package_name() helper that strips extras brackets and normalizes separators per PEP 503 - Update _detect_package_cycle to use normalized names for cycle detection - Update check_circular_dependencies to use normalized initial path - Simplify self-dependency check in resolve_dependencies to use helper	2026-02-03 10:17:13 -06:00
Mondo Diaz	bb7c30b15c	Fix circular dependency resolution by switching to artifact-centric display - Add artifact: prefix handling in resolve_dependencies for direct artifact ID references, enabling dependency resolution for tagless artifacts - Refactor PackagePage from tag-based to artifact-based data display - Add PackageArtifact type with tags array for artifact-centric API responses - Update download URLs to use artifact:ID prefix when no tags exist - Conditionally show "View Ensure File" only when artifact has tags	2026-02-03 10:00:15 -06:00
Mondo Diaz	bf2737b3a2	Fix self-dependency check to use case-insensitive PyPI name normalization	2026-02-03 08:23:39 -06:00
Mondo Diaz	17d3004058	Pass upstream policy errors through PyPI proxy to users - Add _parse_upstream_error() to extract policy messages from JFrog/Artifactory - Pass through 403 and other 4xx errors with detailed messages - Pin babel and electron-to-chromium to older versions for CI compatibility	2026-02-03 08:09:08 -06:00
Mondo Diaz	34ff9caa08	Fix circular dependency error message to show actual cycle path The error was hardcoding [pkg_key, pkg_key] regardless of actual cycle. Now tracks the path through dependencies to report the real cycle.	2026-02-02 20:43:05 -06:00
Mondo Diaz	01915bcb45	Fix circular dependency detection and hide empty graph modal - Add artifact-level self-dependency check (skip if dep resolves to same artifact) - Close dependency graph modal if package has no dependencies to show (only root package with no children and no missing deps)	2026-02-02 20:31:46 -06:00
Mondo Diaz	72952d84a1	Skip self-dependencies in dependency resolver PyPI packages can have self-referential dependencies for extras (e.g., pytest[testing] depends on pytest). These were incorrectly detected as circular dependencies. Now we skip them.	2026-02-02 19:45:34 -06:00
Mondo Diaz	b3ae3b03eb	Show missing dependencies in dependency graph instead of failing When dependencies are not cached on the server (common since we removed proactive caching), the dependency graph now: - Continues resolving what it can find - Shows missing dependencies in a separate section with amber styling - Displays the constraint and which package required them - Updates the header stats to show "X cached • Y not cached" This provides a better user experience than showing an error when some dependencies haven't been downloaded yet.	2026-02-02 16:29:37 -06:00
Mondo Diaz	ba0a658611	Fix dependency graph error for invalid version constraints When a dependency has an invalid version constraint like '>=' (without a version number), the resolver now treats it as a wildcard and returns the latest available version instead of failing with 'Dependency not found'. This handles malformed metadata that may have been stored from PyPI packages.	2026-02-02 16:26:18 -06:00
Mondo Diaz	081cc6df83	Remove proactive PyPI dependency caching feature The background task queue for proactively caching package dependencies was causing server instability and unnecessary growth. The PyPI proxy now only caches packages on-demand when users request them. Removed: - PyPI cache worker (background task queue and worker pool) - PyPICacheTask model and related database schema - Cache management API endpoints (/pypi/cache/*) - Background Jobs admin dashboard - Dependency extraction and queueing logic Kept: - On-demand package caching (still works when users request packages) - Async httpx for non-blocking downloads (prevents health check failures) - URL-based cache lookups for deduplication	2026-02-02 16:17:33 -06:00
Mondo Diaz	1329d380a4	Convert PyPI proxy from sync to async httpx to prevent event loop blocking The pypi_download_file, pypi_simple_index, and pypi_package_versions endpoints were using synchronous httpx.Client inside async functions. When upstream PyPI servers respond slowly, this blocked the entire FastAPI event loop, preventing health checks from responding. Kubernetes would then kill the pod after the liveness probe timed out. Changes: - httpx.Client → httpx.AsyncClient - client.get() → await client.get() - response.iter_bytes() → response.aiter_bytes() This ensures the event loop remains responsive during slow upstream downloads, allowing health checks to succeed even when downloads take 20+ seconds.	2026-02-02 15:26:24 -06:00
Mondo Diaz	361210a2bc	Add cancel job button and improve jobs table UI - Remove "All Jobs" title - Move Status column to front of table - Add Cancel button for in-progress jobs - Add cancel endpoint: POST /pypi/cache/cancel/{package_name} - Add btn-danger CSS styling	2026-02-02 15:18:59 -06:00
Mondo Diaz	415ad9a29a	Stream downloads to temp file to reduce memory usage - Download packages in 64KB chunks to temp file instead of loading into memory - Upload to S3 from temp file (streaming) - Clean up temp file after processing - Reduces memory footprint from 2x file size to 1x file size	2026-02-02 15:10:25 -06:00
Mondo Diaz	92edef92e6	Redesign jobs dashboard with unified table and progress bar - Add overall progress bar showing completed/active/failed counts - Unify all job types into single table with Type column - Simplify status to Working/Pending/Failed badges - Remove NPM "Coming Soon" section - Add get_recent_activity() function for future activity feed - Fix dark mode CSS using CSS variables	2026-02-02 14:34:48 -06:00
Mondo Diaz	47b137f4eb	Improve Active Workers table and recover stale tasks Backend: - Add _recover_stale_tasks() to reset tasks stuck in 'in_progress' from previous crashes (tasks >5 min old get reset to pending) - Called automatically on startup Frontend: - Fix dark mode colors using CSS variables instead of hardcoded values - Add elapsed time column showing how long task has been running - Add spinning indicator next to package name - Add status badge (Running/Stale?) - Highlight stale tasks (>5 min) in amber - Auto-updates every 5 seconds with existing refresh	2026-02-02 14:29:17 -06:00
Mondo Diaz	1138309aaa	Add Active Workers table to Background Jobs dashboard Shows currently processing cache tasks in a dynamic table with: - Package name and version constraint being cached - Recursion depth and attempt number - Start timestamp - Pulsing indicator to show live activity Backend changes: - Add get_active_tasks() function to pypi_cache_worker.py - Add GET /pypi/cache/active endpoint to pypi_proxy.py Frontend changes: - Add PyPICacheActiveTask type - Add getPyPICacheActiveTasks() API function - Add Active Workers section with animated table - Auto-refreshes every 5 seconds with existing data	2026-02-02 13:50:45 -06:00
Mondo Diaz	3bdeade7ca	Fix nested dependency depth tracking in PyPI cache worker When the cache worker downloaded a package through the proxy, dependencies were always queued with depth=0 instead of depth+1. This meant depth limits weren't properly enforced for nested dependencies. Changes: - Add cache-depth query parameter to pypi_download_file endpoint - Worker now passes its current depth when fetching packages - Dependencies are queued at cache_depth+1 instead of hardcoded 0 - Add tests for depth tracking behavior	2026-02-02 13:47:22 -06:00
Mondo Diaz	97b39d000b	Add security fixes and code cleanup for PyPI cache - Add require_admin authentication to cache management endpoints - Add limit validation (1-500) on failed tasks query - Add thread lock for worker pool thread safety - Fix exception handling with separate recovery DB session - Remove obsolete design doc	2026-02-02 11:37:25 -06:00
Mondo Diaz	d274f3f375	Add robust PyPI dependency caching with task queue Replace unbounded thread spawning with managed worker pool: - New pypi_cache_tasks table tracks caching jobs - Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS) - Automatic retries with exponential backoff (30s, 60s, then fail) - Deduplication to prevent duplicate caching attempts New API endpoints for visibility and control: - GET /pypi/cache/status - queue health summary - GET /pypi/cache/failed - list failed tasks with errors - POST /pypi/cache/retry/{package} - retry single package - POST /pypi/cache/retry-all - retry all failed packages This fixes silent failures in background dependency caching where packages would fail to cache without any tracking or retry mechanism.	2026-02-02 11:16:02 -06:00
Mondo Diaz	3c2ab70ef0	Fix proactive dependency caching HTTPS redirect issue When background threads fetch from our own proxy using the request's base_url, it returns http:// but ingress requires https://. The 308 redirect was dropping trailing slashes, causing requests to hit the frontend catch-all route instead of /pypi/simple/. Force HTTPS explicitly in the background caching function to avoid the redirect entirely.	2026-01-30 18:59:31 -06:00
Mondo Diaz	109a593f83	Add debug logging for proactive caching regex failures	2026-01-30 18:43:09 -06:00
Mondo Diaz	1d727b3f8c	Fix proactive caching regex to match both hyphens and underscores PEP 503 normalizes package names to use hyphens, but wheel filenames may use underscores (e.g., typing_extensions-4.0.0-py3-none-any.whl). Convert the search pattern to match either separator.	2026-01-30 18:25:30 -06:00
Mondo Diaz	47aa0afe91	Fix proactive caching failing on HTTP->HTTPS redirects The background dependency caching was getting 308 redirects because request.base_url returns http:// but the ingress redirects to https://. Enable follow_redirects=True in httpx client to handle this.	2026-01-30 18:11:08 -06:00
Mondo Diaz	f992fc540e	Add proactive dependency caching for PyPI packages When a PyPI package is cached, its dependencies are now automatically fetched in background threads. This ensures the entire dependency tree is cached even if pip already has some packages installed locally. Features: - Background threads fetch each dependency without blocking the response - Uses our own proxy endpoint to cache, which recursively caches transitive deps - Max depth of 10 to prevent infinite loops - Daemon threads so they don't block process shutdown	2026-01-30 17:45:30 -06:00

1 2 3

108 Commits