121 Commits

Author SHA1 Message Date
Mondo Diaz
ec518519b2 chore: consolidate duplicate CHANGELOG sections after rebase 2026-02-05 09:44:03 -06:00
Mondo Diaz
968cb00477 fix: treat bare version constraints as exact match
When resolving dependencies like certifi@2025.10.5, the bare version
string "2025.10.5" was being rejected as an invalid SpecifierSet and
falling back to wildcard, which fetched the latest version instead.

Now bare versions starting with a digit are automatically prefixed
with "==" to create an exact match constraint.
2026-02-05 09:15:48 -06:00
Mondo Diaz
262aff6e97 fix: add security checks and tests for code review
Security:
- Add authorization checks to list_packages, update_package, delete_package endpoints
- Add MAX_TOTAL_ARTIFACTS limit (1000) to prevent memory exhaustion during dependency resolution
- Add TooManyArtifactsError exception for proper error handling

UI:
- Display reverse dependency errors in PackagePage
- Add warning display for failed dependency fetches in DependencyGraph

Tests:
- Add unit tests for metadata extraction (deb, wheel, tarball, jar)
- Add unit tests for rate limit configuration
- Add unit tests for PyPI registry client
2026-02-05 09:15:48 -06:00
Mondo Diaz
1389a03c69 fix: filter platform-specific and extra dependencies in PyPI proxy
The dependency parser was stripping environment markers but not checking
if they indicated optional or platform-specific packages. This caused
packages like jaraco.path to pull in pyobjc (324 sub-packages) even on
non-macOS systems.

Changes:
- Filter dependencies with 'extra ==' markers (optional extras)
- Filter dependencies with 'sys_platform' or 'platform_system' markers
- Add diagnostic logging for depth exceeded errors
- Add unit tests for dependency filtering

Fixes tensorflow dependency resolution exceeding max depth.
2026-02-05 09:15:48 -06:00
Mondo Diaz
a45ec46e94 chore: increase MAX_DEPENDENCY_DEPTH from 50 to 100 2026-02-05 09:15:48 -06:00
Mondo Diaz
1202947620 chore: remove unused auto_fetch_max_depth config setting 2026-02-05 09:15:48 -06:00
Mondo Diaz
f5c9e438a0 feat: remove fetch depth limit for dependency resolution
Real package managers (pip, npm, Maven) don't have depth limits - they
resolve the full dependency tree. We have other safeguards:
- Loop prevention via fetch_attempted set
- Timeout via auto_fetch_timeout setting
- Dependency trees are finite
2026-02-05 09:15:48 -06:00
Mondo Diaz
aff08ad393 fix: use lenient conflict handling for dependency resolution
Instead of failing with 409 on version conflicts, use "first version wins"
strategy. This allows resolution to succeed for complex dependency trees
like tensorflow where transitive dependencies may have overlapping but
not identical version requirements.

The resolver now:
- Checks if an already-resolved version satisfies a new constraint
- If yes, reuses the existing version
- If no, logs the mismatch and uses the first-encountered version

This matches pip's behavior of picking a working version rather than
failing on theoretical conflicts.
2026-02-05 09:15:48 -06:00
Mondo Diaz
cdb3b5ecb3 feat: increase auto_fetch_max_depth from 3 to 10 2026-02-05 09:15:48 -06:00
Mondo Diaz
659ecf6f73 fix: prevent false circular dependency detection on self-dependencies
When packages like pytest have extras (e.g., pytest[testing]) that depend
on the base package, the resolution was incorrectly detecting this as a
circular dependency.

Added additional check to skip dependencies that resolve to an artifact
already in the visiting set, preventing the false cycle detection while
still catching real circular dependencies.
2026-02-05 09:15:48 -06:00
Mondo Diaz
15cd90b36d feat: change auto_fetch default to true
Auto-fetching missing dependencies from upstream is the more useful default
behavior. Users who need fast, network-free resolution can explicitly set
auto_fetch=false.

Artifacts are content-addressed by SHA256, so reproducibility concerns don't
apply - the same version always produces the same artifact.
2026-02-05 09:15:48 -06:00
Mondo Diaz
65bb073a6e fix: fetch root artifact from upstream when missing in auto_fetch mode
When auto_fetch=true and the root artifact doesn't exist locally in a
system project (_pypi), now attempts to fetch it from upstream before
starting dependency resolution. Also fixed a bug where fetched_artifacts
was being redeclared, which would lose the root artifact from the list.
2026-02-05 09:15:48 -06:00
Mondo Diaz
cbc2e5e11a feat: add auto-fetch for missing dependencies from upstream registries
Add auto_fetch parameter to dependency resolution endpoint that fetches
missing dependencies from upstream registries (PyPI) when resolving.

- Add RegistryClient abstraction with PyPIRegistryClient implementation
- Extract fetch_and_cache_pypi_package() for reuse
- Add resolve_dependencies_with_fetch() async function
- Extend MissingDependency schema with fetch_attempted/fetch_error
- Add fetched list to DependencyResolutionResponse
- Add auto_fetch_max_depth config setting (default: 3)
- Remove Usage section from Package page UI
- Add 6 integration tests for auto-fetch functionality
2026-02-05 09:15:48 -06:00
Mondo Diaz
9f233e0d4d fix: remove dead code and security issue from code review
- Remove unused _get_pypi_upstream_sources_cached function (never called)
- Remove unused CacheService import and get_cache helper
- Remove unused cache parameter from pypi_download_file
- Fix asyncio.get_event_loop() deprecation - use get_running_loop()
- Note: The caching implementation was incomplete but the other
  performance improvements (connection pooling, batch DB ops) remain
2026-02-05 09:15:29 -06:00
Mondo Diaz
b27eb0a928 fix: correct test imports and health endpoint assertions
- Fix import in test_db_utils.py: use app.models instead of backend.app.models
- Update health endpoint test to expect 'ok' status and infrastructure keys
- Add CHANGELOG entries for PyPI proxy performance improvements
2026-02-05 09:15:29 -06:00
Mondo Diaz
9a1d578525 feat: add infrastructure status to health endpoint 2026-02-05 09:15:09 -06:00
Mondo Diaz
08291a2f56 infra: enable Redis in Helm chart values for all environments 2026-02-05 09:15:09 -06:00
Mondo Diaz
f8ad957ff9 test: add infrastructure integration tests for pypi_proxy 2026-02-05 09:15:09 -06:00
Mondo Diaz
331745320d perf: use batch dependency storage in pypi_proxy 2026-02-05 09:15:09 -06:00
Mondo Diaz
a6fee37ea9 perf: use shared HTTP client pool in pypi_download_file 2026-02-05 09:15:09 -06:00
Mondo Diaz
b1056f2286 perf: cache upstream sources lookup in pypi_proxy 2026-02-05 09:15:09 -06:00
Mondo Diaz
2a423d66c0 refactor: add infrastructure dependency injection to pypi_proxy
Add dependency injection helper functions for HttpClientManager
and CacheService, along with imports for the new infrastructure
modules (http_client, cache_service, db_utils).
2026-02-05 09:15:09 -06:00
Mondo Diaz
cd9940da01 feat: add ArtifactRepository with batch DB operations
Add optimized database operations for artifact storage:
- Atomic upserts using ON CONFLICT for artifact creation
- Batch inserts for dependencies to eliminate N+1 queries
- Joined queries for cached URL lookups
- All methods include comprehensive unit tests
2026-02-05 09:15:09 -06:00
Mondo Diaz
bdfc525e71 feat: integrate HttpClientManager and CacheService into lifespan 2026-02-05 09:15:09 -06:00
Mondo Diaz
8d04dd5449 feat: add CacheService with Redis caching and graceful fallback
Implements Redis-backed caching with category-aware TTL management:
- Immutable categories (artifact metadata, dependencies) cached forever
- Mutable categories (index pages, upstream sources) use configurable TTL
- Graceful fallback when Redis unavailable or disabled
- Pattern-based invalidation for bulk cache clearing
2026-02-05 09:15:09 -06:00
Mondo Diaz
743ce26e54 feat: add HttpClientManager with connection pooling
Add HttpClientManager class for managing httpx.AsyncClient pools with
FastAPI lifespan integration. Features include:
- Default shared connection pool for general requests
- Configurable max connections, keep-alive, and timeouts
- Dedicated thread pool for blocking I/O operations
- Graceful startup/shutdown lifecycle management
- Per-upstream client isolation support (for future use)

Includes comprehensive unit tests covering initialization, startup,
shutdown, client retrieval, blocking operations, idempotency, and
error handling.
2026-02-05 09:15:09 -06:00
Mondo Diaz
39ae40f1c6 config: add HTTP pool, Redis, and updated DB pool settings 2026-02-05 09:15:09 -06:00
Mondo Diaz
ca8f62f69b deps: add redis-py for caching layer 2026-02-05 09:15:09 -06:00
Mondo Diaz
b55c810100 docs: add detailed implementation plan for PyPI proxy performance 2026-02-05 09:15:09 -06:00
Mondo Diaz
bef16d884b Add PyPI proxy performance & multi-protocol architecture design
Comprehensive design for:
- HTTP connection pooling with lifecycle management
- Redis caching layer (TTL for discovery, permanent for immutable)
- Abstract PackageProxyBase for multi-protocol support (npm, Maven)
- Database query optimization with batch operations
- Dependency resolution caching for ensure files
- Observability via health endpoints

Maintains hermetic build guarantees: artifact content and extracted
metadata are immutable, only discovery data uses TTL-based caching.
2026-02-05 09:15:09 -06:00
Mondo Diaz
a97d3e630f Fix duplicate dependency extraction from PyPI wheel METADATA
Wheel METADATA files can list the same dependency multiple times under
different extras (e.g., bokeh appears under [docs] and [bokeh-tests]).
This caused unique constraint violations when storing dependencies.

Fix by deduplicating extracted deps before DB insertion.
2026-02-05 09:15:09 -06:00
Mondo Diaz
7b0d423bee Add inline migration for tag removal (024_remove_tags)
Adds the tag removal migration to the inline migrations in database.py:
- Drops tag-related triggers and functions
- Removes tag_constraint column from artifact_dependencies
- Makes version_constraint NOT NULL
- Drops tags and tag_history tables
- Renames uploads.tag_name to version
2026-02-05 09:15:09 -06:00
Mondo Diaz
8731b42d3e Restore dependency extraction from PyPI packages
Re-adds the dependency extraction that was accidentally removed with the
proactive caching feature. Now when a PyPI package is cached:
1. Extract METADATA from wheel or PKG-INFO from sdist
2. Parse Requires-Dist lines for dependencies
3. Store in artifact_dependencies table

This restores the dependency graph functionality for PyPI packages.
2026-02-05 09:15:09 -06:00
Mondo Diaz
a442778458 Fix SQLAlchemy subquery warning in artifact listing 2026-02-05 09:15:09 -06:00
Mondo Diaz
36c05230ff Add configurable PyPI download mode (redirect vs proxy)
Adds ORCHARD_PYPI_DOWNLOAD_MODE setting (default: "redirect"):
- "redirect": Redirect pip to S3 presigned URL - reduces pod bandwidth
- "proxy": Stream through Orchard pod - for environments where clients can't reach S3

In redirect mode, Orchard only handles metadata requests and upstream fetches.
All file transfers go directly from S3 to the client.
2026-02-05 09:15:09 -06:00
Mondo Diaz
dc9c217d8a Fix artifact listing to include PyPI proxy cached packages
The list_package_artifacts endpoint was only querying artifacts via the
Upload table. PyPI proxy creates PackageVersion records but not Upload
records, so cached packages would show stats (size, version count) but
no artifacts in the listing.

Now queries artifacts from both Upload and PackageVersion tables using
a union, so PyPI-cached packages display their artifacts correctly.
2026-02-05 09:15:09 -06:00
Mondo Diaz
da3fd7a601 Fix PyPI proxy timeout by streaming from S3 instead of loading into memory
Large packages like TensorFlow (~600MB) caused read timeouts because the
entire file was loaded into memory before responding to the client. Now
the file is stored to S3 first, then streamed back using StreamingResponse.
2026-02-05 09:15:09 -06:00
Mondo Diaz
9a2b323fd8 Fix PackageArtifactResponse missing sha256 and version fields
- Add sha256 field to list_package_artifacts response (artifact ID is SHA256)
- Add version field to PackageArtifactResponse schema
- Add version field to frontend PackageArtifact type
- Update getArtifactVersion to prefer direct version field
2026-02-05 09:15:09 -06:00
Mondo Diaz
6b3522aef2 Fix migrations 008 and 011 to handle removed tags table 2026-02-05 09:15:09 -06:00
Mondo Diaz
f37d3e3e9a Fix migration 005 to not create indexes on removed tags table 2026-02-05 09:15:09 -06:00
Mondo Diaz
308057784e Fix tests for tag removal and version behavior
- Fix upload response to return actual version (not requested version)
  when artifact already has a version in the package
- Update ref_count tests to use multiple packages (one version per
  artifact per package design constraint)
- Remove allow_public_internet references from upstream caching tests
- Update consistency check test to not assert global system health
- Add versions field to artifact schemas
- Fix dependencies resolution to handle removed tag constraint
2026-02-05 09:15:09 -06:00
Mondo Diaz
86c95bea2b Fix remaining tag references in tests
- Update CacheRequest test to use version field
- Fix upload_test_file calls that still used tag parameter
- Update artifact history test to check versions instead of tags
- Update artifact stats tests to check versions instead of tags
- Fix garbage collection tests to delete versions instead of tags
- Remove TestGlobalTags class (endpoint removed)
- Update project/package stats tests to check version_count
- Fix upload_test_file fixture in test_download_verification
2026-02-05 09:15:09 -06:00
Mondo Diaz
cc5d67abd6 Update tests for tag removal
- Remove Tag/TagHistory model tests from unit tests
- Update CacheSettings tests to remove allow_public_internet field
- Replace tag= with version= in upload_test_file calls
- Update test assertions to use versions instead of tags
- Remove tests for tag: prefix downloads (now uses version:)
- Update dependency tests for version-only schema
2026-02-05 09:15:09 -06:00
Mondo Diaz
eb287edbda Remove obsolete tag support test from DragDropUpload
The tag functionality was removed in the previous commit, so
this test that expected a 'tag' field in the upload FormData
is no longer valid.
2026-02-05 09:15:09 -06:00
Mondo Diaz
86e971381a Remove tag system, use versions only for artifact references
Tags were mutable aliases that caused confusion alongside the immutable
version system. This removes tags entirely, keeping only PackageVersion
for artifact references.

Changes:
- Remove tags and tag_history tables (migration 012)
- Remove Tag model, TagRepository, and 6 tag API endpoints
- Update cache system to create versions instead of tags
- Update frontend to display versions instead of tags
- Remove tag-related schemas and types
- Update artifact cleanup service for version-based ref_count
2026-02-05 09:15:09 -06:00
Mondo Diaz
cf2fe5151f Remove superuser-only session_replication_role from factory reset 2026-02-05 09:15:09 -06:00
Mondo Diaz
2ae479146f Use same variable pattern as integration tests for reset job 2026-02-05 09:15:09 -06:00
Mondo Diaz
a0dad73db0 Add shell-level debug for password variable 2026-02-05 09:15:09 -06:00
Mondo Diaz
b40c53d308 Add debug to detect hidden characters in password 2026-02-05 09:15:09 -06:00
Mondo Diaz
f04149b410 Fix invalid sort field error on package artifact listing
The artifacts endpoint only supports sorting by: created_at, size, original_name
But the frontend was defaulting to 'name' (from the old tags endpoint).

- Change default sort from 'name' to 'created_at'
- Change default order from 'asc' to 'desc' (newest first)
- Remove sortable flag from version/tags columns (not DB fields)
- Add sortable flag to original_name and size columns
2026-02-05 09:15:09 -06:00
Mondo Diaz
aa851ab445 Add debug output to reset_feature job for auth troubleshooting 2026-02-05 09:15:09 -06:00
Mondo Diaz
9313942f53 Fix self-dependency detection to strip PyPI extras brackets
The circular dependency error '_pypi/psutil → _pypi/psutil' occurred because
dependencies with extras like 'psutil[test]' weren't being recognized as
self-dependencies. The comparison 'psutil[test] != psutil' failed.

- Add _normalize_pypi_package_name() helper that strips extras brackets
  and normalizes separators per PEP 503
- Update _detect_package_cycle to use normalized names for cycle detection
- Update check_circular_dependencies to use normalized initial path
- Simplify self-dependency check in resolve_dependencies to use helper
2026-02-05 09:15:09 -06:00
Mondo Diaz
9a795a301a Fix circular dependency resolution by switching to artifact-centric display
- Add artifact: prefix handling in resolve_dependencies for direct artifact
  ID references, enabling dependency resolution for tagless artifacts
- Refactor PackagePage from tag-based to artifact-based data display
- Add PackageArtifact type with tags array for artifact-centric API responses
- Update download URLs to use artifact:ID prefix when no tags exist
- Conditionally show "View Ensure File" only when artifact has tags
2026-02-05 09:15:09 -06:00
Mondo Diaz
9f13221012 Fix progress bar CSS scoping conflict between upload and dashboard 2026-02-05 09:15:09 -06:00
Mondo Diaz
a99381aafb Add reset job after integration tests on feature branches 2026-02-05 09:15:09 -06:00
Mondo Diaz
d422ed5cd8 Fix self-dependency check to use case-insensitive PyPI name normalization 2026-02-05 09:15:09 -06:00
Mondo Diaz
b2a8c7cfcc Pass upstream policy errors through PyPI proxy to users
- Add _parse_upstream_error() to extract policy messages from JFrog/Artifactory
- Pass through 403 and other 4xx errors with detailed messages
- Pin babel and electron-to-chromium to older versions for CI compatibility
2026-02-05 09:15:09 -06:00
Mondo Diaz
eb11efd001 Pin lodash to 4.17.21 to avoid immature package policy block 2026-02-05 09:15:09 -06:00
Mondo Diaz
02e69c65ee Move Dashboard and Teams from navbar to user dropdown menu
Cleaner navbar with just Projects and Docs links.
Dashboard and Teams are now in the user menu dropdown.
2026-02-05 09:15:09 -06:00
Mondo Diaz
34d98f52cb Fix circular dependency error message to show actual cycle path
The error was hardcoding [pkg_key, pkg_key] regardless of actual cycle.
Now tracks the path through dependencies to report the real cycle.
2026-02-05 09:15:09 -06:00
Mondo Diaz
29fa53d174 Replace custom dependency graph with React Flow
- Install reactflow and dagre for professional graph visualization
- Use dagre for automatic tree layout (top-to-bottom)
- Custom styled nodes with package name, version, and size
- Built-in zoom/pan controls and minimap
- Click nodes to navigate to package page
- Cleaner, more professional appearance
2026-02-05 09:15:09 -06:00
Mondo Diaz
63de1ce672 Improve dependency UI: rename to DependGraph, hide empty Used By
- Rename "Dependency Graph" modal title to "DependGraph"
- Hide "Used By" section when no packages depend on this package
2026-02-05 09:15:09 -06:00
Mondo Diaz
0b85f37abd Fix circular dependency detection and hide empty graph modal
- Add artifact-level self-dependency check (skip if dep resolves to same artifact)
- Close dependency graph modal if package has no dependencies to show
  (only root package with no children and no missing deps)
2026-02-05 09:15:09 -06:00
Mondo Diaz
101152f87f Skip self-dependencies in dependency resolver
PyPI packages can have self-referential dependencies for extras
(e.g., pytest[testing] depends on pytest). These were incorrectly
detected as circular dependencies. Now we skip them.
2026-02-05 09:15:09 -06:00
Mondo Diaz
3a09accfe6 Fix [object Object] error when API returns structured error detail
The backend returns detail as an object for some errors (circular dependency,
conflicts, etc.). The API client now JSON.stringifies object details so they
can be properly parsed by error handlers like DependencyGraph.
2026-02-05 09:15:09 -06:00
Mondo Diaz
88765b4f50 Show missing dependencies in dependency graph instead of failing
When dependencies are not cached on the server (common since we removed
proactive caching), the dependency graph now:
- Continues resolving what it can find
- Shows missing dependencies in a separate section with amber styling
- Displays the constraint and which package required them
- Updates the header stats to show "X cached • Y not cached"

This provides a better user experience than showing an error when
some dependencies haven't been downloaded yet.
2026-02-05 09:15:09 -06:00
Mondo Diaz
152af0a852 Fix dependency graph error for invalid version constraints
When a dependency has an invalid version constraint like '>=' (without
a version number), the resolver now treats it as a wildcard and returns
the latest available version instead of failing with 'Dependency not found'.

This handles malformed metadata that may have been stored from PyPI packages.
2026-02-05 09:15:09 -06:00
Mondo Diaz
31edadf3ad Remove proactive PyPI dependency caching feature
The background task queue for proactively caching package dependencies was
causing server instability and unnecessary growth. The PyPI proxy now only
caches packages on-demand when users request them.

Removed:
- PyPI cache worker (background task queue and worker pool)
- PyPICacheTask model and related database schema
- Cache management API endpoints (/pypi/cache/*)
- Background Jobs admin dashboard
- Dependency extraction and queueing logic

Kept:
- On-demand package caching (still works when users request packages)
- Async httpx for non-blocking downloads (prevents health check failures)
- URL-based cache lookups for deduplication
2026-02-05 09:15:09 -06:00
Mondo Diaz
2136e1f0c5 Center text in jobs table columns 2026-02-05 09:15:09 -06:00
Mondo Diaz
ff25677b16 Convert PyPI proxy from sync to async httpx to prevent event loop blocking
The pypi_download_file, pypi_simple_index, and pypi_package_versions endpoints
were using synchronous httpx.Client inside async functions. When upstream PyPI
servers respond slowly, this blocked the entire FastAPI event loop, preventing
health checks from responding. Kubernetes would then kill the pod after the
liveness probe timed out.

Changes:
- httpx.Client → httpx.AsyncClient
- client.get() → await client.get()
- response.iter_bytes() → response.aiter_bytes()

This ensures the event loop remains responsive during slow upstream downloads,
allowing health checks to succeed even when downloads take 20+ seconds.
2026-02-05 09:15:09 -06:00
Mondo Diaz
0a6dad9af0 Add cancel job button and improve jobs table UI
- Remove "All Jobs" title
- Move Status column to front of table
- Add Cancel button for in-progress jobs
- Add cancel endpoint: POST /pypi/cache/cancel/{package_name}
- Add btn-danger CSS styling
2026-02-05 09:15:09 -06:00
Mondo Diaz
36cf288526 Stream downloads to temp file to reduce memory usage
- Download packages in 64KB chunks to temp file instead of loading into memory
- Upload to S3 from temp file (streaming)
- Clean up temp file after processing
- Reduces memory footprint from 2x file size to 1x file size
2026-02-05 09:15:09 -06:00
Mondo Diaz
7008d913bf Increase memory to 1Gi and reduce workers to 1 for stability 2026-02-05 09:15:09 -06:00
Mondo Diaz
46e8c7df70 Add PyPI cache config and bump memory in values-prod.yaml 2026-02-05 09:15:08 -06:00
Mondo Diaz
a3929bfb17 Add PyPI cache config and bump memory in values-stage.yaml 2026-02-05 09:15:08 -06:00
Mondo Diaz
db2805a36c Add PyPI cache config and bump memory in values-dev.yaml 2026-02-05 09:15:08 -06:00
Mondo Diaz
7a6e270d63 Add PyPI cache worker config and increase memory limit
- Add orchard.pypiCache config section to helm values
- Set default workers to 2 (reduced from 5 to limit memory)
- Bump pod memory from 512Mi to 768Mi (request=limit)
- Add ORCHARD_PYPI_CACHE_* env vars to deployment template
2026-02-05 09:15:08 -06:00
Mondo Diaz
df4f9d168b Redesign jobs dashboard with unified table and progress bar
- Add overall progress bar showing completed/active/failed counts
- Unify all job types into single table with Type column
- Simplify status to Working/Pending/Failed badges
- Remove NPM "Coming Soon" section
- Add get_recent_activity() function for future activity feed
- Fix dark mode CSS using CSS variables
2026-02-05 09:15:08 -06:00
Mondo Diaz
1f98caa73c Improve Active Workers table and recover stale tasks
Backend:
- Add _recover_stale_tasks() to reset tasks stuck in 'in_progress'
  from previous crashes (tasks >5 min old get reset to pending)
- Called automatically on startup

Frontend:
- Fix dark mode colors using CSS variables instead of hardcoded values
- Add elapsed time column showing how long task has been running
- Add spinning indicator next to package name
- Add status badge (Running/Stale?)
- Highlight stale tasks (>5 min) in amber
- Auto-updates every 5 seconds with existing refresh
2026-02-05 09:15:08 -06:00
Mondo Diaz
a485852a6f Add Active Workers table to Background Jobs dashboard
Shows currently processing cache tasks in a dynamic table with:
- Package name and version constraint being cached
- Recursion depth and attempt number
- Start timestamp
- Pulsing indicator to show live activity

Backend changes:
- Add get_active_tasks() function to pypi_cache_worker.py
- Add GET /pypi/cache/active endpoint to pypi_proxy.py

Frontend changes:
- Add PyPICacheActiveTask type
- Add getPyPICacheActiveTasks() API function
- Add Active Workers section with animated table
- Auto-refreshes every 5 seconds with existing data
2026-02-05 09:15:08 -06:00
Mondo Diaz
5517048f05 Fix nested dependency depth tracking in PyPI cache worker
When the cache worker downloaded a package through the proxy, dependencies
were always queued with depth=0 instead of depth+1. This meant depth limits
weren't properly enforced for nested dependencies.

Changes:
- Add cache-depth query parameter to pypi_download_file endpoint
- Worker now passes its current depth when fetching packages
- Dependencies are queued at cache_depth+1 instead of hardcoded 0
- Add tests for depth tracking behavior
2026-02-05 09:15:08 -06:00
Mondo Diaz
c7eca269f4 Fix jobs dashboard showing misleading completion message
The dashboard was showing "All jobs completed successfully" whenever
there were no failed tasks, even if there were pending or in-progress
jobs. Now shows:
- "All jobs completed" only when pending=0 and in_progress=0
- "Jobs are processing. No failures yet." when jobs are in queue
2026-02-05 09:15:08 -06:00
Mondo Diaz
6a3a875a9c Add security fixes and code cleanup for PyPI cache
- Add require_admin authentication to cache management endpoints
- Add limit validation (1-500) on failed tasks query
- Add thread lock for worker pool thread safety
- Fix exception handling with separate recovery DB session
- Remove obsolete design doc
2026-02-05 09:15:08 -06:00
Mondo Diaz
a39b6f098f Add Background Jobs dashboard for admin users
New admin page at /admin/jobs showing:
- PyPI cache job status (pending, in-progress, completed, failed)
- Failed task list with error details
- Retry individual packages or retry all failed
- Auto-refresh every 5 seconds (toggleable)
- Placeholder for future NPM cache jobs

Accessible from admin dropdown menu as "Background Jobs".
2026-02-05 09:15:08 -06:00
Mondo Diaz
e0562195df Add robust PyPI dependency caching with task queue
Replace unbounded thread spawning with managed worker pool:
- New pypi_cache_tasks table tracks caching jobs
- Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS)
- Automatic retries with exponential backoff (30s, 60s, then fail)
- Deduplication to prevent duplicate caching attempts

New API endpoints for visibility and control:
- GET /pypi/cache/status - queue health summary
- GET /pypi/cache/failed - list failed tasks with errors
- POST /pypi/cache/retry/{package} - retry single package
- POST /pypi/cache/retry-all - retry all failed packages

This fixes silent failures in background dependency caching where
packages would fail to cache without any tracking or retry mechanism.
2026-02-05 09:15:08 -06:00
Mondo Diaz
db7d0bb7c4 Add design doc for PyPI cache robustness improvements 2026-02-05 09:15:08 -06:00
Mondo Diaz
4a287d46c8 Fix proactive dependency caching HTTPS redirect issue
When background threads fetch from our own proxy using the request's
base_url, it returns http:// but ingress requires https://. The 308
redirect was dropping trailing slashes, causing requests to hit the
frontend catch-all route instead of /pypi/simple/.

Force HTTPS explicitly in the background caching function to avoid
the redirect entirely.
2026-02-05 09:15:08 -06:00
Mondo Diaz
cbea91a528 Add debug logging for proactive caching regex failures 2026-02-05 09:15:08 -06:00
Mondo Diaz
80e2f3d157 Fix proactive caching regex to match both hyphens and underscores
PEP 503 normalizes package names to use hyphens, but wheel filenames
may use underscores (e.g., typing_extensions-4.0.0-py3-none-any.whl).

Convert the search pattern to match either separator.
2026-02-05 09:15:08 -06:00
Mondo Diaz
522d23ec01 Fix proactive caching failing on HTTP->HTTPS redirects
The background dependency caching was getting 308 redirects because
request.base_url returns http:// but the ingress redirects to https://.

Enable follow_redirects=True in httpx client to handle this.
2026-02-05 09:15:08 -06:00
Mondo Diaz
c1060feb5f Add proactive dependency caching for PyPI packages
When a PyPI package is cached, its dependencies are now automatically
fetched in background threads. This ensures the entire dependency tree
is cached even if pip already has some packages installed locally.

Features:
- Background threads fetch each dependency without blocking the response
- Uses our own proxy endpoint to cache, which recursively caches transitive deps
- Max depth of 10 to prevent infinite loops
- Daemon threads so they don't block process shutdown
2026-02-05 09:15:08 -06:00
Mondo Diaz
e62e75bade Fix duplicate dependency constraint causing 500 errors
- Deduplicate dependencies by package name before inserting
- Some packages (like anyio) list the same dep (trio) multiple times with
  different version constraints for different extras
- The unique constraint on (artifact_id, project, package) rejected these
- Also removed debug logging from dependencies.py
2026-02-05 09:15:08 -06:00
Mondo Diaz
befa517485 Add detailed debug logging to _resolve_dependency_to_artifact 2026-02-05 09:15:08 -06:00
Mondo Diaz
7a2c0a54c6 Add debug logging to resolve_dependencies 2026-02-05 09:15:08 -06:00
Mondo Diaz
ead016208d Add backfill script for PyPI package dependencies
Script extracts Requires-Dist metadata from cached PyPI packages
and stores them in artifact_dependencies table.

Usage:
  docker exec <container> python -m backend.scripts.backfill_pypi_dependencies
  docker exec <container> python -m backend.scripts.backfill_pypi_dependencies --dry-run
2026-02-05 09:15:08 -06:00
Mondo Diaz
4b76ca2046 Add PEP 440 version constraint matching for dependency resolution
- Parse version constraints like >=1.9, <2.0 using packaging library
- Find the latest version that satisfies the constraint
- Support wildcard (*) to get latest version
- Fall back to exact version and tag matching
2026-02-05 09:15:08 -06:00
Mondo Diaz
94bbd87e6b Fix ensure file modal z-index when opened from deps modal 2026-02-05 09:15:08 -06:00
Mondo Diaz
2cf04a43ef Extract and store dependencies from PyPI packages
- Add functions to parse Requires-Dist metadata from wheel and sdist files
- Store extracted dependencies in artifact_dependencies table
- Fix streaming response for cached artifacts (proper tuple unpacking)
- Fix version uniqueness check to use version string instead of artifact_id
- Skip creating versions for .metadata files
2026-02-05 09:15:08 -06:00
Mondo Diaz
9acef055b6 Add is_system to all ProjectResponse constructions in routes 2026-02-05 09:15:08 -06:00
Mondo Diaz
694f25ac9b Fix: ensure existing _pypi project gets is_system=true 2026-02-05 09:15:08 -06:00
Mondo Diaz
06b2beb152 Add is_system field to ProjectResponse schema 2026-02-05 09:15:08 -06:00
Mondo Diaz
2b2dbae38b Hide Tags and Latest columns for system projects in package table 2026-02-05 09:15:08 -06:00
Mondo Diaz
cd56d00ebf Improve system project UX and make dependencies a modal
- Hide tag count stat for system projects (show "versions" instead of "artifacts")
- Hide "Latest" tag stat for system projects
- Change "Create/Update Tag" to only show for non-system projects
- Add "View Artifact ID" menu option with modal showing the SHA256 hash
- Move dependencies section to a modal (opened via "View Dependencies" menu)
- Add deps-modal and artifact-id-modal CSS styles
2026-02-05 09:15:08 -06:00
Mondo Diaz
558e1bc78f Fix PyPI proxy UX and package stats calculation
- Fix artifact_count and total_size calculation to use Tags instead of
  Uploads, so PyPI cached packages show their stats correctly
- Fix PackagePage dropdown menu positioning (use fixed position with backdrop)
- Add system project detection for projects starting with "_"
- Show Version as primary column for system projects, hide Tag column
- Hide upload button for system projects (they're cache-only)
- Rename section header to "Versions" for system projects
- Fix test_projects_sort_by_name to exclude system projects from sort comparison
2026-02-05 09:15:08 -06:00
Mondo Diaz
32218dbb1c Hide format filter and column for system projects
System projects like _pypi only contain packages of one format,
so the format filter dropdown and column are redundant.
2026-02-05 09:15:08 -06:00
Mondo Diaz
006df9dff9 Hide Settings and New Package buttons for system projects
System projects should be system-controlled only. Users should not
be able to create packages or change settings on system cache projects.
2026-02-05 09:15:08 -06:00
Mondo Diaz
844e937071 Improve PyPI proxy and Package page UX
PyPI proxy improvements:
- Set package format to "pypi" instead of "generic"
- Extract version from filename and create PackageVersion record
- Support .whl, .tar.gz, and .zip filename formats

Package page UX overhaul:
- Move upload to header button with modal
- Simplify table: combine Tag/Version, remove Type and Artifact ID columns
- Add row action menu (⋯) with: Copy ID, Ensure File, Create Tag, Dependencies
- Remove cluttered "Download by Artifact ID" and "Create/Update Tag" sections
- Add modals for upload and create tag actions
- Cleaner, more scannable table layout
2026-02-05 09:15:08 -06:00
Mondo Diaz
77c7526023 Show team name instead of individual user in Owner column
Projects owned by teams now display the team name in the Owner column
for better organizational continuity when team members change.
Falls back to created_by if no team is assigned.
2026-02-05 09:15:08 -06:00
Mondo Diaz
ec69d7619b Add "(coming soon)" label for unsupported upstream source types
Only pypi and generic are currently supported. Other types now show
"(coming soon)" in both the dropdown and the sources table.
2026-02-05 09:15:08 -06:00
Mondo Diaz
8e3af8c4f5 Fix PyPI proxy: use correct storage method and make project public
- Use storage.get_stream(s3_key) instead of non-existent get_artifact_stream()
- Make _pypi project public (is_public=True) so cached packages are visible
2026-02-05 09:15:08 -06:00
Mondo Diaz
24a0a71cf4 Fix Project and Tag model fields in PyPI proxy
Use correct model fields:
- Project: is_public, is_system, created_by (not visibility)
- Tag: add required created_by field
2026-02-05 09:15:08 -06:00
Mondo Diaz
ab50148a60 Fix Artifact model field names in PyPI proxy
Use correct Artifact model fields:
- original_name instead of filename
- Add required created_by and s3_key fields
- Include checksum fields from storage result
2026-02-05 09:15:08 -06:00
Mondo Diaz
acee458b3c Fix PyPI proxy to use correct storage.store() method
The code was calling storage.store_artifact() which doesn't exist.
Changed to use storage.store() which handles content-addressable
storage with automatic deduplication.
2026-02-05 09:15:08 -06:00
Mondo Diaz
f18b8ed560 Allow full path in PyPI upstream source URL
Users can now configure the full path including /simple in their
upstream source URL (e.g., https://example.com/api/pypi/repo/simple)
instead of having the code append /simple/ automatically.

This matches pip's --index-url format, making configuration more
intuitive and copy/paste friendly.
2026-02-05 09:15:08 -06:00
Mondo Diaz
7e84dd3958 Fix test_rewrite_relative_links assertion to expect correct URL
The test was checking for the wrong URL pattern. When urljoin resolves
../../packages/ab/cd/... relative to /api/pypi/pypi-remote/simple/requests/,
it correctly produces /api/pypi/pypi-remote/packages/ab/cd/... (not
/api/pypi/packages/...).
2026-02-05 09:15:08 -06:00
Mondo Diaz
a72c9d3f6e Improve PyPI proxy test assertions for all status codes
Tests now verify the correct response for each scenario:
- 200: HTML content-type
- 404: "not found" error message
- 503: "No PyPI upstream sources configured" error message
2026-02-05 09:15:08 -06:00
Mondo Diaz
a6618fe550 Fix PyPI proxy tests to work with or without upstream sources
- Tests now accept 200/404/503 responses since upstream sources may or
  may not be configured in the test environment
- Added upstream_base_url parameter to _rewrite_package_links test
- Added test for relative URL resolution (Artifactory-style URLs)
2026-02-05 09:15:08 -06:00
Mondo Diaz
796176c251 Fix HTTPS scheme detection behind reverse proxy
When behind a reverse proxy that terminates SSL, the server sees HTTP
requests internally. Added _get_base_url() helper that respects the
X-Forwarded-Proto header to generate correct external HTTPS URLs.

This fixes links in the PyPI simple index showing http:// instead of
https:// when accessed via HTTPS through a load balancer.
2026-02-05 09:15:08 -06:00
Mondo Diaz
f58fb0079a Fix relative URL handling in PyPI proxy
Artifactory and other registries may return relative URLs in their
Simple API responses (e.g., ../../packages/...). The proxy now resolves
these to absolute URLs using urljoin() before encoding them in the
upstream parameter.

This fixes package downloads failing when the upstream registry uses
relative URLs in its package index.
2026-02-05 09:15:08 -06:00
Mondo Diaz
f57762334f Remove dead code from pypi_proxy.py
- Remove unused imports (UpstreamClient, UpstreamClientConfig,
  UpstreamHTTPError, UpstreamConnectionError, UpstreamTimeoutError)
- Simplify matched_source selection logic, removing dead conditional
  that always evaluated to True due to 'or True'
2026-02-05 09:15:08 -06:00
Mondo Diaz
599c8c1d5b Fix httpx.Timeout configuration in PyPI proxy
httpx.Timeout requires either a default value or all four parameters.
Changed to httpx.Timeout(default, connect=X) format.
2026-02-05 09:15:08 -06:00

View File

@@ -6,9 +6,6 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
## [0.6.0] - 2026-02-05
### Added
- Added S3 bucket provisioning terraform configuration (#59)
- Creates an S3 bucket to be used for anything Orchard