Commit Graph

36 Commits

Author SHA1 Message Date
Mondo Diaz
c31a147e1f fix: treat bare version constraints as exact match
When resolving dependencies like certifi@2025.10.5, the bare version
string "2025.10.5" was being rejected as an invalid SpecifierSet and
falling back to wildcard, which fetched the latest version instead.

Now bare versions starting with a digit are automatically prefixed
with "==" to create an exact match constraint.
2026-02-04 17:02:02 -06:00
Mondo Diaz
6cf487b224 fix: add security checks and tests for code review
Security:
- Add authorization checks to list_packages, update_package, delete_package endpoints
- Add MAX_TOTAL_ARTIFACTS limit (1000) to prevent memory exhaustion during dependency resolution
- Add TooManyArtifactsError exception for proper error handling

UI:
- Display reverse dependency errors in PackagePage
- Add warning display for failed dependency fetches in DependencyGraph

Tests:
- Add unit tests for metadata extraction (deb, wheel, tarball, jar)
- Add unit tests for rate limit configuration
- Add unit tests for PyPI registry client
2026-02-04 16:19:16 -06:00
Mondo Diaz
7140c9f4f2 fix: filter platform-specific and extra dependencies in PyPI proxy
The dependency parser was stripping environment markers but not checking
if they indicated optional or platform-specific packages. This caused
packages like jaraco.path to pull in pyobjc (324 sub-packages) even on
non-macOS systems.

Changes:
- Filter dependencies with 'extra ==' markers (optional extras)
- Filter dependencies with 'sys_platform' or 'platform_system' markers
- Add diagnostic logging for depth exceeded errors
- Add unit tests for dependency filtering

Fixes tensorflow dependency resolution exceeding max depth.
2026-02-04 14:56:59 -06:00
Mondo Diaz
f1ac43c1cb fix: use lenient conflict handling for dependency resolution
Instead of failing with 409 on version conflicts, use "first version wins"
strategy. This allows resolution to succeed for complex dependency trees
like tensorflow where transitive dependencies may have overlapping but
not identical version requirements.

The resolver now:
- Checks if an already-resolved version satisfies a new constraint
- If yes, reuses the existing version
- If no, logs the mismatch and uses the first-encountered version

This matches pip's behavior of picking a working version rather than
failing on theoretical conflicts.
2026-02-04 13:45:15 -06:00
Mondo Diaz
7bfec020c8 feat: change auto_fetch default to true
Auto-fetching missing dependencies from upstream is the more useful default
behavior. Users who need fast, network-free resolution can explicitly set
auto_fetch=false.

Artifacts are content-addressed by SHA256, so reproducibility concerns don't
apply - the same version always produces the same artifact.
2026-02-04 12:23:01 -06:00
Mondo Diaz
5cff4092e3 feat: add auto-fetch for missing dependencies from upstream registries
Add auto_fetch parameter to dependency resolution endpoint that fetches
missing dependencies from upstream registries (PyPI) when resolving.

- Add RegistryClient abstraction with PyPIRegistryClient implementation
- Extract fetch_and_cache_pypi_package() for reuse
- Add resolve_dependencies_with_fetch() async function
- Extend MissingDependency schema with fetch_attempted/fetch_error
- Add fetched list to DependencyResolutionResponse
- Add auto_fetch_max_depth config setting (default: 3)
- Remove Usage section from Package page UI
- Add 6 integration tests for auto-fetch functionality
2026-02-04 12:01:49 -06:00
Mondo Diaz
632bf54087 fix: correct test imports and health endpoint assertions
- Fix import in test_db_utils.py: use app.models instead of backend.app.models
- Update health endpoint test to expect 'ok' status and infrastructure keys
- Add CHANGELOG entries for PyPI proxy performance improvements
2026-02-04 10:37:12 -06:00
Mondo Diaz
08b6589712 test: add infrastructure integration tests for pypi_proxy 2026-02-04 09:53:02 -06:00
Mondo Diaz
ffe0529ea8 feat: add ArtifactRepository with batch DB operations
Add optimized database operations for artifact storage:
- Atomic upserts using ON CONFLICT for artifact creation
- Batch inserts for dependencies to eliminate N+1 queries
- Joined queries for cached URL lookups
- All methods include comprehensive unit tests
2026-02-04 09:48:08 -06:00
Mondo Diaz
a045509fe4 feat: add CacheService with Redis caching and graceful fallback
Implements Redis-backed caching with category-aware TTL management:
- Immutable categories (artifact metadata, dependencies) cached forever
- Mutable categories (index pages, upstream sources) use configurable TTL
- Graceful fallback when Redis unavailable or disabled
- Pattern-based invalidation for bulk cache clearing
2026-02-04 09:44:12 -06:00
Mondo Diaz
14806b05f0 feat: add HttpClientManager with connection pooling
Add HttpClientManager class for managing httpx.AsyncClient pools with
FastAPI lifespan integration. Features include:
- Default shared connection pool for general requests
- Configurable max connections, keep-alive, and timeouts
- Dedicated thread pool for blocking I/O operations
- Graceful startup/shutdown lifecycle management
- Per-upstream client isolation support (for future use)

Includes comprehensive unit tests covering initialization, startup,
shutdown, client retrieval, blocking operations, idempotency, and
error handling.
2026-02-04 09:16:27 -06:00
Mondo Diaz
c94fe0389b Fix tests for tag removal and version behavior
- Fix upload response to return actual version (not requested version)
  when artifact already has a version in the package
- Update ref_count tests to use multiple packages (one version per
  artifact per package design constraint)
- Remove allow_public_internet references from upstream caching tests
- Update consistency check test to not assert global system health
- Add versions field to artifact schemas
- Fix dependencies resolution to handle removed tag constraint
2026-02-03 15:35:44 -06:00
Mondo Diaz
9a95421064 Fix remaining tag references in tests
- Update CacheRequest test to use version field
- Fix upload_test_file calls that still used tag parameter
- Update artifact history test to check versions instead of tags
- Update artifact stats tests to check versions instead of tags
- Fix garbage collection tests to delete versions instead of tags
- Remove TestGlobalTags class (endpoint removed)
- Update project/package stats tests to check version_count
- Fix upload_test_file fixture in test_download_verification
2026-02-03 12:51:31 -06:00
Mondo Diaz
87f30ea898 Update tests for tag removal
- Remove Tag/TagHistory model tests from unit tests
- Update CacheSettings tests to remove allow_public_internet field
- Replace tag= with version= in upload_test_file calls
- Update test assertions to use versions instead of tags
- Remove tests for tag: prefix downloads (now uses version:)
- Update dependency tests for version-only schema
2026-02-03 12:45:44 -06:00
Mondo Diaz
c4c9c20763 Remove tag system, use versions only for artifact references
Tags were mutable aliases that caused confusion alongside the immutable
version system. This removes tags entirely, keeping only PackageVersion
for artifact references.

Changes:
- Remove tags and tag_history tables (migration 012)
- Remove Tag model, TagRepository, and 6 tag API endpoints
- Update cache system to create versions instead of tags
- Update frontend to display versions instead of tags
- Remove tag-related schemas and types
- Update artifact cleanup service for version-based ref_count
2026-02-03 12:18:19 -06:00
Mondo Diaz
081cc6df83 Remove proactive PyPI dependency caching feature
The background task queue for proactively caching package dependencies was
causing server instability and unnecessary growth. The PyPI proxy now only
caches packages on-demand when users request them.

Removed:
- PyPI cache worker (background task queue and worker pool)
- PyPICacheTask model and related database schema
- Cache management API endpoints (/pypi/cache/*)
- Background Jobs admin dashboard
- Dependency extraction and queueing logic

Kept:
- On-demand package caching (still works when users request packages)
- Async httpx for non-blocking downloads (prevents health check failures)
- URL-based cache lookups for deduplication
2026-02-02 16:17:33 -06:00
Mondo Diaz
3bdeade7ca Fix nested dependency depth tracking in PyPI cache worker
When the cache worker downloaded a package through the proxy, dependencies
were always queued with depth=0 instead of depth+1. This meant depth limits
weren't properly enforced for nested dependencies.

Changes:
- Add cache-depth query parameter to pypi_download_file endpoint
- Worker now passes its current depth when fetching packages
- Dependencies are queued at cache_depth+1 instead of hardcoded 0
- Add tests for depth tracking behavior
2026-02-02 13:47:22 -06:00
Mondo Diaz
d274f3f375 Add robust PyPI dependency caching with task queue
Replace unbounded thread spawning with managed worker pool:
- New pypi_cache_tasks table tracks caching jobs
- Thread pool with 5 workers (configurable via ORCHARD_PYPI_CACHE_WORKERS)
- Automatic retries with exponential backoff (30s, 60s, then fail)
- Deduplication to prevent duplicate caching attempts

New API endpoints for visibility and control:
- GET /pypi/cache/status - queue health summary
- GET /pypi/cache/failed - list failed tasks with errors
- POST /pypi/cache/retry/{package} - retry single package
- POST /pypi/cache/retry-all - retry all failed packages

This fixes silent failures in background dependency caching where
packages would fail to cache without any tracking or retry mechanism.
2026-02-02 11:16:02 -06:00
Mondo Diaz
fe6c6c52d2 Fix PyPI proxy UX and package stats calculation
- Fix artifact_count and total_size calculation to use Tags instead of
  Uploads, so PyPI cached packages show their stats correctly
- Fix PackagePage dropdown menu positioning (use fixed position with backdrop)
- Add system project detection for projects starting with "_"
- Show Version as primary column for system projects, hide Tag column
- Hide upload button for system projects (they're cache-only)
- Rename section header to "Versions" for system projects
- Fix test_projects_sort_by_name to exclude system projects from sort comparison
2026-01-30 12:16:05 -06:00
Mondo Diaz
00fb2729e4 Fix test_rewrite_relative_links assertion to expect correct URL
The test was checking for the wrong URL pattern. When urljoin resolves
../../packages/ab/cd/... relative to /api/pypi/pypi-remote/simple/requests/,
it correctly produces /api/pypi/pypi-remote/packages/ab/cd/... (not
/api/pypi/packages/...).
2026-01-30 08:51:30 -06:00
Mondo Diaz
8ae4d7a685 Improve PyPI proxy test assertions for all status codes
Tests now verify the correct response for each scenario:
- 200: HTML content-type
- 404: "not found" error message
- 503: "No PyPI upstream sources configured" error message
2026-01-29 19:35:20 -06:00
Mondo Diaz
4b887d1aad Fix PyPI proxy tests to work with or without upstream sources
- Tests now accept 200/404/503 responses since upstream sources may or
  may not be configured in the test environment
- Added upstream_base_url parameter to _rewrite_package_links test
- Added test for relative URL resolution (Artifactory-style URLs)
2026-01-29 19:34:33 -06:00
Mondo Diaz
97498b2f86 Add transparent PyPI proxy and improve upstream sources UI 2026-01-29 16:12:57 -06:00
Mondo Diaz
82f67539bd Remove public internet features and fix upstream source UI (#107) 2026-01-29 13:26:28 -06:00
Mondo Diaz
1d51c856b0 Add upstream caching infrastructure and refactor CI pipeline 2026-01-29 11:55:15 -06:00
Mondo Diaz
576791d19e Add multi-tenancy with Teams feature 2026-01-28 12:50:58 -06:00
Mondo Diaz
7120cf64f1 Add configurable admin password via environment variable 2026-01-27 14:23:40 -06:00
Mondo Diaz
abba90ebac Add package dependencies system and project settings page 2026-01-27 10:11:04 -06:00
Mondo Diaz
584acd1e90 Add comprehensive upload/download tests and streaming enhancements (#38, #40, #42, #43) 2026-01-21 09:35:12 -06:00
Mondo Diaz
b93d5a9c68 Add separate version tracking for artifacts 2026-01-16 11:36:08 -06:00
Mondo Diaz
4b3d2fd41d Add feature branch deployment pipeline 2026-01-14 12:29:37 -06:00
Mondo Diaz
617bcbe89c Implement authentication system with access control UI 2026-01-12 09:52:35 -07:00
Mondo Diaz
10d3694794 Add drag-and-drop upload component with chunked uploads and offline support 2026-01-08 11:59:32 -06:00
Mondo Diaz
35fda65d38 Add download verification with SHA256 checksum support (#26, #27, #28, #29) 2026-01-07 13:36:46 -06:00
Mondo Diaz
2f1891cf01 Metadata database tracks all uploads with project, package, tag, and timestamp queryable via API 2026-01-07 12:31:44 -06:00
Mondo Diaz
7e68baed08 Add ref_count management for deletions with atomic operations and error handling 2026-01-06 13:44:23 -06:00