Commit Graph

7 Commits

Author SHA1 Message Date
Mondo Diaz
55a38ad850 Add deduplication design doc, file size limits, and validation tests
- Add max_file_size (10GB) and min_file_size (1 byte) config options
- Add file size validation to regular and resumable upload endpoints
- Create comprehensive deduplication design document covering:
  - SHA256 algorithm selection rationale and migration path
  - Content-addressable storage model
  - S3 key derivation and prefix sharding
  - Duplicate detection workflow
  - Reference counting lifecycle
  - Edge cases and error handling
  - Collision detection strategy
  - Performance considerations
  - Operations runbook
- Add tests for empty file rejection and file size validation
2026-01-05 15:35:21 -06:00
Mondo Diaz
32115fc1c5 Add integration tests for garbage collection endpoints 2026-01-05 15:24:46 -06:00
Mondo Diaz
4c2e21295f Add comprehensive ref_count tests and fix resumable upload double-counting bug
- Add tests for cascade deletion ref_count (package/project delete)
- Add tests for tag update ref_count adjustments
- Fix resumable upload bug where ref_count was incremented manually AND by SQL trigger
- ref_count is now exclusively managed by SQL triggers on tag INSERT/DELETE/UPDATE
2026-01-05 15:19:05 -06:00
Mondo Diaz
939192f425 Add integration tests for stats endpoints and fix JSON report serialization 2026-01-05 15:11:15 -06:00
Mondo Diaz
eca291d194 Add S3 verification and failure cleanup tests
- Add test_s3_bucket_single_object_after_duplicates to verify only one S3 object exists
- Add tests for upload failure scenarios (invalid project/package, empty file)
- Add tests for orphaned S3 objects and database records cleanup
- Add S3 direct access helpers (list_s3_objects_by_hash, s3_object_exists, etc.)
- Fix conftest.py to use setdefault for env vars (don't override container config)

All 52 tests now pass.
2026-01-05 14:39:22 -06:00
Mondo Diaz
7c31b6a244 Add integration tests for deduplication and ref_count
- Add test_integration_uploads.py with 12 tests for duplicate upload scenarios
- Add test_ref_count.py with 7 tests for ref_count management
- Fix ArtifactDetailResponse to include sha256 and checksum fields
- Fix health check SQL warning by wrapping in text()
- Update tests to use unique content per test run for idempotency
2026-01-05 14:29:12 -06:00
Mondo Diaz
109677e43a Add storage abstraction, stats endpoints, garbage collection, and test infrastructure
- Add StorageBackend protocol for backend-agnostic storage interface
- Add health check with storage and database connectivity verification
- Add garbage collection endpoints for orphaned artifacts (ref_count=0)
- Add deduplication statistics endpoints (/api/v1/stats, /stats/storage, /stats/deduplication)
- Add per-project statistics endpoint
- Add verify_integrity method for post-upload hash validation
- Set up pytest infrastructure with mock S3 client
- Add unit tests for hash calculation and duplicate detection
2026-01-05 11:16:46 -06:00