Files
orchard/CHANGELOG.md

11 KiB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Added download verification with verify and verify_mode query parameters (#26)
    • ?verify=true&verify_mode=pre - Pre-verification: verify before streaming (guaranteed no corrupt data)
    • ?verify=true&verify_mode=stream - Streaming verification: verify while streaming (logs error if mismatch)
  • Added checksum response headers to all download endpoints (#27)
    • X-Checksum-SHA256 - SHA256 hash of the artifact
    • X-Content-Length - File size in bytes
    • X-Checksum-MD5 - MD5 hash (if available)
    • ETag - Artifact ID (SHA256)
    • Digest - RFC 3230 format sha-256 hash (base64)
    • X-Verified - Verification status (true/false/pending)
  • Added checksum.py module with SHA256 utilities (#26)
    • compute_sha256() and compute_sha256_stream() functions
    • HashingStreamWrapper for incremental hash computation
    • VerifyingStreamWrapper for stream verification
    • verify_checksum() and verify_checksum_strict() functions
    • ChecksumMismatchError exception with context
  • Added get_verified() and get_stream_verified() methods to storage layer (#26)
  • Added logging_config.py module with structured logging (#28)
    • JSON logging format for production
    • Request ID tracking via context variables
    • Verification failure logging with full context
  • Added log_level and log_format settings to configuration (#28)
  • Added 62 unit tests for checksum utilities and verification (#29)
  • Added 17 integration tests for download verification API (#29)
  • Added global artifacts endpoint GET /api/v1/artifacts with project/package/tag/size/date filters (#18)
  • Added global tags endpoint GET /api/v1/tags with project/package/search/date filters (#18)
  • Added wildcard pattern matching (*) for tag filters across all endpoints (#18)
  • Added comma-separated multi-value support for tag filters (#18)
  • Added search parameter to /api/v1/uploads for filename search (#18)
  • Added tag filter to /api/v1/uploads endpoint (#18)
  • Added sort and order parameters to /api/v1/uploads endpoint (#18)
  • Added min_size and max_size filters to package artifacts endpoint (#18)
  • Added sort and order parameters to package artifacts endpoint (#18)
  • Added from and to date filters to package tags endpoint (#18)
  • Added GlobalArtifactResponse and GlobalTagResponse schemas (#18)
  • Added S3 object verification before database commit during upload (#19)
  • Added S3 object cleanup on database commit failure (#19)
  • Added upload duration tracking (duration_ms field) (#19)
  • Added User-Agent header capture during uploads (#19)
  • Added X-Checksum-SHA256 header support for client-side checksum verification (#19)
  • Added status, error_message, client_checksum columns to uploads table (#19)
  • Added upload_locks table for future concurrent upload conflict detection (#19)
  • Added consistency check endpoint GET /api/v1/admin/consistency-check (#19)
  • Added PUT /api/v1/projects/{project} endpoint for project updates with audit logging (#20)
  • Added PUT /api/v1/project/{project}/packages/{package} endpoint for package updates with audit logging (#20)
  • Added artifact.download audit logging to download endpoint (#20)
  • Added ProjectHistory and PackageHistory models with database triggers (#20)
  • Added migration 004_history_tables.sql for project/package history (#20)
  • Added migration 005_upload_enhancements.sql for upload status tracking (#19)
  • Added 9 integration tests for global artifacts/tags endpoints (#18)
  • Added global uploads query endpoint GET /api/v1/uploads with project/package/user/date filters (#18)
  • Added project-level uploads endpoint GET /api/v1/project/{project}/uploads (#18)
  • Added has_more field to pagination metadata for easier pagination UI (#18)
  • Added upload_id, content_type, original_name, created_at fields to upload response (#19)
  • Added audit log API endpoints with filtering and pagination (#20)
    • GET /api/v1/audit-logs - list all audit logs with action/resource/user/date filters
    • GET /api/v1/projects/{project}/audit-logs - project-scoped audit logs
    • GET /api/v1/project/{project}/{package}/audit-logs - package-scoped audit logs
  • Added upload history API endpoints (#20)
    • GET /api/v1/project/{project}/{package}/uploads - list upload events for a package
    • GET /api/v1/artifact/{id}/uploads - list all uploads of a specific artifact
  • Added artifact provenance endpoint GET /api/v1/artifact/{id}/history (#20)
    • Returns full artifact history including packages, tags, and upload events
  • Added audit logging for project.create, package.create, tag.create, tag.update, artifact.upload actions (#20)
  • Added AuditLogResponse, UploadHistoryResponse, ArtifactProvenanceResponse schemas (#20)
  • Added TagHistoryDetailResponse schema with artifact metadata (#20)
  • Added 31 integration tests for audit log, history, and upload query endpoints (#22)

Changed

  • Standardized audit action naming to {entity}.{action} pattern (project.delete, package.delete, tag.delete) (#20)
  • Added StorageBackend protocol/interface for backend-agnostic storage (#33)
  • Added health_check() method to storage backend with /health endpoint integration (#33)
  • Added verify_integrity() method for post-upload hash validation (#33)
  • Added S3 configuration options: s3_verify_ssl, s3_connect_timeout, s3_read_timeout, s3_max_retries (#33)
  • Added S3StorageUnavailableError and HashCollisionError exception types (#33)
  • Added hash collision detection by comparing file sizes during deduplication (#33)
  • Added garbage collection endpoint POST /api/v1/admin/garbage-collect for orphaned artifacts (#36)
  • Added orphaned artifacts listing endpoint GET /api/v1/admin/orphaned-artifacts (#36)
  • Added global storage statistics endpoint GET /api/v1/stats (#34)
  • Added storage breakdown endpoint GET /api/v1/stats/storage (#34)
  • Added deduplication metrics endpoint GET /api/v1/stats/deduplication (#34)
  • Added per-project statistics endpoint GET /api/v1/projects/{project}/stats (#34)
  • Added per-package statistics endpoint GET /api/v1/project/{project}/packages/{package}/stats (#34)
  • Added per-artifact statistics endpoint GET /api/v1/artifact/{id}/stats (#34)
  • Added cross-project deduplication endpoint GET /api/v1/stats/cross-project (#34)
  • Added timeline statistics endpoint GET /api/v1/stats/timeline with daily/weekly/monthly periods (#34)
  • Added stats export endpoint GET /api/v1/stats/export with JSON/CSV formats (#34)
  • Added summary report endpoint GET /api/v1/stats/report with markdown/JSON formats (#34)
  • Added Dashboard page at /dashboard with storage and deduplication visualizations (#34)
  • Added pytest infrastructure with mock S3 client for unit testing (#35)
  • Added unit tests for SHA256 hash calculation (#35)
  • Added unit tests for duplicate detection and deduplication behavior (#35)
  • Added integration tests for upload scenarios and ref_count management (#35)
  • Added integration tests for S3 verification and failure cleanup (#35)
  • Added integration tests for all stats endpoints (#35)
  • Added integration tests for cascade deletion ref_count behavior (package/project delete) (#35)
  • Added integration tests for tag update ref_count adjustments (#35)
  • Added integration tests for garbage collection endpoints (#35)
  • Added integration tests for file size validation (#35)
  • Added test dependencies to requirements.txt (pytest, pytest-asyncio, pytest-cov, httpx, moto) (#35)
  • Added ORCHARD_MAX_FILE_SIZE config option (default: 10GB) for upload size limits (#37)
  • Added ORCHARD_MIN_FILE_SIZE config option (default: 1 byte, rejects empty files) (#37)
  • Added file size validation to upload and resumable upload endpoints (#37)
  • Added comprehensive deduplication design document (docs/design/deduplication-design.md) (#37)

Fixed

  • Fixed Helm chart minio.ingress conflicting with Bitnami MinIO subchart by renaming to minioIngress (#48)
  • Fixed JSON report serialization error for Decimal types in GET /api/v1/stats/report (#34)
  • Fixed resumable upload double-counting ref_count when tag provided (removed manual increment, SQL triggers handle it) (#35)

[0.3.0] - 2025-12-15

Changed

  • Changed default download mode from proxy to presigned for better performance (#48)

Added

  • Added presigned URL support for direct S3 downloads (#48)
  • Added ORCHARD_DOWNLOAD_MODE config option (presigned, redirect, proxy) (#48)
  • Added ORCHARD_PRESIGNED_URL_EXPIRY config option (default: 3600 seconds) (#48)
  • Added ?mode= query parameter to override download mode per-request (#48)
  • Added /api/v1/project/{project}/{package}/+/{ref}/url endpoint for getting presigned URLs (#48)
  • Added PresignedUrlResponse schema with URL, expiry, checksums, and artifact metadata (#48)
  • Added MinIO ingress support in Helm chart for presigned URL access (#48)
  • Added orchard.download.mode and orchard.download.presignedUrlExpiry Helm values (#48)
  • Added integrity verification workflow design document (#24)
  • Added sha256 field to API responses for clarity (alias of id) (#25)
  • Added checksum_sha1 field to artifacts table for compatibility (#25)
  • Added s3_etag field to artifacts table for S3 verification (#25)
  • Compute and store MD5, SHA1, and S3 ETag alongside SHA256 during upload (#25)
  • Added Dockerfile.local and docker-compose.local.yml for local development (#25)
  • Added migration script 003_checksum_fields.sql for existing databases (#25)

[0.2.0] - 2025-12-15

Added

  • Added format and platform fields to packages table (#16)
  • Added checksum_md5 and metadata JSONB fields to artifacts table (#16)
  • Added updated_at field to tags table (#16)
  • Added tag_name, user_agent, duration_ms, deduplicated, checksum_verified fields to uploads table (#16)
  • Added change_type field to tag_history table (#16)
  • Added composite indexes for common query patterns (#16)
  • Added GIN indexes on JSONB fields for efficient JSON queries (#16)
  • Added partial index for public projects (#16)
  • Added database triggers for updated_at timestamps (#16)
  • Added database triggers for maintaining artifact ref_count accuracy (#16)
  • Added CHECK constraints for data integrity (size > 0, ref_count >= 0) (#16)
  • Added migration script 002_schema_enhancements.sql for existing databases (#16)

Changed

  • Updated images to use internal container BSF proxy (#46)

[0.1.0] - 2025-12-12

Added

  • Added Prosper docker template config (#45)

Changed

  • Changed the Dockerfile npm build arg to use the deps.global.bsf.tools URL as the default registry (#45)