Add comprehensive stats endpoints and reporting features

Backend stats endpoints:
- GET /api/v1/project/:project/packages/:package/stats - per-package stats
- GET /api/v1/artifact/:id/stats - artifact reference statistics
- GET /api/v1/stats/cross-project - cross-project deduplication detection
- GET /api/v1/stats/timeline - time-based metrics (daily/weekly/monthly)
- GET /api/v1/stats/export - CSV/JSON export
- GET /api/v1/stats/report - markdown/JSON summary report generation

Enhanced existing endpoints:
- Added storage_saved_bytes and deduplication_ratio to project stats
- Added date range filtering via from_date/to_date params

New schemas:
- PackageStatsResponse
- ArtifactStatsResponse
- CrossProjectDeduplicationResponse
- TimeBasedStatsResponse
- StatsReportResponse
This commit is contained in:
Mondo Diaz
2026-01-05 14:57:47 -06:00
parent e215ecabcd
commit c79b10cbc5
2 changed files with 572 additions and 1 deletions

View File

@@ -456,3 +456,62 @@ class ProjectStatsResponse(BaseModel):
total_size_bytes: int
upload_count: int
deduplicated_uploads: int
storage_saved_bytes: int = 0 # Bytes saved through deduplication
deduplication_ratio: float = 1.0 # upload_count / artifact_count
class PackageStatsResponse(BaseModel):
"""Per-package statistics"""
package_id: str
package_name: str
project_name: str
tag_count: int
artifact_count: int
total_size_bytes: int
upload_count: int
deduplicated_uploads: int
storage_saved_bytes: int = 0
deduplication_ratio: float = 1.0
class ArtifactStatsResponse(BaseModel):
"""Per-artifact reference statistics"""
artifact_id: str
sha256: str
size: int
ref_count: int
storage_savings: int # (ref_count - 1) * size
tags: List[Dict[str, Any]] # Tags referencing this artifact
projects: List[str] # Projects using this artifact
packages: List[str] # Packages using this artifact
first_uploaded: Optional[datetime] = None
last_referenced: Optional[datetime] = None
class CrossProjectDeduplicationResponse(BaseModel):
"""Cross-project deduplication statistics"""
shared_artifacts_count: int # Artifacts used in multiple projects
total_cross_project_savings: int # Bytes saved by cross-project sharing
shared_artifacts: List[Dict[str, Any]] # Details of shared artifacts
class TimeBasedStatsResponse(BaseModel):
"""Time-based deduplication statistics"""
period: str # "daily", "weekly", "monthly"
start_date: datetime
end_date: datetime
data_points: List[
Dict[str, Any]
] # List of {date, uploads, unique, duplicated, bytes_saved}
class StatsReportResponse(BaseModel):
"""Summary report in various formats"""
format: str # "json", "csv", "markdown"
generated_at: datetime
content: str # The report content