Mondo Diaz ece5033d74 Add schema enhancements for uploads, artifacts, and audit tracking
- Add format and platform fields to packages table
- Add checksum_md5 and metadata JSONB to artifacts with CHECK constraints
- Add updated_at and composite index to tags table
- Add tag_name, user_agent, duration_ms, deduplicated, checksum_verified to uploads
- Add change_type field to tag_history table
- Add composite indexes and GIN index to audit_logs
- Add partial index for public projects
- Add triggers for ref_count accuracy and updated_at timestamps
- Create migration script (002) for existing databases
2025-12-12 15:05:24 -06:00
2025-12-12 13:52:27 -07:00
2025-12-12 13:52:27 -07:00
2025-12-12 13:52:27 -07:00
2025-12-12 13:52:27 -07:00
2025-12-12 13:52:27 -07:00

Orchard

Content-Addressable Storage System

Orchard is a centralized binary artifact storage system that provides content-addressable storage with automatic deduplication, flexible access control, and multi-format package support.

Tech Stack

  • Backend: Python 3.12 + FastAPI
  • Frontend: React 18 + TypeScript + Vite
  • Database: PostgreSQL 16
  • Object Storage: MinIO (S3-compatible)
  • Cache: Redis (for future use)

Features

Currently Implemented

  • Content-Addressable Storage - Artifacts are stored and referenced by their SHA256 hash, ensuring deduplication and data integrity
  • Project/Package/Artifact Hierarchy - Organized storage structure:
    • Project - Top-level organizational container
    • Package - Named collection within a project
    • Artifact - Specific content instance identified by SHA256
  • Tags - Alias system for referencing artifacts by human-readable names (e.g., v1.0.0, latest, stable)
  • Package Formats & Platforms - Packages can be tagged with format (npm, pypi, docker, deb, rpm, etc.) and platform (linux, darwin, windows, etc.)
  • Rich Package Metadata - Package listings include aggregated stats (tag count, artifact count, total size, latest tag)
  • S3-Compatible Backend - Uses MinIO (or any S3-compatible storage) for artifact storage
  • PostgreSQL Metadata - Relational database for metadata, access control, and audit trails
  • REST API - Full HTTP API for all operations
  • Web UI - React-based interface for managing artifacts with:
    • Hierarchical navigation (Projects → Packages → Tags/Artifacts)
    • Search, sort, and filter capabilities on all list views
    • URL-based state persistence for filters and pagination
    • Keyboard navigation (Backspace to go up hierarchy)
    • Copy-to-clipboard for artifact IDs
    • Responsive design for mobile and desktop
  • Docker Compose Setup - Easy local development environment
  • Helm Chart - Kubernetes deployment with PostgreSQL, MinIO, and Redis subcharts
  • Multipart Upload - Automatic multipart upload for files larger than 100MB
  • Resumable Uploads - API for resumable uploads with part-by-part upload support
  • Range Requests - HTTP range request support for partial downloads
  • Format-Specific Metadata - Automatic extraction of metadata from package formats:
    • .deb - Debian packages (name, version, architecture, maintainer)
    • .rpm - RPM packages (name, version, release, architecture)
    • .tar.gz/.tgz - Tarballs (name, version from filename)
    • .whl - Python wheels (name, version, author)
    • .jar - Java JARs (manifest info, Maven coordinates)
    • .zip - ZIP files (file count, uncompressed size)

API Endpoints

Method Endpoint Description
GET / Web UI
GET /health Health check
GET /api/v1/projects List all projects
POST /api/v1/projects Create a new project
GET /api/v1/projects/:project Get project details
GET /api/v1/project/:project/packages List packages (with pagination, search, filtering)
GET /api/v1/project/:project/packages/:package Get single package with metadata
POST /api/v1/project/:project/packages Create a new package
POST /api/v1/project/:project/:package/upload Upload an artifact
GET /api/v1/project/:project/:package/+/:ref Download an artifact (supports Range header)
HEAD /api/v1/project/:project/:package/+/:ref Get artifact metadata without downloading
GET /api/v1/project/:project/:package/tags List tags (with pagination, search, sorting, artifact metadata)
POST /api/v1/project/:project/:package/tags Create a tag
GET /api/v1/project/:project/:package/tags/:tag_name Get single tag with artifact metadata
GET /api/v1/project/:project/:package/tags/:tag_name/history Get tag change history
GET /api/v1/project/:project/:package/artifacts List artifacts in package (with filtering)
GET /api/v1/project/:project/:package/consumers List consumers of a package
GET /api/v1/artifact/:id Get artifact metadata with referencing tags

Resumable Upload Endpoints

For large files, use the resumable upload API:

Method Endpoint Description
POST /api/v1/project/:project/:package/upload/init Initialize resumable upload
PUT /api/v1/project/:project/:package/upload/:upload_id/part/:part_number Upload a part
POST /api/v1/project/:project/:package/upload/:upload_id/complete Complete upload
DELETE /api/v1/project/:project/:package/upload/:upload_id Abort upload
GET /api/v1/project/:project/:package/upload/:upload_id/status Get upload status

Reference Formats

When downloading artifacts, the :ref parameter supports multiple formats:

  • latest - Tag name directly
  • v1.0.0 - Version tag
  • tag:stable - Explicit tag reference
  • version:2024.1 - Version reference
  • artifact:a3f5d8e12b4c6789... - Direct SHA256 hash reference

Quick Start

Prerequisites

  • Docker and Docker Compose

Running Locally

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f orchard-server

# Stop services
docker-compose down

Services

Service Port Description
orchard-server 8080 Main API server and Web UI
postgres 5432 PostgreSQL database
minio 9000 S3-compatible object storage
minio (console) 9001 MinIO web console
redis 6379 Cache (for future use)

Access Points

Development

Backend (FastAPI)

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8080

Frontend (React)

cd frontend
npm install
npm run dev

The frontend dev server proxies API requests to localhost:8080.

Usage Examples

Create a Project

curl -X POST http://localhost:8080/api/v1/projects \
  -H "Content-Type: application/json" \
  -d '{"name": "my-project", "description": "My project artifacts", "is_public": true}'

Create a Package

curl -X POST http://localhost:8080/api/v1/project/my-project/packages \
  -H "Content-Type: application/json" \
  -d '{"name": "releases", "description": "Release builds", "format": "generic", "platform": "any"}'

Supported formats: generic, npm, pypi, docker, deb, rpm, maven, nuget, helm

Supported platforms: any, linux, darwin, windows, linux-amd64, linux-arm64, darwin-amd64, darwin-arm64, windows-amd64

List Packages

# Basic listing
curl http://localhost:8080/api/v1/project/my-project/packages

# With pagination
curl "http://localhost:8080/api/v1/project/my-project/packages?page=1&limit=10"

# With search
curl "http://localhost:8080/api/v1/project/my-project/packages?search=release"

# With sorting
curl "http://localhost:8080/api/v1/project/my-project/packages?sort=created_at&order=desc"

# Filter by format/platform
curl "http://localhost:8080/api/v1/project/my-project/packages?format=npm&platform=linux"

Response includes aggregated metadata:

{
  "items": [
    {
      "id": "uuid",
      "name": "releases",
      "description": "Release builds",
      "format": "generic",
      "platform": "any",
      "tag_count": 5,
      "artifact_count": 3,
      "total_size": 1048576,
      "latest_tag": "v1.0.0",
      "latest_upload_at": "2025-01-01T00:00:00Z",
      "recent_tags": [...]
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Single Package

curl http://localhost:8080/api/v1/project/my-project/packages/releases

# Include all tags (not just recent 5)
curl "http://localhost:8080/api/v1/project/my-project/packages/releases?include_tags=true"

Upload an Artifact

curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload \
  -F "file=@./build/app-v1.0.0.tar.gz" \
  -F "tag=v1.0.0"

Response:

{
  "artifact_id": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678",
  "size": 1048576,
  "project": "my-project",
  "package": "releases",
  "tag": "v1.0.0",
  "format_metadata": {
    "format": "tarball",
    "package_name": "app",
    "version": "1.0.0"
  },
  "deduplicated": false
}

Resumable Upload (for large files)

For files larger than 100MB, use the resumable upload API:

# 1. Initialize upload (client must compute SHA256 hash first)
curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/init \
  -H "Content-Type: application/json" \
  -d '{
    "expected_hash": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678",
    "filename": "large-file.tar.gz",
    "size": 524288000,
    "tag": "v2.0.0"
  }'

# Response: {"upload_id": "abc123", "already_exists": false, "chunk_size": 10485760}

# 2. Upload parts (10MB chunks recommended)
curl -X PUT http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/part/1 \
  --data-binary @chunk1.bin

# 3. Complete the upload
curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/complete \
  -H "Content-Type: application/json" \
  -d '{"tag": "v2.0.0"}'

Download an Artifact

# By tag (use -OJ to save with the correct filename from Content-Disposition header)
curl -OJ http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# By artifact ID
curl -OJ http://localhost:8080/api/v1/project/my-project/releases/+/artifact:a3f5d8e12b4c6789...

# Using the short URL pattern
curl -OJ http://localhost:8080/project/my-project/releases/+/latest

# Save to a specific filename
curl -o myfile.tar.gz http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# Partial download (range request)
curl -H "Range: bytes=0-1023" http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# Check file info without downloading (HEAD request)
curl -I http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

Note on curl flags:

  • -O saves the file using the URL path as the filename (e.g., latest, v1.0.0)
  • -J tells curl to use the filename from the Content-Disposition header (e.g., app-v1.0.0.tar.gz)
  • -OJ combines both: download to a file using the server-provided filename
  • -o <filename> saves to a specific filename you choose

Create a Tag

curl -X POST http://localhost:8080/api/v1/project/my-project/releases/tags \
  -H "Content-Type: application/json" \
  -d '{"name": "stable", "artifact_id": "a3f5d8e12b4c6789..."}'

List Tags

# Basic listing with artifact metadata
curl http://localhost:8080/api/v1/project/my-project/releases/tags

# With pagination
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?page=1&limit=10"

# Search by tag name
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?search=v1"

# Sort by created_at descending
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?sort=created_at&order=desc"

Response includes artifact metadata:

{
  "items": [
    {
      "id": "uuid",
      "package_id": "uuid",
      "name": "v1.0.0",
      "artifact_id": "a3f5d8e...",
      "created_at": "2025-01-01T00:00:00Z",
      "created_by": "user",
      "artifact_size": 1048576,
      "artifact_content_type": "application/gzip",
      "artifact_original_name": "app-v1.0.0.tar.gz",
      "artifact_created_at": "2025-01-01T00:00:00Z",
      "artifact_format_metadata": {}
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Single Tag

curl http://localhost:8080/api/v1/project/my-project/releases/tags/v1.0.0

Get Tag History

curl http://localhost:8080/api/v1/project/my-project/releases/tags/latest/history

Returns list of artifact changes for the tag (most recent first).

List Artifacts in Package

# Basic listing
curl http://localhost:8080/api/v1/project/my-project/releases/artifacts

# Filter by content type
curl "http://localhost:8080/api/v1/project/my-project/releases/artifacts?content_type=application/gzip"

# Filter by date range
curl "http://localhost:8080/api/v1/project/my-project/releases/artifacts?created_after=2025-01-01T00:00:00Z"

Response includes tags pointing to each artifact:

{
  "items": [
    {
      "id": "a3f5d8e...",
      "size": 1048576,
      "content_type": "application/gzip",
      "original_name": "app-v1.0.0.tar.gz",
      "created_at": "2025-01-01T00:00:00Z",
      "created_by": "user",
      "format_metadata": {},
      "tags": ["v1.0.0", "latest", "stable"]
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Artifact by ID

curl http://localhost:8080/api/v1/artifact/a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678

Response includes all tags/packages referencing the artifact:

{
  "id": "a3f5d8e...",
  "size": 1048576,
  "content_type": "application/gzip",
  "original_name": "app-v1.0.0.tar.gz",
  "created_at": "2025-01-01T00:00:00Z",
  "created_by": "user",
  "ref_count": 2,
  "format_metadata": {},
  "tags": [
    {
      "id": "uuid",
      "name": "v1.0.0",
      "package_id": "uuid",
      "package_name": "releases",
      "project_name": "my-project"
    }
  ]
}

Project Structure

orchard/
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── config.py           # Pydantic settings
│   │   ├── database.py         # SQLAlchemy setup and migrations
│   │   ├── main.py             # FastAPI application
│   │   ├── metadata.py         # Format-specific metadata extraction
│   │   ├── models.py           # SQLAlchemy models
│   │   ├── routes.py           # API endpoints
│   │   ├── schemas.py          # Pydantic schemas
│   │   └── storage.py          # S3 storage layer with multipart support
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── components/         # Reusable UI components
│   │   │   ├── Badge.tsx       # Status/type badges
│   │   │   ├── Breadcrumb.tsx  # Navigation breadcrumbs
│   │   │   ├── Card.tsx        # Card containers
│   │   │   ├── DataTable.tsx   # Sortable data tables
│   │   │   ├── FilterChip.tsx  # Active filter chips
│   │   │   ├── Pagination.tsx  # Page navigation
│   │   │   ├── SearchInput.tsx # Debounced search
│   │   │   └── SortDropdown.tsx# Sort field selector
│   │   ├── pages/              # Page components
│   │   │   ├── Home.tsx        # Project list
│   │   │   ├── ProjectPage.tsx # Package list within project
│   │   │   └── PackagePage.tsx # Tag/artifact list within package
│   │   ├── api.ts              # API client with pagination support
│   │   ├── types.ts            # TypeScript interfaces
│   │   ├── App.tsx
│   │   └── main.tsx
│   ├── index.html
│   ├── package.json
│   ├── tsconfig.json
│   └── vite.config.ts
├── helm/
│   └── orchard/                # Helm chart
├── Dockerfile                  # Multi-stage build (Node + Python)
├── docker-compose.yml          # Local development stack
└── .gitlab-ci.yml              # CI/CD pipeline

Configuration

Configuration is provided via environment variables prefixed with ORCHARD_:

Environment Variable Description Default
ORCHARD_SERVER_HOST Server bind address 0.0.0.0
ORCHARD_SERVER_PORT Server port 8080
ORCHARD_DATABASE_HOST PostgreSQL host localhost
ORCHARD_DATABASE_PORT PostgreSQL port 5432
ORCHARD_DATABASE_USER PostgreSQL user orchard
ORCHARD_DATABASE_PASSWORD PostgreSQL password -
ORCHARD_DATABASE_DBNAME PostgreSQL database orchard
ORCHARD_S3_ENDPOINT S3 endpoint URL -
ORCHARD_S3_REGION S3 region us-east-1
ORCHARD_S3_BUCKET S3 bucket name orchard-artifacts
ORCHARD_S3_ACCESS_KEY_ID S3 access key -
ORCHARD_S3_SECRET_ACCESS_KEY S3 secret key -

Kubernetes Deployment

Using Helm

# Add Bitnami repo for dependencies
helm repo add bitnami https://charts.bitnami.com/bitnami

# Update dependencies
cd helm/orchard
helm dependency update

# Install
helm install orchard ./helm/orchard -n orchard --create-namespace

# Install with custom values
helm install orchard ./helm/orchard -f my-values.yaml

See helm/orchard/values.yaml for all configuration options.

Database Schema

Core Tables

  • projects - Top-level organizational containers
  • packages - Collections within projects
  • artifacts - Content-addressable artifacts (SHA256)
  • tags - Aliases pointing to artifacts
  • tag_history - Audit trail for tag changes
  • uploads - Upload event records
  • consumers - Dependency tracking
  • access_permissions - Project-level access control
  • api_keys - Programmatic access tokens
  • audit_logs - Immutable operation logs

Future Work

The following features are planned but not yet implemented:

  • CLI tool (orchard command)
  • Dependency file parsing
  • Lock file generation
  • Export/Import for air-gapped systems
  • Consumer notification
  • Automated update propagation
  • OIDC/SAML authentication
  • API key management
  • Redis caching layer
  • Garbage collection for orphaned artifacts

License

Internal use only.

Description
No description provided
Readme 28 MiB
Languages
Python 54%
TypeScript 25.8%
CSS 12.3%
PLpgSQL 5.7%
Smarty 1.3%
Other 0.8%