Files
orchard/README.md
Mondo Diaz 1402e71d8b Fix Helm chart: rename minio.ingress to minioIngress to avoid subchart conflict
The minio.ingress config was conflicting with the Bitnami MinIO subchart's
own ingress configuration, causing coalesce.go warnings. Renamed to
minioIngress as a top-level config.

Also enabled minioIngress by default with host minio-orch-dev.common.global.bsf.tools
2025-12-16 12:30:59 -06:00

20 KiB

Orchard

Content-Addressable Storage System

Orchard is a centralized binary artifact storage system that provides content-addressable storage with automatic deduplication, flexible access control, and multi-format package support.

Tech Stack

  • Backend: Python 3.12 + FastAPI
  • Frontend: React 18 + TypeScript + Vite
  • Database: PostgreSQL 16
  • Object Storage: MinIO (S3-compatible)
  • Cache: Redis (for future use)

Features

Currently Implemented

  • Content-Addressable Storage - Artifacts are stored and referenced by their SHA256 hash, ensuring deduplication and data integrity
  • Project/Package/Artifact Hierarchy - Organized storage structure:
    • Project - Top-level organizational container
    • Package - Named collection within a project
    • Artifact - Specific content instance identified by SHA256
  • Tags - Alias system for referencing artifacts by human-readable names (e.g., v1.0.0, latest, stable)
  • Package Formats & Platforms - Packages can be tagged with format (npm, pypi, docker, deb, rpm, etc.) and platform (linux, darwin, windows, etc.)
  • Rich Package Metadata - Package listings include aggregated stats (tag count, artifact count, total size, latest tag)
  • S3-Compatible Backend - Uses MinIO (or any S3-compatible storage) for artifact storage
  • PostgreSQL Metadata - Relational database for metadata, access control, and audit trails
  • REST API - Full HTTP API for all operations
  • Web UI - React-based interface for managing artifacts with:
    • Hierarchical navigation (Projects → Packages → Tags/Artifacts)
    • Search, sort, and filter capabilities on all list views
    • URL-based state persistence for filters and pagination
    • Keyboard navigation (Backspace to go up hierarchy)
    • Copy-to-clipboard for artifact IDs
    • Responsive design for mobile and desktop
  • Docker Compose Setup - Easy local development environment
  • Helm Chart - Kubernetes deployment with PostgreSQL, MinIO, and Redis subcharts
  • Multipart Upload - Automatic multipart upload for files larger than 100MB
  • Resumable Uploads - API for resumable uploads with part-by-part upload support
  • Range Requests - HTTP range request support for partial downloads
  • Format-Specific Metadata - Automatic extraction of metadata from package formats:
    • .deb - Debian packages (name, version, architecture, maintainer)
    • .rpm - RPM packages (name, version, release, architecture)
    • .tar.gz/.tgz - Tarballs (name, version from filename)
    • .whl - Python wheels (name, version, author)
    • .jar - Java JARs (manifest info, Maven coordinates)
    • .zip - ZIP files (file count, uncompressed size)

API Endpoints

Method Endpoint Description
GET / Web UI
GET /health Health check
GET /api/v1/projects List all projects
POST /api/v1/projects Create a new project
GET /api/v1/projects/:project Get project details
GET /api/v1/project/:project/packages List packages (with pagination, search, filtering)
GET /api/v1/project/:project/packages/:package Get single package with metadata
POST /api/v1/project/:project/packages Create a new package
POST /api/v1/project/:project/:package/upload Upload an artifact
GET /api/v1/project/:project/:package/+/:ref Download an artifact (supports Range header, mode param)
GET /api/v1/project/:project/:package/+/:ref/url Get presigned URL for direct S3 download
HEAD /api/v1/project/:project/:package/+/:ref Get artifact metadata without downloading
GET /api/v1/project/:project/:package/tags List tags (with pagination, search, sorting, artifact metadata)
POST /api/v1/project/:project/:package/tags Create a tag
GET /api/v1/project/:project/:package/tags/:tag_name Get single tag with artifact metadata
GET /api/v1/project/:project/:package/tags/:tag_name/history Get tag change history
GET /api/v1/project/:project/:package/artifacts List artifacts in package (with filtering)
GET /api/v1/project/:project/:package/consumers List consumers of a package
GET /api/v1/artifact/:id Get artifact metadata with referencing tags

Resumable Upload Endpoints

For large files, use the resumable upload API:

Method Endpoint Description
POST /api/v1/project/:project/:package/upload/init Initialize resumable upload
PUT /api/v1/project/:project/:package/upload/:upload_id/part/:part_number Upload a part
POST /api/v1/project/:project/:package/upload/:upload_id/complete Complete upload
DELETE /api/v1/project/:project/:package/upload/:upload_id Abort upload
GET /api/v1/project/:project/:package/upload/:upload_id/status Get upload status

Reference Formats

When downloading artifacts, the :ref parameter supports multiple formats:

  • latest - Tag name directly
  • v1.0.0 - Version tag
  • tag:stable - Explicit tag reference
  • version:2024.1 - Version reference
  • artifact:a3f5d8e12b4c6789... - Direct SHA256 hash reference

Quick Start

Prerequisites

  • Docker and Docker Compose

Running Locally

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f orchard-server

# Stop services
docker-compose down

Services

Service Port Description
orchard-server 8080 Main API server and Web UI
postgres 5432 PostgreSQL database
minio 9000 S3-compatible object storage
minio (console) 9001 MinIO web console
redis 6379 Cache (for future use)

Access Points

Development

Backend (FastAPI)

cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8080

Frontend (React)

cd frontend
npm install
npm run dev

The frontend dev server proxies API requests to localhost:8080.

Usage Examples

Create a Project

curl -X POST http://localhost:8080/api/v1/projects \
  -H "Content-Type: application/json" \
  -d '{"name": "my-project", "description": "My project artifacts", "is_public": true}'

Create a Package

curl -X POST http://localhost:8080/api/v1/project/my-project/packages \
  -H "Content-Type: application/json" \
  -d '{"name": "releases", "description": "Release builds", "format": "generic", "platform": "any"}'

Supported formats: generic, npm, pypi, docker, deb, rpm, maven, nuget, helm

Supported platforms: any, linux, darwin, windows, linux-amd64, linux-arm64, darwin-amd64, darwin-arm64, windows-amd64

List Packages

# Basic listing
curl http://localhost:8080/api/v1/project/my-project/packages

# With pagination
curl "http://localhost:8080/api/v1/project/my-project/packages?page=1&limit=10"

# With search
curl "http://localhost:8080/api/v1/project/my-project/packages?search=release"

# With sorting
curl "http://localhost:8080/api/v1/project/my-project/packages?sort=created_at&order=desc"

# Filter by format/platform
curl "http://localhost:8080/api/v1/project/my-project/packages?format=npm&platform=linux"

Response includes aggregated metadata:

{
  "items": [
    {
      "id": "uuid",
      "name": "releases",
      "description": "Release builds",
      "format": "generic",
      "platform": "any",
      "tag_count": 5,
      "artifact_count": 3,
      "total_size": 1048576,
      "latest_tag": "v1.0.0",
      "latest_upload_at": "2025-01-01T00:00:00Z",
      "recent_tags": [...]
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Single Package

curl http://localhost:8080/api/v1/project/my-project/packages/releases

# Include all tags (not just recent 5)
curl "http://localhost:8080/api/v1/project/my-project/packages/releases?include_tags=true"

Upload an Artifact

curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload \
  -F "file=@./build/app-v1.0.0.tar.gz" \
  -F "tag=v1.0.0"

Response:

{
  "artifact_id": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678",
  "size": 1048576,
  "project": "my-project",
  "package": "releases",
  "tag": "v1.0.0",
  "format_metadata": {
    "format": "tarball",
    "package_name": "app",
    "version": "1.0.0"
  },
  "deduplicated": false
}

Resumable Upload (for large files)

For files larger than 100MB, use the resumable upload API:

# 1. Initialize upload (client must compute SHA256 hash first)
curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/init \
  -H "Content-Type: application/json" \
  -d '{
    "expected_hash": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678",
    "filename": "large-file.tar.gz",
    "size": 524288000,
    "tag": "v2.0.0"
  }'

# Response: {"upload_id": "abc123", "already_exists": false, "chunk_size": 10485760}

# 2. Upload parts (10MB chunks recommended)
curl -X PUT http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/part/1 \
  --data-binary @chunk1.bin

# 3. Complete the upload
curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/complete \
  -H "Content-Type: application/json" \
  -d '{"tag": "v2.0.0"}'

Download an Artifact

# By tag (use -OJ to save with the correct filename from Content-Disposition header)
curl -OJ http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# By artifact ID
curl -OJ http://localhost:8080/api/v1/project/my-project/releases/+/artifact:a3f5d8e12b4c6789...

# Using the short URL pattern
curl -OJ http://localhost:8080/project/my-project/releases/+/latest

# Save to a specific filename
curl -o myfile.tar.gz http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# Partial download (range request)
curl -H "Range: bytes=0-1023" http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# Check file info without downloading (HEAD request)
curl -I http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0

# Download with specific mode (presigned, redirect, or proxy)
curl "http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0?mode=proxy"

# Get presigned URL for direct S3 download
curl http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0/url

Note on curl flags:

  • -O saves the file using the URL path as the filename (e.g., latest, v1.0.0)
  • -J tells curl to use the filename from the Content-Disposition header (e.g., app-v1.0.0.tar.gz)
  • -OJ combines both: download to a file using the server-provided filename
  • -o <filename> saves to a specific filename you choose

Download Modes

Orchard supports three download modes, configurable via ORCHARD_DOWNLOAD_MODE or per-request with ?mode=:

Mode Description Use Case
presigned (default) Returns JSON with a presigned S3 URL Clients that handle redirects themselves, web UIs
redirect Returns HTTP 302 redirect to presigned S3 URL Simple clients, browsers, wget
proxy Streams content through the backend When S3 isn't directly accessible to clients

Presigned URL Response:

{
  "url": "https://minio.example.com/bucket/...",
  "expires_at": "2025-01-01T01:00:00Z",
  "method": "GET",
  "artifact_id": "a3f5d8e...",
  "size": 1048576,
  "content_type": "application/gzip",
  "original_name": "app-v1.0.0.tar.gz",
  "checksum_sha256": "a3f5d8e...",
  "checksum_md5": "d41d8cd..."
}

Note: For presigned URLs to work, clients must be able to reach the S3 endpoint directly. In Kubernetes, this requires exposing MinIO via ingress (see Helm configuration below).

Create a Tag

curl -X POST http://localhost:8080/api/v1/project/my-project/releases/tags \
  -H "Content-Type: application/json" \
  -d '{"name": "stable", "artifact_id": "a3f5d8e12b4c6789..."}'

List Tags

# Basic listing with artifact metadata
curl http://localhost:8080/api/v1/project/my-project/releases/tags

# With pagination
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?page=1&limit=10"

# Search by tag name
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?search=v1"

# Sort by created_at descending
curl "http://localhost:8080/api/v1/project/my-project/releases/tags?sort=created_at&order=desc"

Response includes artifact metadata:

{
  "items": [
    {
      "id": "uuid",
      "package_id": "uuid",
      "name": "v1.0.0",
      "artifact_id": "a3f5d8e...",
      "created_at": "2025-01-01T00:00:00Z",
      "created_by": "user",
      "artifact_size": 1048576,
      "artifact_content_type": "application/gzip",
      "artifact_original_name": "app-v1.0.0.tar.gz",
      "artifact_created_at": "2025-01-01T00:00:00Z",
      "artifact_format_metadata": {}
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Single Tag

curl http://localhost:8080/api/v1/project/my-project/releases/tags/v1.0.0

Get Tag History

curl http://localhost:8080/api/v1/project/my-project/releases/tags/latest/history

Returns list of artifact changes for the tag (most recent first).

List Artifacts in Package

# Basic listing
curl http://localhost:8080/api/v1/project/my-project/releases/artifacts

# Filter by content type
curl "http://localhost:8080/api/v1/project/my-project/releases/artifacts?content_type=application/gzip"

# Filter by date range
curl "http://localhost:8080/api/v1/project/my-project/releases/artifacts?created_after=2025-01-01T00:00:00Z"

Response includes tags pointing to each artifact:

{
  "items": [
    {
      "id": "a3f5d8e...",
      "size": 1048576,
      "content_type": "application/gzip",
      "original_name": "app-v1.0.0.tar.gz",
      "created_at": "2025-01-01T00:00:00Z",
      "created_by": "user",
      "format_metadata": {},
      "tags": ["v1.0.0", "latest", "stable"]
    }
  ],
  "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1}
}

Get Artifact by ID

curl http://localhost:8080/api/v1/artifact/a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678

Response includes all tags/packages referencing the artifact:

{
  "id": "a3f5d8e...",
  "size": 1048576,
  "content_type": "application/gzip",
  "original_name": "app-v1.0.0.tar.gz",
  "created_at": "2025-01-01T00:00:00Z",
  "created_by": "user",
  "ref_count": 2,
  "format_metadata": {},
  "tags": [
    {
      "id": "uuid",
      "name": "v1.0.0",
      "package_id": "uuid",
      "package_name": "releases",
      "project_name": "my-project"
    }
  ]
}

Project Structure

orchard/
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── config.py           # Pydantic settings
│   │   ├── database.py         # SQLAlchemy setup and migrations
│   │   ├── main.py             # FastAPI application
│   │   ├── metadata.py         # Format-specific metadata extraction
│   │   ├── models.py           # SQLAlchemy models
│   │   ├── routes.py           # API endpoints
│   │   ├── schemas.py          # Pydantic schemas
│   │   └── storage.py          # S3 storage layer with multipart support
│   └── requirements.txt
├── frontend/
│   ├── src/
│   │   ├── components/         # Reusable UI components
│   │   │   ├── Badge.tsx       # Status/type badges
│   │   │   ├── Breadcrumb.tsx  # Navigation breadcrumbs
│   │   │   ├── Card.tsx        # Card containers
│   │   │   ├── DataTable.tsx   # Sortable data tables
│   │   │   ├── FilterChip.tsx  # Active filter chips
│   │   │   ├── Pagination.tsx  # Page navigation
│   │   │   ├── SearchInput.tsx # Debounced search
│   │   │   └── SortDropdown.tsx# Sort field selector
│   │   ├── pages/              # Page components
│   │   │   ├── Home.tsx        # Project list
│   │   │   ├── ProjectPage.tsx # Package list within project
│   │   │   └── PackagePage.tsx # Tag/artifact list within package
│   │   ├── api.ts              # API client with pagination support
│   │   ├── types.ts            # TypeScript interfaces
│   │   ├── App.tsx
│   │   └── main.tsx
│   ├── index.html
│   ├── package.json
│   ├── tsconfig.json
│   └── vite.config.ts
├── helm/
│   └── orchard/                # Helm chart
├── Dockerfile                  # Multi-stage build (Node + Python)
├── docker-compose.yml          # Local development stack
└── .gitlab-ci.yml              # CI/CD pipeline

Configuration

Configuration is provided via environment variables prefixed with ORCHARD_:

Environment Variable Description Default
ORCHARD_SERVER_HOST Server bind address 0.0.0.0
ORCHARD_SERVER_PORT Server port 8080
ORCHARD_DATABASE_HOST PostgreSQL host localhost
ORCHARD_DATABASE_PORT PostgreSQL port 5432
ORCHARD_DATABASE_USER PostgreSQL user orchard
ORCHARD_DATABASE_PASSWORD PostgreSQL password -
ORCHARD_DATABASE_DBNAME PostgreSQL database orchard
ORCHARD_S3_ENDPOINT S3 endpoint URL -
ORCHARD_S3_REGION S3 region us-east-1
ORCHARD_S3_BUCKET S3 bucket name orchard-artifacts
ORCHARD_S3_ACCESS_KEY_ID S3 access key -
ORCHARD_S3_SECRET_ACCESS_KEY S3 secret key -
ORCHARD_DOWNLOAD_MODE Download mode: presigned, redirect, or proxy presigned
ORCHARD_PRESIGNED_URL_EXPIRY Presigned URL expiry in seconds 3600

Kubernetes Deployment

Using Helm

# Add Bitnami repo for dependencies
helm repo add bitnami https://charts.bitnami.com/bitnami

# Update dependencies
cd helm/orchard
helm dependency update

# Install
helm install orchard ./helm/orchard -n orchard --create-namespace

# Install with custom values
helm install orchard ./helm/orchard -f my-values.yaml

Helm Configuration

Key configuration options in values.yaml:

orchard:
  # Download configuration
  download:
    mode: "presigned"       # presigned, redirect, or proxy
    presignedUrlExpiry: 3600

# MinIO ingress (required for presigned URL downloads)
minioIngress:
  enabled: true
  className: "nginx"
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt"
  host: "minio.your-domain.com"
  tls:
    enabled: true
    secretName: minio-tls

When minioIngress.enabled is true, the S3 endpoint automatically uses the external URL (https://minio.your-domain.com), making presigned URLs accessible to external clients.

See helm/orchard/values.yaml for all configuration options.

Database Schema

Core Tables

  • projects - Top-level organizational containers
  • packages - Collections within projects
  • artifacts - Content-addressable artifacts (SHA256)
  • tags - Aliases pointing to artifacts
  • tag_history - Audit trail for tag changes
  • uploads - Upload event records
  • consumers - Dependency tracking
  • access_permissions - Project-level access control
  • api_keys - Programmatic access tokens
  • audit_logs - Immutable operation logs

Future Work

The following features are planned but not yet implemented:

  • CLI tool (orchard command)
  • Dependency file parsing
  • Lock file generation
  • Export/Import for air-gapped systems
  • Consumer notification
  • Automated update propagation
  • OIDC/SAML authentication
  • API key management
  • Redis caching layer
  • Garbage collection for orphaned artifacts

License

Internal use only.