# Orchard **Content-Addressable Storage System** Orchard is a centralized binary artifact storage system that provides content-addressable storage with automatic deduplication, flexible access control, and multi-format package support. ## Tech Stack - **Backend**: Python 3.12 + FastAPI - **Frontend**: React 18 + TypeScript + Vite - **Database**: PostgreSQL 16 - **Object Storage**: MinIO (S3-compatible) - **Cache**: Redis (for future use) ## Features ### Currently Implemented - **Content-Addressable Storage** - Artifacts are stored and referenced by their SHA256 hash, ensuring deduplication and data integrity - **Project/Package/Artifact Hierarchy** - Organized storage structure: - **Project** - Top-level organizational container - **Package** - Named collection within a project - **Artifact** - Specific content instance identified by SHA256 - **Tags** - Alias system for referencing artifacts by human-readable names (e.g., `v1.0.0`, `latest`, `stable`) - **Package Formats & Platforms** - Packages can be tagged with format (npm, pypi, docker, deb, rpm, etc.) and platform (linux, darwin, windows, etc.) - **Rich Package Metadata** - Package listings include aggregated stats (tag count, artifact count, total size, latest tag) - **S3-Compatible Backend** - Uses MinIO (or any S3-compatible storage) for artifact storage - **PostgreSQL Metadata** - Relational database for metadata, access control, and audit trails - **REST API** - Full HTTP API for all operations - **Web UI** - React-based interface for managing artifacts - **Docker Compose Setup** - Easy local development environment - **Helm Chart** - Kubernetes deployment with PostgreSQL, MinIO, and Redis subcharts - **Multipart Upload** - Automatic multipart upload for files larger than 100MB - **Resumable Uploads** - API for resumable uploads with part-by-part upload support - **Range Requests** - HTTP range request support for partial downloads - **Format-Specific Metadata** - Automatic extraction of metadata from package formats: - `.deb` - Debian packages (name, version, architecture, maintainer) - `.rpm` - RPM packages (name, version, release, architecture) - `.tar.gz/.tgz` - Tarballs (name, version from filename) - `.whl` - Python wheels (name, version, author) - `.jar` - Java JARs (manifest info, Maven coordinates) - `.zip` - ZIP files (file count, uncompressed size) ### API Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | `GET` | `/` | Web UI | | `GET` | `/health` | Health check | | `GET` | `/api/v1/projects` | List all projects | | `POST` | `/api/v1/projects` | Create a new project | | `GET` | `/api/v1/projects/:project` | Get project details | | `GET` | `/api/v1/project/:project/packages` | List packages (with pagination, search, filtering) | | `GET` | `/api/v1/project/:project/packages/:package` | Get single package with metadata | | `POST` | `/api/v1/project/:project/packages` | Create a new package | | `POST` | `/api/v1/project/:project/:package/upload` | Upload an artifact | | `GET` | `/api/v1/project/:project/:package/+/:ref` | Download an artifact (supports Range header) | | `HEAD` | `/api/v1/project/:project/:package/+/:ref` | Get artifact metadata without downloading | | `GET` | `/api/v1/project/:project/:package/tags` | List all tags | | `POST` | `/api/v1/project/:project/:package/tags` | Create a tag | | `GET` | `/api/v1/project/:project/:package/consumers` | List consumers of a package | | `GET` | `/api/v1/artifact/:id` | Get artifact metadata by hash | #### Resumable Upload Endpoints For large files, use the resumable upload API: | Method | Endpoint | Description | |--------|----------|-------------| | `POST` | `/api/v1/project/:project/:package/upload/init` | Initialize resumable upload | | `PUT` | `/api/v1/project/:project/:package/upload/:upload_id/part/:part_number` | Upload a part | | `POST` | `/api/v1/project/:project/:package/upload/:upload_id/complete` | Complete upload | | `DELETE` | `/api/v1/project/:project/:package/upload/:upload_id` | Abort upload | | `GET` | `/api/v1/project/:project/:package/upload/:upload_id/status` | Get upload status | ### Reference Formats When downloading artifacts, the `:ref` parameter supports multiple formats: - `latest` - Tag name directly - `v1.0.0` - Version tag - `tag:stable` - Explicit tag reference - `version:2024.1` - Version reference - `artifact:a3f5d8e12b4c6789...` - Direct SHA256 hash reference ## Quick Start ### Prerequisites - Docker and Docker Compose ### Running Locally ```bash # Start all services docker-compose up -d # View logs docker-compose logs -f orchard-server # Stop services docker-compose down ``` ### Services | Service | Port | Description | |---------|------|-------------| | orchard-server | 8080 | Main API server and Web UI | | postgres | 5432 | PostgreSQL database | | minio | 9000 | S3-compatible object storage | | minio (console) | 9001 | MinIO web console | | redis | 6379 | Cache (for future use) | ### Access Points - **Web UI**: http://localhost:8080 - **API**: http://localhost:8080/api/v1 - **API Docs**: http://localhost:8080/docs - **MinIO Console**: http://localhost:9001 (user: `minioadmin`, pass: `minioadmin`) ## Development ### Backend (FastAPI) ```bash cd backend pip install -r requirements.txt uvicorn app.main:app --reload --port 8080 ``` ### Frontend (React) ```bash cd frontend npm install npm run dev ``` The frontend dev server proxies API requests to `localhost:8080`. ## Usage Examples ### Create a Project ```bash curl -X POST http://localhost:8080/api/v1/projects \ -H "Content-Type: application/json" \ -d '{"name": "my-project", "description": "My project artifacts", "is_public": true}' ``` ### Create a Package ```bash curl -X POST http://localhost:8080/api/v1/project/my-project/packages \ -H "Content-Type: application/json" \ -d '{"name": "releases", "description": "Release builds", "format": "generic", "platform": "any"}' ``` Supported formats: `generic`, `npm`, `pypi`, `docker`, `deb`, `rpm`, `maven`, `nuget`, `helm` Supported platforms: `any`, `linux`, `darwin`, `windows`, `linux-amd64`, `linux-arm64`, `darwin-amd64`, `darwin-arm64`, `windows-amd64` ### List Packages ```bash # Basic listing curl http://localhost:8080/api/v1/project/my-project/packages # With pagination curl "http://localhost:8080/api/v1/project/my-project/packages?page=1&limit=10" # With search curl "http://localhost:8080/api/v1/project/my-project/packages?search=release" # With sorting curl "http://localhost:8080/api/v1/project/my-project/packages?sort=created_at&order=desc" # Filter by format/platform curl "http://localhost:8080/api/v1/project/my-project/packages?format=npm&platform=linux" ``` Response includes aggregated metadata: ```json { "items": [ { "id": "uuid", "name": "releases", "description": "Release builds", "format": "generic", "platform": "any", "tag_count": 5, "artifact_count": 3, "total_size": 1048576, "latest_tag": "v1.0.0", "latest_upload_at": "2025-01-01T00:00:00Z", "recent_tags": [...] } ], "pagination": {"page": 1, "limit": 20, "total": 1, "total_pages": 1} } ``` ### Get Single Package ```bash curl http://localhost:8080/api/v1/project/my-project/packages/releases # Include all tags (not just recent 5) curl "http://localhost:8080/api/v1/project/my-project/packages/releases?include_tags=true" ``` ### Upload an Artifact ```bash curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload \ -F "file=@./build/app-v1.0.0.tar.gz" \ -F "tag=v1.0.0" ``` Response: ```json { "artifact_id": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678", "size": 1048576, "project": "my-project", "package": "releases", "tag": "v1.0.0", "format_metadata": { "format": "tarball", "package_name": "app", "version": "1.0.0" }, "deduplicated": false } ``` ### Resumable Upload (for large files) For files larger than 100MB, use the resumable upload API: ```bash # 1. Initialize upload (client must compute SHA256 hash first) curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/init \ -H "Content-Type: application/json" \ -d '{ "expected_hash": "a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678", "filename": "large-file.tar.gz", "size": 524288000, "tag": "v2.0.0" }' # Response: {"upload_id": "abc123", "already_exists": false, "chunk_size": 10485760} # 2. Upload parts (10MB chunks recommended) curl -X PUT http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/part/1 \ --data-binary @chunk1.bin # 3. Complete the upload curl -X POST http://localhost:8080/api/v1/project/my-project/releases/upload/abc123/complete \ -H "Content-Type: application/json" \ -d '{"tag": "v2.0.0"}' ``` ### Download an Artifact ```bash # By tag curl -O http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0 # By artifact ID curl -O http://localhost:8080/api/v1/project/my-project/releases/+/artifact:a3f5d8e12b4c6789... # Using the short URL pattern curl -O http://localhost:8080/project/my-project/releases/+/latest # Partial download (range request) curl -H "Range: bytes=0-1023" http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0 # Check file info without downloading (HEAD request) curl -I http://localhost:8080/api/v1/project/my-project/releases/+/v1.0.0 ``` ### Create a Tag ```bash curl -X POST http://localhost:8080/api/v1/project/my-project/releases/tags \ -H "Content-Type: application/json" \ -d '{"name": "stable", "artifact_id": "a3f5d8e12b4c6789..."}' ``` ### Get Artifact by ID ```bash curl http://localhost:8080/api/v1/artifact/a3f5d8e12b4c67890abcdef1234567890abcdef1234567890abcdef12345678 ``` ## Project Structure ``` orchard/ ├── backend/ │ ├── app/ │ │ ├── __init__.py │ │ ├── config.py # Pydantic settings │ │ ├── database.py # SQLAlchemy setup and migrations │ │ ├── main.py # FastAPI application │ │ ├── metadata.py # Format-specific metadata extraction │ │ ├── models.py # SQLAlchemy models │ │ ├── routes.py # API endpoints │ │ ├── schemas.py # Pydantic schemas │ │ └── storage.py # S3 storage layer with multipart support │ └── requirements.txt ├── frontend/ │ ├── src/ │ │ ├── components/ # React components │ │ ├── pages/ # Page components │ │ ├── api.ts # API client │ │ ├── types.ts # TypeScript types │ │ ├── App.tsx │ │ └── main.tsx │ ├── index.html │ ├── package.json │ ├── tsconfig.json │ └── vite.config.ts ├── helm/ │ └── orchard/ # Helm chart ├── Dockerfile # Multi-stage build (Node + Python) ├── docker-compose.yml # Local development stack └── .gitlab-ci.yml # CI/CD pipeline ``` ## Configuration Configuration is provided via environment variables prefixed with `ORCHARD_`: | Environment Variable | Description | Default | |---------------------|-------------|---------| | `ORCHARD_SERVER_HOST` | Server bind address | `0.0.0.0` | | `ORCHARD_SERVER_PORT` | Server port | `8080` | | `ORCHARD_DATABASE_HOST` | PostgreSQL host | `localhost` | | `ORCHARD_DATABASE_PORT` | PostgreSQL port | `5432` | | `ORCHARD_DATABASE_USER` | PostgreSQL user | `orchard` | | `ORCHARD_DATABASE_PASSWORD` | PostgreSQL password | - | | `ORCHARD_DATABASE_DBNAME` | PostgreSQL database | `orchard` | | `ORCHARD_S3_ENDPOINT` | S3 endpoint URL | - | | `ORCHARD_S3_REGION` | S3 region | `us-east-1` | | `ORCHARD_S3_BUCKET` | S3 bucket name | `orchard-artifacts` | | `ORCHARD_S3_ACCESS_KEY_ID` | S3 access key | - | | `ORCHARD_S3_SECRET_ACCESS_KEY` | S3 secret key | - | ## Kubernetes Deployment ### Using Helm ```bash # Add Bitnami repo for dependencies helm repo add bitnami https://charts.bitnami.com/bitnami # Update dependencies cd helm/orchard helm dependency update # Install helm install orchard ./helm/orchard -n orchard --create-namespace # Install with custom values helm install orchard ./helm/orchard -f my-values.yaml ``` See `helm/orchard/values.yaml` for all configuration options. ## Database Schema ### Core Tables - **projects** - Top-level organizational containers - **packages** - Collections within projects - **artifacts** - Content-addressable artifacts (SHA256) - **tags** - Aliases pointing to artifacts - **tag_history** - Audit trail for tag changes - **uploads** - Upload event records - **consumers** - Dependency tracking - **access_permissions** - Project-level access control - **api_keys** - Programmatic access tokens - **audit_logs** - Immutable operation logs ## Future Work The following features are planned but not yet implemented: - [ ] CLI tool (`orchard` command) - [ ] Dependency file parsing - [ ] Lock file generation - [ ] Export/Import for air-gapped systems - [ ] Consumer notification - [ ] Automated update propagation - [ ] OIDC/SAML authentication - [ ] API key management - [ ] Redis caching layer - [ ] Garbage collection for orphaned artifacts ## License Internal use only.