Add schema enhancements for uploads, artifacts, and audit tracking

- Add format and platform fields to packages table
- Add checksum_md5 and metadata JSONB to artifacts with CHECK constraints
- Add updated_at and composite index to tags table
- Add tag_name, user_agent, duration_ms, deduplicated, checksum_verified to uploads
- Add change_type field to tag_history table
- Add composite indexes and GIN index to audit_logs
- Add partial index for public projects
- Add triggers for ref_count accuracy and updated_at timestamps
- Create migration script (002) for existing databases
This commit is contained in:
Mondo Diaz
2025-12-12 15:05:24 -06:00
parent 4afcdf5cda
commit 74d1081666
5 changed files with 461 additions and 5 deletions

167
CLAUDE.md Normal file
View File

@@ -0,0 +1,167 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## ⚠️ BEFORE STARTING ANY NEW ISSUE
**ALWAYS run `/start-issue` before beginning work on a new task.** This ensures you have the latest code and prevents merge conflicts.
If you cannot use the slash command, manually run:
```bash
git checkout main && git pull
```
## Project Overview
Orchard is a content-addressable artifact storage system. Artifacts are stored and referenced by their SHA256 hash, enabling automatic deduplication. The system uses a Project → Package → Artifact hierarchy with Tags as human-readable aliases.
## Tech Stack
- **Backend**: Python 3.12 + FastAPI + SQLAlchemy
- **Frontend**: React 18 + TypeScript + Vite
- **Database**: PostgreSQL 16
- **Object Storage**: MinIO (S3-compatible)
- **Cache**: Redis (configured but not yet used)
## Common Commands
### Local Development (Docker Compose)
```bash
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f orchard-server
# Stop services
docker-compose down
```
### Backend Development
```bash
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8080
```
### Frontend Development
```bash
cd frontend
npm install
npm run dev # Dev server with API proxy to localhost:8080
npm run build # Production build
```
### Building Docker Image
```bash
docker build -t orchard .
# With custom NPM registry
docker build --build-arg NPM_REGISTRY=https://registry.example.com -t orchard .
```
## Architecture
### Content-Addressable Storage
Artifacts use SHA256 hash as their primary identifier. The storage flow:
1. Upload receives file → compute SHA256 → check if hash exists
2. If new: store in S3 at `fruits/{hash[:2]}/{hash[2:4]}/{hash}`
3. If duplicate: increment `ref_count`, skip S3 upload
4. Artifact ID = SHA256 hash (64-char hex string)
### Key Backend Files
- `backend/app/routes.py` - All API endpoints
- `backend/app/storage.py` - S3 storage layer with deduplication logic
- `backend/app/models.py` - SQLAlchemy ORM models
- `backend/app/schemas.py` - Pydantic request/response schemas
- `backend/app/config.py` - Environment-based settings (ORCHARD_* prefix)
### API URL Patterns
- `/api/v1/search` - Global search across projects, packages, and artifacts
- `/api/v1/projects` - Project CRUD (supports `visibility`, `search`, `sort`, `order` params)
- `/api/v1/project/{project}/packages` - Package CRUD (supports `format`, `platform`, `search`, `sort`, `order` params)
- `/api/v1/project/{project}/{package}/upload` - Upload artifact (multipart form)
- `/api/v1/project/{project}/{package}/+/{ref}` - Download artifact
- `/api/v1/project/{project}/{package}/tags` - Tag management (search includes artifact filename)
Download `ref` parameter supports: tag name, `tag:name`, `artifact:sha256hash`
### Database Schema
Core tables in `migrations/001_initial.sql`:
- `projects` - Top-level containers
- `packages` - Collections within projects
- `artifacts` - Content-addressable storage (SHA256 as PK, includes ref_count)
- `tags` - Aliases pointing to artifacts
- `uploads` - Upload event log (tracks all uploads including duplicates)
### Frontend Structure
- `frontend/src/api.ts` - API client functions
- `frontend/src/types.ts` - TypeScript interfaces matching backend schemas
- `frontend/src/pages/` - Route components (Home, ProjectPage, PackagePage)
Frontend dev server proxies `/api`, `/health`, `/project` to backend at localhost:8080.
## Environment Variables
All config uses `ORCHARD_` prefix. Key variables:
- `ORCHARD_DATABASE_HOST/PORT/USER/PASSWORD/DBNAME`
- `ORCHARD_S3_ENDPOINT/BUCKET/ACCESS_KEY_ID/SECRET_ACCESS_KEY`
- `ORCHARD_S3_USE_PATH_STYLE=true` for MinIO
## Local Services
| Service | Port | Purpose |
|---------|------|---------|
| orchard-server | 8080 | API + Web UI |
| postgres | 5432 | Metadata |
| minio | 9000 | Object storage |
| minio console | 9001 | MinIO web UI (minioadmin/minioadmin) |
| redis | 6379 | Cache (future) |
## Git Workflow
The code flows through multiple remotes:
```
Local (this laptop) → Bitstorm → Work laptop → BSF (GitLab) → Work laptop → Bitstorm
```
1. **Local laptop**: Claude makes changes, pushes feature branch to Bitstorm
2. **Bitstorm**: Remote git repo (origin)
3. **Work laptop**: Fetch from Bitstorm, checkout feature branch
4. **BSF (GitLab at work)**: Push branch, merge to main (squash merge recommended)
5. **Work laptop**: Pull merged main from BSF
6. **Bitstorm**: Push updated main back to keep repos in sync
**Squash merging** is recommended when merging feature branches to main on BSF to keep history clean.
## Workflow Guidelines
- **Run `/start-issue` before starting any new issue** (or manually: `git checkout main && git pull`)
- Do NOT include AI attribution in commit messages (no "Generated with Claude Code", no "Co-Authored-By: Claude", etc.)
- Always update documentation with each completed work
- Always output complete acceptance criteria in a markdown code block so it can be copied, marking completed items as checked
- Use `- [ ]` for unchecked items
- Use `- [x]` for checked items
- Example format:
```markdown
## Acceptance Criteria
- [x] Completed item
- [ ] Incomplete item
```
- **NEVER merge branches to main without explicit user approval** - If testing requires changes from another branch to be merged first, inform the user and wait for approval before merging. Do not merge branches autonomously.
- **Testing dependencies** - If you cannot fully test functionality because it depends on unmerged changes from another branch, clearly inform the user what is blocking testing and what needs to be merged first.
- **Always run linting before committing**:
- Frontend: `cd frontend && npm run lint` (if available) or `npm run build` to catch TypeScript errors
- Backend: `cd backend && python -m py_compile app/*.py` to check for syntax errors
- **Update CHANGELOG.md** - At the end of each issue, add entries under the `[Unreleased]` section with appropriate `Added`, `Changed`, `Fixed`, or `Removed` subsections. Ask the user for the GitLab issue number to include in the entry (e.g., `(#16)`).