# Warehouse13 **Enterprise Test Artifact Storage** A lightweight, cloud-native API for storing and querying test artifacts including CSV files, JSON files, binary files, and packet captures (PCAP). Built with FastAPI and supports both AWS S3 and self-hosted MinIO storage backends. ## Features - **Multi-format Support**: Store CSV, JSON, binary files, and PCAP files - **Flexible Storage**: Switch between AWS S3 and self-hosted MinIO - **Rich Metadata**: Track test configurations, results, and custom metadata - **Powerful Querying**: Query artifacts by test name, suite, result, tags, date ranges, and more - **RESTful API**: Clean REST API with automatic OpenAPI documentation - **Cloud-Native**: Fully containerized with Docker and Kubernetes/Helm support - **Production-Ready**: Includes GitLab CI/CD pipeline for automated deployments ## Architecture ``` ┌─────────────┐ │ FastAPI │ ← REST API │ Backend │ └──────┬──────┘ │ ├─────────┐ ↓ ↓ ┌──────────┐ ┌────────────┐ │PostgreSQL│ │ S3/MinIO │ │(Metadata)│ │ (Blobs) │ └──────────┘ └────────────┘ ``` - **PostgreSQL**: Stores artifact metadata, test configs, and query indexes - **S3/MinIO**: Stores actual file contents (blob storage) - **FastAPI**: Async REST API for uploads, downloads, and queries ## Quick Start ### Standard Deployment (Internet Access) **Linux/macOS:** ```bash ./quickstart.sh ``` **Windows (PowerShell):** ```powershell .\quickstart.ps1 ``` ### Air-Gapped/Restricted Environment Deployment **For environments with restricted npm access:** ```bash ./quickstart-airgap.sh ``` This script: 1. Builds Angular locally (where npm works) 2. Packages pre-built files into Docker 3. Starts all services See [DEPLOYMENT.md](docs/DEPLOYMENT.md) for detailed instructions. ### Manual Setup with Docker Compose 1. Clone the repository: ```bash git clone cd datalake ``` 2. Copy environment configuration: ```bash cp .env.example .env ``` 3. Start all services: ```bash docker-compose up -d ``` 4. Access the application: - **Web UI**: http://localhost:8000 - **API Docs**: http://localhost:8000/docs - **MinIO Console**: http://localhost:9001 ### Using Python Directly 1. Install dependencies: ```bash pip install -r requirements.txt ``` 2. Set up PostgreSQL and MinIO/S3 3. Configure environment variables in `.env` 4. Run the application: ```bash python -m uvicorn app.main:app --reload ``` ## API Usage ### Upload an Artifact ```bash curl -X POST "http://localhost:8000/api/v1/artifacts/upload" \ -F "file=@test_results.csv" \ -F "test_name=auth_test" \ -F "test_suite=integration" \ -F "test_result=pass" \ -F 'test_config={"browser":"chrome","timeout":30}' \ -F 'tags=["regression","smoke"]' \ -F "description=Authentication test results" ``` ### Query Artifacts ```bash curl -X POST "http://localhost:8000/api/v1/artifacts/query" \ -H "Content-Type: application/json" \ -d '{ "test_suite": "integration", "test_result": "fail", "start_date": "2024-01-01T00:00:00", "limit": 50 }' ``` ### Download an Artifact ```bash curl -X GET "http://localhost:8000/api/v1/artifacts/123/download" \ -o downloaded_file.csv ``` ### Get Presigned URL ```bash curl -X GET "http://localhost:8000/api/v1/artifacts/123/url?expiration=3600" ``` ### List All Artifacts ```bash curl -X GET "http://localhost:8000/api/v1/artifacts/?limit=100&offset=0" ``` ### Delete an Artifact ```bash curl -X DELETE "http://localhost:8000/api/v1/artifacts/123" ``` ## API Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | POST | `/api/v1/artifacts/upload` | Upload a new artifact with metadata | | GET | `/api/v1/artifacts/{id}` | Get artifact metadata by ID | | GET | `/api/v1/artifacts/{id}/download` | Download artifact file | | GET | `/api/v1/artifacts/{id}/url` | Get presigned download URL | | DELETE | `/api/v1/artifacts/{id}` | Delete artifact and file | | POST | `/api/v1/artifacts/query` | Query artifacts with filters | | GET | `/api/v1/artifacts/` | List all artifacts (paginated) | | GET | `/` | API information | | GET | `/health` | Health check | | GET | `/docs` | Interactive API documentation | ## Configuration ### Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `DATABASE_URL` | PostgreSQL connection string | `postgresql://user:password@localhost:5432/datalake` | | `STORAGE_BACKEND` | Storage backend (`s3` or `minio`) | `minio` | | `AWS_ACCESS_KEY_ID` | AWS access key (for S3) | - | | `AWS_SECRET_ACCESS_KEY` | AWS secret key (for S3) | - | | `AWS_REGION` | AWS region (for S3) | `us-east-1` | | `S3_BUCKET_NAME` | S3 bucket name | `test-artifacts` | | `MINIO_ENDPOINT` | MinIO endpoint | `localhost:9000` | | `MINIO_ACCESS_KEY` | MinIO access key | `minioadmin` | | `MINIO_SECRET_KEY` | MinIO secret key | `minioadmin` | | `MINIO_BUCKET_NAME` | MinIO bucket name | `test-artifacts` | | `MINIO_SECURE` | Use HTTPS for MinIO | `false` | | `API_HOST` | API host | `0.0.0.0` | | `API_PORT` | API port | `8000` | | `MAX_UPLOAD_SIZE` | Max upload size (bytes) | `524288000` (500MB) | ### Switching Between S3 and MinIO To use AWS S3: ```bash STORAGE_BACKEND=s3 AWS_ACCESS_KEY_ID=your_key AWS_SECRET_ACCESS_KEY=your_secret AWS_REGION=us-east-1 S3_BUCKET_NAME=your-bucket ``` To use self-hosted MinIO: ```bash STORAGE_BACKEND=minio MINIO_ENDPOINT=minio:9000 MINIO_ACCESS_KEY=minioadmin MINIO_SECRET_KEY=minioadmin MINIO_BUCKET_NAME=test-artifacts ``` ## Deployment ### Kubernetes with Helm **Quick Start:** ```bash helm install warehouse13 ./helm/warehouse13 --namespace warehouse13 --create-namespace ``` **Production Deployment:** ```bash helm install warehouse13 ./helm/warehouse13 \ --namespace warehouse13 \ --create-namespace \ --values ./helm/warehouse13/values-production.yaml ``` **Air-Gapped Deployment:** ```bash helm install warehouse13 ./helm/warehouse13 \ --namespace warehouse13 \ --create-namespace \ --values ./helm/warehouse13/values-airgapped.yaml ``` **Access the Application:** ```bash kubectl port-forward -n warehouse13 svc/warehouse13-frontend 4200:80 kubectl port-forward -n warehouse13 svc/warehouse13-api 8000:8000 ``` ### Helm Documentation - **Full Helm Guide:** [HELM-DEPLOYMENT.md](./docs/HELM-DEPLOYMENT.md) - **Chart README:** [helm/warehouse13/README.md](./helm/warehouse13/README.md) - **Quick Start:** [helm/warehouse13/QUICKSTART.md](./helm/warehouse13/QUICKSTART.md) - **Example Configurations:** - Development: [values-dev.yaml](./helm/warehouse13/values-dev.yaml) - Production: [values-production.yaml](./helm/warehouse13/values-production.yaml) - Air-Gapped: [values-airgapped.yaml](./helm/warehouse13/values-airgapped.yaml) ### Helm Configuration All component images are fully configurable in `helm/warehouse13/values.yaml`: - PostgreSQL image and version - MinIO image and version - API image and version - Frontend image and version - Resource limits and requests - Storage backend configuration - Ingress and TLS settings - Persistence and storage classes ### GitLab CI/CD The included `.gitlab-ci.yml` provides: - Automated testing - Linting - Docker image builds - Deployments to dev/staging/prod **Required GitLab CI/CD Variables:** - `CI_REGISTRY_USER`: Docker registry username - `CI_REGISTRY_PASSWORD`: Docker registry password - `KUBE_CONFIG_DEV`: Base64-encoded kubeconfig for dev - `KUBE_CONFIG_STAGING`: Base64-encoded kubeconfig for staging - `KUBE_CONFIG_PROD`: Base64-encoded kubeconfig for prod ## Database Schema The `artifacts` table stores: - File metadata (name, type, size, storage path) - Test information (name, suite, config, result) - Custom metadata and tags - Timestamps and versioning ## Example Use Cases ### Store Test Results Upload CSV files containing test execution results with metadata about the test suite and configuration. ### Archive Packet Captures Store PCAP files from network tests with tags for easy filtering and retrieval. ### Track Test Configurations Upload JSON test configurations and query them by date, test suite, or custom tags. ### Binary Artifact Storage Store compiled binaries, test data files, or any binary artifacts with full metadata. ## Development ### Running Tests ```bash pytest tests/ -v ``` ### Code Formatting ```bash black app/ flake8 app/ ``` ### Database Migrations ```bash alembic revision --autogenerate -m "description" alembic upgrade head ``` ## Troubleshooting ### Cannot Connect to Database - Verify PostgreSQL is running - Check `DATABASE_URL` is correct - Ensure database exists ### Cannot Upload Files - Check storage backend is running (MinIO or S3 accessible) - Verify credentials are correct - Check file size is under `MAX_UPLOAD_SIZE` ### MinIO Connection Failed - Ensure MinIO service is running - Verify `MINIO_ENDPOINT` is correct - Check MinIO credentials ## License [Your License Here] ## Support For issues and questions, please open an issue in the repository.