327 lines
8.1 KiB
Markdown
327 lines
8.1 KiB
Markdown
# Warehouse13
|
|
|
|
**Enterprise Test Artifact Storage**
|
|
|
|
A lightweight, cloud-native API for storing and querying test artifacts including CSV files, JSON files, binary files, and packet captures (PCAP). Built with FastAPI and supports both AWS S3 and self-hosted MinIO storage backends.
|
|
|
|
## Features
|
|
|
|
- **Multi-format Support**: Store CSV, JSON, binary files, and PCAP files
|
|
- **Flexible Storage**: Switch between AWS S3 and self-hosted MinIO
|
|
- **Rich Metadata**: Track test configurations, results, and custom metadata
|
|
- **Powerful Querying**: Query artifacts by test name, suite, result, tags, date ranges, and more
|
|
- **RESTful API**: Clean REST API with automatic OpenAPI documentation
|
|
- **Cloud-Native**: Fully containerized with Docker and Kubernetes/Helm support
|
|
- **Production-Ready**: Includes GitLab CI/CD pipeline for automated deployments
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────┐
|
|
│ FastAPI │ ← REST API
|
|
│ Backend │
|
|
└──────┬──────┘
|
|
│
|
|
├─────────┐
|
|
↓ ↓
|
|
┌──────────┐ ┌────────────┐
|
|
│PostgreSQL│ │ S3/MinIO │
|
|
│(Metadata)│ │ (Blobs) │
|
|
└──────────┘ └────────────┘
|
|
```
|
|
|
|
- **PostgreSQL**: Stores artifact metadata, test configs, and query indexes
|
|
- **S3/MinIO**: Stores actual file contents (blob storage)
|
|
- **FastAPI**: Async REST API for uploads, downloads, and queries
|
|
|
|
## Quick Start
|
|
|
|
### Standard Deployment (Internet Access)
|
|
|
|
**Linux/macOS:**
|
|
```bash
|
|
./quickstart.sh
|
|
```
|
|
|
|
**Windows (PowerShell):**
|
|
```powershell
|
|
.\quickstart.ps1
|
|
```
|
|
|
|
### Air-Gapped/Restricted Environment Deployment
|
|
|
|
**For environments with restricted npm access:**
|
|
```bash
|
|
./quickstart-airgap.sh
|
|
```
|
|
|
|
This script:
|
|
1. Builds Angular locally (where npm works)
|
|
2. Packages pre-built files into Docker
|
|
3. Starts all services
|
|
|
|
See [DEPLOYMENT.md](docs/DEPLOYMENT.md) for detailed instructions.
|
|
|
|
### Manual Setup with Docker Compose
|
|
|
|
1. Clone the repository:
|
|
```bash
|
|
git clone <repository-url>
|
|
cd datalake
|
|
```
|
|
|
|
2. Copy environment configuration:
|
|
```bash
|
|
cp .env.example .env
|
|
```
|
|
|
|
3. Start all services:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
4. Access the application:
|
|
- **Web UI**: http://localhost:8000
|
|
- **API Docs**: http://localhost:8000/docs
|
|
- **MinIO Console**: http://localhost:9001
|
|
|
|
### Using Python Directly
|
|
|
|
1. Install dependencies:
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
2. Set up PostgreSQL and MinIO/S3
|
|
|
|
3. Configure environment variables in `.env`
|
|
|
|
4. Run the application:
|
|
```bash
|
|
python -m uvicorn app.main:app --reload
|
|
```
|
|
|
|
## API Usage
|
|
|
|
### Upload an Artifact
|
|
|
|
```bash
|
|
curl -X POST "http://localhost:8000/api/v1/artifacts/upload" \
|
|
-F "file=@test_results.csv" \
|
|
-F "test_name=auth_test" \
|
|
-F "test_suite=integration" \
|
|
-F "test_result=pass" \
|
|
-F 'test_config={"browser":"chrome","timeout":30}' \
|
|
-F 'tags=["regression","smoke"]' \
|
|
-F "description=Authentication test results"
|
|
```
|
|
|
|
### Query Artifacts
|
|
|
|
```bash
|
|
curl -X POST "http://localhost:8000/api/v1/artifacts/query" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"test_suite": "integration",
|
|
"test_result": "fail",
|
|
"start_date": "2024-01-01T00:00:00",
|
|
"limit": 50
|
|
}'
|
|
```
|
|
|
|
### Download an Artifact
|
|
|
|
```bash
|
|
curl -X GET "http://localhost:8000/api/v1/artifacts/123/download" \
|
|
-o downloaded_file.csv
|
|
```
|
|
|
|
### Get Presigned URL
|
|
|
|
```bash
|
|
curl -X GET "http://localhost:8000/api/v1/artifacts/123/url?expiration=3600"
|
|
```
|
|
|
|
### List All Artifacts
|
|
|
|
```bash
|
|
curl -X GET "http://localhost:8000/api/v1/artifacts/?limit=100&offset=0"
|
|
```
|
|
|
|
### Delete an Artifact
|
|
|
|
```bash
|
|
curl -X DELETE "http://localhost:8000/api/v1/artifacts/123"
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
| Method | Endpoint | Description |
|
|
|--------|----------|-------------|
|
|
| POST | `/api/v1/artifacts/upload` | Upload a new artifact with metadata |
|
|
| GET | `/api/v1/artifacts/{id}` | Get artifact metadata by ID |
|
|
| GET | `/api/v1/artifacts/{id}/download` | Download artifact file |
|
|
| GET | `/api/v1/artifacts/{id}/url` | Get presigned download URL |
|
|
| DELETE | `/api/v1/artifacts/{id}` | Delete artifact and file |
|
|
| POST | `/api/v1/artifacts/query` | Query artifacts with filters |
|
|
| GET | `/api/v1/artifacts/` | List all artifacts (paginated) |
|
|
| GET | `/` | API information |
|
|
| GET | `/health` | Health check |
|
|
| GET | `/docs` | Interactive API documentation |
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `DATABASE_URL` | PostgreSQL connection string | `postgresql://user:password@localhost:5432/datalake` |
|
|
| `STORAGE_BACKEND` | Storage backend (`s3` or `minio`) | `minio` |
|
|
| `AWS_ACCESS_KEY_ID` | AWS access key (for S3) | - |
|
|
| `AWS_SECRET_ACCESS_KEY` | AWS secret key (for S3) | - |
|
|
| `AWS_REGION` | AWS region (for S3) | `us-east-1` |
|
|
| `S3_BUCKET_NAME` | S3 bucket name | `test-artifacts` |
|
|
| `MINIO_ENDPOINT` | MinIO endpoint | `localhost:9000` |
|
|
| `MINIO_ACCESS_KEY` | MinIO access key | `minioadmin` |
|
|
| `MINIO_SECRET_KEY` | MinIO secret key | `minioadmin` |
|
|
| `MINIO_BUCKET_NAME` | MinIO bucket name | `test-artifacts` |
|
|
| `MINIO_SECURE` | Use HTTPS for MinIO | `false` |
|
|
| `API_HOST` | API host | `0.0.0.0` |
|
|
| `API_PORT` | API port | `8000` |
|
|
| `MAX_UPLOAD_SIZE` | Max upload size (bytes) | `524288000` (500MB) |
|
|
|
|
### Switching Between S3 and MinIO
|
|
|
|
To use AWS S3:
|
|
```bash
|
|
STORAGE_BACKEND=s3
|
|
AWS_ACCESS_KEY_ID=your_key
|
|
AWS_SECRET_ACCESS_KEY=your_secret
|
|
AWS_REGION=us-east-1
|
|
S3_BUCKET_NAME=your-bucket
|
|
```
|
|
|
|
To use self-hosted MinIO:
|
|
```bash
|
|
STORAGE_BACKEND=minio
|
|
MINIO_ENDPOINT=minio:9000
|
|
MINIO_ACCESS_KEY=minioadmin
|
|
MINIO_SECRET_KEY=minioadmin
|
|
MINIO_BUCKET_NAME=test-artifacts
|
|
```
|
|
|
|
## Deployment
|
|
|
|
### Kubernetes with Helm
|
|
|
|
1. Build and push Docker image:
|
|
```bash
|
|
docker build -t your-registry/datalake:latest .
|
|
docker push your-registry/datalake:latest
|
|
```
|
|
|
|
2. Install with Helm:
|
|
```bash
|
|
helm install datalake ./helm \
|
|
--set image.repository=your-registry/datalake \
|
|
--set image.tag=latest \
|
|
--namespace datalake \
|
|
--create-namespace
|
|
```
|
|
|
|
3. Access the API:
|
|
```bash
|
|
kubectl port-forward -n datalake svc/datalake 8000:8000
|
|
```
|
|
|
|
### Helm Configuration
|
|
|
|
Edit `helm/values.yaml` to customize:
|
|
- Replica count
|
|
- Resource limits
|
|
- Storage backend (S3 vs MinIO)
|
|
- Ingress settings
|
|
- PostgreSQL settings
|
|
- Autoscaling
|
|
|
|
### GitLab CI/CD
|
|
|
|
The included `.gitlab-ci.yml` provides:
|
|
- Automated testing
|
|
- Linting
|
|
- Docker image builds
|
|
- Deployments to dev/staging/prod
|
|
|
|
**Required GitLab CI/CD Variables:**
|
|
- `CI_REGISTRY_USER`: Docker registry username
|
|
- `CI_REGISTRY_PASSWORD`: Docker registry password
|
|
- `KUBE_CONFIG_DEV`: Base64-encoded kubeconfig for dev
|
|
- `KUBE_CONFIG_STAGING`: Base64-encoded kubeconfig for staging
|
|
- `KUBE_CONFIG_PROD`: Base64-encoded kubeconfig for prod
|
|
|
|
## Database Schema
|
|
|
|
The `artifacts` table stores:
|
|
- File metadata (name, type, size, storage path)
|
|
- Test information (name, suite, config, result)
|
|
- Custom metadata and tags
|
|
- Timestamps and versioning
|
|
|
|
## Example Use Cases
|
|
|
|
### Store Test Results
|
|
Upload CSV files containing test execution results with metadata about the test suite and configuration.
|
|
|
|
### Archive Packet Captures
|
|
Store PCAP files from network tests with tags for easy filtering and retrieval.
|
|
|
|
### Track Test Configurations
|
|
Upload JSON test configurations and query them by date, test suite, or custom tags.
|
|
|
|
### Binary Artifact Storage
|
|
Store compiled binaries, test data files, or any binary artifacts with full metadata.
|
|
|
|
## Development
|
|
|
|
### Running Tests
|
|
```bash
|
|
pytest tests/ -v
|
|
```
|
|
|
|
### Code Formatting
|
|
```bash
|
|
black app/
|
|
flake8 app/
|
|
```
|
|
|
|
### Database Migrations
|
|
```bash
|
|
alembic revision --autogenerate -m "description"
|
|
alembic upgrade head
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Cannot Connect to Database
|
|
- Verify PostgreSQL is running
|
|
- Check `DATABASE_URL` is correct
|
|
- Ensure database exists
|
|
|
|
### Cannot Upload Files
|
|
- Check storage backend is running (MinIO or S3 accessible)
|
|
- Verify credentials are correct
|
|
- Check file size is under `MAX_UPLOAD_SIZE`
|
|
|
|
### MinIO Connection Failed
|
|
- Ensure MinIO service is running
|
|
- Verify `MINIO_ENDPOINT` is correct
|
|
- Check MinIO credentials
|
|
|
|
## License
|
|
|
|
[Your License Here]
|
|
|
|
## Support
|
|
|
|
For issues and questions, please open an issue in the repository.
|