Try out chunking

This commit is contained in:
pratik
2025-10-21 15:19:49 -05:00
parent 9ad9e6b7b8
commit fdcc92eeb6
9 changed files with 849 additions and 2 deletions

390
CHUNKED_UPLOAD_GUIDE.md Normal file
View File

@@ -0,0 +1,390 @@
# Chunked Upload Implementation Guide
## Overview
This application now supports chunked file uploads to avoid nginx 413 "Request Entity Too Large" errors when deploying large JAR files through a load balancer.
## How It Works
Instead of uploading the entire JAR and manifest files in a single request, files are split into smaller chunks (default 5MB) and uploaded sequentially. The server reassembles the chunks before deployment.
## API Endpoints
### 1. Initialize Upload Session
**POST** `/api/cf/upload/init`
Creates a new upload session and returns a session ID.
**Request Body:**
```json
{
"apiEndpoint": "https://api.cf.example.com",
"username": "your-username",
"password": "your-password",
"organization": "your-org",
"space": "your-space",
"appName": "your-app",
"skipSslValidation": false
}
```
**Response:**
```json
{
"success": true,
"uploadSessionId": "550e8400-e29b-41d4-a716-446655440000",
"message": "Upload session created successfully"
}
```
### 2. Upload File Chunk
**POST** `/api/cf/upload/chunk`
Upload a single chunk of a file.
**Request Parameters:**
- `uploadSessionId` (string): The session ID from step 1
- `fileType` (string): Either "jarFile" or "manifest"
- `chunkIndex` (integer): Zero-based index of this chunk (0, 1, 2, ...)
- `totalChunks` (integer): Total number of chunks for this file
- `fileName` (string, optional): Original filename (required for jarFile)
- `chunk` (multipart file): The chunk data
**Response:**
```json
{
"success": true,
"uploadSessionId": "550e8400-e29b-41d4-a716-446655440000",
"fileType": "jarFile",
"chunkIndex": 0,
"totalChunks": 10,
"receivedChunks": 1,
"message": "Chunk uploaded successfully"
}
```
### 3. Get Upload Status
**GET** `/api/cf/upload/status/{uploadSessionId}`
Check the status of an upload session.
**Response:**
```json
{
"jarFile": {
"fileName": "myapp.jar",
"totalChunks": 10,
"receivedChunks": {
"0": true,
"1": true,
"2": true
}
},
"manifest": {
"fileName": "manifest.yml",
"totalChunks": 1,
"receivedChunks": {
"0": true
}
}
}
```
### 4. Finalize Upload and Deploy
**POST** `/api/cf/upload/finalize?uploadSessionId={sessionId}`
Triggers the deployment after all chunks are uploaded.
**Response:**
Same as the traditional `/api/cf/deploy` endpoint.
## Client Implementation Example (JavaScript)
```javascript
const CHUNK_SIZE = 5 * 1024 * 1024; // 5MB
async function deployWithChunks(jarFile, manifestFile, deploymentConfig) {
const apiBase = 'https://your-app.example.com/api/cf';
// Step 1: Initialize upload session
const initResponse = await fetch(`${apiBase}/upload/init`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(deploymentConfig)
});
const { uploadSessionId } = await initResponse.json();
console.log('Upload session created:', uploadSessionId);
// Step 2: Upload JAR file in chunks
await uploadFileInChunks(apiBase, uploadSessionId, 'jarFile', jarFile);
// Step 3: Upload manifest file in chunks
await uploadFileInChunks(apiBase, uploadSessionId, 'manifest', manifestFile);
// Step 4: Finalize and deploy
const deployResponse = await fetch(
`${apiBase}/upload/finalize?uploadSessionId=${uploadSessionId}`,
{ method: 'POST' }
);
const result = await deployResponse.json();
console.log('Deployment result:', result);
return result;
}
async function uploadFileInChunks(apiBase, sessionId, fileType, file) {
const totalChunks = Math.ceil(file.size / CHUNK_SIZE);
console.log(`Uploading ${fileType}: ${file.name} (${totalChunks} chunks)`);
for (let chunkIndex = 0; chunkIndex < totalChunks; chunkIndex++) {
const start = chunkIndex * CHUNK_SIZE;
const end = Math.min(start + CHUNK_SIZE, file.size);
const chunk = file.slice(start, end);
const formData = new FormData();
formData.append('chunk', chunk);
formData.append('uploadSessionId', sessionId);
formData.append('fileType', fileType);
formData.append('chunkIndex', chunkIndex);
formData.append('totalChunks', totalChunks);
formData.append('fileName', file.name);
const response = await fetch(`${apiBase}/upload/chunk`, {
method: 'POST',
body: formData
});
const result = await response.json();
console.log(`Chunk ${chunkIndex + 1}/${totalChunks} uploaded for ${fileType}`);
if (!result.success) {
throw new Error(`Failed to upload chunk: ${result.message}`);
}
}
console.log(`${fileType} upload complete`);
}
// Usage
const jarInput = document.getElementById('jarFile');
const manifestInput = document.getElementById('manifestFile');
const config = {
apiEndpoint: 'https://api.cf.example.com',
username: 'user',
password: 'pass',
organization: 'my-org',
space: 'dev',
appName: 'my-app',
skipSslValidation: false
};
deployWithChunks(jarInput.files[0], manifestInput.files[0], config)
.then(result => console.log('Success:', result))
.catch(error => console.error('Error:', error));
```
## Client Implementation Example (Python)
```python
import requests
import os
CHUNK_SIZE = 5 * 1024 * 1024 # 5MB
def deploy_with_chunks(api_base, jar_path, manifest_path, deployment_config):
# Step 1: Initialize upload session
response = requests.post(
f"{api_base}/upload/init",
json=deployment_config
)
session_id = response.json()['uploadSessionId']
print(f"Upload session created: {session_id}")
# Step 2: Upload JAR file in chunks
upload_file_in_chunks(api_base, session_id, 'jarFile', jar_path)
# Step 3: Upload manifest file in chunks
upload_file_in_chunks(api_base, session_id, 'manifest', manifest_path)
# Step 4: Finalize and deploy
response = requests.post(
f"{api_base}/upload/finalize",
params={'uploadSessionId': session_id}
)
result = response.json()
print(f"Deployment result: {result}")
return result
def upload_file_in_chunks(api_base, session_id, file_type, file_path):
file_size = os.path.getsize(file_path)
total_chunks = (file_size + CHUNK_SIZE - 1) // CHUNK_SIZE
file_name = os.path.basename(file_path)
print(f"Uploading {file_type}: {file_name} ({total_chunks} chunks)")
with open(file_path, 'rb') as f:
for chunk_index in range(total_chunks):
chunk_data = f.read(CHUNK_SIZE)
files = {'chunk': (f'chunk_{chunk_index}', chunk_data)}
data = {
'uploadSessionId': session_id,
'fileType': file_type,
'chunkIndex': chunk_index,
'totalChunks': total_chunks,
'fileName': file_name
}
response = requests.post(
f"{api_base}/upload/chunk",
files=files,
data=data
)
result = response.json()
print(f"Chunk {chunk_index + 1}/{total_chunks} uploaded for {file_type}")
if not result.get('success'):
raise Exception(f"Failed to upload chunk: {result.get('message')}")
print(f"{file_type} upload complete")
# Usage
config = {
'apiEndpoint': 'https://api.cf.example.com',
'username': 'user',
'password': 'pass',
'organization': 'my-org',
'space': 'dev',
'appName': 'my-app',
'skipSslValidation': False
}
deploy_with_chunks(
'https://your-app.example.com/api/cf',
'/path/to/app.jar',
'/path/to/manifest.yml',
config
)
```
## Nginx Configuration
For the chunked upload to work properly with nginx, you need minimal configuration changes:
```nginx
server {
listen 80;
server_name your-app.example.com;
# Important: Set client_max_body_size for individual chunks
# This should be slightly larger than your chunk size (5MB chunks -> 10MB limit)
client_max_body_size 10m;
# Increase timeouts for long deployments
proxy_read_timeout 900s;
proxy_connect_timeout 900s;
proxy_send_timeout 900s;
location /api/cf/ {
proxy_pass http://cf-deployer-backend:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Buffer settings for chunked uploads
proxy_buffering off;
proxy_request_buffering off;
}
}
```
### Key Nginx Settings:
1. **client_max_body_size**: Set to ~10MB (double your chunk size for safety)
2. **proxy_buffering off**: Prevents nginx from buffering the entire request
3. **proxy_request_buffering off**: Allows streaming of request body
4. **Increased timeouts**: CF deployments can take several minutes
## Configuration Properties
### application.properties
```properties
# Chunked Upload Configuration
cf.upload.chunk.size=5242880
cf.upload.session.timeout-minutes=30
```
- **cf.upload.chunk.size**: Size of each chunk in bytes (default: 5MB)
- **cf.upload.session.timeout-minutes**: How long inactive sessions are kept (default: 30 minutes)
## Session Management
- Upload sessions expire after 30 minutes of inactivity (configurable)
- Expired sessions are automatically cleaned up every 5 minutes
- Sessions are deleted after successful deployment
- Each session maintains its own temporary directory
## Error Handling
### Common Errors:
1. **"Upload session not found or expired"**
- Session timed out (default: 30 minutes)
- Invalid session ID
- Solution: Create a new upload session
2. **"Upload incomplete. Not all file chunks received"**
- Not all chunks were uploaded before calling finalize
- Solution: Check upload status and retry missing chunks
3. **"Total chunks mismatch"**
- Different totalChunks value sent for the same file
- Solution: Ensure consistent totalChunks across all chunk uploads
## Migration from Traditional Upload
The traditional `/api/cf/deploy` endpoint remains available and functional. You can:
1. **Keep using the traditional endpoint** for deployments behind nginx if you increase nginx `client_max_body_size` to 500MB+
2. **Migrate to chunked uploads** for better reliability and to avoid nginx 413 errors without increasing limits
## Performance Considerations
- **Chunk size**: 5MB is a good balance between request count and size
- Smaller chunks = more requests but safer for proxies
- Larger chunks = fewer requests but may hit proxy limits
- **Parallel uploads**: Current implementation is sequential
- Files are uploaded one chunk at a time
- Chunks must be uploaded in order (0, 1, 2, ...)
- **Network reliability**: Chunked uploads are more resilient
- Failed chunks can be retried individually
- No need to re-upload the entire file on failure
## Monitoring
Check active upload sessions:
```bash
# The ChunkedUploadService tracks active sessions
# Monitor via application logs or add a custom endpoint
```
Example log output:
```
2025-10-21 10:15:30 - Created upload session: 550e8400-e29b-41d4-a716-446655440000
2025-10-21 10:15:31 - Session 550e8400...: Received chunk 1/10 for jarFile (5242880 bytes)
2025-10-21 10:15:32 - Session 550e8400...: Received chunk 2/10 for jarFile (5242880 bytes)
...
2025-10-21 10:16:00 - Session 550e8400...: File jarFile upload completed (10 chunks)
2025-10-21 10:16:01 - Starting deployment for app: my-app from session: 550e8400...
2025-10-21 10:18:00 - Deployment completed successfully
2025-10-21 10:18:00 - Deleted upload session: 550e8400-e29b-41d4-a716-446655440000
```