Files
cf-uploader/CHUNKED_UPLOAD_GUIDE.md
2025-10-21 18:48:35 -05:00

12 KiB

Chunked Upload Implementation Guide

Overview

This application now supports chunked file uploads to avoid nginx 413 "Request Entity Too Large" errors when deploying large JAR files through a load balancer.

How It Works

Instead of uploading the entire JAR and manifest files in a single request, files are split into smaller chunks (default 5MB) and uploaded sequentially. The server reassembles the chunks before deployment.

API Endpoints

1. Initialize Upload Session

POST /api/cf/upload/init

Creates a new upload session and returns a session ID.

Request Body:

{
  "apiEndpoint": "https://api.cf.example.com",
  "username": "your-username",
  "password": "your-password",
  "organization": "your-org",
  "space": "your-space",
  "appName": "your-app",
  "skipSslValidation": false
}

Response:

{
  "success": true,
  "uploadSessionId": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Upload session created successfully"
}

2. Upload File Chunk

POST /api/cf/upload/chunk

Upload a single chunk of a file.

Request Parameters:

  • uploadSessionId (string): The session ID from step 1
  • fileType (string): Either "jarFile" or "manifest"
  • chunkIndex (integer): Zero-based index of this chunk (0, 1, 2, ...)
  • totalChunks (integer): Total number of chunks for this file
  • fileName (string, optional): Original filename (required for jarFile)
  • chunk (multipart file): The chunk data

Response:

{
  "success": true,
  "uploadSessionId": "550e8400-e29b-41d4-a716-446655440000",
  "fileType": "jarFile",
  "chunkIndex": 0,
  "totalChunks": 10,
  "receivedChunks": 1,
  "message": "Chunk uploaded successfully"
}

3. Get Upload Status

GET /api/cf/upload/status/{uploadSessionId}

Check the status of an upload session.

Response:

{
  "jarFile": {
    "fileName": "myapp.jar",
    "totalChunks": 10,
    "receivedChunks": {
      "0": true,
      "1": true,
      "2": true
    }
  },
  "manifest": {
    "fileName": "manifest.yml",
    "totalChunks": 1,
    "receivedChunks": {
      "0": true
    }
  }
}

4. Finalize Upload and Deploy

POST /api/cf/upload/finalize?uploadSessionId={sessionId}

Triggers the deployment after all chunks are uploaded.

Response: Same as the traditional /api/cf/deploy endpoint.

Client Implementation Example (JavaScript)

// You can use ANY chunk size - server supports variable chunk sizes!
// Recommended: 1-2MB for Tanzu with memory constraints
const CHUNK_SIZE = 1 * 1024 * 1024; // 1MB
// Other options:
// const CHUNK_SIZE = 512 * 1024;      // 512KB (very safe)
// const CHUNK_SIZE = 2 * 1024 * 1024; // 2MB (balanced)
// const CHUNK_SIZE = 5 * 1024 * 1024; // 5MB (if you have memory)

async function deployWithChunks(jarFile, manifestFile, deploymentConfig) {
  const apiBase = 'https://your-app.example.com/api/cf';

  // Step 1: Initialize upload session
  const initResponse = await fetch(`${apiBase}/upload/init`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(deploymentConfig)
  });

  const { uploadSessionId } = await initResponse.json();
  console.log('Upload session created:', uploadSessionId);

  // Step 2: Upload JAR file in chunks
  await uploadFileInChunks(apiBase, uploadSessionId, 'jarFile', jarFile);

  // Step 3: Upload manifest file in chunks
  await uploadFileInChunks(apiBase, uploadSessionId, 'manifest', manifestFile);

  // Step 4: Finalize and deploy
  const deployResponse = await fetch(
    `${apiBase}/upload/finalize?uploadSessionId=${uploadSessionId}`,
    { method: 'POST' }
  );

  const result = await deployResponse.json();
  console.log('Deployment result:', result);
  return result;
}

async function uploadFileInChunks(apiBase, sessionId, fileType, file) {
  const totalChunks = Math.ceil(file.size / CHUNK_SIZE);
  console.log(`Uploading ${fileType}: ${file.name} (${totalChunks} chunks)`);

  for (let chunkIndex = 0; chunkIndex < totalChunks; chunkIndex++) {
    const start = chunkIndex * CHUNK_SIZE;
    const end = Math.min(start + CHUNK_SIZE, file.size);
    const chunk = file.slice(start, end);

    const formData = new FormData();
    formData.append('chunk', chunk);
    formData.append('uploadSessionId', sessionId);
    formData.append('fileType', fileType);
    formData.append('chunkIndex', chunkIndex);
    formData.append('totalChunks', totalChunks);
    formData.append('fileName', file.name);

    const response = await fetch(`${apiBase}/upload/chunk`, {
      method: 'POST',
      body: formData
    });

    const result = await response.json();
    console.log(`Chunk ${chunkIndex + 1}/${totalChunks} uploaded for ${fileType}`);

    if (!result.success) {
      throw new Error(`Failed to upload chunk: ${result.message}`);
    }
  }

  console.log(`${fileType} upload complete`);
}

// Usage
const jarInput = document.getElementById('jarFile');
const manifestInput = document.getElementById('manifestFile');

const config = {
  apiEndpoint: 'https://api.cf.example.com',
  username: 'user',
  password: 'pass',
  organization: 'my-org',
  space: 'dev',
  appName: 'my-app',
  skipSslValidation: false
};

deployWithChunks(jarInput.files[0], manifestInput.files[0], config)
  .then(result => console.log('Success:', result))
  .catch(error => console.error('Error:', error));

Client Implementation Example (Python)

import requests
import os

# You can use ANY chunk size!
CHUNK_SIZE = 1 * 1024 * 1024  # 1MB (recommended for Tanzu)

def deploy_with_chunks(api_base, jar_path, manifest_path, deployment_config):
    # Step 1: Initialize upload session
    response = requests.post(
        f"{api_base}/upload/init",
        json=deployment_config
    )
    session_id = response.json()['uploadSessionId']
    print(f"Upload session created: {session_id}")

    # Step 2: Upload JAR file in chunks
    upload_file_in_chunks(api_base, session_id, 'jarFile', jar_path)

    # Step 3: Upload manifest file in chunks
    upload_file_in_chunks(api_base, session_id, 'manifest', manifest_path)

    # Step 4: Finalize and deploy
    response = requests.post(
        f"{api_base}/upload/finalize",
        params={'uploadSessionId': session_id}
    )

    result = response.json()
    print(f"Deployment result: {result}")
    return result

def upload_file_in_chunks(api_base, session_id, file_type, file_path):
    file_size = os.path.getsize(file_path)
    total_chunks = (file_size + CHUNK_SIZE - 1) // CHUNK_SIZE
    file_name = os.path.basename(file_path)

    print(f"Uploading {file_type}: {file_name} ({total_chunks} chunks)")

    with open(file_path, 'rb') as f:
        for chunk_index in range(total_chunks):
            chunk_data = f.read(CHUNK_SIZE)

            files = {'chunk': (f'chunk_{chunk_index}', chunk_data)}
            data = {
                'uploadSessionId': session_id,
                'fileType': file_type,
                'chunkIndex': chunk_index,
                'totalChunks': total_chunks,
                'fileName': file_name
            }

            response = requests.post(
                f"{api_base}/upload/chunk",
                files=files,
                data=data
            )

            result = response.json()
            print(f"Chunk {chunk_index + 1}/{total_chunks} uploaded for {file_type}")

            if not result.get('success'):
                raise Exception(f"Failed to upload chunk: {result.get('message')}")

    print(f"{file_type} upload complete")

# Usage
config = {
    'apiEndpoint': 'https://api.cf.example.com',
    'username': 'user',
    'password': 'pass',
    'organization': 'my-org',
    'space': 'dev',
    'appName': 'my-app',
    'skipSslValidation': False
}

deploy_with_chunks(
    'https://your-app.example.com/api/cf',
    '/path/to/app.jar',
    '/path/to/manifest.yml',
    config
)

Nginx Configuration

For the chunked upload to work properly with nginx, you need minimal configuration changes:

server {
    listen 80;
    server_name your-app.example.com;

    # Important: Set client_max_body_size for individual chunks
    # This should be slightly larger than your chunk size (5MB chunks -> 10MB limit)
    client_max_body_size 10m;

    # Increase timeouts for long deployments
    proxy_read_timeout 900s;
    proxy_connect_timeout 900s;
    proxy_send_timeout 900s;

    location /api/cf/ {
        proxy_pass http://cf-deployer-backend:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Buffer settings for chunked uploads
        proxy_buffering off;
        proxy_request_buffering off;
    }
}

Key Nginx Settings:

  1. client_max_body_size: Set to ~10MB (double your chunk size for safety)
  2. proxy_buffering off: Prevents nginx from buffering the entire request
  3. proxy_request_buffering off: Allows streaming of request body
  4. Increased timeouts: CF deployments can take several minutes

Configuration Properties

application.properties

# Chunked Upload Configuration
cf.upload.session.timeout-minutes=30
  • cf.upload.session.timeout-minutes: How long inactive sessions are kept (default: 30 minutes)

Note: There is NO server-side chunk size configuration. The server accepts ANY chunk size from the client. Chunks are appended sequentially as they arrive.

Session Management

  • Upload sessions expire after 30 minutes of inactivity (configurable)
  • Expired sessions are automatically cleaned up every 5 minutes
  • Sessions are deleted after successful deployment
  • Each session maintains its own temporary directory

Error Handling

Common Errors:

  1. "Upload session not found or expired"

    • Session timed out (default: 30 minutes)
    • Invalid session ID
    • Solution: Create a new upload session
  2. "Upload incomplete. Not all file chunks received"

    • Not all chunks were uploaded before calling finalize
    • Solution: Check upload status and retry missing chunks
  3. "Total chunks mismatch"

    • Different totalChunks value sent for the same file
    • Solution: Ensure consistent totalChunks across all chunk uploads

Migration from Traditional Upload

The traditional /api/cf/deploy endpoint remains available and functional. You can:

  1. Keep using the traditional endpoint for deployments behind nginx if you increase nginx client_max_body_size to 500MB+
  2. Migrate to chunked uploads for better reliability and to avoid nginx 413 errors without increasing limits

Performance Considerations

  • Chunk size: Client controls this completely

    • Smaller chunks (512KB-1MB): More requests, but safer for memory-constrained servers and strict proxies
    • Larger chunks (5-10MB): Fewer requests, faster uploads, but needs more memory
    • Recommended for Tanzu: 1MB (good balance for low-memory environments)
    • Any size works: Server accepts variable chunk sizes
  • Sequential upload requirement: CRITICAL

    • Chunks MUST be uploaded in order: 0, 1, 2, 3...
    • Server validates and enforces sequential order
    • Out-of-order chunks will be rejected
    • This is necessary because chunks are appended sequentially to the file
  • Network reliability: Chunked uploads are more resilient

    • Failed chunks can be retried individually
    • No need to re-upload the entire file on failure
    • Just retry the specific failed chunk index

Monitoring

Check active upload sessions:

# The ChunkedUploadService tracks active sessions
# Monitor via application logs or add a custom endpoint

Example log output:

2025-10-21 10:15:30 - Created upload session: 550e8400-e29b-41d4-a716-446655440000
2025-10-21 10:15:31 - Session 550e8400...: Received chunk 1/10 for jarFile (5242880 bytes)
2025-10-21 10:15:32 - Session 550e8400...: Received chunk 2/10 for jarFile (5242880 bytes)
...
2025-10-21 10:16:00 - Session 550e8400...: File jarFile upload completed (10 chunks)
2025-10-21 10:16:01 - Starting deployment for app: my-app from session: 550e8400...
2025-10-21 10:18:00 - Deployment completed successfully
2025-10-21 10:18:00 - Deleted upload session: 550e8400-e29b-41d4-a716-446655440000