Browse Docs

Core

Endpoints

Converters

Options Reference

Archives (14)

Audio (38)

Documents (26)

Ebooks (7)

Fonts (13)

Images (62)

Video (10)

On This Page

Overview
chapterSplit
Example
metadata
Example
ocr
Example
resize
Example
smartCompression
Example — basic (most converters)
Example — PNG output (with mode)
transcription
Modes
Properties
primaryFormat defaults
Example — fast mode, TXT + SRT output
Example — meeting intelligence with all document formats

Options Reference

Pass options as a JSON string in the options form field when submitting a conversion.

-F 'options={"resize":{"enabled":true,"preset":"1080p"}}'

Overview

Option	Description	Used by
`chapterSplit`	Split audiobooks into separate files per chapter	Audiobook converters
`metadata`	EXIF/GPS metadata handling for privacy and file size reduction	Images, Documents converters
`ocr`	Optical Character Recognition settings for document conversion	Documents converters
`resize`	Resize image by pixels or percentage with presets (4K, 1080p, 720p, custom)	Images, Documents converters
`smartCompression`	Optimize file size using format-specific compression (MozJPEG, pngquant, etc.)	Images converters
`transcription`	Audio transcription settings — mode, output formats, and primary format	Audio transcription converters

chapterSplit

Split audiobooks into separate files per chapter.

Property	Type	Values	Default	Description
`enabled`	boolean	`true`, `false`	`false`	Whether to split audiobooks into separate files per chapter.

Example

{"chapterSplit":{"enabled":true}}

metadata

EXIF/GPS metadata handling for privacy and file size reduction

Property	Type	Values	Default	Description
`strip`	boolean	`true`, `false`	`true`	Remove EXIF metadata (GPS location, camera info, timestamps)

Example

{"metadata":{"strip":true}}

ocr

Optical Character Recognition settings for document conversion

Property	Type	Values	Default	Description
`enabled`	boolean	`true`, `false`	`true`	Enable OCR for scanned documents
`force`	boolean	`true`, `false`	`false`	Force OCR even on text-selectable PDFs

Example

{"ocr":{"enabled":true}}

OCR is enabled by default. For text-selectable PDFs, this may add processing time. Set enabled: false if your PDFs already contain selectable text and you want faster conversion.

resize

Resize image by pixels or percentage with presets (4K, 1080p, 720p, custom)

Property	Type	Values	Default	Description
`enabled`	boolean	`true`, `false`	`false`	Enable post-processing resize
`mode`	string	`pixels`, `percentage`		Resize mode: 'pixels' targets longest edge, 'percentage' scales proportionally
`preset`	string	`off`, `4k`, `2k`, `1080p`, `720p`, `custom`	`off`	Quick resize presets
`value`	integer	1–10000		Resize value: pixels mode (1-10000, longest edge), percentage mode (1-99)

Value ranges by mode: In pixels mode, value is the target longest edge in pixels (1–10000). In percentage mode, value is a scale percentage (1–99). The JSON Schema enforces 1–10000 for both modes; in percentage mode, values above 99 are clamped to 99.

Example

{"resize":{"enabled":true,"preset":"1080p"}}

smartCompression

Optimize file size using format-specific compression (MozJPEG, pngquant, etc.)

Property	Type	Values	Default	Description
`enabled`	boolean	`true`, `false`	`false`	Enable smart compression via Compress.FAST
`mode`	string	`lossy`, `lossless`	`lossy`	Compression mode — PNG-output converters only (lossy = smaller files via pngquant, lossless = no quality loss via optipng). Not available for JPG, WebP, AVIF, or other outputs.

Example — basic (most converters)

{"smartCompression":{"enabled":true}}

Example — PNG output (with mode)

{"smartCompression":{"enabled":true,"mode":"lossy"}}

transcription

Audio transcription settings — mode, output formats, and primary format.

Audio transcription converters can produce multiple output formats in a single job. By default, only the primary format (determined by the endpoint) is generated. Use the include* flags to add additional formats alongside it.

Modes

Mode	Cost	Engine	Best for
`fast`	2 credits/min	Groq whisper-large-v3-turbo	Clear audio, podcasts, lectures
`quality`	5 credits/min	Groq whisper-large-v3	Accented speech, noisy recordings, technical content
`meeting-intelligence`	8 credits/min	Groq whisper-large-v3 + AI analysis	Meetings, interviews — adds speaker labeling, AI summary, and action items

Properties

Property	Type	Values	Default	Description
`mode`	string	`fast`, `quality`, `meeting-intelligence`	`fast`	Transcription mode (determines engine and cost)
`primaryFormat`	string	`txt`, `srt`, `vtt`, `pdf`, `docx`, `md`, `epub`	(endpoint-dependent)	Primary output format — defaults to the format implied by the endpoint
`includeTxt`	boolean	`true`, `false`	`false`	Include plain text transcript
`includeSrt`	boolean	`true`, `false`	`false`	Include SRT subtitles
`includeVtt`	boolean	`true`, `false`	`false`	Include WebVTT subtitles
`includePdf`	boolean	`true`, `false`	`false`	Include PDF transcript
`includeDocx`	boolean	`true`, `false`	`false`	Include Word document
`includeMarkdown`	boolean	`true`, `false`	`false`	Include Markdown transcript
`includeEpub`	boolean	`true`, `false`	`false`	Include EPUB ebook

primaryFormat defaults

The default primaryFormat is derived from the endpoint you submit to — no explicit option is needed:

Endpoint	Default `primaryFormat`
`/audio-to-txt`	`txt`
`/audio-to-srt`	`srt`
`/audio-to-vtt`	`vtt`
`/audio-to-pdf`	`pdf`
`/audio-to-word`	`docx`
`/audio-to-markdown`	`md`
`/audio-to-epub`	`epub`

Override it by passing "primaryFormat": "srt" in the transcription options — the overridden format is always included in the output regardless of its include* flag.

Example — fast mode, TXT + SRT output

{"transcription":{"mode":"fast","includeTxt":true,"includeSrt":true}}

Example — meeting intelligence with all document formats

{"transcription":{"mode":"meeting-intelligence","includeTxt":true,"includeSrt":true,"includeVtt":true,"includePdf":true,"includeDocx":true,"includeMarkdown":true}}

The primaryFormat is forced to be included regardless of its include* flag. For example, if you submit via the audio-to-pdf endpoint, PDF is always included even if includePdf is false.

Meeting Intelligence adds automatic speaker diarization (labeling who said what), an AI-generated meeting summary, and extracted action items. Speaker labels are included in all output formats. Summaries and action items are embedded in TXT, PDF, DOCX, Markdown, and EPUB outputs — subtitle formats (SRT, VTT) include speaker labels only.

Entitlements

7Z to ISO

# Options Reference

Pass options as a JSON string in the `options` form field when submitting a conversion.

```
-F 'options={"resize":{"enabled":true,"preset":"1080p"}}'
```

## Overview

| Option | Description | Used by |
|--------|-------------|---------|
| [`chapterSplit`](#chaptersplit) | Split audiobooks into separate files per chapter | Audiobook converters |
| [`metadata`](#metadata) | EXIF/GPS metadata handling for privacy and file size reduction | Images, Documents converters |
| [`ocr`](#ocr) | Optical Character Recognition settings for document conversion | Documents converters |
| [`resize`](#resize) | Resize image by pixels or percentage with presets (4K, 1080p, 720p, custom) | Images, Documents converters |
| [`smartCompression`](#smartcompression) | Optimize file size using format-specific compression (MozJPEG, pngquant, etc.) | Images converters |
| [`transcription`](#transcription) | Audio transcription settings — mode, output formats, and primary format | Audio transcription converters |

## chapterSplit

Split audiobooks into separate files per chapter.

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| `enabled` | boolean | `true`, `false` | `false` | Whether to split audiobooks into separate files per chapter. |

### Example

```json
{"chapterSplit":{"enabled":true}}
```

## metadata

EXIF/GPS metadata handling for privacy and file size reduction

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| `strip` | boolean | `true`, `false` | `true` | Remove EXIF metadata (GPS location, camera info, timestamps) |

### Example

```json
{"metadata":{"strip":true}}
```

## ocr

Optical Character Recognition settings for document conversion

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| `enabled` | boolean | `true`, `false` | `true` | Enable OCR for scanned documents |
| `force` | boolean | `true`, `false` | `false` | Force OCR even on text-selectable PDFs |

### Example

```json
{"ocr":{"enabled":true}}
```

> OCR is enabled by default. For text-selectable PDFs, this may add processing time. Set `enabled: false` if your PDFs already contain selectable text and you want faster conversion.

## resize

Resize image by pixels or percentage with presets (4K, 1080p, 720p, custom)

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| ènabled` | boolean | `true`, `false` | `false` | Enable post-processing resize |
| `mode` | string | `pixels`, `percentage` |  | Resize mode: 'pixels' targets longest edge, 'percentage' scales proportionally |
| `preset` | string | òff`, `4k`, `2k`, `1080p`, `720p`, `custom` | òff` | Quick resize presets |
| `value` | integer | 1–10000 |  | Resize value: pixels mode (1-10000, longest edge), percentage mode (1-99) |

> **Value ranges by mode:** In `pixels` mode, `value` is the target longest edge in pixels (1–10000). In `percentage` mode, `value` is a scale percentage (1–99). The JSON Schema enforces 1–10000 for both modes; in percentage mode, values above 99 are clamped to 99.

### Example

```json
{"resize":{"enabled":true,"preset":"1080p"}}
```

## smartCompression

Optimize file size using format-specific compression (MozJPEG, pngquant, etc.)

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| `enabled` | boolean | `true`, `false` | `false` | Enable smart compression via Compress.FAST |
| `mode` | string | `lossy`, `lossless` | `lossy` | Compression mode — **PNG-output converters only** (lossy = smaller files via pngquant, lossless = no quality loss via optipng). Not available for JPG, WebP, AVIF, or other outputs. |

### Example — basic (most converters)

```json
{"smartCompression":{"enabled":true}}
```

### Example — PNG output (with mode)

```json
{"smartCompression":{"enabled":true,"mode":"lossy"}}
```

## transcription

Audio transcription settings — mode, output formats, and primary format.

Audio transcription converters **can produce multiple output formats in a single job**. By default, only the primary format (determined by the endpoint) is generated. Use the `include*` flags to add additional formats alongside it.

### Modes

| Mode | Cost | Engine | Best for |
|------|------|--------|----------|
| `fast` | 2 credits/min | Groq whisper-large-v3-turbo | Clear audio, podcasts, lectures |
| `quality` | 5 credits/min | Groq whisper-large-v3 | Accented speech, noisy recordings, technical content |
| `meeting-intelligence` | 8 credits/min | Groq whisper-large-v3 + AI analysis | Meetings, interviews — adds speaker labeling, AI summary, and action items |

### Properties

| Property | Type | Values | Default | Description |
|----------|------|--------|---------|-------------|
| `mode` | string | `fast`, `quality`, `meeting-intelligence` | `fast` | Transcription mode (determines engine and cost) |
| `primaryFormat` | string | `txt`, `srt`, `vtt`, `pdf`, `docx`, `md`, èpub` | *(endpoint-dependent)* | Primary output format — defaults to the format implied by the endpoint |
| ìncludeTxt` | boolean | `true`, `false` | `false` | Include plain text transcript |
| ìncludeSrt` | boolean | `true`, `false` | `false` | Include SRT subtitles |
| ìncludeVtt` | boolean | `true`, `false` | `false` | Include WebVTT subtitles |
| ìncludePdf` | boolean | `true`, `false` | `false` | Include PDF transcript |
| ìncludeDocx` | boolean | `true`, `false` | `false` | Include Word document |
| ìncludeMarkdown` | boolean | `true`, `false` | `false` | Include Markdown transcript |
| ìncludeEpub` | boolean | `true`, `false` | `false` | Include EPUB ebook |

### primaryFormat defaults

The default `primaryFormat` is derived from the endpoint you submit to — no explicit option is needed:

| Endpoint | Default `primaryFormat` |
|----------|------------------------|
| `/audio-to-txt` | `txt` |
| `/audio-to-srt` | `srt` |
| `/audio-to-vtt` | `vtt` |
| `/audio-to-pdf` | `pdf` |
| `/audio-to-word` | `docx` |
| `/audio-to-markdown` | `md` |
| `/audio-to-epub` | `epub` |

Override it by passing `"primaryFormat": "srt"` in the transcription options — the overridden format is always included in the output regardless of its `include*` flag.

### Example — fast mode, TXT + SRT output

```json
{"transcription":{"mode":"fast","includeTxt":true,"includeSrt":true}}
```

### Example — meeting intelligence with all document formats

```json
{"transcription":{"mode":"meeting-intelligence","includeTxt":true,"includeSrt":true,"includeVtt":true,"includePdf":true,"includeDocx":true,"includeMarkdown":true}}
```

> The `primaryFormat` is forced to be included regardless of its `include*` flag. For example, if you submit via the audio-to-pdf endpoint, PDF is always included even if `includePdf` is `false`.

> **Meeting Intelligence** adds automatic speaker diarization (labeling who said what), an AI-generated meeting summary, and extracted action items. Speaker labels are included in all output formats. Summaries and action items are embedded in TXT, PDF, DOCX, Markdown, and EPUB outputs — subtitle formats (SRT, VTT) include speaker labels only.

Copied.