Browse Docs
Archives (14)
Audio (38)
Documents (26)
Ebooks (7)
Fonts (13)
Images (62)
Video (10)
On This Page

Options Reference

Pass options as a JSON string in the options form field when submitting a conversion.

-F 'options={"resize":{"enabled":true,"preset":"1080p"}}'

Overview

OptionDescriptionUsed by
chapterSplitSplit audiobooks into separate files per chapterAudiobook converters
metadataEXIF/GPS metadata handling for privacy and file size reductionImages, Documents converters
ocrOptical Character Recognition settings for document conversionDocuments converters
resizeResize image by pixels or percentage with presets (4K, 1080p, 720p, custom)Images, Documents converters
smartCompressionOptimize file size using format-specific compression (MozJPEG, pngquant, etc.)Images converters
transcriptionAudio transcription settings — mode, output formats, and primary formatAudio transcription converters

chapterSplit

Split audiobooks into separate files per chapter.

PropertyTypeValuesDefaultDescription
enabledbooleantrue, falsefalseWhether to split audiobooks into separate files per chapter.

Example

{"chapterSplit":{"enabled":true}}

metadata

EXIF/GPS metadata handling for privacy and file size reduction

PropertyTypeValuesDefaultDescription
stripbooleantrue, falsetrueRemove EXIF metadata (GPS location, camera info, timestamps)

Example

{"metadata":{"strip":true}}

ocr

Optical Character Recognition settings for document conversion

PropertyTypeValuesDefaultDescription
enabledbooleantrue, falsetrueEnable OCR for scanned documents
forcebooleantrue, falsefalseForce OCR even on text-selectable PDFs

Example

{"ocr":{"enabled":true}}

OCR is enabled by default. For text-selectable PDFs, this may add processing time. Set enabled: false if your PDFs already contain selectable text and you want faster conversion.

resize

Resize image by pixels or percentage with presets (4K, 1080p, 720p, custom)

PropertyTypeValuesDefaultDescription
enabledbooleantrue, falsefalseEnable post-processing resize
modestringpixels, percentageResize mode: 'pixels' targets longest edge, 'percentage' scales proportionally
presetstringoff, 4k, 2k, 1080p, 720p, customoffQuick resize presets
valueinteger1–10000Resize value: pixels mode (1-10000, longest edge), percentage mode (1-99)

Value ranges by mode: In pixels mode, value is the target longest edge in pixels (1–10000). In percentage mode, value is a scale percentage (1–99). The JSON Schema enforces 1–10000 for both modes; in percentage mode, values above 99 are clamped to 99.

Example

{"resize":{"enabled":true,"preset":"1080p"}}

smartCompression

Optimize file size using format-specific compression (MozJPEG, pngquant, etc.)

PropertyTypeValuesDefaultDescription
enabledbooleantrue, falsefalseEnable smart compression via Compress.FAST
modestringlossy, losslesslossyCompression mode — PNG-output converters only (lossy = smaller files via pngquant, lossless = no quality loss via optipng). Not available for JPG, WebP, AVIF, or other outputs.

Example — basic (most converters)

{"smartCompression":{"enabled":true}}

Example — PNG output (with mode)

{"smartCompression":{"enabled":true,"mode":"lossy"}}

transcription

Audio transcription settings — mode, output formats, and primary format.

Audio transcription converters can produce multiple output formats in a single job. By default, only the primary format (determined by the endpoint) is generated. Use the include* flags to add additional formats alongside it.

Modes

ModeCostEngineBest for
fast2 credits/minGroq whisper-large-v3-turboClear audio, podcasts, lectures
quality5 credits/minGroq whisper-large-v3Accented speech, noisy recordings, technical content
meeting-intelligence8 credits/minGroq whisper-large-v3 + AI analysisMeetings, interviews — adds speaker labeling, AI summary, and action items

Properties

PropertyTypeValuesDefaultDescription
modestringfast, quality, meeting-intelligencefastTranscription mode (determines engine and cost)
primaryFormatstringtxt, srt, vtt, pdf, docx, md, epub(endpoint-dependent)Primary output format — defaults to the format implied by the endpoint
includeTxtbooleantrue, falsefalseInclude plain text transcript
includeSrtbooleantrue, falsefalseInclude SRT subtitles
includeVttbooleantrue, falsefalseInclude WebVTT subtitles
includePdfbooleantrue, falsefalseInclude PDF transcript
includeDocxbooleantrue, falsefalseInclude Word document
includeMarkdownbooleantrue, falsefalseInclude Markdown transcript
includeEpubbooleantrue, falsefalseInclude EPUB ebook

primaryFormat defaults

The default primaryFormat is derived from the endpoint you submit to — no explicit option is needed:

EndpointDefault primaryFormat
/audio-to-txttxt
/audio-to-srtsrt
/audio-to-vttvtt
/audio-to-pdfpdf
/audio-to-worddocx
/audio-to-markdownmd
/audio-to-epubepub

Override it by passing "primaryFormat": "srt" in the transcription options — the overridden format is always included in the output regardless of its include* flag.

Example — fast mode, TXT + SRT output

{"transcription":{"mode":"fast","includeTxt":true,"includeSrt":true}}

Example — meeting intelligence with all document formats

{"transcription":{"mode":"meeting-intelligence","includeTxt":true,"includeSrt":true,"includeVtt":true,"includePdf":true,"includeDocx":true,"includeMarkdown":true}}

The primaryFormat is forced to be included regardless of its include* flag. For example, if you submit via the audio-to-pdf endpoint, PDF is always included even if includePdf is false.

Meeting Intelligence adds automatic speaker diarization (labeling who said what), an AI-generated meeting summary, and extracted action items. Speaker labels are included in all output formats. Summaries and action items are embedded in TXT, PDF, DOCX, Markdown, and EPUB outputs — subtitle formats (SRT, VTT) include speaker labels only.

Copied.