Scalable audio REST API to convert, trim, concatenate, optimize, and compress audio files.
To process audio, the input file must be uploaded to your account, or accessible via a URL:
Use the Bytescale Dashboard to upload a file manually.
Use the Upload Widget, Bytescale SDKs or Bytescale API to upload a file programmatically.
Create external HTTP folders to process files hosted externally.
Get the input file's /raw/ URL before continuing.
To build an audio processing URL:
Get the /raw/ URL for your input file.
Replace /raw/ with /audio/.
Add the querystring parameters documented on this page, e.g.:
Play your audio by navigating to the URL from step 2.
By default, your audio will be encoded to AAC.
The HTTP response will be an HTML webpage with an embedded audio player that's hardcoded to play your audio.
You can change this behaviour — e.g. to return an audio file instead of an audio player — using the parameters documented on this page.
To embed audio in a webpage using Video.js:
<!DOCTYPE html><html><head> <link href="https://unpkg.com/video.js@7/dist/video-js.min.css" rel="stylesheet"> <script src="https://unpkg.com/video.js@7/dist/video.min.js"></script> <style type="text/css"> .audio-container { height: 316px; max-width: 600px; } </style></head><body> <div class="audio-container"> <video-js class="vjs-fill vjs-big-play-centered" controls preload="auto"> <p class="vjs-no-js">To play this audio please enable JavaScript.</p> </video-js> </div> <script> var vid = document.querySelector('video-js'); var player = videojs(vid, {responsive: true}); player.on('loadedmetadata', function() { // Begin playing from the start of the audio. (Required for 'f=hls-aac-rt'.) player.currentTime(player.seekable().start(0)); }); player.src({ src: 'https://upcdn.io/W142hJk/audio/example.mp3!f=hls-aac-rt&br=80&br=256', type: 'application/x-mpegURL' }); </script></body></html>
Audio encoded using f=hls-aac-rt takes ~10 seconds to play initially and ~100ms on all subsequent requests.
To create an MP3 file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=mp3 to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the MP3 file:
https://upcdn.io/W142hJk/audio/example.mp3?f=mp3
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.mp3", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=mp3&a=/audio.mp3" } }}
To create an AAC file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=aac to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the AAC file:
https://upcdn.io/W142hJk/audio/example.mp3?f=aac
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.aac", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=aac&a=/audio.aac" } }}
To create a WAV file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=wav-riff to the URL.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the WAV file:
https://upcdn.io/W142hJk/audio/example.mp3?f=wav-riff
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.wav", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=wav-riff&a=/audio.wav" } }}
To create an HTTP Live Streaming (HLS) file:
Upload an input file (e.g. an audio or video file) or create an external file source.
Replace /raw/ with /audio/ in the file's URL, and then append ?f=hls-aac to the URL.
Add parameters from the Audio Transcoding API or Audio Compression API
You can create adaptive bitrate (ABR) audio by specifying multiple groups of bitrate and/or sample rate parameters. The end-user's audio player will automatically switch to the most appropriate variant during playback. By default, a single 96 kbps variant is produced.
You can specify up to 10 variants. Each variant's parameters must be adjacent on the querystring. For example: br=80&sr=24&br=256&sr=48 specifies 2 variants, whereas br=80&br=256&sr=24&sr=48 specifies 3 variants (which would most likely be a mistake). You can add next=true between groups of parameters to forcefully split them into separate variants.
Navigate to the URL (i.e. request the URL using a simple GET request).
Wait for status: "Succeeded" in the JSON response.
The result will contain a URL to the HTTP Live Streaming (HLS) file:
https://upcdn.io/W142hJk/audio/example.mp3?f=hls-aac&br=80&br=256
{ "jobUrl": "https://api.bytescale.com/v2/accounts/W142hJk/jobs/ProcessFileJob/01H3211XMV1VH829RV697VE3WM", "jobDocs": "https://www.bytescale.com/docs/job-api/GetJob", "jobId": "01H3211XMV1VH829RV697VE3WM", "jobType": "ProcessFileJob", "accountId": "W142hJk", "created": 1686916626075, "lastUpdated": 1686916669389, "status": "Succeeded", "summary": { "result": { "type": "Artifact", "artifact": "/audio.m3u8", "artifactUrl": "https://upcdn.io/W142hJk/audio/example.mp3!f=hls-aac&br=80&br=256&a=/audio.m3u8" } }}
Real-time encoding allows you to return HLS manifests (.m3u8 files) while they're being transcoded.
The benefit of real-time encoding is the ability to play web-optimized audio files within seconds of uploading them, rather than having to wait for audio transcoding jobs to complete.
To create HTTP Live Streaming (HLS) audio with real-time encoding:
Complete the steps from creating HLS audio.
Replace f=hls-aac with f=hls-aac-rt.
The result will be an M3U8 file that's dynamically updated as new segments finish transcoding:
https://upcdn.io/W142hJk/audio/example.mp4?f=hls-aac-rt
#EXTM3U#EXT-X-VERSION:3#EXT-X-INDEPENDENT-SEGMENTS#EXT-X-STREAM-INF:BANDWIDTH=2038521,AVERAGE-BANDWIDTH=2038521,CODECS="mp4a.40.2"example.mp3!f=hls-aac-rt&a=/0f/manifest.m3u8
The Audio Metadata API allows you to extract the audio file's duration, codec, and more.
To extract an audio file's duration using JavaScript:
<!DOCTYPE html><html><body> <p>Please wait, loading audio metadata...</p> <script> async function getAudioDuration() { const response = await fetch("https://upcdn.io/W142hJk/audio/example.mp4?f=meta"); const jsonData = await response.json(); const audioTrack = (jsonData.tracks ?? []).find(x => x.type === "Audio"); if (audioTrack === undefined) { alert("Cannot find audio metadata.") } else { alert(`Duration (seconds): ${audioTrack.duration}`) } } getAudioDuration().then(() => {}, e => alert(`Error: ${e}`)) </script></body></html>
The Audio Processing API can transcode audio from video and audio files:
The Audio Processing API can transcode audio from the following audio inputs:
File Extension(s) | Audio Container | Audio Codecs |
---|---|---|
.wma, .asf | Advanced Systems Format (ASF) | WMA, WMA2, WMA Pro |
.fla, .flac | FLAC | FLAC |
.mp3 | MPEG-1 Layer 3 | MP3 |
.ts, .m2ts | MPEG-2 TS | MP2, PCM |
.aac, .mp4, .m4a | MPEG-4 | AAC |
.mka | Matroska Audio Container | Opus, FLAC |
.oga | OGA | Opus, Vorbis, FLAC |
.wav | Waveform Audio File | PCM |
The Audio Processing API can transcode audio from the following video inputs:
File Extension(s) | Video Container | Video Codecs |
---|---|---|
.m2v, .mpeg, .mpg | No Container | AVC (H.264), DV/DVCPRO, HEVC (H.265), MPEG-1, MPEG-2 |
.3g2 | 3G2 | AVC (H.264), H.263, MPEG-4 part 2 |
.3gp | 3GP | AVC (H.264), H.263, MPEG-4 part 2 |
.wmv | Advanced Systems Format (ASF) | VC-1 |
.flv | Adobe Flash | AVC (H.264), Flash 9 File, H.263 |
.avi | Audio Video Interleave (AVI) | Uncompressed, Canopus HQ, DivX/Xvid, DV/DVCPRO, MJPEG |
.m3u8 | HLS (MPEG-2 TS segments) | AVC (H.264), HEVC (H.265), MPEG-2 |
.mxf | Interoperable Master Format (IMF) | Apple ProRes, JPEG 2000 (J2K) |
.mxf | Material Exchange Format (MXF) | Uncompressed, AVC (H.264), AVC Intra 50/100, Apple ProRes (4444, 4444 XQ, 422, 422 HQ, LT, Proxy), DV/DVCPRO, DV25, DV50, DVCPro HD, JPEG 2000 (J2K), MPEG-2, Panasonic P2, SonyXDCam, SonyXDCam MPEG-4 Proxy, VC-3 |
.mkv | Matroska | AVC (H.264), MPEG-2, MPEG-4 part 2, PCM, VC-1 |
.mpg, .mpeg, .m2p, .ps | MPEG Program Streams (MPEG-PS) | MPEG-2 |
.m2t, .ts, .tsv | MPEG Transport Streams (MPEG-TS) | AVC (H.264), HEVC (H.265), MPEG-2, VC-1 |
.dat, .m1v, .mpeg, .mpg, .mpv | MPEG-1 System Streams | MPEG-1, MPEG-2 |
.mp4, .mpeg4 | MPEG-4 | Uncompressed, DivX/Xvid, H.261, H.262, H.263, AVC (H.264), AVC Intra 50/100, HEVC (H.265), JPEG 2000, MPEG-2, MPEG-4 part 2, VC-1 |
.mov, .qt | QuickTime | Uncompressed, Apple ProRes (4444, 4444 XQ, 422, 422 HQ, LT, Proxy), DV/DVCPRO, DivX/Xvid, H.261, H.262, H.263, AVC (H.264), AVC Intra 50/100, HEVC (H.265), JPEG 2000 (J2K), MJPEG, MPEG-2, MPEG-4 part 2, QuickTime Animation (RLE) |
.webm | WebM | VP8, VP9 |
Use the Audio Metadata API to extract the duration, codec, and other information from an audio file.
Instructions:
Replace raw with audio in your audio URL.
Append ?f=meta to the URL.
The result will be a JSON payload describing the audio's tracks (see below).
Example audio metadata JSON response:
{ "tracks": [ { "bitRate": 159980, "bitRateMode": "VBR", "channels": 2, "codec": "AAC", "codecId": "mp4a-40-2", "frameCount": 35875, "frameRate": 46.875, "samplingRate": 48000, "title": "Stereo", "type": "Audio" } ]}
Use the Audio Transcoding API to transcode your audio to a specific format.
Use the f parameter to change the output format of the audio:
Format | Transcoding | Compression | Browser Support |
---|---|---|---|
f=mp3 | async | good | all |
f=aac | async | excellent | all |
f=wav-riff | async | none | none |
f=wav-rf64 | async | none | none |
f=hls-aac | async | excellent | requires SDK |
f=hls-aac-rt | real-time | excellent | requires SDK |
Which output format should I use?
Use f=hls-aac-rt to create web-optimized audio that plays while it's being transcoded.
Omit the f parameter to get a shareable link to your audio (encoded in AAC).
Which audio SDK should I use?
For f=hls-* formats you need to use an audio player SDK that supports HLS (e.g. Video.js). For other formats you can use HTML5's <audio> element.
What is async transcoding?
Asynchronous transcoding means Bytescale will return a JSON response that initially contains "status": "Running". You must poll the URL until "status": "Succeeded" is returned, at which point the JSON response will contain the URL to the encoded audio.
What is real-time transcoding?
Real-time transcoding means Bytescale will stream the audio to your device while it's being transcoded: instead of receiving a JSON response you will receive an M3U8 response. This allows you to start playing transcoded audio within seconds of uploading it.
Transcodes the audio to MP3 (.mp3).
Transcoding: asynchronous (poll for completion)
Response: JSON (contains the URL to the MP3 file on completion)
Transcodes the audio to AAC (.aac).
Transcoding: asynchronous (poll for completion)
Response: JSON (contains the URL to the AAC file on completion)
Transcodes the audio to Waveform (.wav) using the RIFF wave format.
Transcoding: asynchronous (poll for completion)
Response: JSON (contains the URL to the WAV file on completion)
Transcodes the audio to Waveform (.wav) using the RF64 wave format (to support output audio larger than 4GB).
Transcoding: asynchronous (poll for completion)
Response: JSON (contains the URL to the WAV file on completion)
Transcodes the audio to HLS AAC (.m3u8).
Transcoding: asynchronous (poll for completion)
Response: JSON (contains the URL to the M3U8 file on completion)
Browser support: all browsers (requires an audio player SDK with HLS support, like Video.js)
Transcodes the audio to HLS AAC (.m3u8) in real-time.
Transcoding: real-time
During transcoding: there will be an initial ~10 second delay for the first HTTP response. Subsequent HTTP responses will return the M3U8 file with new audio segments appended to it as transcoding progresses (similar to a live audio stream). Generic audio players may hide playback controls and start playing from towards the end of the audio. To overcome this issue, we recommend using an audio player SDK that allows you to force playback from the beginning, and to show the seek bar for live HLS feeds. See the source code returned by f=html-aac for a working example of using Video.js to gracefully play audio files transcoded using f=hls-aac-rt.
Response: M3U8
Browser support: all browsers (requires an audio player SDK with HLS support, like Video.js)
Returns a webpage with an embedded audio player that's configured to play the requested audio in AAC.
Useful for sharing links to audio files and for previewing/debugging audio transformation parameters.
Transcoding: real-time
Response: HTML
This is the default value.
Returns metadata for the audio file (duration, codec, etc.)
See the Audio Metadata API docs for more information.
Response: JSON (audio metadata)
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=auto) will be returned to the user while it's being transcoded only if the transcode rate is faster than the playback rate.
Only supported by f=hls-aac-rt and f=html-aac.
This is the default value.
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=false) will never be returned to the user while it's being transcoded.
Use this option as a performance optimization (instead of using rt=auto) when you know the variant will always transcode at a slower rate than its playback rate:
•When rt=auto is used, the initial HTTP request for the M3U master manifest will block until the first few segments of each rt=auto and rt=true variants have been transcoded, before returning the initial M3U playlist.
•In general, you want to exclude slow-transcoding HLS variants to reduce this latency.
If none of the HLS variants have rt=true or rt=auto then the fastest variant to transcode will be returned during transcoding.
Only supported by f=hls-aac-rt and f=html-aac.
If this flag is present, the audio variant expressed by the adjacent parameters on the querystring (e.g. br=80&rt=true&br=256&rt=auto) will always be returned to the user while it's being transcoded.
Only supported by f=hls-aac-rt and f=html-aac.
Use the Audio Compression API to control the file size of your audio.
Sets the output audio bitrate (kbps).
Supported values for f=aac, f=hls-aac, f=hls-aac-rt and f=html-aac:
•16
•20
•24
•28
•32
•40
•48
•56
•64
•80
•96
•112
•128
•160
•192
•224
•256
•288
•320
•384
•448
•512
•576
Supported values for f=mp3:
•16
•24
•32
•40
•48
•56
•64
•72
•80
•88
•96
•104
•112
•120
•128
•136
•144
•152
•160
•168
•176
•184
•192
•200
•208
•216
•224
•232
•240
•248
•256
•264
•272
•280
•288
•296
Not applicable to f=wav (Waveform audio files do not have a bitrate).
Default: 96
Sets the output audio sample rate (kHz).
Supported values for f=aac, f=hls-aac, f=hls-aac-rt and f=html-aac:
•8
•12
•16
•22.05
•24
•32
•44.1
•48
•88.2
•96
Supported values for f=mp3:
•22.05
•32
•44.1
•48
Supported values for f=wav:
•8
•16
•22.05
•24
•32
•44.1
•48
•88.2
•96
•192
Note: the sample rate will be automatically adjusted if the provided value is unsupported by the requested bitrate for the requested audio format (for example, AAC only supports sample rates between 32kHz - 48kHz when a bitrate of 96kbps is used).
Default: 48
Use the Audio Trimming API to remove parts of the audio from the start and/or end.
Sets the start position of audio, and removes all audio before that point.
If s exceeds the length of the audio, then an error will be returned.
Supports numbers between 0 - 86399 with up to two decimal places. To provide frame accuracy for audio inputs, decimals will be interpreted as frame numbers, not milliseconds.
Sets the end position of audio, and removes all audio after that point.
If e exceeds the length of the audio, then no error will be returned, and the parameter effectively does nothing.
Supports numbers between 0 - 86399 with up to two decimal places. To provide frame accuracy for audio inputs, decimals will be interpreted as frame numbers, not milliseconds.
Applies the trim specified by ts and/or te after the rp parameter is applied.
Applies the trim specified by ts and/or te before the rp parameter is applied.
This is the default value.
Use the Audio Concatenation API to append additional audio files to the primary audio file's timeline.
Appends the audio from another media file (video or audio file) to the output.
All media files specified via append are concatenated in the order they are specified, with the primary input audio (specified on the URL's file path) playing first.
To use: specify the "file path" attribute of another media file as the query parameter's value.
Number of times to play the audio file.
If this parameter appears after an append parameter, then it will repeat the appended audio file only.
If this parameter appears before any append parameters, then it will repeat the primary audio file only.
Default: 1
The Audio Processing API is available on all Bytescale Plans.
Your processing quota (see pricing) is consumed by the output audio file's duration multiplied by a "processing multiplier": the codec of your output audio file determines the "processing multiplier" that will be used.
Audio files can be played an unlimited number of times.
Your processing quota will only be deducted once per URL: for the very first request to the URL.
There is a minimum billable duration of 10 seconds per audio file.
Audio billing example:
A 60-second audio file encoded to AAC would consume 45 seconds (60 × 0.75) from your monthly processing quota.
If the audio file is initially played in January 2023, and is then played 100k times for the following 2 years, then you would be billed 45 seconds in January 2023 and 0 seconds in all the following months. (This assumes you never clear your permanent cache).
Codec | Processing Multiplier |
---|---|
AAC | 0.75 |
MP3 | 0.75 |
WAV | 1.15 |
When using f=hls-aac, f=hls-aac-rt or f=html-aac (which uses f=hls-aac-rt internally) your processing quota will be consumed per HLS variant.
When using f=hls-aac-rt each real-time variant (rt=true or rt=auto) will have an additional 10 seconds added to its billable duration.
The default behaviour for HLS outputs is to produce one HLS AAC variant.
You can change this behaviour using the querystring parameters documented on this page.
HLS pricing example:
Given an input audio file of 60 seconds and the querystring ?f=hls-aac-rt&br=64&br=128&br=256&rt=false, you would be billed:
3×60 seconds for 3× HLS variants (br=64&br=128&br=256).
2×10 seconds for 2× HLS variants using real-time encoding.
The first two variants on the querystring (br=64&br=128) do not specify rt parameters, so will default to rt=auto.
Per the pricing above, real-time variants incur an additional 10 seconds of billable duration.
200 seconds total billed duration: 3×60 + 2×10
This website uses cookies. By continuing you are consenting to the use of cookies per our cookie policy.
This website requires a modern web browser -- the latest versions of these browsers are supported: