Create audio

HTTP request

POST /ai/audio_generation

Authorization

Include your ACCESS TOKEN in HTTP Authorization header.

Authorization: Bearer Token

Request Parameters

KEY
TYPE
VALUE

prompt

String

A brief description or theme for the audio you want to generate. Example: “Motorcycle engine sound”.

configs

JSON

This parameter is a JSON object encompassing a variety of settings you can adjust to customize the image generation process. It includes several parameters, which we will describe next, allowing you to control different aspects of the generation.

model

String

Specifies the AI model used for generating the audio. Default is "audiogen".

duration

Integer

The duration of the generated audio clip in seconds. The value default is 5.

top_k

Integer

This integer value determines the sampling strategy by limiting the selection to the k most likely next tokens at each step of the generation. Influences the variety in the audio generation. Common range: 1 to 1000. Default value is 15.

top_p

Float

Controls the breadth of token selection based on cumulative probability. A lower top_p value means the model will sample from a smaller, more likely set of tokens, which can help in maintaining the quality of the generated audio content. Common range: 0 - 1. Default value is 0.9.

User Guide

  1. Craft a Detailed Prompt: Start with a clear and detailed prompt for the audio you envision. Include all relevant details to guide the audio generation effectively. The more descriptive your prompt, the more accurately the AI can generate the desired audio ambiance or music.

  2. Set Duration: Define how long you want your audio clip to be. This can range from a few seconds to several minutes, depending on your needs. Keep in mind that longer durations may increase processing time.

  3. Adjust Sampling Strategy (top_k and top_p): Experiment with the top_k and top_p parameters to control the diversity and creativity of the generated audio. Adjusting these parameters can help you find a balance between randomness and coherence in the audio content.

  4. Generate and Evaluate: Once you have set all the parameters, generate your audio. Listen to the output carefully and evaluate whether it meets your expectations and requirements.

  5. Iterate for Perfection: It's unlikely to get the perfect result on the first try. Use your initial output as a learning experience to refine your prompt, adjust parameters, and experiment with different settings. Iteration can significantly improve the quality and relevance of the generated audio.

Example Request

{
  "prompt": "The motorbike engine is accelerating",
  "configs": {
    "model": "audiogen",
    "duration": 5,
    "top_k": 15,
    "top_p": 0.9
  }
}

Parrot API

audio_task = parrot.create_txt2audio(prompt, model, duration, top_k, top_p)

Response

Returns the ID of the successful task.

{
  "data": {
    "task_id": "f8184aff872c4062ac7fb7e3a40dafac",
    "prompt": "The motorbike engine is accelerating",
    "config": {
      "model": "audiogen",
      "duration": 5,
      "top_k": 15,
      "top_p": 0.9,
      "task_name": "tasks.parrot_audiogen_task",
      "task_type": "TEXT-TO-AUDIO",
    }
  },
  "errors": [],
  "error_description": "",
  "start_time": "2024-03-15 01:34:49.257219",
  "end_time": "2024-03-15 01:34:49.359186",
  "host_of_client_call_request": "103.186.100.36",
  "total_time_by_second": 0.101979,
  "status": "success"
}

Last updated

Was this helpful?