Text Completion

HTTP request

POST /ai/text_completion?stream={stream}

Authorization

Include your ACCESS TOKEN in HTTP Authorization header

Authorization: Bearer Token

Request Parameters

Query Parameters

KEY

TYPE

VALUE

stream

Boolean

If the stream is set to true, the result content will be sent back as streaming. Conversely, if it is false, it will return the result after fully completing the content. The default value is False.

JSON Body

KEY

TYPE

VALUE

messages

List

This is a list that contains the messages intended for the text generation. Each message within the list is represented as a small JSON object specifying the role (such as "user", "assistant") and the actual content of the message (e.g., "Hello!"). This setup helps the AI understand the context and the kind of interaction that's taking place. Example: [{"role": "user", "content": "Hello!"}]}.

configs

JSON

This parameter is a JSON object encompassing a variety of settings you can adjust to customize the text generation process. It includes several parameters, which we will describe next, allowing you to control different aspects of the generation.

model

String

Specifies the AI model used for generating the video. Default value is “gemma-7b”.

temperature

Float

The temperature setting controls the randomness or diversity of the generation process. A higher temperature value encourages the model to explore a wider range of possibilities, making the output more varied and sometimes more creative. Common range: 0-1. Default value is 0.7.

top_k

Integer

This integer value determines the sampling strategy by limiting the selection to the k most likely next tokens at each step of the generation. A lower value of k results in the model focusing more on the higher probability tokens, often leading to more predictable and coherent outcomes.

Common range: 1 to 1000. Default value is 50.

top_p

Float

The top_p controls the breadth of token selection based on cumulative probability. Setting a lower top_p value means the model will sample from a smaller, more likely set of tokens, which can help in maintaining the relevance and quality of the content generated. Common range: 0 - 1. Default value is 0.5.

max_tokens

Integer

This parameter sets the maximum number of tokens the model can generate. It acts as a cap, ensuring that the generation process does not exceed a certain length, which is crucial for keeping the content focused and within desired constraints. The maximum limit is 4096. Default value is 512.

User Guide

Craft Your Input: Gather your messages including who's speaking and what's said, e.g., {"role": "user", "content": "What's the weather like?"}. Keep messages clear and relevant.
Choose a Model: Pick an AI model, like "gemma-7b". The model influences and style of replies.
Set Your Parameters

temperature: Affects creativity. Higher for more varied responses.
top_k and top_p: Controls response diversity. Lower numbers for more focused answers.
max_tokens: Sets the maximum length of replies. Keep it practical for chatbot interactions.

Fine-tune for Quality: Experiment with temperature, top_k, and top_p to find the sweet spot between creativity and relevance.
Limit Response Length: Use max_tokens to ensure responses are concise and to the point.
Evaluate and Adjust: Review the generated responses. If they don't meet your needs, tweak the input or settings.

Example Request

{
  "messages": [
    {
      "role": "user",
      "content": "Hello, have a good day!"
    }
  ],
  "configs": {
    "model": "gemma-7b",
    "max_tokens": 1024,
    "top_k": 10,
    "top_p": 0.9,
    "temperature": 0
  }
}

Response streaming with Python request

# Parrot API Endpoint
url = "https://api.joinparrot.ai/v1/ai/text_completion?stream=true"

# Setting configs
configs = dict(
    model=model, temperature=temperature, top_k=top_k, top_p=top_p, max_new_tokens=max_tokens
)

# Payload
payload = {"messages": messages, "configs": configs}

# Headers
headers = {"Authorization" :  "Bearer " + "YOUR_TOKEN_HERE"}
    
# Get response with streaming
with requests.post(url, json=payload, stream=True, headers=headers) as response:
    response.raise_for_status()
    for line in response.iter_lines():
        if line:
            print(line.decode())

Parrot API

Streaming

# Initialize the text generation process with the Parrot api.
generator = parrot.generate_text_stream(messages, model, top_k, top_p, temperature, max_tokens)

# Iterate over the generator object to fetch generated text.
for data in generator:
    print(data.decode().strip())

Not Streaming

response = parrot.text_generation(messages, model, top_k, top_p, temperature, max_tokens)

Response

Returns the result of task (only when stream=false)

{
    "data": {
        "is_success": true,
        "data": {
            "task_id": "cf86cbda217c481f8dbc9fb24b7e79e0",
            "total_tasks": 1,
            "percent": 100,
            "status": "COMPLETED",
            "response": "Hello, and thank you for stopping by! I hope you have a good day too!\n\nWould you like me to tell you what I can do today? I'm a large language model, and I'm here to help you with a variety of tasks."
        },
        "configs": {
            "model": "gemma-7b",
            "max_new_tokens": 2048,
            "top_k": 10,
            "temperature": 0.0,
            "task_type": "LLM-GEMMA-7B",
            "queue_name": "llm_gemma_7b_queue",
            "messages": [
                {
                    "role": "user",
                    "content": "Hello, have a good day!"
                }
            ]
        }
    },
    "errors": [],
    "error_description": "",
    "start_time": "2024-03-05 21:24:55.572492",
    "end_time": "2024-03-05 21:25:00.601529",
    "host_of_client_call_request": "103.186.100.36",
    "total_time_by_second": 5.029042,
    "status": "success"
}

PreviousModels NextImage Generation

Last updated 1 year ago

Was this helpful?