Groq-Specific Parameters in ClientAI¶

This guide covers the Groq-specific parameters that can be passed to ClientAI's generate_text and chat methods. These parameters are passed as additional keyword arguments to customize Groq's behavior.

generate_text Method¶

Basic Structure¶

from clientai import ClientAI

client = ClientAI('groq', api_key="your-groq-api-key")
response = client.generate_text(
    prompt="Your prompt here",          # Required
    model="llama3-8b-8192",            # Required
    frequency_penalty=0.5,              # Groq-specific
    presence_penalty=0.2,               # Groq-specific
    max_tokens=100,                     # Groq-specific
    response_format={"type": "json"},   # Groq-specific
    seed=12345,                        # Groq-specific
    temperature=0.7,                    # Groq-specific
    top_p=0.9,                         # Groq-specific
    n=1,                               # Groq-specific
    stop=["END"],                      # Groq-specific
    stream=False,                      # Groq-specific
    stream_options=None,               # Groq-specific
    functions=None,                    # Groq-specific (Deprecated)
    function_call=None,                # Groq-specific (Deprecated)
    tools=None,                        # Groq-specific
    tool_choice=None,                  # Groq-specific
    parallel_tool_calls=True,          # Groq-specific
    user="user_123"                    # Groq-specific
)

Groq-Specific Parameters¶

`frequency_penalty: Optional[float]`¶

Range: -2.0 to 2.0
Default: 0

Penalizes tokens based on their frequency in the text

response = client.generate_text(
    prompt="Write a creative story",
    model="llama3-8b-8192",
    frequency_penalty=0.7  # Reduces repetition
)

`presence_penalty: Optional[float]`¶

Range: -2.0 to 2.0
Default: 0

Penalizes tokens based on their presence in prior text

response = client.generate_text(
    prompt="Write a varied story",
    model="llama3-8b-8192",
    presence_penalty=0.6  # Encourages topic diversity
)

`max_tokens: Optional[int]`¶

Maximum tokens for completion

Limited by model's context length

response = client.generate_text(
    prompt="Write a summary",
    model="llama3-8b-8192",
    max_tokens=100
)

`response_format: Optional[Dict]`¶

Controls output structure

Requires explicit JSON instruction in prompt

response = client.generate_text(
    prompt="List three colors in JSON",
    model="llama3-8b-8192",
    response_format={"type": "json_object"}
)

`seed: Optional[int]`¶

For deterministic generation

response = client.generate_text(
    prompt="Generate a random number",
    model="llama3-8b-8192",
    seed=12345
)

`temperature: Optional[float]`¶

Range: 0 to 2
Default: 1

Controls randomness in output

response = client.generate_text(
    prompt="Write creatively",
    model="llama3-8b-8192",
    temperature=0.7  # More creative output
)

`top_p: Optional[float]`¶

Range: 0 to 1
Default: 1

Alternative to temperature, called nucleus sampling

response = client.generate_text(
    prompt="Generate text",
    model="llama3-8b-8192",
    top_p=0.1  # Only consider top 10% probability tokens
)

`n: Optional[int]`¶

Default: 1
Number of completions to generate

Note: Currently only n=1 is supported

response = client.generate_text(
    prompt="Generate a story",
    model="llama3-8b-8192",
    n=1
)

`stop: Optional[Union[str, List[str]]]`¶

Up to 4 sequences where generation stops

response = client.generate_text(
    prompt="Write until you see END",
    model="llama3-8b-8192",
    stop=["END", "STOP"]  # Stops at either sequence
)

`stream: Optional[bool]`¶

Default: False

Enable token streaming

for chunk in client.generate_text(
    prompt="Tell a story",
    model="llama3-8b-8192",
    stream=True
):
    print(chunk, end="", flush=True)

`stream_options: Optional[Dict]`¶

Options for streaming responses

Only used when stream=True

response = client.generate_text(
    prompt="Long story",
    model="llama3-8b-8192",
    stream=True,
    stream_options={"chunk_size": 1024}
)

`user: Optional[str]`¶

Unique identifier for end-user tracking

response = client.generate_text(
    prompt="Hello",
    model="llama3-8b-8192",
    user="user_123"
)

chat Method¶

Basic Structure¶

response = client.chat(
    model="llama3-8b-8192",            # Required
    messages=[...],                    # Required
    tools=[...],                      # Groq-specific
    tool_choice="auto",               # Groq-specific
    parallel_tool_calls=True,         # Groq-specific
    response_format={"type": "json"}, # Groq-specific
    temperature=0.7,                  # Groq-specific
    frequency_penalty=0.5,            # Groq-specific
    presence_penalty=0.2,             # Groq-specific
    max_tokens=100,                   # Groq-specific
    seed=12345,                      # Groq-specific
    stop=["END"],                    # Groq-specific
    stream=False,                    # Groq-specific
    stream_options=None,             # Groq-specific
    top_p=0.9,                       # Groq-specific
    n=1,                             # Groq-specific
    user="user_123"                  # Groq-specific
)

Groq-Specific Parameters¶

`tools: Optional[List[Dict]]`¶

List of available tools (max 128)

response = client.chat(
    model="llama3-70b-8192",
    messages=[{"role": "user", "content": "What's the weather?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather data",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                }
            }
        }
    }]
)

`tool_choice: Optional[Union[str, Dict]]`¶

Controls tool selection behavior

Values: "none", "auto", "required"

response = client.chat(
    model="llama3-70b-8192",
    messages=[{"role": "user", "content": "Calculate something"}],
    tool_choice="auto"  # or "none" or "required"
)

`parallel_tool_calls: Optional[bool]`¶

Default: True

Enable parallel function calling

response = client.chat(
    model="llama3-70b-8192",
    messages=[{"role": "user", "content": "Multiple tasks"}],
    parallel_tool_calls=True
)

Complete Examples¶

Example 1: Structured Output with Tools¶

response = client.chat(
    model="llama3-70b-8192",
    messages=[
        {"role": "system", "content": "You are a data assistant"},
        {"role": "user", "content": "Get weather for Paris"}
    ],
    response_format={"type": "json_object"},
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather data",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                }
            }
        }
    }],
    tool_choice="auto",
    temperature=0.7,
    max_tokens=200,
    seed=42
)

Example 2: Advanced Text Generation¶

response = client.generate_text(
    prompt="Write a technical analysis",
    model="mixtral-8x7b-32768",
    max_tokens=500,
    frequency_penalty=0.7,
    presence_penalty=0.6,
    temperature=0.4,
    top_p=0.9,
    stop=["END", "CONCLUSION"],
    user="analyst_1",
    seed=42
)

Example 3: Streaming Generation¶

for chunk in client.chat(
    model="llama3-8b-8192",
    messages=[{"role": "user", "content": "Explain quantum physics"}],
    stream=True,
    temperature=0.7,
    max_tokens=1000,
    stream_options={"chunk_size": 1024}
):
    print(chunk, end="", flush=True)

Parameter Validation Notes¶

Both model and prompt/messages are required
Model must be one of: "gemma-7b-it", "llama3-70b-8192", "llama3-8b-8192", "mixtral-8x7b-32768"
n parameter only supports value of 1
stop sequences limited to 4 maximum
Tool usage limited to 128 functions
response_format requires explicit JSON instruction in prompt
Parameters like logprobs, logit_bias, and top_logprobs are not yet supported
Deterministic generation with seed is best-effort
functions and function_call are deprecated in favor of tools and tool_choice

These parameters allow you to fully customize Groq's behavior while working with ClientAI's abstraction layer.

Groq-Specific Parameters in ClientAI¶

generate_text Method¶

Basic Structure¶

Groq-Specific Parameters¶

frequency_penalty: Optional[float]¶

presence_penalty: Optional[float]¶

max_tokens: Optional[int]¶

response_format: Optional[Dict]¶

seed: Optional[int]¶

temperature: Optional[float]¶

top_p: Optional[float]¶

n: Optional[int]¶

stop: Optional[Union[str, List[str]]]¶

stream: Optional[bool]¶

stream_options: Optional[Dict]¶

user: Optional[str]¶

chat Method¶

Basic Structure¶

Groq-Specific Parameters¶

tools: Optional[List[Dict]]¶

tool_choice: Optional[Union[str, Dict]]¶

parallel_tool_calls: Optional[bool]¶

Complete Examples¶

Example 1: Structured Output with Tools¶

Example 2: Advanced Text Generation¶

Example 3: Streaming Generation¶

Parameter Validation Notes¶

`frequency_penalty: Optional[float]`¶

`presence_penalty: Optional[float]`¶

`max_tokens: Optional[int]`¶

`response_format: Optional[Dict]`¶

`seed: Optional[int]`¶

`temperature: Optional[float]`¶

`top_p: Optional[float]`¶

`n: Optional[int]`¶

`stop: Optional[Union[str, List[str]]]`¶

`stream: Optional[bool]`¶

`stream_options: Optional[Dict]`¶

`user: Optional[str]`¶

`tools: Optional[List[Dict]]`¶

`tool_choice: Optional[Union[str, Dict]]`¶

`parallel_tool_calls: Optional[bool]`¶