Gemini Content Generation

Overview

The Gemini API endpoint allows you to generate content using Google’s Gemini models in their native format. This endpoint follows the official Gemini API specification, making it easy to integrate with existing Gemini-compatible code.

Latest News: gemini-3-pro-preview is now supported!

Quick Start

Simply replace the Base URL and API Key in the official SDK or requests to use it:

Base URL: https://wisdom-gate.juheapi.com (replace generativelanguage.googleapis.com)
API Key: Replace $GEMINI_API_KEY with your $WISDOM_GATE_KEY

Basic Example: Text Generation

curl "https://wisdom-gate.juheapi.com/v1beta/models/gemini-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "How does AI work?"
          }
        ]
      }
    ]
  }'

Important Notes

Model DifferencesDifferent Gemini model versions may support different request parameters and return different response fields. We strongly recommend consulting the model catalog for complete parameter lists and usage instructions for each model.

Response Pass-through PrincipleWisdom Gate typically does not modify model responses outside of reverse format, ensuring you receive response content consistent with the original Gemini API provider.

Streaming SupportWisdom Gate supports Server-Sent Events (SSE) for streaming responses. Use the streamGenerateContent operator with ?alt=sse parameter to enable real-time streaming, which is useful for chat applications.

Auto-Generated DocumentationThe request parameters and response format are automatically generated from the OpenAPI specification. All parameters, their types, descriptions, defaults, and examples are pulled directly from openapi.json. Scroll down to see the interactive API reference.

FAQ

How to control Thinking?

Gemini models support a “thinking” process to improve reasoning capabilities. The control method depends on the model version. For details, please refer to the official documentation: Gemini Thinking Guide

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Use the thinkingLevel parameter to control thinking intensity ("LOW" or "HIGH").

curl "https://wisdom-gate.juheapi.com/v1beta/models/gemini-3-pro-preview:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{ "parts": [{ "text": "Explain quantum physics simply." }] }],
    "generationConfig": {
      "thinkingConfig": {
        "thinkingLevel": "LOW"
      }
    }
  }'

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

Use the thinkingBudget parameter to control the Token budget for thinking.

0: Disable thinking.
-1: Dynamic thinking (model decides automatically, default).
> 0: Set a specific Token limit (e.g., 1024).

curl "https://wisdom-gate.juheapi.com/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [{ "parts": [{ "text": "Solve this logic puzzle." }] }],
    "generationConfig": {
      "thinkingConfig": {
        "thinkingBudget": 1024
      }
    }
  }'

How to use Streaming Responses?

Streaming responses allow you to receive results incrementally as the model generates content, reducing perceived latency. For details, please refer to the official documentation: Gemini Text Generation - Streaming Responses Note: The URL must point to streamGenerateContent and it is recommended to add ?alt=sse to use the Server-Sent Events format.

curl "https://wisdom-gate.juheapi.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "x-goog-api-key: $WISDOM_GATE_KEY" \
  -H 'Content-Type: application/json' \
  --no-buffer \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works"
          }
        ]
      }
    ]
  }'

How to maintain conversation context?

Include the complete conversation history in the contents array:

conversation = [
    {
        "role": "user",
        "parts": [{"text": "What is Python?"}]
    },
    {
        "role": "model",
        "parts": [{"text": "Python is a programming language..."}]
    },
    {
        "role": "user",
        "parts": [{"text": "What are its advantages?"}]
    }
]

response = requests.post(
    "https://wisdom-gate.juheapi.com/v1beta/models/gemini-3-pro-preview:generateContent",
    headers={
        "x-goog-api-key": "WISDOM_GATE_KEY",
        "Content-Type": "application/json"
    },
    json={"contents": conversation}
)

What does finishReason mean?

The finishReason field in the response indicates why the model stopped generating:

Value	Meaning
`STOP`	Natural completion
`MAX_TOKENS`	Reached maxOutputTokens limit
`SAFETY`	Triggered safety filter
`RECITATION`	Detected recitation of training data
`OTHER`	Other reason

How to control costs?

Use maxOutputTokens in generationConfig to limit output length
Choose appropriate models (e.g., gemini-2.5-flash is more economical than gemini-3-pro-preview)
Streamline prompts, avoid redundant context
Monitor token consumption in the usageMetadata field of responses
Use thinking budgets wisely for reasoning models to control reasoning token usage

How to use multimodal input (text and images)?

Gemini supports multimodal input through the parts array. You can include both text and images in a single request:

data = {
    "contents": [
        {
            "parts": [
                {"text": "What is in this image?"},
                {
                    "inlineData": {
                        "mimeType": "image/jpeg",
                        "data": "base64_encoded_image_data_here"
                    }
                }
            ]
        }
    ]
}

Authorizations

Authorization

string

header

required

Bearer token authentication. Include your API key in the Authorization header as 'Bearer YOUR_API_KEY'

Path Parameters

model

string

required

The model identifier (e.g., 'gemini-pro', 'gemini-pro-vision')

operator

string

required

The operation to perform. Use 'generateContent' for standard requests, or 'streamGenerateContent?alt=sse' for streaming responses with Server-Sent Events format.

Body

application/json

contents

object[]

required

Array of content parts that make up the conversation

Show child attributes

systemInstruction

object

System instruction to guide the model's behavior

Show child attributes

generationConfig

object

Configuration for content generation

Show child attributes

safetySettings

object[]

Safety settings for content filtering

Show child attributes

Response

Successful content generation response

candidates

object[]

required

Array of generated content candidates

Show child attributes

usageMetadata

object

Token usage statistics for the request

Show child attributes

promptFeedback

object

Feedback about the prompt, including safety ratings

Show child attributes

Text Models

Image Models

Video Models

Error Handling

Overview

Quick Start

Basic Example: Text Generation

Important Notes

FAQ

How to control Thinking?

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

How to use Streaming Responses?

How to maintain conversation context?

What does finishReason mean?

How to control costs?

How to use multimodal input (text and images)?

Authorizations

Path Parameters

Body

Response

Text Models

Image Models

Video Models

Error Handling

​Overview

​Quick Start

​Basic Example: Text Generation

​Important Notes

​FAQ

​How to control Thinking?

​Gemini 3 Series (e.g., gemini-3-pro-preview)

​Gemini 2.5 Series (e.g., gemini-2.5-pro)

​How to use Streaming Responses?

​How to maintain conversation context?

​What does finishReason mean?

​How to control costs?

​How to use multimodal input (text and images)?

Authorizations

Path Parameters

Body

Response

Overview

Quick Start

Basic Example: Text Generation

Important Notes

FAQ

How to control Thinking?

Gemini 3 Series (e.g., `gemini-3-pro-preview`)

Gemini 2.5 Series (e.g., `gemini-2.5-pro`)

How to use Streaming Responses?

How to maintain conversation context?

What does finishReason mean?

How to control costs?

How to use multimodal input (text and images)?