Pricing Queries

Understanding Weave's Phase 1 Pricing Plan

Phase 1 Pricing

Weave is currently in Phase 1 of its pricing plan, designed to align with our objectives of internal testing and market gauging. This phase is characterized by a few key aspects:

  • Token-Based Pay-Per-Use Model Users are charged based on their usage of tokens. This allows for flexibility and scalability in line with the user's needs.

  • Full Platform Functionality During Phase 1, users have unrestricted access to all functionalities of the Weave platform.

  • Initial Credit Offering To encourage exploration and usage of the platform, new users are awarded 500 credits (equivalent to USD 5) upon registration. Each credit is valued at USD 0.01.

  • Credit Top-Up Option Users have the option to purchase additional credits to continue using the platform beyond the initial credit offering.

Credits and Charges

  • Credit Usage: Credits are deducted based on token usage in running prompts and workflows.

  • Credit Plans: Users can purchase credits through one-off plans, with options like Tier 1 (USD 10), Tier 2 (USD 20), and Tier 3 (USD 30), each offering additional bonus credits.

  • Payment Methods: Currently, Weave accepts payments through Stripe-supported methods.

Types of Credits and Expiry

  • Earned Credits: These are credits granted for completing certain actions as part of incentives. They expire within a month from the date of receiving.

  • Purchased Credits: These credits are bought by the user and do not have an expiry date.

Token Price

Models and their pricing: Different models, such as LLama-2-70b, Mistral-7b, GPT 3.5, and GPT 4 come with different pricing per 1000 tokens. For instance, the charge for Mistral-7b model is USD 0.0011 per 1000 tokens.

Below are some examples of models and our charges per 1000 tokens:

Model
Cost (per 1000 Tokens)

LLama-2-70b

USD 0.0225

Mistral-7b

USD 0.0011

GPT-3.5

USD 0.0030

GPT-4

USD 0.0450

For a detailed cost breakdown of our latest models and estimated runs per Tier package, please see our constantly updated Model List.

How Tokens are Calculated

Tokens are counted based on the user's input prompt and the LLM's generated response. These are separated and labelled respectively as input and output tokens.

  • Input Tokens Refer to the total number of tokens in the user's input prompt, which may include any previous conversation history or context provided in the ongoing session.

  • Output Tokens Refer to the total number of tokens generated as part of the LLM's response in reply to the user's input prompt.

For more information on how tokens are calculated, users can refer to OpenAI's Tokenizer for token count specifics.

Example Calculation

Calculation Breakdown
  1. Model Used: GPT 3.5

  2. Token Costs:

    • Input Tokens: 85

    • Output Tokens: 400

  3. Cost Calculation Per Token:

    • Input Tokens Cost: The cost for input tokens is calculated at USD 0.0000015 per token.

      • So, for 85 input tokens, the cost is 85 × USD 0.0000015 = USD 0.0001275

    • Output Tokens Cost: The cost for output tokens is calculated at $0.000003 per token.

      • Therefore, for 400 output tokens, the cost is 400 × USD 0.000003 = USD 0.0012

  4. Total Cost:

    • Monetary Value: The total cost in monetary value is the sum of input and output token costs, which equals to USD 0.0001275 (input) + USD 0.0012 (output) = USD 0.0013275

    • Credit Value: The credit value is determined by first summing up the total monetary value. This sum is then converted to credits, ensuring precision to the nearest ten-thousandth.

      • Example: If the initial monetary value is USD 0.0013275, it translates to 0.13275 credits. This value is then rounded off to the nearest 1/10,000, resulting in a final credit value of 0.1328.

Credits Cost Calculations

In Weave's Phase 1 pricing plan, the calculation of estimated run costs is a critical component, ensuring transparency and predictability in usage charges.

Users are presented with an estimated credit cost that updates in real time as they modify the prompt. This dynamic display offers an estimated cost in US dollars (USD), making it simpler for users to understand the financial implications of their changes.

Here's an overview of how these costs are computed:

Input Cost Calculation

The cost associated with input tokens is determined by multiplying the quantity of input tokens by the per-token price of the chosen model.

In other words:

number of input tokens × price per LLM input token
Estimated Output Cost Calculation

The estimated output cost is derived by calculating the number of estimated output tokens and multiplying it by the cost per output token of the selected model.

number of estimated max output tokens × price per LLM output token
Actual Output Cost

The actual output cost is determined by multiplying the actual number of output tokens generated (following the execution of the prompt) by the price per output token of the model utilized.

number of output tokens generated by the prompt x price per LLM output token
Combined Estimated Run Cost

Before running a prompt, the estimated average run cost displayed is the aggregate of the input cost and the estimated output cost.

sum of the input cost + estimated output cost

This provides users with a close estimation of the total cost to be incurred before the execution of the prompt or workflow.

Final Charge Post-Completion

Upon the completion of a prompt or workflow, the actual run cost is computed as the sum of the input cost and the actual output cost.

sum of the input cost + actual output cost. 

This calculation ensures that users are billed precisely based on their actual usage of the service.

Error Handling

At Weave, our commitment to customer satisfaction is paramount. We understand the importance of a reliable service and the impact any disruptions can have on our users. Recognizing this, we have established clear and fair policies to address scenarios where service issues occur:

  • Error on Weave's Side If an error occurs on Weave’s server side, the system absorbs the cost, returning withheld credits to the user's wallet.

  • Error on LLM's Side If there's an error on the LLM side, users are charged based on the input and output costs of the LLM.

The estimated run costs are not the same as the actual run costs, therefore they won't be charged until confirmation is received.

Run Cancellation Charges

We recognize that situations may arise where you need to cancel a run after it has started. It's important to be aware of how these cancellations can affect your billing. Here’s what you need to know about potential charges:

  • When a run is initiated and then canceled, it's important to understand that halting the LLM's processing mid-way is not always instant. Due to the nature of LLM operations, once a task has started, the LLM might continue processing for a brief period until the cancellation is fully processed. During this delay, computational resources are still being used. As such, we calculate costs based on the amount of processing completed by the LLM at the time of cancellation.

  • However, in cases where the LLM has not begun to process any inputs before the cancellation is recognized, no charges will be incurred. This ensures that you are only billed for the actual computational work performed.

Credit Packages

To purchase our credit packages, please proceed to our Pricing Page for more details.

Last updated