Authentication

API keys are associated with a project in your organization. Include the API key in the header of your request.

curl -X GET https://api.vaeroapi.com/v1/[endpoint] \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -H "Content-Type: application/json"

Step 1. Upload files for training

Upload one or more files of your sample text for which you want to copy the style.

The files are text only. If you have documents in other formats, convert them to text first.

Preferably, each file should represent one document, such as one blog post, one report, one email, etc. If you want to train on multiple documents, upload them as separate files instead of mixing the content in one file.

curl -X POST "https://api.vaeroapi.com/v1/files" \
  -F "file=@/path/to/your/file" \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -H "Content-Type: multipart/form-data"

The response will include the file IDs to use for fine-tuning.

Step 2. Fine-tune your model

Fine-tune your model based on one or more of your uploaded files. The model will learn to write in the style of those files.

For best results:

  1. Use files with a consistent style
  2. Use files of the same type that you want to style, e.g. fine-tuning on product guides will create a model that works best on product guides

The model is the model to fine-tune. base_model is the base AI whose output you want to style (for example, you may want to style outputs from gpt4-o or from Claude Sonnet). The num_examples is the number of examples used to fine-tune the model. More examples provides more learning.

Important: Our recommended settings for most use cases is llama-3.1-70b for the model parameter and 200 for the num_examples parameter.

curl -X POST "https://api.vaeroapi.com/v1/fine_tuning/jobs" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "llama-3.1-70b",
        "base_model": "gpt-4o-2024-08-06",
        "training_files": ["file_id_1", "file_id_2"],
        "suffix": "optional_suffix",
        "num_examples": 200
      }'

The response will include a fine-tuned model ID, which you will use to identify the model for inference.

Fine-tuning can sometimes complete in minutes, but may take up to 24 hours based on load. The status of the job will change from ‘queued’ to ‘completed’ when finished.

Step 3. Styling

Submit the output of your base AI model to the style transform endpoint. Include the parameter model with the ID of your fine-tuned model. Include the parameter message as a string that contains the text to style.

No conversational turns or instructions to the AI need to be submitted.

If you have a large block of multi-paragraph text, include all of it in a single API call. You don’t need to segment the text into separate calls.

curl -X POST "https://api.vaeroapi.com/v1/style/transform" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -d '{
        "message": "your_text_here",
        "model": "your_fine_tuned_model_id"
      }'

The styled content is at response["choices"][0]["message"]["content"]

Step 4. Evaluating the results

The response also includes quality metrics. response["histogram_delta"]["vocab"] includes the metrics from the comparison of the vocabulary (words used) in the training files of the fine-tuned model (“ground truth”) versus the original output and the styled output. This comparison is based on the rates of appearance of the 100 most common words in the English language.

response["histogram_delta"]["sentence_length"] includes the metrics from the comparison of the sentence lengths (words per sentence) in the training files of the fine-tuned model versus the original output and the styled output. This comparison is based on the rates at which each sentence length appears.

The field euclidean_delta_percent compares the Euclidean distance of the original output to the ground truth and the styled output to the ground truth, and provides the percentage difference in these distances.

The field manhattan_delta_percent compares the Manhattan distance of the original output to the ground truth and the styled output to the ground truth, and provides the percentage difference in these distances.

The Euclidean and Manhattan distances are computed based on the relevant histograms.

In both cases, positive numbers show that the styled output is closer to the ground truth than the original output.