sparkleAbilities

An Ability is a preconfigured AI task that analyzes visual media and returns structured output.

Abilities allow developers to add visual intelligence to applications without training or deploying their own models.

Typical workflow:

  • Create your ability on EyePop.ai

  • Image/Video/Live Stream β†’ Ability β†’ Structured Output

For examples of Abilities and their output, see the Abilities Hubarrow-up-right


What is an Ability?

An Ability is a specific visual analysis capability such as:

  • Detecting objects

  • Classifying scenes

  • Extracting structured information from images

  • Understanding events in video

Developers call an Ability using the EyePop API and receive structured results such as bounding boxes, classifications, or extracted text.

Anatomy of an Ability

An Ability is defined by a small set of configuration parameters that control how visual media is analyzed. These parameters determine how the model interprets the task, how much compute is used, and how frequently media is analyzed.

Ability Components

Field

Description

name

Unique identifier for the Ability. Used when calling the API.

description

Human-readable explanation of the task the Ability performs. This is used by the Prompt Creation Agent to generate or refine prompts.

image_size

The resolution images are resized to before being analyzed. Smaller sizes reduce compute cost and increase speed.

prompt

The instructions given to the vision-language model describing what it should detect or classify.

model

The underlying AI model used to perform the analysis.

fps

Frames per second to analyze when processing video or livestreams. Controls how frequently frames are sampled.

Name

The name uniquely identifies the Ability.

Example:

Names are used in API calls and should clearly reflect the task being performed.

Description

The description explains the purpose of the Ability.

This field is especially important because it is used by the Prompt Creation Agent to help generate or improve prompts.

Example:

Good descriptions are:

  • clear

  • specific

  • task-oriented

Avoid vague descriptions such as:

Image Size

The image_size parameter controls how images are resized before inference.

Example:

Reducing image size:

  • decreases compute usage

  • increases inference speed

  • often improves detection consistency

Typical production sizes:

Use Case

Image Size

Object detection

512-640

Find event in video

512-640

Document analysis

768-1024

Prompt

The prompt defines what the model should analyze in the image or video frame. For examples of Ability prompts, see the Abilities Hubarrow-up-right.

Example:

Prompts should:

  • clearly define the task

  • restrict possible outputs

  • avoid unnecessary complexity

Model

The model specifies which AI model runs the Ability.

Example:

Different models may vary in:

  • reasoning ability

  • speed

  • compute cost

Abilities typically use a model optimized for vision-language tasks.

FPS (Frames Per Second)

The fps parameter determines how frequently frames are analyzed when processing video or livestreams.

Example:

This means the Ability will analyze 5 frames per second of video.

Choosing the correct FPS helps balance detection accuracy and compute cost.

Use Case

Recommended FPS

Security monitoring

2–5

Sports analytics

5–10

Industrial monitoring

1–3

Lower FPS reduces compute usage while still capturing most events.

Helpful Mental Model

You can think of an Ability as three main parts:

Where:

  • Model determines reasoning capability

  • Prompt defines the task

  • Sampling (fps + image_size) controls performance and compute usage


What Tasks Are Abilities Good For?

Abilities are designed for real-world visual intelligence workloads.

Common use cases include:

Security and Surveillance

  • Person detection

  • Intrusion alerts

  • PPE detection

Retail and Commerce

  • Product recognition

  • Shelf monitoring

  • Customer analytics

Sports Analytics

  • Player detection

  • Action classification

  • Event segmentation

Document Processing

  • Driver's license extraction

  • Receipt parsing

  • Title or invoice extraction

Industrial Automation

  • Quality inspection

  • Object counting

  • Safety compliance monitoring


What is a Compute Unit?

A Compute Unit (CU) represents the cost of running one AI inference task.

The resolution of the image or video does not effect the compute units used as it's resized to the image scaling resolution you define in your ability. The bigger you make this resolution, the more compute units will be used to achieve your task. Typically, accuracy will increase with larger window sizes.

Each time an Ability processes an image or frame, compute resources are consumed.

Video workloads consume compute based on the number of frames analyzed.

Estimating Compute Unit Usage

Estimate pricing coming soon

Overage Costs

Each EyePop plan includes a monthly allocation of compute units.

If usage exceeds the included amount, additional compute is billed automatically.

Example:

Included CU
Used CU
Overage

4,000

5,200

1,200 units billed at the end of the month (+$60)

This allows applications to scale without interruption.

Optimizing Compute unit Usage

You can reduce compute costs and improve performance by optimizing inputs.

Reduce Image Size

Recommended image width:

640px – 1280px

Very large images increase compute cost.

Reduce Video FPS

Typical production settings:

Use Case
FPS

Security monitoring

2–5

Sports analytics

5–10

Industrial monitoring

1–3

Crop Regions of Interest

Instead of analyzing the entire frame, crop the relevant region before sending it to the Ability.

This reduces compute and improves accuracy. Region of interest (ROI) is supported through the SDK.


Ability Prompt Creation Agent

The Ability Prompt Creation Agent helps developers generate reliable prompts for visual AI tasks.

Prompt design is one of the hardest parts of working with vision-language models. Small wording changes can significantly affect accuracy, consistency, and cost.

The Prompt Creation Agent analyzes the task you want to perform and generates a production-ready prompt optimized for EyePop Abilities.

Why It Exists

Vision models are extremely sensitive to how instructions are written.

Poor prompts can cause issues such as:

  • inconsistent classifications

  • overly verbose outputs

  • hallucinated results

  • unpredictable formatting

  • increased compute cost

The Prompt Creation Agent helps avoid these problems by generating prompts that follow tested patterns.

How to run the Ability Prompt Creation Agent

Coming soon to the Dashboard

Last updated