Skip to main content

Action Reference

Complete reference for all EvalCI inputs, outputs, and configuration options.


Usage

- uses: SynapseKit/evalci@v1
with:
path: tests/evals
threshold: "0.80"
extras: "openai"
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

Inputs

InputDefaultDescription
path.Path to eval files or directory. EvalCI discovers all @eval_case functions recursively.
threshold0.7Global minimum score (0.0–1.0). Cases scoring below this fail.
extrasopenaipip extras to install with synapsekit. Use comma-separated values for multiple providers.
synapsekit-versionlatestPin a specific synapsekit version (e.g. 1.5.2) or latest.
github-token${{ github.token }}GitHub token used to post PR comments. The default works for most repos.
fail-on-regressionfalseSet to true to also fail if scores regress compared to a saved baseline.
token(empty)Reserved for future EvalCI backend API token. Leave blank for now.

Outputs

Access outputs in subsequent steps using steps.<id>.outputs.<name>:

OutputTypeDescription
passedstringNumber of eval cases that passed
failedstringNumber of eval cases that failed
totalstringTotal number of eval cases run
mean-scorestringMean score across all eval cases (4 decimal places)
- uses: SynapseKit/evalci@v1
id: eval
with:
path: tests/evals

- name: Print results
run: |
echo "Passed: ${{ steps.eval.outputs.passed }}/${{ steps.eval.outputs.total }}"
echo "Mean score: ${{ steps.eval.outputs.mean-score }}"

Exit codes

CodeMeaning
0All eval cases passed (score ≥ threshold, cost ≤ max, latency ≤ max)
1One or more cases failed, or regression detected (if fail-on-regression: true)

Provider extras

The extras input controls which LLM provider packages are installed alongside synapsekit.

Providerextras value
OpenAIopenai
Anthropicanthropic
Google Geminigoogle-generativeai
Ollama (local)ollama
Coherecohere
Groqgroq
Multiple providersopenai,anthropic
- uses: SynapseKit/evalci@v1
with:
extras: "openai,anthropic"
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Pinning a version

By default EvalCI always installs the latest synapsekit. To pin:

- uses: SynapseKit/evalci@v1
with:
synapsekit-version: "1.5.2"

Custom threshold per case

The threshold input sets a global floor. You can override it per case using the min_score parameter on @eval_case:

# This case needs 0.90, regardless of the global threshold
@eval_case(min_score=0.90)
async def test_critical_path():
...

# This case only needs 0.60
@eval_case(min_score=0.60)
async def test_experimental_feature():
...

A case fails if its score is below either its own min_score or the global threshold — whichever is stricter.


Disabling PR comments

To run evals without posting a comment (e.g. on a branch that has no PR):

- uses: SynapseKit/evalci@v1
with:
github-token: "" # empty token disables comment posting
path: tests/evals

Running on push as well as PRs

on:
pull_request:
push:
branches: [main]

On push events (not a PR), EvalCI still runs the evals and sets outputs, but skips the PR comment since there is no PR to comment on.


See also