JSON Equal (json_equal)
Contents
Metric description
JSON equal parses output and golden answer as JSON values and compares them. Options can ignore array order and extra keys in the output when comparing objects.
How to interpret the score
- 100: parsed JSON values are equal under the chosen rules.
- 0: parse errors or structural or value mismatch.
API usage
Prerequisites
After the environment variables are configured, the next step is to create a JSON payload for the custom-runs request. For a field-by-field description of the payload (top-level keys, evaluations, and each row in data), see Custom run request body.
Shortname: json_equal
Default threshold: 100
Structural metrics run without an LLM (deterministic checks). Your run may still include model_slug where the API expects it; scoring does not depend on it for this category.
Inputs (each object in data)
output(str, required): JSON text.golden_answer(str, required): Reference JSON text.
metric_args
-
ignore_order(booleanoptional): Ignore order of array elements when comparing. Default:false. -
ignore_extra_keys(booleanoptional): Ignore keys present in output but not in the reference object. Default:false.
Eval metadata
Structural metrics do not populate eval_metadata; the field is omitted or ull on the result object.
Example
import json
import os
import requests
from dotenv import load_dotenv
load_dotenv(override=True)
_API_KEY = os.getenv("AEGIS_API_KEY")
_BASE_URL = os.getenv("AEGIS_API_BASE_URL")
_CUSTOM_RUN_URL = f"{_BASE_URL}/runs/custom"
def post_custom_run(payload: dict) -> requests.Response:
"""POST JSON payload to Aegis custom runs; returns the raw response."""
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {_API_KEY}",
}
return requests.post(
_CUSTOM_RUN_URL,
headers=headers,
data=json.dumps(payload),
)
if __name__ == "__main__":
data = [
{"output": "{\"a\": 1}", "golden_answer": "{\"a\": 1}"}
]
payload = {
"threshold": 100,
"model_slug": "o4-mini",
"is_blocking": True,
"data_collection_id": None,
"evaluations": [
{
"metrics": [
{
"metric": "json_equal",
"metric_args": {"ignore_order": False, "ignore_extra_keys": False},
},
],
"threshold": 100,
"model_slug": "o4-mini",
"data": data,
}
],
}
response = post_custom_run(payload)
response.raise_for_status()
print(json.dumps(response.json(), indent=2))