Новый проект
Worker API
Локальный сервер будет забирать задачи через pull-модель.
POST /api/worker/claim
POST /api/worker/jobs/{id}/complete
Header: X-Worker-Token
OpenRouter
API key configured. Можно запускать тестовые агентские запросы.
Проекты
| Проект | Статус | Гипотезы | Эксперименты | Очередь | Результаты |
|---|---|---|---|---|---|
| MVP Smoke Project mvp-smoke-1780175317 |
active priority 90 |
1 | 1 | 0 queued / 0 running | 1 |
| MVP Smoke Project mvp-smoke |
active priority 90 |
1 | 1 | 0 queued / 1 running | 0 |
| AI Feature Smoke ai-feature-1780177760 |
active priority 75 |
1 | 1 | 2 queued / 0 running | 0 |
Последние simulation jobs
| ID | Проект | Эксперимент | Статус | Seed | Worker |
|---|---|---|---|---|---|
| 1 | MVP Smoke Project | Smoke experiment | running | 42 | smoke-worker |
| 3 | AI Feature Smoke | Test impact of bonus frequency on RTP volatility | queued | 42 | |
| 4 | AI Feature Smoke | Test impact of bonus frequency on RTP volatility | queued | 42 | |
| 2 | MVP Smoke Project | Smoke experiment | done | 42 | smoke-worker |
Model profiles
| Profile | Model | Temperature | Max tokens | Purpose | |
|---|---|---|---|---|---|
| cheap_fast | |||||
| balanced | |||||
| strong_reasoning | |||||
| critical | |||||
OpenRouter catalog
| Model ID | Name | Context | Prompt | Completion |
|---|---|---|---|---|
| AI21: Jamba Large 1.7 | 256000 | 0.000002000000 | 0.000008000000 | |
| AionLabs: Aion-1.0 | 131072 | 0.000004000000 | 0.000008000000 | |
| AionLabs: Aion-1.0-Mini | 131072 | 7.00000E-7 | 0.000001400000 | |
| AionLabs: Aion-2.0 | 131072 | 8.00000E-7 | 0.000001600000 | |
| AionLabs: Aion-RP 1.0 (8B) | 32768 | 8.00000E-7 | 0.000001600000 | |
| AlfredPros: CodeLLaMa 7B Instruct Solidity | 4096 | 8.00000E-7 | 0.000001200000 | |
| AllenAI: Olmo 3 32B Think | 65536 | 1.50000E-7 | 5.00000E-7 | |
| Amazon: Nova 2 Lite | 1000000 | 3.00000E-7 | 0.000002500000 | |
| Amazon: Nova Lite 1.0 | 300000 | 6.0000E-8 | 2.40000E-7 | |
| Amazon: Nova Micro 1.0 | 128000 | 3.5000E-8 | 1.40000E-7 | |
| Amazon: Nova Premier 1.0 | 1000000 | 0.000002500000 | 0.000012500000 | |
| Amazon: Nova Pro 1.0 | 300000 | 8.00000E-7 | 0.000003200000 | |
| Magnum v4 72B | 32768 | 0.000003000000 | 0.000005000000 | |
| Anthropic: Claude 3.5 Haiku | 200000 | 8.00000E-7 | 0.000004000000 | |
| Anthropic: Claude 3 Haiku | 200000 | 2.50000E-7 | 0.000001250000 | |
| Anthropic: Claude Haiku 4.5 | 200000 | 0.000001000000 | 0.000005000000 | |
| Anthropic Claude Haiku Latest | 200000 | 0.000001000000 | 0.000005000000 | |
| Anthropic: Claude Opus 4 | 200000 | 0.000015000000 | 0.000075000000 | |
| Anthropic: Claude Opus 4.1 | 200000 | 0.000015000000 | 0.000075000000 | |
| Anthropic: Claude Opus 4.5 | 200000 | 0.000005000000 | 0.000025000000 | |
| Anthropic: Claude Opus 4.6 | 1000000 | 0.000005000000 | 0.000025000000 | |
| Anthropic: Claude Opus 4.6 (Fast) | 1000000 | 0.000030000000 | 0.000150000000 | |
| Anthropic: Claude Opus 4.7 | 1000000 | 0.000005000000 | 0.000025000000 | |
| Anthropic: Claude Opus 4.7 (Fast) | 1000000 | 0.000030000000 | 0.000150000000 | |
| Anthropic: Claude Opus 4.8 | 1000000 | 0.000005000000 | 0.000025000000 | |
| Anthropic: Claude Opus 4.8 (Fast) | 1000000 | 0.000010000000 | 0.000050000000 | |
| Anthropic: Claude Opus Latest | 1000000 | 0.000005000000 | 0.000025000000 | |
| Anthropic: Claude Sonnet 4 | 1000000 | 0.000003000000 | 0.000015000000 | |
| Anthropic: Claude Sonnet 4.5 | 1000000 | 0.000003000000 | 0.000015000000 | |
| Anthropic: Claude Sonnet 4.6 | 1000000 | 0.000003000000 | 0.000015000000 | |
| Anthropic Claude Sonnet Latest | 1000000 | 0.000003000000 | 0.000015000000 | |
| Arcee AI: Coder Large | 32768 | 5.00000E-7 | 8.00000E-7 | |
| Arcee AI: Maestro Reasoning | 131072 | 9.00000E-7 | 0.000003300000 | |
| Arcee AI: Spotlight | 131072 | 1.80000E-7 | 1.80000E-7 | |
| Arcee AI: Trinity Large Thinking | 262144 | 2.20000E-7 | 8.50000E-7 | |
| Arcee AI: Trinity Mini | 131072 | 4.5000E-8 | 1.50000E-7 | |
| Arcee AI: Virtuoso Large | 131072 | 7.50000E-7 | 0.000001200000 | |
| Baidu: ERNIE 4.5 21B A3B | 131072 | 7.0000E-8 | 2.80000E-7 | |
| Baidu: ERNIE 4.5 21B A3B Thinking | 131072 | 7.0000E-8 | 2.80000E-7 | |
| Baidu: ERNIE 4.5 300B A47B | 131072 | 2.80000E-7 | 0.000001100000 | |
| Baidu: ERNIE 4.5 VL 28B A3B | 131072 | 1.40000E-7 | 5.60000E-7 | |
| Baidu: ERNIE 4.5 VL 424B A47B | 131072 | 4.20000E-7 | 0.000001250000 | |
| ByteDance Seed: Seed 1.6 | 262144 | 2.50000E-7 | 0.000002000000 | |
| ByteDance Seed: Seed 1.6 Flash | 262144 | 7.5000E-8 | 3.00000E-7 | |
| ByteDance Seed: Seed-2.0-Lite | 262144 | 2.50000E-7 | 0.000002000000 | |
| ByteDance Seed: Seed-2.0-Mini | 262144 | 1.00000E-7 | 4.00000E-7 | |
| ByteDance: UI-TARS 7B | 128000 | 1.00000E-7 | 2.00000E-7 | |
| Venice: Uncensored (free) | 32768 | |||
| Cohere: Command A | 256000 | 0.000002500000 | 0.000010000000 | |
| Cohere: Command R (08-2024) | 128000 | 1.50000E-7 | 6.00000E-7 | |
| Cohere: Command R7B (12-2024) | 128000 | 3.7500E-8 | 1.50000E-7 | |
| Cohere: Command R+ (08-2024) | 128000 | 0.000002500000 | 0.000010000000 | |
| Deep Cogito: Cogito v2.1 671B | 128000 | 0.000001250000 | 0.000001250000 | |
| DeepSeek: DeepSeek V3 | 131072 | 2.28800E-7 | 9.14400E-7 | |
| DeepSeek: DeepSeek V3 0324 | 163840 | 2.00000E-7 | 7.70000E-7 | |
| DeepSeek: DeepSeek V3.1 | 163840 | 2.10000E-7 | 7.90000E-7 | |
| DeepSeek: R1 | 163840 | 7.00000E-7 | 0.000002500000 | |
| DeepSeek: R1 0528 | 163840 | 5.00000E-7 | 0.000002150000 | |
| DeepSeek: R1 Distill Llama 70B | 131072 | 7.00000E-7 | 8.00000E-7 | |
| DeepSeek: R1 Distill Qwen 32B | 128000 | 2.90000E-7 | 2.90000E-7 | |
| DeepSeek: DeepSeek V3.1 Terminus | 163840 | 2.70000E-7 | 9.50000E-7 | |
| DeepSeek: DeepSeek V3.2 | 131072 | 2.52000E-7 | 3.78000E-7 | |
| DeepSeek: DeepSeek V3.2 Exp | 163840 | 2.70000E-7 | 4.10000E-7 | |
| DeepSeek: DeepSeek V4 Flash | 1048576 | 9.8300E-8 | 1.96600E-7 | |
| DeepSeek: DeepSeek V4 Flash (free) | 1048576 | |||
| DeepSeek: DeepSeek V4 Pro | 1048576 | 4.35000E-7 | 8.70000E-7 | |
| EssentialAI: Rnj 1 Instruct | 32768 | 1.50000E-7 | 1.50000E-7 | |
| Google: Gemini 2.0 Flash | 1048576 | 1.00000E-7 | 4.00000E-7 | |
| Google: Gemini 2.0 Flash Lite | 1048576 | 7.5000E-8 | 3.00000E-7 | |
| Google: Gemini 2.5 Flash | 1048576 | 3.00000E-7 | 0.000002500000 | |
| Google: Nano Banana (Gemini 2.5 Flash Image) | 32768 | 3.00000E-7 | 0.000002500000 | |
| Google: Gemini 2.5 Flash Lite | 1048576 | 1.00000E-7 | 4.00000E-7 | |
| Google: Gemini 2.5 Flash Lite Preview 09-2025 | 1048576 | 1.00000E-7 | 4.00000E-7 | |
| Google: Gemini 2.5 Pro | 1048576 | 0.000001250000 | 0.000010000000 | |
| Google: Gemini 2.5 Pro Preview 06-05 | 1048576 | 0.000001250000 | 0.000010000000 | |
| Google: Gemini 2.5 Pro Preview 05-06 | 1048576 | 0.000001250000 | 0.000010000000 | |
| Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) | 131072 | 5.00000E-7 | 0.000003000000 | |
| Google: Gemini 3.1 Flash Lite | 1048576 | 2.50000E-7 | 0.000001500000 | |
| Google: Gemini 3.1 Flash Lite Preview | 1048576 | 2.50000E-7 | 0.000001500000 | |
| Google: Gemini 3.1 Pro Preview | 1048576 | 0.000002000000 | 0.000012000000 |
Agent tasks
| ID | Project | Type | Profile | Status | Error |
|---|---|---|---|---|---|
| 7 | AI Feature Smoke | coding_plan | cheap_fast | done | |
| 6 | AI Feature Smoke | coding_plan | cheap_fast | failed | Expecting ',' delimiter: line 42 column 164 (char 3657) |
| 5 | AI Feature Smoke | coding_plan | cheap_fast | failed | Expecting ',' delimiter: line 43 column 170 (char 3802) |
| 4 | AI Feature Smoke | coding_plan | cheap_fast | failed | Expecting ',' delimiter: line 41 column 173 (char 3654) |
| 3 | AI Feature Smoke | coding_plan | cheap_fast | failed | Expecting ',' delimiter: line 114 column 6 (char 3806) |
| 2 | AI Feature Smoke | plan_experiment | cheap_fast | done | |
| 1 | AI Feature Smoke | generate_hypotheses | cheap_fast | done |
Agent runs
| ID | Project | Profile | Model | Tokens | Result |
|---|---|---|---|---|---|
| 4 | AI Feature Smoke | cheap_fast | openai/gpt-4.1-mini-2025-04-14 | 649 / 1423 | {
"title": "Python CLI for Simulation Results Summary",
"requirements": [
{"id": "R1", "text": "Read simulation results from a CSV file", "priority": "must"},
{"id": "R2", "text": "Compute statistical summaries (mean, standard deviation, quantiles) for selected metrics", "priority": "must"},
{"id": "R3", "text": "Export the computed summary statistics to a JSON file", "priority": "must"},
{"id": "R4", "text": "Provide a command-line interface to specify input CSV, output JSON, and metrics to summarize", "priority": "must"},
{"id": "R5", "text": "Include unit tests covering CSV reading, statistics computation, and JSON export", "priority": "must"},
{"id": "R6", "text": "Handle invalid input files or missing metrics gracefully with clear error messages", "priority": "should"},
{"id": "R7", "text": "Allow configuration of quantile levels via CLI or config", "priority": "could"}
],
"architecture": {
"summary": "Modular CLI utility that reads CSV data, computes statistics on selected metrics, and writes a JSON summary output.",
"components": [
{"name": "CSVReader", "responsibility": "Load and validate CSV data from file"},
{"name": "StatisticsCalculator", "responsibility": "Compute mean, std, and quantiles for specified metrics"},
{"name": "JSONExporter", "responsibility": "Serialize computed statistics into JSON format and write to file"},
{"name": "CLIHandler", "responsibility": "Parse command line arguments and coordinate workflow"},
{"name": "ErrorHandler", "responsibility": "Manage errors and provide user-friendly messages"}
],
"data_flow": [
"CLIHandler parses input arguments",
"CSVReader loads CSV data",
"StatisticsCalculator computes summaries for selected metrics",
"JSONExporter writes summary JSON output",
"CLIHandler reports success or errors"
]
},
"files": [
{"path": "cli.py", "purpose": "Entry point and CLI argument parsing", "main_symbols": ["main", "parse_args"]},
{"path": "csv_reader.py", "purpose": "Load and validate CSV simulation results", "main_symbols": ["CSVReader"]},
{"path": "statistics.py", "purpose": "Compute statistical summaries for metrics", "main_symbols": ["StatisticsCalculator"]},
{"path": "json_exporter.py", "purpose": "Export computed statistics to JSON file", "main_symbols": ["JSONExporter"]},
{"path": "errors.py", "purpose": "Define custom exceptions and error handling utilities", "main_symbols": ["InvalidInputError"]},
{"path": "tests/test_csv_reader.py", "purpose": "Unit tests for CSV reading and validation", "main_symbols": ["test_csv_loading", "test_invalid_csv"]},
{"path": "tests/test_statistics.py", "purpose": "Unit tests for statistics computations", "main_symbols": ["test_mean_std_quantiles"]},
{"path": "tests/test_json_exporter.py", "purpose": "Unit tests for JSON export functionality", "main_symbols": ["test_json_output"]}
],
"implementation_steps": [
{"step": 1, "title": "Setup project structure and dependencies", "details": "Create project folders, initialize Python environment, add requirements (e.g. pandas, numpy)"},
{"step": 2, "title": "Implement CSVReader", "details": "Develop CSVReader class to load CSV, validate presence of required columns, and handle errors"},
{"step": 3, "title": "Implement StatisticsCalculator", "details": "Create functions to compute mean, std deviation, and configurable quantiles for given metrics"},
{"step": 4, "title": "Implement JSONExporter", "details": "Serialize statistics dictionary to JSON and write to output file"},
{"step": 5, "title": "Implement CLIHandler", "details": "Use argparse to parse input CSV path, output JSON path, and metrics list; invoke components in order"},
{"step": 6, "title": "Add error handling and user messages", "details": "Catch and report errors such as file not found, missing metrics, or parse errors"},
{"step": 7, "title": "Write unit tests", "details": "Cover CSV reading, statistics calculations, and JSON export with representative test cases"},
{"step": 8, "title": "Manual testing and validation", "details": "Run CLI with sample CSV files, verify JSON output correctness and error handling"},
{"step": 9, "title": "Documentation and usage instructions", "details": "Document CLI usage, input/output formats, and configuration options"}
],
"test_plan": [
{"type": "unit", "target": "CSVReader", "cases": ["valid CSV loads correctly", "missing columns raises error", "empty file handling"]},
{"type": "unit", "target": "StatisticsCalculator", "cases": ["correct mean/std for sample data", "quantiles computed accurately", "handle empty data gracefully"]},
{"type": "unit", "target": "JSONExporter", "cases": ["valid JSON output format", "file write success", "handle write permission errors"]},
{"type": "integration", "target": "CLI end-to-end", "cases": ["valid input produces correct JSON", "invalid input shows error message", "metrics selection works as expected"]},
{"type": "manual", "target": "CLI usability", "cases": ["help message clarity", "error messages user-friendly", "performance on large CSV files"]}
],
"risks": [
{"risk": "Input CSV files may have inconsistent formatting or missing data", "mitigation": "Implement robust validation and clear error messages; provide example CSV format"},
{"risk": "Large CSV files may cause high memory usage or slow processing", "mitigation": "Use efficient libraries like pandas; consider streaming if needed"},
{"risk": "User may specify metrics not present in CSV", "mitigation": "Validate metric names early and inform user with suggestions"},
{"risk": "Quantile calculation may be misunderstood or misconfigured", "mitigation": "Document quantile defaults and allow user configuration with validation"}
],
"open_questions": [
"What are the exact metric names expected in the CSV and their data types?",
"Should quantile levels be fixed or configurable via CLI?",
"Is there a preferred JSON schema or structure for the summary output?",
"Are there any performance constraints or expected CSV file sizes?"
]
} |
| 3 | AI Feature Smoke | cheap_fast | openai/gpt-4.1-mini-2025-04-14 | 484 / 217 | ```json
{
"title": "Test impact of bonus frequency on RTP volatility",
"goal": "Evaluate if increasing bonus frequency reduces RTP volatility under fixed CPU budget",
"parameter_space": {
"bonus_frequency": [0.05, 0.10, 0.20]
},
"success_criteria": {
"primary_metric": "RTP_volatility",
"decision_rule": "RTP_volatility decreases as bonus_frequency increases"
},
"budget": {
"max_jobs": 3,
"max_runtime_seconds": 3600
},
"suggested_jobs": [
{
"seed": 42,
"priority": 70,
"parameters": {"bonus_frequency": 0.05},
"timeout_seconds": 1200
},
{
"seed": 42,
"priority": 60,
"parameters": {"bonus_frequency": 0.20},
"timeout_seconds": 1200
}
]
}
``` |
| 2 | AI Feature Smoke | cheap_fast | openai/gpt-4.1-mini-2025-04-14 | 226 / 88 | ```json
{
"hypotheses": [
{
"title": "Higher Bonus Frequency Reduces RTP Volatility",
"description": "Increasing the frequency of bonus events in the game math model will decrease RTP volatility by providing more frequent smaller wins, which may stabilize player experience under a fixed CPU budget.",
"confidence": 0.35,
"parent_id": null
}
]
}
``` |
| 1 | MVP Smoke Project | cheap_fast | openai/gpt-4.1-mini-2025-04-14 | 54 / 65 | 1. Вероятность выигрыша для всех стратегий примерно равна при длительной игре. 2. Средний выигрыш игрока близок к нулю при большом числе раундов. 3. Изменение параметров модели не приводит к систематическому преимуществу одной из сторон. |