AI MathService Lab

projects, hypotheses, experiments, simulation jobs
health

AI Feature Smoke

Game math model: tune RTP volatility and bonus frequency under CPU budget.
priority 75 · active

AI: dispatcher

Диспетчер разложит произвольную задачу на маршрут агентов и ревью.

AI: гипотезы

Агент создаст новые узлы в графе проекта.

AI: coding plan

Агент подготовит план реализации Python-программы без изменения файлов.

Новая гипотеза

Новый эксперимент

Simulation job

Dispatcher plans

#1 Python CLI for Simulation Results Summary

planned · 2026-05-30 22:22:29.027363+00:00
Нужно написать Python CLI-утилиту по ТЗ: читать CSV результатов симуляций, считать mean/std/quantiles, экспортировать JSON, затем проверить план тестирования и риски.

Route

{
  "human_approval_points": [
    "Approval of the coding plan and test plan before implementation"
  ],
  "intent": "Develop a Python CLI utility to read CSV simulation results, compute statistical summaries (mean, std, quantiles), export to JSON, and review test plans and risks",
  "notes": [
    "The plan must handle invalid inputs gracefully with clear error messages.",
    "Quantile levels should be configurable via CLI or config as a \u0027could\u0027 priority.",
    "Unit tests should cover CSV reading, statistics computation, JSON export, and error handling.",
    "Review is critical to ensure risk mitigations address potential CSV data issues and performance concerns."
  ],
  "review": {
    "agent": "reviewer",
    "criteria": [
      "Completeness of implementation plan",
      "Adequacy of test coverage",
      "Appropriateness of risk mitigations",
      "Alignment with project requirements"
    ],
    "model_profile": "critical",
    "required": true
  },
  "risk_level": "medium",
  "summary": "Plan and design a Python CLI tool that processes CSV simulation outputs to produce JSON summaries with statistical metrics, followed by a review of testing plans and risk mitigations.",
  "tasks": [
    {
      "agent": "coding_planner",
      "depends_on": [],
      "expected_output": "A structured coding plan with implementation steps, component responsibilities, and test plan aligned with the project context and requirements.",
      "instructions": "Create a detailed implementation plan for a Python CLI utility that reads CSV simulation results, computes mean, standard deviation, and configurable quantiles for selected metrics, exports the results to JSON, and includes unit tests and error handling as per the provided requirements and architecture.",
      "model_profile": "balanced",
      "order": 1,
      "requires_human_approval": false,
      "task_type": "coding_plan"
    },
    {
      "agent": "reviewer",
      "depends_on": [
        1
      ],
      "expected_output": "A review report with feedback, identified gaps, and suggestions for improvement.",
      "instructions": "Review the proposed coding plan, test plan, and risk mitigations for completeness, correctness, and alignment with the project requirements and best practices.",
      "model_profile": "critical",
      "order": 2,
      "requires_human_approval": true,
      "task_type": "review"
    }
  ],
  "title": "Python CLI for Simulation Results Summary"
}

Created tasks

[
  8,
  9,
  10
]

Coding plans

#2 CLI plan for CSV simulation summary to JSON

2026-05-30 22:35:18.175260+00:00

Requirements

[
  {
    "id": "R1",
    "priority": "must",
    "text": "Read simulation results from a CSV file and validate that the file is readable and parseable."
  },
  {
    "id": "R2",
    "priority": "must",
    "text": "Compute mean, standard deviation, and quantiles for selected numeric metrics."
  },
  {
    "id": "R3",
    "priority": "must",
    "text": "Export computed summaries to a JSON file with a stable, documented structure."
  },
  {
    "id": "R4",
    "priority": "must",
    "text": "Provide a CLI for input CSV path, output JSON path, selected metrics, and quantile configuration."
  },
  {
    "id": "R5",
    "priority": "must",
    "text": "Include unit tests for CSV reading, statistics computation, JSON export, and error handling."
  },
  {
    "id": "R6",
    "priority": "should",
    "text": "Handle invalid inputs gracefully with clear, actionable error messages."
  },
  {
    "id": "R7",
    "priority": "could",
    "text": "Support configurable quantile levels via CLI or config file."
  }
]

Architecture

{
  "components": [
    {
      "name": "CLIHandler",
      "responsibility": "Parse CLI arguments, validate top-level options, and orchestrate the workflow."
    },
    {
      "name": "CSVReader",
      "responsibility": "Load CSV content, validate required columns and numeric fields, and normalize data for downstream processing."
    },
    {
      "name": "StatisticsCalculator",
      "responsibility": "Compute mean, standard deviation, and configurable quantiles for selected metrics."
    },
    {
      "name": "JSONExporter",
      "responsibility": "Serialize the summary structure and write it to the requested output path."
    },
    {
      "name": "ErrorHandler",
      "responsibility": "Map exceptions to clear CLI messages and exit codes."
    }
  ],
  "data_flow": [
    "CLI parses input/output paths, metric names, and quantile options.",
    "CSVReader loads and validates the CSV data for the requested metrics.",
    "StatisticsCalculator computes summaries for each selected metric.",
    "JSONExporter writes the aggregated summary object to JSON.",
    "CLIHandler reports success or a clear error message."
  ],
  "summary": "A small modular CLI utility: parse arguments, load and validate CSV data, compute per-metric statistics, serialize results to JSON, and return user-friendly errors without exposing stack traces in normal CLI usage."
}

Files

[
  {
    "main_symbols": [
      "main",
      "parse_args"
    ],
    "path": "cli.py",
    "purpose": "CLI entry point and argument parsing."
  },
  {
    "main_symbols": [
      "CSVReader"
    ],
    "path": "csv_reader.py",
    "purpose": "Read and validate simulation results from CSV."
  },
  {
    "main_symbols": [
      "StatisticsCalculator"
    ],
    "path": "statistics.py",
    "purpose": "Compute mean, standard deviation, and quantiles."
  },
  {
    "main_symbols": [
      "JSONExporter"
    ],
    "path": "json_exporter.py",
    "purpose": "Write computed statistics to JSON."
  },
  {
    "main_symbols": [
      "InvalidInputError"
    ],
    "path": "errors.py",
    "purpose": "Define domain-specific exceptions for invalid input and processing failures."
  },
  {
    "main_symbols": [
      "test_csv_loading",
      "test_invalid_csv"
    ],
    "path": "tests/test_csv_reader.py",
    "purpose": "Unit tests for CSV parsing and validation."
  },
  {
    "main_symbols": [
      "test_mean_std_quantiles"
    ],
    "path": "tests/test_statistics.py",
    "purpose": "Unit tests for statistical computations."
  },
  {
    "main_symbols": [
      "test_json_output"
    ],
    "path": "tests/test_json_exporter.py",
    "purpose": "Unit tests for JSON serialization and file writing."
  }
]

Implementation

[
  {
    "details": "Specify required arguments, optional quantile levels, metric selection syntax, and the JSON structure for per-metric summaries so implementation and tests align.",
    "step": 1,
    "title": "Define CLI contract and output schema"
  },
  {
    "details": "Create the planned modules and define custom exceptions for missing file, invalid CSV format, missing metrics, non-numeric data, and export failures.",
    "step": 2,
    "title": "Set up module boundaries and exceptions"
  },
  {
    "details": "Load CSV data, verify it is non-empty, ensure requested metrics exist, confirm numeric type compatibility, and surface clear errors for malformed or incomplete inputs.",
    "step": 3,
    "title": "Implement CSVReader validation logic"
  },
  {
    "details": "Compute mean and standard deviation per metric, then compute quantiles from a validated list of levels; keep the logic deterministic and independent of CLI concerns.",
    "step": 4,
    "title": "Implement statistics computation"
  },
  {
    "details": "Build a JSON-serializable summary object and write it safely to the target path, handling permission and write errors explicitly.",
    "step": 5,
    "title": "Implement JSONExporter"
  },
  {
    "details": "Wire parsing, reading, computation, and export together; catch domain errors and present concise, user-friendly messages with non-zero exit codes.",
    "step": 6,
    "title": "Implement CLI orchestration and error handling"
  },
  {
    "details": "Cover valid CSV loading, missing/invalid column handling, statistic correctness, quantile configuration, JSON output shape, and file write failure scenarios using isolated test fixtures.",
    "step": 7,
    "title": "Add unit tests for core behaviors"
  },
  {
    "details": "Verify end-to-end CLI behavior on representative sample CSVs, confirm output JSON correctness, and validate help text and error messages from the command line.",
    "step": 8,
    "title": "Add integration and manual checks"
  }
]

Tests

[
  {
    "cases": [
      "valid CSV loads successfully",
      "missing required metric raises a clear error",
      "empty CSV is rejected",
      "non-numeric values in numeric metric columns are rejected"
    ],
    "target": "CSVReader",
    "type": "unit"
  },
  {
    "cases": [
      "mean and standard deviation match expected values for known data",
      "quantiles are computed correctly for configured levels",
      "single-row input returns valid summary values",
      "empty input raises a controlled error"
    ],
    "target": "StatisticsCalculator",
    "type": "unit"
  },
  {
    "cases": [
      "summary object is written in valid JSON format",
      "output structure contains required fields",
      "permission or path write errors are handled cleanly"
    ],
    "target": "JSONExporter",
    "type": "unit"
  },
  {
    "cases": [
      "valid input CSV produces expected JSON output",
      "missing file path returns a clear error",
      "unknown metric name is rejected before export",
      "custom quantile levels are accepted and reflected in output"
    ],
    "target": "CLI end-to-end",
    "type": "integration"
  },
  {
    "cases": [
      "help text explains required arguments and quantile option",
      "error messages are understandable without stack traces",
      "large CSV file is processed within acceptable responsiveness"
    ],
    "target": "CLI usability",
    "type": "manual"
  }
]

Risks / Questions

[
  {
    "mitigation": "Validate headers and row values early, reject invalid records with explicit messages, and include tests for malformed input cases.",
    "risk": "CSV files may contain missing columns, malformed rows, or mixed data types."
  },
  {
    "mitigation": "Prefer efficient data loading and keep computation vectorized where possible; note performance limits in documentation and test with a larger fixture.",
    "risk": "Large simulation CSVs may cause memory or runtime pressure."
  },
  {
    "mitigation": "Validate quantile inputs strictly, document defaults clearly, and expose them through a single CLI option or config path.",
    "risk": "Quantile behavior may be misconfigured or misunderstood by users."
  },
  {
    "mitigation": "Check metric existence and data type before computation and fail fast with a helpful message listing invalid selections.",
    "risk": "Users may request metrics that are not present or not numeric."
  },
  {
    "mitigation": "Define and test a stable output schema and keep exporter tests coupled to the documented structure.",
    "risk": "JSON schema drift may break downstream consumers."
  }
]
[
  "What exact metric names and column types should be assumed in the CSV input?",
  "Should quantile levels be provided as a CLI option, a config file, or both?",
  "Is there a required JSON schema beyond metric -\u003e summary mapping?",
  "Are there constraints on file size or runtime that should influence implementation choices?"
]

#1 Python CLI for Simulation Results Summary

2026-05-30 22:11:14.197434+00:00

Requirements

[
  {
    "id": "R1",
    "priority": "must",
    "text": "Read simulation results from a CSV file"
  },
  {
    "id": "R2",
    "priority": "must",
    "text": "Compute statistical summaries (mean, standard deviation, quantiles) for selected metrics"
  },
  {
    "id": "R3",
    "priority": "must",
    "text": "Export the computed summary statistics to a JSON file"
  },
  {
    "id": "R4",
    "priority": "must",
    "text": "Provide a command-line interface to specify input CSV, output JSON, and metrics to summarize"
  },
  {
    "id": "R5",
    "priority": "must",
    "text": "Include unit tests covering CSV reading, statistics computation, and JSON export"
  },
  {
    "id": "R6",
    "priority": "should",
    "text": "Handle invalid input files or missing metrics gracefully with clear error messages"
  },
  {
    "id": "R7",
    "priority": "could",
    "text": "Allow configuration of quantile levels via CLI or config"
  }
]

Architecture

{
  "components": [
    {
      "name": "CSVReader",
      "responsibility": "Load and validate CSV data from file"
    },
    {
      "name": "StatisticsCalculator",
      "responsibility": "Compute mean, std, and quantiles for specified metrics"
    },
    {
      "name": "JSONExporter",
      "responsibility": "Serialize computed statistics into JSON format and write to file"
    },
    {
      "name": "CLIHandler",
      "responsibility": "Parse command line arguments and coordinate workflow"
    },
    {
      "name": "ErrorHandler",
      "responsibility": "Manage errors and provide user-friendly messages"
    }
  ],
  "data_flow": [
    "CLIHandler parses input arguments",
    "CSVReader loads CSV data",
    "StatisticsCalculator computes summaries for selected metrics",
    "JSONExporter writes summary JSON output",
    "CLIHandler reports success or errors"
  ],
  "summary": "Modular CLI utility that reads CSV data, computes statistics on selected metrics, and writes a JSON summary output."
}

Files

[
  {
    "main_symbols": [
      "main",
      "parse_args"
    ],
    "path": "cli.py",
    "purpose": "Entry point and CLI argument parsing"
  },
  {
    "main_symbols": [
      "CSVReader"
    ],
    "path": "csv_reader.py",
    "purpose": "Load and validate CSV simulation results"
  },
  {
    "main_symbols": [
      "StatisticsCalculator"
    ],
    "path": "statistics.py",
    "purpose": "Compute statistical summaries for metrics"
  },
  {
    "main_symbols": [
      "JSONExporter"
    ],
    "path": "json_exporter.py",
    "purpose": "Export computed statistics to JSON file"
  },
  {
    "main_symbols": [
      "InvalidInputError"
    ],
    "path": "errors.py",
    "purpose": "Define custom exceptions and error handling utilities"
  },
  {
    "main_symbols": [
      "test_csv_loading",
      "test_invalid_csv"
    ],
    "path": "tests/test_csv_reader.py",
    "purpose": "Unit tests for CSV reading and validation"
  },
  {
    "main_symbols": [
      "test_mean_std_quantiles"
    ],
    "path": "tests/test_statistics.py",
    "purpose": "Unit tests for statistics computations"
  },
  {
    "main_symbols": [
      "test_json_output"
    ],
    "path": "tests/test_json_exporter.py",
    "purpose": "Unit tests for JSON export functionality"
  }
]

Implementation

[
  {
    "details": "Create project folders, initialize Python environment, add requirements (e.g. pandas, numpy)",
    "step": 1,
    "title": "Setup project structure and dependencies"
  },
  {
    "details": "Develop CSVReader class to load CSV, validate presence of required columns, and handle errors",
    "step": 2,
    "title": "Implement CSVReader"
  },
  {
    "details": "Create functions to compute mean, std deviation, and configurable quantiles for given metrics",
    "step": 3,
    "title": "Implement StatisticsCalculator"
  },
  {
    "details": "Serialize statistics dictionary to JSON and write to output file",
    "step": 4,
    "title": "Implement JSONExporter"
  },
  {
    "details": "Use argparse to parse input CSV path, output JSON path, and metrics list; invoke components in order",
    "step": 5,
    "title": "Implement CLIHandler"
  },
  {
    "details": "Catch and report errors such as file not found, missing metrics, or parse errors",
    "step": 6,
    "title": "Add error handling and user messages"
  },
  {
    "details": "Cover CSV reading, statistics calculations, and JSON export with representative test cases",
    "step": 7,
    "title": "Write unit tests"
  },
  {
    "details": "Run CLI with sample CSV files, verify JSON output correctness and error handling",
    "step": 8,
    "title": "Manual testing and validation"
  },
  {
    "details": "Document CLI usage, input/output formats, and configuration options",
    "step": 9,
    "title": "Documentation and usage instructions"
  }
]

Tests

[
  {
    "cases": [
      "valid CSV loads correctly",
      "missing columns raises error",
      "empty file handling"
    ],
    "target": "CSVReader",
    "type": "unit"
  },
  {
    "cases": [
      "correct mean/std for sample data",
      "quantiles computed accurately",
      "handle empty data gracefully"
    ],
    "target": "StatisticsCalculator",
    "type": "unit"
  },
  {
    "cases": [
      "valid JSON output format",
      "file write success",
      "handle write permission errors"
    ],
    "target": "JSONExporter",
    "type": "unit"
  },
  {
    "cases": [
      "valid input produces correct JSON",
      "invalid input shows error message",
      "metrics selection works as expected"
    ],
    "target": "CLI end-to-end",
    "type": "integration"
  },
  {
    "cases": [
      "help message clarity",
      "error messages user-friendly",
      "performance on large CSV files"
    ],
    "target": "CLI usability",
    "type": "manual"
  }
]

Risks / Questions

[
  {
    "mitigation": "Implement robust validation and clear error messages; provide example CSV format",
    "risk": "Input CSV files may have inconsistent formatting or missing data"
  },
  {
    "mitigation": "Use efficient libraries like pandas; consider streaming if needed",
    "risk": "Large CSV files may cause high memory usage or slow processing"
  },
  {
    "mitigation": "Validate metric names early and inform user with suggestions",
    "risk": "User may specify metrics not present in CSV"
  },
  {
    "mitigation": "Document quantile defaults and allow user configuration with validation",
    "risk": "Quantile calculation may be misunderstood or misconfigured"
  }
]
[
  "What are the exact metric names expected in the CSV and their data types?",
  "Should quantile levels be fixed or configurable via CLI?",
  "Is there a preferred JSON schema or structure for the summary output?",
  "Are there any performance constraints or expected CSV file sizes?"
]

Graph view

tests runs runs hypothesis #3 proposed Higher Bonus Frequency Reduces RTP... Increasing the frequency of bonus even... experiment #3 planned Test impact of bonus frequency on ... Evaluate if increasing bonus frequency... job #4 queued job 4: queued seed=42 priority=60 job #3 queued job 3: queued seed=42 priority=70

Гипотезы

IDНазваниеСтатусParentConfidence
3 Higher Bonus Frequency Reduces RTP Volatility
Increasing the frequency of bonus events in the game math model will decrease RTP volatility by providing more frequent smaller wins, which may stabilize player experience under a fixed CPU budget.
proposed 0.35

Agent tasks

IDTypeHypothesisProfileStatusError
10 review critical queued
9 review critical done
8 coding_plan balanced done
7 coding_plan cheap_fast done
6 coding_plan cheap_fast failed Expecting ',' delimiter: line 42 column 164 (char 3657)
5 coding_plan cheap_fast failed Expecting ',' delimiter: line 43 column 170 (char 3802)
4 coding_plan cheap_fast failed Expecting ',' delimiter: line 41 column 173 (char 3654)
3 coding_plan cheap_fast failed Expecting ',' delimiter: line 114 column 6 (char 3806)
2 plan_experiment 3 cheap_fast done
1 generate_hypotheses cheap_fast done

Эксперименты

IDНазваниеHypothesisСтатусПараметры
3 Test impact of bonus frequency on RTP volatility
Evaluate if increasing bonus frequency reduces RTP volatility under fixed CPU budget
3 planned
{'bonus_frequency': [0.05, 0.1, 0.2]}

Simulation jobs

IDЭкспериментСтатусPrioritySeedParametersWorker
4 Test impact of bonus frequency on RTP volatility queued 60 42
{'bonus_frequency': 0.2}
3 Test impact of bonus frequency on RTP volatility queued 70 42
{'bonus_frequency': 0.05}

Результаты

JobЭкспериментSeedExitDurationMetrics
Результатов пока нет.

Graph edges

FromRelationToCreated
hypothesis:3 tests experiment:3 2026-05-30 21:49:25.303416+00:00