Skip to content

Add Pydantic models for configuration validation#510

Open
jan-janssen wants to merge 2 commits intomainfrom
add-pydantic-validation-8214308980248924915
Open

Add Pydantic models for configuration validation#510
jan-janssen wants to merge 2 commits intomainfrom
add-pydantic-validation-8214308980248924915

Conversation

@jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Mar 23, 2026

This change introduces Pydantic models to validate the configuration files (typically queue.yaml) used by pysqa.

Key changes:

  • QueueModel defines the schema for individual queue configurations, including optional fields for resource limits and submission scripts.
  • ConfigModel validates the top-level configuration structure.
  • QueueAdapterWithConfig now validates its input configuration upon initialization.
  • The ConfigDict(extra='allow') is used to maintain backward compatibility with custom template variables and adapter-specific settings.
  • Added pydantic as a core dependency in pyproject.toml.
  • Improved robustness of template loading by checking for the existence of the script field.
  • Comprehensive tests added to ensure validation works as expected and handles edge cases like missing fields or incorrect types.

PR created automatically by Jules for task 8214308980248924915 started by @jan-janssen

Summary by CodeRabbit

Release Notes

  • Chores

    • Updated project's minimum Pydantic dependency version requirement to 2.0 or later.
  • New Features

    • Configuration validation now enforces proper field types and identifies missing required entries.
    • Configuration system now supports preserving extra custom configuration fields.
  • Bug Fixes

    • Template file loading improved to correctly handle configurations with null or missing scripts.

- Introduced `QueueModel` and `ConfigModel` in `pysqa/base/config.py` using Pydantic v2.
- Integrated validation into `QueueAdapterWithConfig` initialization.
- Added `pydantic>=2.0` to project dependencies.
- Updated `_load_templates` to handle queues without a submission script (e.g., remote queues).
- Added unit tests for configuration validation.

Co-authored-by: jan-janssen <3854739+jan-janssen@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@coderabbitai
Copy link

coderabbitai bot commented Mar 23, 2026

📝 Walkthrough

Walkthrough

The PR adds a Pydantic v2 dependency and introduces QueueModel and ConfigModel Pydantic models to validate and normalize queue configuration data. Configuration is now validated before assignment, extra fields are explicitly preserved, and template loading is optimized to skip null script values.

Changes

Cohort / File(s) Summary
Dependency Management
pyproject.toml
Added pydantic>=2.0 as a project dependency to support configuration schema validation.
Configuration Validation
src/pysqa/base/config.py
Introduced QueueModel and ConfigModel Pydantic models to define and validate queue configuration schema. Modified QueueAdapterWithConfig.__init__ to validate incoming config via Pydantic and store the normalized model dump. Enhanced _load_templates to explicitly check for non-null script values before attempting file operations.
Test Coverage
tests/unit/base/test_config.py
Added four new unit tests validating Pydantic-driven behavior: missing queues detection, invalid field type rejection, and preservation of extra fields at both queue and top-level scopes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 Pydantic guards the config gate,
Extra fields escape their fate,
Null scripts skip the loading state,
Validation flows so clean and straight!
Hop-hop-hop!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding Pydantic models for configuration validation, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-pydantic-validation-8214308980248924915

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pyproject.toml (1)

34-39: Inconsistent version pinning strategy.

Other runtime dependencies (jinja2, pandas, pyyaml) use exact version pinning (==), while pydantic>=2.0 uses a minimum version constraint. This is acceptable if intentional (to allow Pydantic v2 minor/patch updates), but consider documenting this choice or aligning with the existing pinning strategy for reproducibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` around lines 34 - 39, The dependencies list currently mixes
exact pins and a minimum constraint: update the entry for "pydantic>=2.0" to
match the project's pinning strategy or document the exception; specifically
either change the dependency in the dependencies array to an exact version
(e.g., "pydantic==2.x.y") to match jinja2/pandas/pyyaml, or add a brief comment
in the pyproject.toml (or project README) next to the dependencies list
explaining why pydantic is intentionally left as "pydantic>=2.0" to allow
minor/patch upgrades for the Pydantic package.
src/pysqa/base/config.py (2)

28-37: Consider validating queue_type against allowed values.

The queue_type field accepts any string, but valid values are constrained to "SGE", "TORQUE", "SLURM", "LSF", "MOAB", "FLUX", and "REMOTE" (per set_queue_adapter in queueadapter.py). Using Literal would catch invalid queue types earlier with a clearer error message.

💡 Suggested improvement
-from pydantic import BaseModel, ConfigDict
+from typing import Literal
+from pydantic import BaseModel, ConfigDict

+QUEUE_TYPES = Literal["SGE", "TORQUE", "SLURM", "LSF", "MOAB", "FLUX", "REMOTE"]

 class ConfigModel(BaseModel):
     model_config = ConfigDict(extra="allow")
-    queue_type: str
+    queue_type: QUEUE_TYPES
     queue_primary: Optional[str] = None
     queues: dict[str, QueueModel]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pysqa/base/config.py` around lines 28 - 37, The ConfigModel currently
allows any string for queue_type; change queue_type to a constrained literal
type to validate allowed values early (e.g., use typing.Literal or
typing_extensions.Literal and set queue_type:
Literal["SGE","TORQUE","SLURM","LSF","MOAB","FLUX","REMOTE"]). Update the
import(s) as needed and ensure ConfigModel (in src/pysqa/base/config.py) uses
this Literal so invalid values are caught by Pydantic; this aligns with the
allowed queue types enforced by set_queue_adapter in queueadapter.py.

338-350: Minor: Redundant key check after Pydantic validation.

After model_dump(), the "script" key will always exist in queue_dict (defaulting to None if not provided in the original config). The "script" in queue_dict check is now redundant and can be simplified.

✨ Simplified condition
 for queue_dict in queue_lst_dict.values():
-    if "script" in queue_dict and queue_dict["script"] is not None:
+    if queue_dict.get("script") is not None:
         with open(os.path.join(directory, queue_dict["script"])) as f:

Note: The current code is functionally correct and provides defensive coding if this method is ever called with unvalidated data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pysqa/base/config.py` around lines 338 - 350, The check for the presence
of the "script" key is redundant after Pydantic's model_dump(); update the loop
over queue_lst_dict so you only test for a non-None script value (e.g., replace
the current if "script" in queue_dict and queue_dict["script"] is not None: with
a single check for queue_dict["script"] is not None), then proceed to open
os.path.join(directory, queue_dict["script"]) and compile the Template and
handle TemplateSyntaxError as before (references: queue_lst_dict,
queue_dict["script"], Template, TemplateSyntaxError, model_dump()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/pysqa/base/config.py`:
- Around line 103-106: The constructor currently calls ConfigModel(**config)
which raises pydantic.ValidationError on invalid input; wrap that call in a
try/except that catches pydantic.ValidationError and re-raises a ValueError
(including the original error message) before continuing to set self._config and
call super().__init__; reference the call to ConfigModel and the surrounding
code that assigns self._config and calls super().__init__ so the conversion
happens before queue_type is read.

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 34-39: The dependencies list currently mixes exact pins and a
minimum constraint: update the entry for "pydantic>=2.0" to match the project's
pinning strategy or document the exception; specifically either change the
dependency in the dependencies array to an exact version (e.g.,
"pydantic==2.x.y") to match jinja2/pandas/pyyaml, or add a brief comment in the
pyproject.toml (or project README) next to the dependencies list explaining why
pydantic is intentionally left as "pydantic>=2.0" to allow minor/patch upgrades
for the Pydantic package.

In `@src/pysqa/base/config.py`:
- Around line 28-37: The ConfigModel currently allows any string for queue_type;
change queue_type to a constrained literal type to validate allowed values early
(e.g., use typing.Literal or typing_extensions.Literal and set queue_type:
Literal["SGE","TORQUE","SLURM","LSF","MOAB","FLUX","REMOTE"]). Update the
import(s) as needed and ensure ConfigModel (in src/pysqa/base/config.py) uses
this Literal so invalid values are caught by Pydantic; this aligns with the
allowed queue types enforced by set_queue_adapter in queueadapter.py.
- Around line 338-350: The check for the presence of the "script" key is
redundant after Pydantic's model_dump(); update the loop over queue_lst_dict so
you only test for a non-None script value (e.g., replace the current if "script"
in queue_dict and queue_dict["script"] is not None: with a single check for
queue_dict["script"] is not None), then proceed to open os.path.join(directory,
queue_dict["script"]) and compile the Template and handle TemplateSyntaxError as
before (references: queue_lst_dict, queue_dict["script"], Template,
TemplateSyntaxError, model_dump()).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c9676030-bfd4-4101-a875-249e1a5cd7f0

📥 Commits

Reviewing files that changed from the base of the PR and between 63ace34 and 443d16a.

📒 Files selected for processing (3)
  • pyproject.toml
  • src/pysqa/base/config.py
  • tests/unit/base/test_config.py

Comment on lines +103 to 106
self._config = ConfigModel(**config).model_dump()
super().__init__(
queue_type=config["queue_type"], execute_command=execute_command
queue_type=self._config["queue_type"], execute_command=execute_command
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pydantic ValidationError is not converted to ValueError.

The unit tests expect ValueError for validation failures, but ConfigModel(**config) raises pydantic.ValidationError. Either wrap the validation in a try/except block to convert the exception, or update the tests to catch ValidationError.

🔧 Option 1: Wrap validation to raise ValueError
+from pydantic import BaseModel, ConfigDict, ValidationError
 ...
 def __init__(
     self,
     config: dict,
     directory: str = "~/.queues",
     execute_command: Callable = execute_command,
 ):
-    self._config = ConfigModel(**config).model_dump()
+    try:
+        self._config = ConfigModel(**config).model_dump()
+    except ValidationError as e:
+        raise ValueError(f"Invalid configuration: {e}") from e
     super().__init__(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._config = ConfigModel(**config).model_dump()
super().__init__(
queue_type=config["queue_type"], execute_command=execute_command
queue_type=self._config["queue_type"], execute_command=execute_command
)
try:
self._config = ConfigModel(**config).model_dump()
except ValidationError as e:
raise ValueError(f"Invalid configuration: {e}") from e
super().__init__(
queue_type=self._config["queue_type"], execute_command=execute_command
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/pysqa/base/config.py` around lines 103 - 106, The constructor currently
calls ConfigModel(**config) which raises pydantic.ValidationError on invalid
input; wrap that call in a try/except that catches pydantic.ValidationError and
re-raises a ValueError (including the original error message) before continuing
to set self._config and call super().__init__; reference the call to ConfigModel
and the surrounding code that assigns self._config and calls super().__init__ so
the conversion happens before queue_type is read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant