[ENH] V1 → V2 API Migration - studies by rohansen856 · Pull Request #1610 · openml/openml-python

rohansen856 · 2026-01-08T20:47:22Z

Metadata

Reference Issue: [ENH] V1 → V2 API Migration - studies #1594 (towards [ENH] V1 → V2 API Migration #1575)
New Tests Added: No
Documentation Updated: No

Details

Stackend PR, Depends on #1576

This PR adds Studies v2 migration.

A question:
Due to the pre commit hook i could not put 6 arguments in a function, so i had to workaround that with this instead:
openml_api\resources\studies.py (line 10-15)

        limit = kwargs.get("limit")
        offset = kwargs.get("offset")
        status = kwargs.get("status")
        main_entity_type = kwargs.get("main_entity_type")
        uploader = kwargs.get("uploader")
        benchmark_suite = kwargs.get("benchmark_suite")

I would like to confirm if this approach is correct or not. Raising a draft PR for now.

Signed-off-by: rohansen856 <rohansen856@gmail.com>

codecov-commenter · 2026-01-08T20:53:56Z

Codecov Report

❌ Patch coverage is 50.22693% with 329 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.66%. Comparing base (d421b9e) to head (647a5cd).

Files with missing lines	Patch %	Lines
openml/_api/clients/http.py	24.46%	142 Missing ⚠️
openml/_api/resources/base/versions.py	24.71%	67 Missing ⚠️
openml/_api/resources/study.py	25.00%	33 Missing ⚠️
openml/_api/runtime/core.py	55.38%	29 Missing ⚠️
openml/_api/resources/base/fallback.py	26.31%	28 Missing ⚠️
openml/testing.py	48.71%	20 Missing ⚠️
openml/_api/config.py	95.45%	3 Missing ⚠️
openml/_api/resources/base/base.py	76.92%	3 Missing ⚠️
openml/study/functions.py	50.00%	2 Missing ⚠️
openml/_api/__init__.py	88.88%	1 Missing ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1610      +/-   ##
==========================================
+ Coverage   52.04%   52.66%   +0.62%     
==========================================
  Files          36       58      +22     
  Lines        4333     4965     +632     
==========================================
+ Hits         2255     2615     +360     
- Misses       2078     2350     +272

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rohansen856 · 2026-01-13T07:12:28Z

Implementing noqa instead of the kwargs following example from here: openml\testing.py:

    def _check_fold_timing_evaluations(  # noqa: PLR0913
        self,
        fold_evaluations: dict[str, dict[int, dict[int, float]]],
        num_repeats: int,
        num_folds: int,
        *,
        max_time_allowed: float = 60000.0,
        task_type: TaskType = TaskType.SUPERVISED_CLASSIFICATION,
        check_scores: bool = True,
    ) -> None:

Final function signature:

    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> Any:

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

Good work. Just use the listing as suggested in #1575 (comment) which is already similar to what you have done.

for more information, see https://pre-commit.ci

rohansen856 · 2026-01-15T08:35:54Z

@geetu040 I reviewed the specific changes needed and have a slight doubt in the pandas implementation.
So as i undertand, i need to use pandas Dataframe insteaf of ANY in openml\_api\resources\base.py like this:

class StudiesAPI(ResourceAPI, ABC):
    @abstractmethod
    def list(  # noqa: PLR0913
        self,
        limit: int | None = None,
        offset: int | None = None,
        status: str | None = None,
        main_entity_type: str | None = None,
        uploader: list[int] | None = None,
        benchmark_suite: int | None = None,
    ) -> pd.DataFrame: ...

and similarly i have to change the return object in openml\_api\resources\studies.py from this:return response.text
to this:

xml_string = response.text

        # Parse XML and convert to DataFrame
        study_dict = xmltodict.parse(xml_string, force_list=("oml:study",))

        # Minimalistic check if the XML is useful
        assert isinstance(study_dict["oml:study_list"]["oml:study"], list), type(
            study_dict["oml:study_list"],
        )
        assert (
            study_dict["oml:study_list"]["@xmlns:oml"] == "http://openml.org/openml"
        ), study_dict["oml:study_list"]["@xmlns:oml"]

        studies = {}
        for study_ in study_dict["oml:study_list"]["oml:study"]:
            # maps from xml name to a tuple of (dict name, casting fn)
            expected_fields = {
                "oml:id": ("id", int),
                "oml:alias": ("alias", str),
                "oml:main_entity_type": ("main_entity_type", str),
                "oml:benchmark_suite": ("benchmark_suite", int),
                "oml:name": ("name", str),
                "oml:status": ("status", str),
                "oml:creation_date": ("creation_date", str),
                "oml:creator": ("creator", int),
            }
            study_id = int(study_["oml:id"])
            current_study = {}
            for oml_field_name, (real_field_name, cast_fn) in expected_fields.items():
                if oml_field_name in study_:
                    current_study[real_field_name] = cast_fn(study_[oml_field_name])
            current_study["id"] = int(current_study["id"])
            studies[study_id] = current_study

        return pd.DataFrame.from_dict(studies, orient="index")

A total of 3 files would be affected: openml\_api\resources\base.py, openml\_api\resources\studies.py and openml\study\functions.py

Can you please confirm my approach... After that i will update the PR.

geetu040 · 2026-01-15T08:55:24Z

@rohansen856 yes sounds right

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 · 2026-01-15T10:42:12Z

Updated! Ready for review.

geetu040

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

rohansen856 · 2026-01-16T09:45:32Z

Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.

Oh definitely! I prolly missed that in openml\study\functions.py but pushing the change with next commit.

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

This reverts commit fd43c48.

Signed-off-by: rohansen856 <rohansen856@gmail.com>

source: openml#1606 (comment)

…into studies-migration # Conflicts: # openml/_api/__init__.py # openml/_api/resources/base/resources.py # openml/_api/resources/study.py

EmanAbdelhaleem · 2026-02-04T16:53:54Z

openml/_api/resources/study.py

+
+
+class StudyV1API(ResourceV1API, StudyAPI):
+    def list(  # noqa: PLR0913


I think we can split this into 3 functions for more readability:

list()

_build_url()

_parse_list_xml()

check #1606 for reference

Understood! will deparate the long list function into the said 3 functions with proper docstring. applying with next commit.

EmanAbdelhaleem · 2026-02-04T16:55:20Z

tests/test_api/test_study.py

@@ -0,0 +1,94 @@
+# License: BSD 3-Clause
+from __future__ import annotations


I think it would be better to change the file name to "test_study" for consistency

Agreed! applying with next commit.

also in this case similarly, tests\test_study folder should be renamed to tests\test_studies.
cc @geetu040

makes sense, but let's not do it here, that will make the file hard to review with visible changes

EmanAbdelhaleem · 2026-02-04T16:59:11Z

tests/test_api/test_study.py

+            assert all(studies_df["status"] == "active")
+
+    @pytest.mark.uses_test_server()
+    def test_list_pagination(self):


I don't think we need to test pagination here. These tests should only be specific for the API. It's better to leave this test on test_study_functions if it's there.

there is actually no pagination test in test_study_functions. Implementing this here should be fine... LMK if do u think we need to remove it still...

I would say it's not necessary, since our goal is to simply test the public methods of resource class that are expected to be used in the sdk
but if someone has written additional tests like this, then it's good and not a problem proceeding with them

EmanAbdelhaleem · 2026-02-04T17:06:27Z

tests/test_api/test_studies.py

+
+    def setUp(self) -> None:
+        super().setUp()
+        self.api = StudyV2API(self.http_client)


This is v2, you need to use

self.v2_client = self._get_http_client( server="http://localhost:8001/", base_url="", api_key="", timeout_seconds=self.timeout_seconds, retries=self.retries, retry_policy=self.retry_policy, cache=self.cache, )

and change the server to your local v2 server

Understood!
replacing this:

self.api = StudyV2API(self.http_client)

with this:

self.v2_client = self._get_http_client( server="http://localhost:8001/", base_url="", api_key="", timeout=self.timeout, retries=self.retries, retry_policy=self.retry_policy, cache=self.cache, ) self.api = StudyV2API(self.v2_client)

you can use self.http_clients[APIVersion.V2]

EmanAbdelhaleem · 2026-02-04T17:18:14Z

tests/test_api/test_studies.py

+        self.v2_api = StudyV2API(self.http_client)
+
+    @pytest.mark.uses_test_server()
+    def test_v1_v2_compatibility(self):


I think this should test that the output matches and follow the naming style mentioned here: #1575 (comment)

check #1603 for reference

EmanAbdelhaleem · 2026-02-04T17:20:35Z

tests/test_api/test_studies.py

+        # Both should have delete, tag, untag from base
+        for method in ["delete", "tag", "untag", "publish"]:
+            assert hasattr(self.v1_api, method)
+            assert hasattr(self.v2_api, method)


I think you need to add Fallback tests as mentioned here: #1575 (comment)

check #1603 for reference

Understoo! will implement FallbackProxy and a test_list_fallback function that tests the FallbackProxy automatically falls back from V2 to V1 when V2 raises not supported. also in case of test_list_matches i think it should be marked with @pytest.mark.skip(reason="V2 list not yet implemented") as it currently throws OpenMLNotSupportedError...

don't skip it, instead check if it raises the right exception, you can see this for reference: tests/test_api/test_versions.py

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040

you are not changing anything in tests/test_study, are you sure nothing needs to be changed there? are all tests passing?
test_study.py looks good, please sync with base branch. I'll check if the tests are passing locally

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 · 2026-02-06T10:30:44Z

you are not changing anything in tests/test_study, are you sure nothing needs to be changed there? are all tests passing? test_study.py looks good, please sync with base branch. I'll check if the tests are passing locally

with reference to all other PRs the tests\test_study\test_study_functions.py is already correct for the migration. The publish and delete tests are running and passing. publish_study and delete_study does not require modification in my case of studies migration. syncing with base after this.

geetu040 and others added 9 commits December 30, 2025 09:11

set up folder structure and base code

0159f47

Merge branch 'main' into migration

58e9175

Merge branch 'main' into migration

bdd65ff

fix pre-commit

52ef379

refactor

5dfcbce

implement cache_dir

2acbe99

refactor

af99880

Merge branch 'main' into pr/1576

74ab366

feat: added migrations for studies api v2

9100d91

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 mentioned this pull request Jan 9, 2026

[ENH] V1 → V2 API Migration #1575

Open

25 tasks

chore: fixed the args limit in function using noqa

88077a7

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 marked this pull request as ready for review January 13, 2026 07:21

geetu040 suggested changes Jan 13, 2026

View reviewed changes

rohansen856 and others added 2 commits January 15, 2026 13:46

Merge branch 'main' into studies-migration

13acf35

[pre-commit.ci] auto fixes from pre-commit.com hooks

e02e05b

for more information, see https://pre-commit.ci

geetu040 and others added 3 commits January 15, 2026 14:51

undo changes in tasks/functions.py

4c75e16

Merge branch 'main' into migration

5762185

chore: updated the list function acc to reviews

9170edc

Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 suggested changes Jan 15, 2026

View reviewed changes

chore: removed _list_studies and implemented api_context for studies …

8c980c9

…list Signed-off-by: rohansen856 <rohansen856@gmail.com>

geetu040 assigned rohansen856 Jan 19, 2026

geetu040 added 3 commits January 21, 2026 10:47

Merge branch 'main' into migration

7e9bc1f

add tests directory

c603383

use enum for delay method

ff6a8b0

geetu040 and others added 13 commits February 3, 2026 12:24

use LegacyConfig

fd43c48

Revert "use LegacyConfig"

f4aab6b

This reverts commit fd43c48.

implement _sync_api_config

d43cf86

update tests with _sync_api_config

3e323ed

rename config: timeout -> timeout_seconds

9195fa6

use timedelta for default ttl value

5342eec

update tests, adds v2/fallback

adc0e74

add MinIOClient in TestBase

bfb2d3e

refactor: replace api_context.backend.study with openml._backend.study

ee10f59

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: removed unneccesary test for studies

9c0ad45

Signed-off-by: rohansen856 <rohansen856@gmail.com>

fix linting for builder

cabaecf

fix unbound variables: "code", "message"

85c1113

source: openml#1606 (comment)

Merge branch 'migration' of https://github.com/geetu040/openml-python …

0458929

…into studies-migration # Conflicts: # openml/_api/__init__.py # openml/_api/resources/base/resources.py # openml/_api/resources/study.py

EmanAbdelhaleem reviewed Feb 4, 2026

View reviewed changes

rohansen856 added 4 commits February 5, 2026 10:56

refactor: updated StudyV1API acc to reviews

5e3fea8

Signed-off-by: rohansen856 <rohansen856@gmail.com>

refactor: updated studies test acc to reviews

fc32488

Signed-off-by: rohansen856 <rohansen856@gmail.com>

chore: removed delete method test from studies api test

eda66ca

Signed-off-by: rohansen856 <rohansen856@gmail.com>

refactor: updated study api test filename

18dc72a

Signed-off-by: rohansen856 <rohansen856@gmail.com>

rohansen856 requested review from EmanAbdelhaleem and geetu040 February 5, 2026 05:37

rohansen856 marked this pull request as ready for review February 5, 2026 05:40

geetu040 reviewed Feb 5, 2026

View reviewed changes

rohansen856 force-pushed the studies-migration branch from 945e965 to 18dc72a Compare February 6, 2026 08:25

chore: updated list matches ro check proper error throw

647a5cd

Signed-off-by: rohansen856 <rohansen856@gmail.com>



		class StudyV1API(ResourceV1API, StudyAPI):
		def list( # noqa: PLR0913

		@@ -0,0 +1,94 @@
		# License: BSD 3-Clause
		from __future__ import annotations

Uh oh!

Conversation

rohansen856 commented Jan 8, 2026

Metadata

Details

Uh oh!

codecov-commenter commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rohansen856 commented Jan 13, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 commented Jan 15, 2026

Uh oh!

rohansen856 commented Jan 15, 2026

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EmanAbdelhaleem Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

geetu040 left a comment

Choose a reason for hiding this comment

Uh oh!

rohansen856 commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

codecov-commenter commented Jan 8, 2026 •

edited

Loading

rohansen856 commented Jan 16, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading

EmanAbdelhaleem Feb 4, 2026 •

edited

Loading