[ENH] V1 → V2 API Migration - studies#1610
Conversation
Signed-off-by: rohansen856 <rohansen856@gmail.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1610 +/- ##
==========================================
+ Coverage 52.04% 52.66% +0.62%
==========================================
Files 36 58 +22
Lines 4333 4965 +632
==========================================
+ Hits 2255 2615 +360
- Misses 2078 2350 +272 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Implementing def _check_fold_timing_evaluations( # noqa: PLR0913
self,
fold_evaluations: dict[str, dict[int, dict[int, float]]],
num_repeats: int,
num_folds: int,
*,
max_time_allowed: float = 60000.0,
task_type: TaskType = TaskType.SUPERVISED_CLASSIFICATION,
check_scores: bool = True,
) -> None:Final function signature: def list( # noqa: PLR0913
self,
limit: int | None = None,
offset: int | None = None,
status: str | None = None,
main_entity_type: str | None = None,
uploader: list[int] | None = None,
benchmark_suite: int | None = None,
) -> Any: |
Signed-off-by: rohansen856 <rohansen856@gmail.com>
geetu040
left a comment
There was a problem hiding this comment.
Good work. Just use the listing as suggested in #1575 (comment) which is already similar to what you have done.
|
@geetu040 I reviewed the specific changes needed and have a slight doubt in the pandas implementation. class StudiesAPI(ResourceAPI, ABC):
@abstractmethod
def list( # noqa: PLR0913
self,
limit: int | None = None,
offset: int | None = None,
status: str | None = None,
main_entity_type: str | None = None,
uploader: list[int] | None = None,
benchmark_suite: int | None = None,
) -> pd.DataFrame: ...and similarly i have to change the return object in xml_string = response.text
# Parse XML and convert to DataFrame
study_dict = xmltodict.parse(xml_string, force_list=("oml:study",))
# Minimalistic check if the XML is useful
assert isinstance(study_dict["oml:study_list"]["oml:study"], list), type(
study_dict["oml:study_list"],
)
assert (
study_dict["oml:study_list"]["@xmlns:oml"] == "http://openml.org/openml"
), study_dict["oml:study_list"]["@xmlns:oml"]
studies = {}
for study_ in study_dict["oml:study_list"]["oml:study"]:
# maps from xml name to a tuple of (dict name, casting fn)
expected_fields = {
"oml:id": ("id", int),
"oml:alias": ("alias", str),
"oml:main_entity_type": ("main_entity_type", str),
"oml:benchmark_suite": ("benchmark_suite", int),
"oml:name": ("name", str),
"oml:status": ("status", str),
"oml:creation_date": ("creation_date", str),
"oml:creator": ("creator", int),
}
study_id = int(study_["oml:id"])
current_study = {}
for oml_field_name, (real_field_name, cast_fn) in expected_fields.items():
if oml_field_name in study_:
current_study[real_field_name] = cast_fn(study_[oml_field_name])
current_study["id"] = int(current_study["id"])
studies[study_id] = current_study
return pd.DataFrame.from_dict(studies, orient="index")A total of 3 files would be affected: Can you please confirm my approach... After that i will update the PR. |
|
@rohansen856 yes sounds right |
Signed-off-by: rohansen856 <rohansen856@gmail.com>
|
Updated! Ready for review. |
geetu040
left a comment
There was a problem hiding this comment.
Almost fine, just complety remove _list_studies as well and replace _list_studies with api_context.backend.studies.list as the parameter for partial in list_studies. Hope I didnot confuse you, just search for the exact method names in code. Let me know if I am not clear enough.
Oh definitely! I prolly missed that in |
…list Signed-off-by: rohansen856 <rohansen856@gmail.com>
This reverts commit fd43c48.
Signed-off-by: rohansen856 <rohansen856@gmail.com>
Signed-off-by: rohansen856 <rohansen856@gmail.com>
…into studies-migration # Conflicts: # openml/_api/__init__.py # openml/_api/resources/base/resources.py # openml/_api/resources/study.py
|
|
||
|
|
||
| class StudyV1API(ResourceV1API, StudyAPI): | ||
| def list( # noqa: PLR0913 |
There was a problem hiding this comment.
I think we can split this into 3 functions for more readability:
- list()
- _build_url()
- _parse_list_xml()
check #1606 for reference
There was a problem hiding this comment.
Understood! will deparate the long list function into the said 3 functions with proper docstring. applying with next commit.
| @@ -0,0 +1,94 @@ | |||
| # License: BSD 3-Clause | |||
| from __future__ import annotations | |||
There was a problem hiding this comment.
I think it would be better to change the file name to "test_study" for consistency
There was a problem hiding this comment.
Agreed! applying with next commit.
There was a problem hiding this comment.
also in this case similarly, tests\test_study folder should be renamed to tests\test_studies.
cc @geetu040
There was a problem hiding this comment.
makes sense, but let's not do it here, that will make the file hard to review with visible changes
| assert all(studies_df["status"] == "active") | ||
|
|
||
| @pytest.mark.uses_test_server() | ||
| def test_list_pagination(self): |
There was a problem hiding this comment.
I don't think we need to test pagination here. These tests should only be specific for the API. It's better to leave this test on test_study_functions if it's there.
There was a problem hiding this comment.
there is actually no pagination test in test_study_functions. Implementing this here should be fine... LMK if do u think we need to remove it still...
There was a problem hiding this comment.
I would say it's not necessary, since our goal is to simply test the public methods of resource class that are expected to be used in the sdk
but if someone has written additional tests like this, then it's good and not a problem proceeding with them
tests/test_api/test_studies.py
Outdated
|
|
||
| def setUp(self) -> None: | ||
| super().setUp() | ||
| self.api = StudyV2API(self.http_client) |
There was a problem hiding this comment.
This is v2, you need to use
self.v2_client = self._get_http_client(
server="http://localhost:8001/",
base_url="",
api_key="",
timeout_seconds=self.timeout_seconds,
retries=self.retries,
retry_policy=self.retry_policy,
cache=self.cache,
)
and change the server to your local v2 server
There was a problem hiding this comment.
Understood!
replacing this:
self.api = StudyV2API(self.http_client)with this:
self.v2_client = self._get_http_client(
server="http://localhost:8001/",
base_url="",
api_key="",
timeout=self.timeout,
retries=self.retries,
retry_policy=self.retry_policy,
cache=self.cache,
)
self.api = StudyV2API(self.v2_client)There was a problem hiding this comment.
you can use self.http_clients[APIVersion.V2]
tests/test_api/test_studies.py
Outdated
| self.v2_api = StudyV2API(self.http_client) | ||
|
|
||
| @pytest.mark.uses_test_server() | ||
| def test_v1_v2_compatibility(self): |
There was a problem hiding this comment.
I think this should test that the output matches and follow the naming style mentioned here: #1575 (comment)
check #1603 for reference
tests/test_api/test_studies.py
Outdated
| # Both should have delete, tag, untag from base | ||
| for method in ["delete", "tag", "untag", "publish"]: | ||
| assert hasattr(self.v1_api, method) | ||
| assert hasattr(self.v2_api, method) |
There was a problem hiding this comment.
I think you need to add Fallback tests as mentioned here: #1575 (comment)
check #1603 for reference
There was a problem hiding this comment.
Understoo! will implement FallbackProxy and a test_list_fallback function that tests the FallbackProxy automatically falls back from V2 to V1 when V2 raises not supported. also in case of test_list_matches i think it should be marked with @pytest.mark.skip(reason="V2 list not yet implemented") as it currently throws OpenMLNotSupportedError...
There was a problem hiding this comment.
don't skip it, instead check if it raises the right exception, you can see this for reference: tests/test_api/test_versions.py
Signed-off-by: rohansen856 <rohansen856@gmail.com>
Signed-off-by: rohansen856 <rohansen856@gmail.com>
Signed-off-by: rohansen856 <rohansen856@gmail.com>
Signed-off-by: rohansen856 <rohansen856@gmail.com>
geetu040
left a comment
There was a problem hiding this comment.
you are not changing anything in tests/test_study, are you sure nothing needs to be changed there? are all tests passing?
test_study.py looks good, please sync with base branch. I'll check if the tests are passing locally
945e965 to
18dc72a
Compare
Signed-off-by: rohansen856 <rohansen856@gmail.com>
with reference to all other PRs the |
Metadata
Details
Stackend PR, Depends on #1576
This PR adds
Studies v2migration.A question:
Due to the pre commit hook i could not put 6 arguments in a function, so i had to workaround that with this instead:
openml_api\resources\studies.py (line 10-15)
I would like to confirm if this approach is correct or not. Raising a draft PR for now.