Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor BaseTimeBucket and add class BaseTimeProjectData #40

Closed
10 of 13 tasks
PabloVasconez opened this issue Jul 30, 2024 · 1 comment · Fixed by #42
Closed
10 of 13 tasks

Refactor BaseTimeBucket and add class BaseTimeProjectData #40

PabloVasconez opened this issue Jul 30, 2024 · 1 comment · Fixed by #42
Assignees
Labels
enhancement New feature or request python Pull requests that update Python code

Comments

@PabloVasconez
Copy link
Contributor

PabloVasconez commented Jul 30, 2024

In order to make the parsing of the data more efficeint, I propose to refactor the class BaseTimeBucket as detailed below.

Partly becasue of efficiency but use to add user-friendlyness, I propose to create a BaseTimeProjectData class, which only containts the measurements for a single project. The user will typically request data for a single project, rather than for multiple projects at the same time.

My suggestions:

Class BaseTimeBucket

  • Create public cached property method companies: List[str] and create a separate private method _test_connection. Note that these will replace the code below. And note that list_projects is actually list_companies.
    image
  • Create public cached property method filepaths: List[str] to list all filepaths in the bucket. This can be efficiently achieved by applying this:
    image
  • Create public cached_property method filepath_project_patterns: List[str] to get the unique the project pattern from the self.filepaths atributte. In the list below, this value should return ["meting_Crux", "7579_Plasdraszone_Dijksgracht_Oost"].
    image
  • Create private method _get_filepaths_for_project_pattern(project_pattern: str) which returns all the filepaths with the requested project_pattern from self.filepaths.
  • Create public cached_property method filepath_project_pattern_per_project: Dict[project_name, filepath_project_pattern]. For this, it is necessary to parse only one filepath per filepath_project_pattern (of the listed obtained using _get_filepaths_for_project_pattern, perhaps the last one?), get the actual project_name and store the iitem in the dictionary. This will save quite a lot of time in parsing jsons that ultimately are not necessary.
  • Create public cached_property method projects: List[str].
  • Create public method get_filepaths_for_project which returns all the filepaths for that project name from self.filepaths Note that under the hood the methods/attributes self._filepath_project_pattern_per_project and self._get_filepaths_for_project_pattern will be used for this purpose.
  • Create public cached_property method projects_per_company: Dict[company_name: List[project_names]]. For this, the get_filepaths_for_project will be used.
  • Create public class get_project_data(project_name: str, company: str | None = None) -> BaseTimeProjectData. This method should do:
    • Check checking whether the project_name is available (if not, print available project_names in ValueError message). Check whether the project_name appears in more than one company (if it is, then list the companies where the project_name appears and suggest to add specify the company_name).
    • Loop through all filepaths corresponding to project_name and parse the .json file corresponding to the filepath and do using the info the .json file:
      • If it is the first filepath, parse the Project according to the information of the .json. This only need to be done once.
      • Loop through every rod_id, parse the SettlementMeasurement and store it in a temporary dictionary Dict[rod_id, List[SettlementMeasurement]].
    • For each rod_id in the dictionary, create one SettlementRodSeries using all the corresponding List[SettlementMeasurement] and append it to a list_settlement_rod_series.
    • Return the BaseTimeProjectData(list_settlement_rod_series).
  • Remove the following code from the init.
    image

Class BaseTimeProjectData

  • Create init method with signature __init__(self, list_settlement_rod_series: List[SettlementRodSeries]) and validate that all the settlement rod series have come from the same project using a private method _set_list_settlement_rod_series and a public one to return it.
  • Create public cached_property methods project: Project and rod_ids: List[str] getting the data from self.list_settlement_rod_series.
  • Create public method get_series_for_rod_id(object_id:str) -> SettlementRodSeries which returns the series for the requested rod_id.
@PabloVasconez PabloVasconez changed the title Create BaseTimeProjectData (TO BE COMPLETED) Refactor BaseTimeBucket and add class BaseTimeProjectData Jul 30, 2024
@RDWimmers RDWimmers added enhancement New feature or request python Pull requests that update Python code labels Jul 31, 2024
@RDWimmers
Copy link
Member

very detailed and clear instruction for this change @PabloVasconez. Indeed, I think that the user is interested in the project data and not the company data. Loading all company data makes it inefficient.

so the UX will look like this:

--> BaseTimeBucket                       --> BaseTimeProjectData               --> SettlementRodSeries
    func: get_project_data                   func: get_series_for_rod_id
    - project_name                           - rod_ids
    - company_name (optional)

RDWimmers pushed a commit that referenced this issue Oct 1, 2024
@RDWimmers RDWimmers linked a pull request Oct 1, 2024 that will close this issue
RDWimmers added a commit that referenced this issue Oct 10, 2024
* Added Basetime class to BAEC measurements

* Adjust 3 files based on feedback

* Edited Files with Black, super_linter and compile

* Update with the new Basetime Lambda functions.

* Update with the new Basetime Lambda functions.

* Update Lambda and super linter, restful 01 version

* feat: added Basetime class to BAEC measurements

* pref: resolve issue #40

* style: edited Files with Black, super_linter and compile

* Update with the new Basetime Lambda functions.

* Update with the new Basetime Lambda functions.

* Update Lambda and super linter, restful 01 version

* chore(deps): add studs package for boto3 and botocore

* Added Credentials from Basetime to test_basetime

* chore(deps): use super-linter v7

* style: resolve linter issues

* replace for loops with list comprehensions

* replace for loops with list comprehensions

* refactor!: move logic to io folder

* style: format files

* format fix new version json Basetime

* style: format files with super linter

---------

Co-authored-by: Robin Wimmers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python Pull requests that update Python code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants