You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to make the parsing of the data more efficeint, I propose to refactor the class BaseTimeBucket as detailed below.
Partly becasue of efficiency but use to add user-friendlyness, I propose to create a BaseTimeProjectData class, which only containts the measurements for a single project. The user will typically request data for a single project, rather than for multiple projects at the same time.
My suggestions:
Class BaseTimeBucket
Create public cached property method companies: List[str] and create a separate private method _test_connection. Note that these will replace the code below. And note that list_projects is actually list_companies.
Create public cached property method filepaths: List[str] to list all filepaths in the bucket. This can be efficiently achieved by applying this:
Create public cached_property method filepath_project_patterns: List[str] to get the unique the project pattern from the self.filepaths atributte. In the list below, this value should return ["meting_Crux", "7579_Plasdraszone_Dijksgracht_Oost"].
Create private method _get_filepaths_for_project_pattern(project_pattern: str) which returns all the filepaths with the requested project_pattern from self.filepaths.
Create public cached_property method filepath_project_pattern_per_project: Dict[project_name, filepath_project_pattern]. For this, it is necessary to parse only one filepath per filepath_project_pattern (of the listed obtained using _get_filepaths_for_project_pattern, perhaps the last one?), get the actual project_name and store the iitem in the dictionary. This will save quite a lot of time in parsing jsons that ultimately are not necessary.
Create public cached_property method projects: List[str].
Create public method get_filepaths_for_project which returns all the filepaths for that project name from self.filepaths Note that under the hood the methods/attributes self._filepath_project_pattern_per_project and self._get_filepaths_for_project_pattern will be used for this purpose.
Create public cached_property method projects_per_company: Dict[company_name: List[project_names]]. For this, the get_filepaths_for_project will be used.
Create public class get_project_data(project_name: str, company: str | None = None) -> BaseTimeProjectData. This method should do:
Check checking whether the project_name is available (if not, print available project_names in ValueError message). Check whether the project_name appears in more than one company (if it is, then list the companies where the project_name appears and suggest to add specify the company_name).
Loop through all filepaths corresponding to project_name and parse the .json file corresponding to the filepath and do using the info the .json file:
If it is the first filepath, parse the Project according to the information of the .json. This only need to be done once.
Loop through every rod_id, parse the SettlementMeasurement and store it in a temporary dictionary Dict[rod_id, List[SettlementMeasurement]].
For each rod_id in the dictionary, create one SettlementRodSeries using all the corresponding List[SettlementMeasurement] and append it to a list_settlement_rod_series.
Return the BaseTimeProjectData(list_settlement_rod_series).
Remove the following code from the init.
Class BaseTimeProjectData
Create init method with signature __init__(self, list_settlement_rod_series: List[SettlementRodSeries]) and validate that all the settlement rod series have come from the same project using a private method _set_list_settlement_rod_series and a public one to return it.
Create public cached_property methods project: Project and rod_ids: List[str] getting the data from self.list_settlement_rod_series.
Create public method get_series_for_rod_id(object_id:str) -> SettlementRodSeries which returns the series for the requested rod_id.
The text was updated successfully, but these errors were encountered:
PabloVasconez
changed the title
Create BaseTimeProjectData (TO BE COMPLETED)
Refactor BaseTimeBucket and add class BaseTimeProjectData
Jul 30, 2024
very detailed and clear instruction for this change @PabloVasconez. Indeed, I think that the user is interested in the project data and not the company data. Loading all company data makes it inefficient.
* Added Basetime class to BAEC measurements
* Adjust 3 files based on feedback
* Edited Files with Black, super_linter and compile
* Update with the new Basetime Lambda functions.
* Update with the new Basetime Lambda functions.
* Update Lambda and super linter, restful 01 version
* feat: added Basetime class to BAEC measurements
* pref: resolve issue #40
* style: edited Files with Black, super_linter and compile
* Update with the new Basetime Lambda functions.
* Update with the new Basetime Lambda functions.
* Update Lambda and super linter, restful 01 version
* chore(deps): add studs package for boto3 and botocore
* Added Credentials from Basetime to test_basetime
* chore(deps): use super-linter v7
* style: resolve linter issues
* replace for loops with list comprehensions
* replace for loops with list comprehensions
* refactor!: move logic to io folder
* style: format files
* format fix new version json Basetime
* style: format files with super linter
---------
Co-authored-by: Robin Wimmers <[email protected]>
In order to make the parsing of the data more efficeint, I propose to refactor the class
BaseTimeBucket
as detailed below.Partly becasue of efficiency but use to add user-friendlyness, I propose to create a BaseTimeProjectData class, which only containts the measurements for a single project. The user will typically request data for a single project, rather than for multiple projects at the same time.
My suggestions:
Class BaseTimeBucket
companies: List[str]
and create a separate private method_test_connection
. Note that these will replace the code below. And note thatlist_projects
is actuallylist_companies
.filepaths: List[str]
to list all filepaths in the bucket. This can be efficiently achieved by applying this:filepath_project_patterns: List[str]
to get the unique the project pattern from theself.filepaths
atributte. In the list below, this value should return ["meting_Crux", "7579_Plasdraszone_Dijksgracht_Oost"]._get_filepaths_for_project_pattern(project_pattern: str)
which returns all the filepaths with the requestedproject_pattern
fromself.filepaths
.filepath_project_pattern_per_project: Dict[project_name, filepath_project_pattern]
. For this, it is necessary to parse only one filepath perfilepath_project_pattern
(of the listed obtained using_get_filepaths_for_project_pattern
, perhaps the last one?), get the actualproject_name
and store the iitem in the dictionary. This will save quite a lot of time in parsing jsons that ultimately are not necessary.projects: List[str]
.get_filepaths_for_project
which returns all the filepaths for that project name fromself.filepaths
Note that under the hood the methods/attributesself._filepath_project_pattern_per_project
andself._get_filepaths_for_project_pattern
will be used for this purpose.projects_per_company: Dict[company_name: List[project_names]]
. For this, theget_filepaths_for_project
will be used.get_project_data(project_name: str, company: str | None = None) -> BaseTimeProjectData
. This method should do:project_name
is available (if not, print available project_names in ValueError message). Check whether theproject_name
appears in more than one company (if it is, then list the companies where the project_name appears and suggest to add specify the company_name).project_name
and parse the.json file
corresponding to the filepath and do using the info the.json file
:rod_id
, parse theSettlementMeasurement
and store it in a temporary dictionaryDict[rod_id, List[SettlementMeasurement]]
.rod_id
in the dictionary, create oneSettlementRodSeries
using all the correspondingList[SettlementMeasurement]
and append it to alist_settlement_rod_series
.BaseTimeProjectData(list_settlement_rod_series)
.Class BaseTimeProjectData
__init__(self, list_settlement_rod_series: List[SettlementRodSeries])
and validate that all the settlement rod series have come from the same project using a private method_set_list_settlement_rod_series
and a public one to return it.project: Project
androd_ids: List[str]
getting the data from self.list_settlement_rod_series.get_series_for_rod_id(object_id:str) -> SettlementRodSeries
which returns the series for the requestedrod_id
.The text was updated successfully, but these errors were encountered: