-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: GROUP_29: US-Salary-Prediction #14
Comments
Data analysis review checklistReviewer: @joshsiaConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1.5 hoursReview Comments:
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer: @liannahConflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1 hourReview Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above.
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer: CHEN_Xiaohan (@Anthea98)Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1 hourReview Comments:Please provide more detailed feedback here on what was done particularly well, and what could be improved. It is especially important to elaborate on items that you were not able to check off in the list above. About the report structure and writing quality
About the data process and model chosen
About the whole repo
AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Data analysis review checklistReviewer: Valli Akella ( @valli180 )Conflict of interest
Code of Conduct
General checks
Documentation
Code quality
Reproducibility
Analysis report
Estimated hours spent reviewing: 1 HourReview Comments:
Overall: AttributionThis was derived from the JOSE review checklist and the ROpenSci review checklist. |
Project Improvements Feedback 1:Feedback: Comment 8 in this issue Based on the reasoning, the original question can be in a smaller scale, instead of anyone's salary, maybe of a certain city. It could be easily guessed that age comes in as a factor, but indeed the actual variable of interest is experience which is not a feature here that is quantitative (years of work). Next steps should be more clearly stated to address some concerns about your project question and further analysis of the data Action: Feedback 2:Feedback: Comment 1C in this issue I think there needs to be some summary about how your data isn't giving you the right result or if your model isn't working properly. The advancement from milestone 1 to 2 isn't obvious. It's okay to have a negative result, be the presentation is a bit lacking. Maybe have your results and discussion seperated and try to evaluate your data and then your model results. Action: Feedback 3:Feedback: Comment 3C in this issue Narrative of analysis and visualization was not present. Action: Improved narrative/storytelling aspect of final report: Feedback 4:Feedback: Comments 3-4 in this issue The scripts in the src directory are not named consistently (some scripts use camel case while others use underscores). I'm also not sure about this but I don't understand why you define the ordering for ordinal variables with missing_value at the bottom. Maybe imputation would be useful in this scenario if there are not a lot of missing values? Regarding the bar plots for how_old_are_you, years_of_experience_in_field etc., it might be nice to order the y-axis by the same ordinal encoding used in the fit_transform_evaluate_model.py script. Currently, the plot for how_old_are_you might be a bit misleading because it shows a gradual increase in median annual salary but the age groups are not in increasing order. Action: changed EDA to reorder graph axis: UBC-MDS/US-Salary-Prediction@9a51f45 |
Submitting authors: @AndyYang80 @cuthchow @lirnish
Repository: https://github.com/UBC-MDS/US-Salary-Prediction
Report link: https://github.com/UBC-MDS/US-Salary-Prediction/blob/main/doc/final_report.md
Abstract/executive summary:
One of the most important things in the job search is about the salaries, specifically, does this job's salary meet our expectations? However, it is not that easy to set proper expectations. Setting an expectation too high or too low will both be harmful to our job search. Here, this project is to help you to answer this question: What we can expect a person's salary to be in the US? According to Martín et al. (2018), a linear regression model with an R2 score is a good combination for predicting salaries, so we will use that to do the prediction. In the process, we wish to understand which factors provide the most predictive power when trying to predict a person's salary. The dataset we are analyzing comes from a salary survey from the "Ask a Manager" blog by Alison Green. This dataset contains survey data gathered from "Ask a Manager" readers working in a variety of industries, and can be found here.
Editor: @flor14
Reviewer: Chen_Xiaohan, Hovhannisyan Lianna, Akella Lakshmi Santosha Valli, Sia_Joshua
The text was updated successfully, but these errors were encountered: