In this problem, I used the Kaggle public data repository for getting dataset for this Deep Learning use case. I used the Rossmann Store sales dataset, which is available here. This was a very popular competition hosted a couple of years ago and has a fairly large dataset. You would need to register with Kaggle and accept the competition rules to be able to download the data. From the datasets, i used only train.csv and store.csv files
Rossmann is one of the largest drugstore chains in Germany, with operations across Europe. My task is to predict the sales for a few identified stores on a given day. The first question I needed to ask nyself is: who is the end stakeholder for the business problem and how is he going to utilize the solution? We know that there is a marketing team designing store-specific promotional campaigns to target customers and increase the overall revenue while using resources more judiciously. Therefore, they don’t want to provide promotions to stores that would be outperforming anyway irrespective of the promotions. If they have visibility into estimated future sales, they can classify a few stores as “low,” “medium,” and “high” based on a defined threshold for the required discount and promotions to achieve the expected targets. The team hit a roadblock, as they have no means to estimate the future sales for a given store. Therefore, to solve the problem, I asked the following question: “How can I estimate future sales for a store?” Given that the roadblock has been overcome, the marketing team now has the means to study and estimate future store sales and thus design more effective promotional campaigns. The answer to the key question of the business problem is probably easy to guess now. I developed an ML model that can learn the sales for a store as a function of internal, external, and temporal (time-based) attributes and then predict future sales given the attributes available.