You are working as a Data Scientist in an ecommerce company. The company is a market leader with range of product segments. The main product segments are latest gadgets, books, toys, household items and clothes.
However, the company’s performance especially its profitability is decreasing due to increased competition and changing customer expectations. The company has well qualified board of directors who believe in the power of data analytics in solving business problems and thus make important business decisions. Due to your data analytics skills and data science capability, the company executives have given you a special task in analysing the company’s sales data for each product segment and each geographic region. The executives have requested you the following:
1. Develop the dataset for at least one product segment with at least 1000 records and with attributes such as
· product name,
· product price,
· shipping type (free or customer paid)
· monthly sales ($)
· geographic region
· No. Of customers who bought the product,
· Customer type (New or existing)
You are also allowed to add any more attributes to describe product segment
Note: You must develop your own unique and original dataset – copying a dataset from another student or from the internet will result in reduced or zero marks.
2. Research any specific data mining or classification technique and propose a suitable technique or model to determine any association or relationships among the attributes.
3. Develop a predictive model to predict monthly sales for a given geographic region. You can use any of the methods such as Naive Bayes, decision trees or linear regression. You are also welcome to do a comparative analysis of all the methods you come across in your research and use the comparative analysis to justify your approach and research findings.
4. Based on your analysis, present recommendations to the board for the following business problems:
· What is the most likely geographic region to target new customers to increase sales and profit?
· Which products should be prioritised for increase in sales?
· What will be impact on product sales if free shipping is provided to all products?
· Any new innovative ideas to improve company’s profitability supported by your data analytics.
Proposed Report Structure
The report should be structured using the following headings (2000 words)
1. Executive Summary
2. Table of Contents
3. List of Abbreviations and assumptions made
4. Introduction – What is the problem?
5. Research Methodology
6. Analytical Findings
7. Recommendations to the company
8. An implementation plan based on the recommendations you have provided
10. List of References
11. Appendix (E.g. Python code)
· You must submit your sample data as a CSV file.
· If you use Python code to analyse their data and draw graphs, then you must include all that code in the AppAppendix