Victoria Solution – Consulting Group’s Intern

Week 2 Project

1. Cleaning Dataset:

1.1. Initial Inspection and Data Cleanup:

  • Duplicate records: The dataset contains two rows with highly similar fields, warranting further analysis for potential duplication.
  • Inconsistent date formats: Date entries exhibit variations in formatting.
  • Missing fields: Several customer email and phone number fields are absent.

1.2. Data Type Formatting:

  • The “Order_Date” column was formatted with consistency. 
  • The column “Email” has the missing value -> manually use function VLookup -> find and replace if the similar value exists in the same dataset. -> result: the dataset is so small that does not have the reference.
  • The column “Phone” has the missing value -> formatting the whole column’s data type into the numeric type. -> VLookup ->  find and replace if the similar value exists in the same dataset. -> result: the dataset is so small that does not have the reference.
  • Still have a few leftover “Null” cells in the dataset -> analyzing -> review -> decide to let it be because it is limited information. 

1.3. Export: The cleaned dataset was exported as a CSV file.

1.4 Cleaned dataset sample:

Order_IDCustomer_NameEmailPhoneProduct_CategoryOrder_DateRevenueDiscount (%)
101John Doejohn@email.com9876543210ElectronicsSunday, December 31, 2023120010
102Alice Smith9898989898ClothingFriday, January 5, 20245000
103Bob Millerbob@email.comElectronicsFriday, January 12, 2024300020
104John Doejohn@email.com9876543210ElectronicsSunday, December 31, 2023120010
105David Whitedavid@email.com9123456789FurnitureThursday, February 15, 2024250015
106Emma Brownemma@email.com9234567890ClothingFriday, March 8, 20247005
107Chris Green9345678901FurnitureWednesday, April 10, 2024180025
108Alice Smithalice@email.comClothingFriday, March 8, 20245000

2. Insights:

2.1 Customer Activity:

  • The dataset comprises five unique customers, with ‘John Doe’ and ‘Alice Smith’ appearing twice, indicating repeat patronage.
  • ‘John Doe’ has orders with Order_ID 101 and 104, both for ‘Electronics’ with a 10% discount and a revenue of 1200. These entries may represent duplicate records or identical orders.
  • ‘Bob Miller’ has missing email and phone number information, which could impede customer communication.

2.2 Product Performance:

  • ‘Electronics’ is a prominent product category, accounting for three out of eight orders.
  • ‘Electronics’ generated the highest revenue, totaling $5400 across the recorded orders.
  • ‘Clothing’ and ‘Furniture’ also contributed to revenue, with ‘Clothing’ generating $500 and ‘Furniture’ generating $2500.

2.3 Sales Trends and Discounts:

  • The ‘Order_date’ column spans from ‘2023-12-31’ to ‘2024-02-20’, indicating sales activity across multiple months.
  • The average discount applied across all orders is approximately 10%.
  • The highest discount, 20%, was applied to an ‘Electronics’ order placed by ‘Bob Miller’, which also had the highest individual revenue of $3000.
  • The relationship isn’t straightforward – higher discounts don’t automatically mean higher revenue. -> Because two orders (Alice Smith’s Clothing purchases) had no discount and generated $500 each, it is proved that discounts aren’t always necessary for conversions.


3. Visualizations:

3.1 Summary Revenues:

3.2 Relationship between Revenues and Discounts:

3.3 Sales Tracking:


Recommendations

  •  Implement a data validation process at the point of order entry. For existing records, reach out to customers to complete their contact information. This is crucial for marketing campaigns and customer retention efforts.
  • Consider focusing marketing efforts on Electronics while investigating why Furniture has such variable discounting (15-25%). -> Consider implementing tiered discount strategies by product category rather than blanket discounting. -> Electronics and Furniture categories can support 15-20% discounts while maintaining high revenue, whereas Clothing appears to sell at lower price points regardless of discount level.
  • The lower sales figures observed in February and April of the current year offer valuable insights for future marketing strategy modifications by implementing the A/B test: An A/B test will be conducted: Test A will offer a bundle of Electronics and Clothing, while Test B will offer the same bundle with an additional 10% discount.

Proposal:

error: Content is protected !!