Citi Bike, a public bicycle sharing system that serves parts of New York City, is the largest bike-sharing program in the United States. With the existing wealth of data pertaining to Citi Bike users in New York City, it is possible to identify, segment and categorize the customer domain based on several factors such as Age, Sex and Occupation which will help the company to identify their potential customers that can be targeted and also the weak customer domain which needs to be improved. The prime objective of this project is to identify long-run customers and potential audience that can be targeted to increase the company’s customer base and drive more revenue.
Other deliverables of this project would be to identify hot-spot locations and peak-demand hours, which can be crucial information that can help the company to better, manage their business supply, which will also essentially help more customers. Identifying the hot-spots at peak hours would help the company to understand the market demand and thereby increase the bike availability at these hot-spots during peak hours will help in driving more revenue. Our design would rely largely upon existing Citi Bike trip histories data and Citi Bike Daily Ridership and Membership Data.
- Trip Duration (seconds)
- Start Time and Date
- Stop Time and Date
- Start Station Name
- End Station Name
- Station ID
- Station Lat/Long
- Bike ID
- User Type (Customer = 24-hour pass or 7-day pass user; Subscriber = Annual Member)
- Gender (Zero=unknown; 1=male; 2=female)
- Year of Birth
Problem Statements :
- 1.Number of Subscribers and Customers
- 2.Number of Subscribers and Customers for each gender
- 3.Number of Subscribers and Customers for each gender in every age category
- 4.Average Trip Distance
- 5.Week Stats : Identify the day with most trips
- 6.Most Popular Stations : Identify the stations with most originating trips and destinations.
I hope this tutorial will surely help you. If you have any questions or problems please let me know.
Happy Hadooping with Patrick..