Introduction
Regrading The Data
Liking the flexibility that rideshare platforms offer. I decided to take up rideshare driving while I was taking classes in the spring semester of 2017. With ok profits resulting from it. i kept driving in the summer where I discovered that the weekends are a great source of revenue. Realizing this I started driving full time in the summers. I did this for two years in 2018 and 2019. I mostly drove for Uber with Lyft as a sidekick when things got slow. In total, I drove 3200+ trips with Uber and 300+ with Lyft. I unknowingly created a small data set that was too good to forget, so I didn't. I wrote some web scrapers to collect my driving information and processed the data which can be seen below.
Stats for credibility
4.94
Star Rating
11/21/2017
Driver Since
94%
Acceptance
3%
Cancellation
3,119
Trips Over 2 Years
3,274
Total Trips
Tools Used
Uber makes it very hard to collect trip information. Each trip has its own HTML and statements data ordering is inconsistent across months. Thus the first challenge was data collection. Test bots were created with Beautiful Soup and Selenium. In the end, I decided to go mainly with Selenium as the HTML code on Ubers site was too dynamic and sloppy to use. With Selenium, I was able to collect the vast majority of my trips on the platform. The data was processed after collection and combined with statements to answer as many questions as possible. Pandas, Python, Excel, Juypeter Notebooks, & Visual Studios Code were used for this step, which was the most time-consuming aspect of the project. The final step in the project is constructing this website which, was wireframed on paper. The website itself was built from scratch using HTM5, CSS3, jquery, BootStrap 4, PHP, Charts.js, Typed.js, & Counter.js. It took one month of work to get to this point and not all the questions I answered with the data set are posted on this website.
Data Drawbacks
One of the main drawbacks is that it only my data set. 35-40 hours spread over Friday, Saturday, & Sundays in the summer months mostly. Thus I am missing the weekdays and 8 non-summer months. Collecting other driver data would be a great solution to this problem but I would need account access for my web scrapper, and would potentially compromise their income source. I also don't know that many Uber drivers, so we'll have to settle for my data set. In a nutshell, I think that this is a great data set, but it's far from big enough to take extremely seriously. The data is presented as is to the best of my data science/ data mining abilities. All this data is pre COVID-19, so things may be different out there.
General Trips
Trip Types
Profits
Tips
Distance
Duration
Surged Trips
Waiting
Long Pickups
Points
Most of this point averages were calculated excluding canceled trips. This is because canceled trips do generate points for the driver. Calculations that include canceled trips are marked.
2.88
Points Per Trip (Including Canceled Trips)
3.0
Points Per Trip
3.01
Points Per UberX Trip
2.78
Points Per Pool Trip
2.99
Points Per Express Pool Trip
Zip Codes
Passenger Payments
Driver Fees
Canceled Trips
DIA Trips
Trip Types
Profits
Profit Breakdown
Tips
$17.20
Best App Based Tip
$2.0
Worst App Based Tip
Distance
Duration
Canceled Trips
Contact Me
Thank you for your interest!
I hope that I impressed you enough to reach out if you're interested in having me join your team or company. Feel free to browse my GitHub repositories, view and/or add me on LinkedIn, or view my latest resume by clicking the respective icons below. You can also send me a message using the contact form on this website. If you do decide to send a message using this website, keep an eye out on the email you used.
Professional contact forms are high priority and are replied to within 24 hours. A confirmation email is sent out shortly after a message is sent.