Tableau for the Citi

Data extraction, normalization and analysis the NY Citi way

New York Citibike analysis

NY State government and Citibike combined forces to provide bike rentals throughout the Greater Manhattan area, and the customer information such as gender, resident status, pick up and drop off times and locations are available to developers for the purpose of analysis.

This project illustrates how multiple modalities can combine to grab, clean and visualize data. The data comes from the Citibike program is held in Amazon Web Servers and was extracted and cleaned using the Python programming language. Analysis and visualizations were created using Tableau.

All project code and notes

Fork or clone from the GitHub repository.

A note about the visualizations

The visualizatons are best viewed on a computer. They are hosted courtesy of Tableau Public which make interactivity and hosting easy but load times are influenced by the user’s connection speed and the amount of data being pushed to the page. Read the paragraph at the bottom of each illustration page for further instruction on how to prevent the “spinning wheel” of a frozen web page.

Interactive Dashboard

Map analysis of the rides

Some general observations are that the majority of the activity seems to take place in redevelopment and commercial areas of lower Manhattan, Brooklyn near the bridges and Long Island City, Queens. Additionally, men and subscribers far outweigh the alternative in regular use and Citibike sees to be popular with people in their 30’s.

Age by gender

Clearly the number one demographic for Citibike is late 20s/early 30s and male.

Popular start times

The most popular starting times seem to mimic a 9-5 work schedule. It’s not conclusive for several reasons, the most obvious being that I did not pull weekends out of the data. Perhaps if I did, the peaks would have been more extreme.

Popular start and stop locations

Popular start stations and ending stations are not the same.

New Years Eve

Very few bikes were returned on New Years Eve.

Is celebratory biking on the rise, or did people fail to return them?

Subscribers vs non subscriber visualization

Clearly Citibike has a strong subscriber base, which may be a side effect of high density living. However, between tourists and non-subscribing locals the customer demographic supports the popularity of this transportation program.

There is more to this story

The word “data” seems to be perceived as dry and dull, which I suppose it can be, but put into the context of humanity it is interesting. It helps to explain who we are, where we have been and where we are going. This project examines a relatively small timeline of a city with a big skyline. There are several more years of the program available to explore and tales to tell. That being said, there are some general conclusions such as age, gender and popular locations from this data analysis that likely serves as a springboard for comparison and deeper understanding of how New Yorkers interact with their surroundings.

Sheri Rosalia | Data Engineer

Data Engineer | Data Analyst | Data Scientist