Hi, I'm Mark, the creator of this website. From my photo, you can tell a couple things about me. One, I like to bicycle, even in the winter. And two, I'm cheap, mighty cheap. That Schwinn LeTour bike you see there is over thirty years old -- bought it used back in 1995. Set me back only $60. Yes, I'm cheap.
I hold a Ph.D. in Economics from the University of Michigan. That does not mean I know anything, though. It's just a credential that allowed me to pursue a career as an economics professor.
Why did I develop this website? Well, I find value in promoting commuter cycling, only partly for environmental reasons, though. I like to help people live a frugal lifestyle so they won't have to worry about financial security so much, and the personal automobile is the worst money-pit human beings have ever created. I'm an advocate of physical fitness, too. And nothing is better for the body than bicycling to work instead of driving. If I can put my data analytics skills to work to promote commuter cycling, I'll take the opportunity.
Machine learning has certainly captured people's imagination in recent years. Its accomplishments in various domains have indeed been impressive. To my knowledge, no one has yet examined whether machine learning can outperform traditional statistical techniques -- think regression analysis -- at analyzing the impact of various geographic and demographic neighborhood characteristics on commuters' propensity to bicycle to work, or at predicting where the construction of bicycle infrastructure might best be placed in order to encourage commuter cycling.
In this website, I present data analyses I've run on this issue using some of the old-fashioned statistical tools, namely multivariate regression, Tobit regression analysis, and spatially autocorrelated regression modeling. I compare the effectiveness of these approaches with two machine learning algorithms, a random forest regressor and a dense neural network.
The data I use come from the 2018 New Zealand Census, more specifically the census data pertaining to Auckland. I also use shapefiles provided by Statistics New Zealand, along with QGIS and Geoda, two GIS software packages, to compile the geographic variables I use in my analyses.
Although the data pertain to one specific locale, namely Auckland, anyone interested in the possible application of machine learning to urban bicycle infrastructure design, wherever they live, should find value in the analyses I perform. Again, as I mentioned on the first webpage, what you are seeing is definitely a work-in-progress and will hopefully show improvements as I update this website.
Let me give you a few spoiler alerts. One, the random forest regressor does a better job at predicting commuters' propensity to bicycle to work than the other methods I try. Two, no matter how we analyze the data, it appears that, at the aggregate geographic level of the city of Auckland, accessibility to bicycle infrastructure has only a very small impact on commuters' propensity to bicycle to work. (This may stem from the design of the network: it is fragmented and does not appear to integrate centers of employment with residential areas well. You will see this on maps appearing on other pages of this website.) Sure, cyclists are using the infrastructure to commute to work, but most of these cyclists would still be cycling to work if the infrastructure were not there. On a more encouraging note, a machine learning analysis, more specifically a random forest regressor, when combined with a bit of GIS mapping, reveals pockets of the city where the addition of bicycle infrastructure might yield an appreciable increase in commuter cycling.