Mapping the trashcan landscape in NYC with QGIS and OpenData

A map of how far you are from a trashcan on the sidewalk in NYC

A evergreen-stage note

Last tended Feb 22, 2025 originally posted Feb 21, 2025

Distance to the nearest litter basket and the presence of pre-existing litter have been shown in psychology studies to be the biggest predictors for littering behavior in psychological studies[1]. The litter in my neighborhood of east Bushwick has bothered me since I've moved here, and I noticed that one has to walk many blocksā€”if you are, for example, carrying your new puppy's poop bag every morning and afternoonā€”to find the nearest litter basket. So I decided to make a map about it.

The map explained

I generated the following thematic maps in QGIS, in order of smallest-to-largest scale, and set them up to appear and disappear at appropriate zoom levels on an overall map.

First, I created a raster map that shows the distance to the nearest litter basket registered in the NYC DSPā€™s Litter Basket Inventory[2]. This map is clipped to only include distance data where it overlaps with the NYC Sidewalk inventory data[3]. I reprojected these maps to the State Plane New York / Long Island datum so that these distance values are in feet. The raster data set is generated at a resolution of 11ft^2. You won't see this map's data until you zoom in a bit closer to street level.

The first thing you'll notice on the map is a color-scale (or choropleth) map of the 2020 US Census tract-level data[4][5] for NYC, which shows the result of QGISā€™s Zonal Statistics being performed on my sidewalk distance raster map, with the color indicating the median sidewalk distance to a litter basket for that census tract. This gives some sense of how far a citizen might have to walk with a piece of litter.

In addition, at the zoom level where the distance raster is visible, you'll see that this map has a custom label which shows a count of how many litter baskets were recorded in the data set, as well as the tractā€™s population divided by that number of litter baskets. This gives a bit more of a sense the ā€œlevel of serviceā€ for residents of that tract, although it has the major limitation of not accounting for actual daily foot traffic through a given census tract.

How to make this map

If you're not familiar with NYC OpenData and you've read this far, familiarize yourself with it. It's such an incredible service to the city, and the only reason I thought I might actually be able to make this map. I searched "trash can" on there andā€”of courseā€”found "DSNY Litter Basket Inventory"[2:1]. I used the Export button to export it in ESRI Shapefile format because I recognized that file type from college.

Next, I re-downloaded QGIS, which is the Blender of the geospatial world and deserves your $10 donation before you download it. I dragged that data set into a new project and saw a bunch of dots that resembled New York. Good start.

Now I worked out what my goal for the map was. That thought process went something like this: I would love to show, for every small local area, how many litter baskets they have available. Oh wait, there's a data set called "NYC Planametric Database: Sidewalk"[3:1] on OpenData. What if I could get a map of all the sidewalks in NYC in terms of the distance to the nearest litter basket? Then I can do analysis on that distance map, like split it by "2020 Census Tracts."[4:1]

With that goal in mind and some data sets I could use in hand, I muddled through figuring out the steps towards it. I figured these out one at a time and searched around the web for how to do each one so the following list is more direct than my explorations.

Step 0: project your input data sets to use whatever datum uses the unit of measure you want your distances to be in. I did this pipeline without reprojecting first and ended up with distances in terms of latitude and longitude without any clear way to get them into the feet we silly Americans use. Reprojecting to the US State Plane system, which is searched around to find, did the trick.

  1. Use the "Rasterize (vector to raster)" tool[1:1] to convert the vector points in the litter basket inventory to a raster data set, baking the value to 1 wherever a litter basket point is. This is commonly called a "raster mask"
  2. Use the "Proximity (raster distance)" tool, with a target pixel value of 1, to generate a map of all the NYC area's distance from a litter basket
  3. Make a raster mask of the sidewalks: use the "Raster (vector to raster)" tool to convert the vector polygons in the sidewalk data set to a raster data set, baking the value 1 wherever a sidewalk is.
  4. Use the "Raster analysis > raster calculator" tool to get the intersection of the litter basket distance map and the sidewalk map. You can get the intersection by multiplying these two maps together: "nyc_sidewalks_rasterized@1" * "nyc_litter_basket_distance@1". I used the census tracts map to set the new map's extents, so there wouldn't be any issues in the later steps. This is the target map we wanted!
  5. Use the "Raster analysis > zonal statistics" tool with the census tracts as the input layer and our sidewalk litter basket distance map as our raster layer. I set it to calculate the median, mean, standard deviation, min, max, range, and variance for each census tract.
  6. Show the layers created in steps 4 and 5 with the original litter basket data on top, and apply color scales to show the variation.
  7. Use the "Generate XYZ tiles (MBTiles)" command to generate a tiled version of the map that can be uploaded the a web tool.
  8. Upload the data set to a tool like Mapbox Studio and follow its user guides to get a publishable, embeddable map like the one you see above.

Conclusion

I plan on writing separately on the analysis of this map, so I'll leave it with as little editorial additions as I can. Litter is a design problem. NYC is a very complex and interconnected city, and it's great that it provides services like litter baskets and the data around those litter baskets for us to all benefit from. I think with a clearer sense of where the longest walks between litter baskets remain in the city, we can be efficient in the work to eliminate litter and its myriad of negative knock-on effects it has on the city.


  1. Schultz, P. W., Bator, R. J., Large, L. B., Bruni, C. M., & Tabanico, J. J. (2013). Littering in Context: Personal and Environmental Predictors of Littering Behavior. Environment and Behavior, 45(1), 35-59. https://doi.org/10.1177/0013916511412179 ā†©ļøŽ ā†©ļøŽ

  2. NYC DSP Litter Basket Inventory. NYC Open Data. https://data.cityofnewyork.us/Environment/DSNY-Litter-Basket-Inventory/8znf-7b2c/about_data. Accessed 2025-02-22. ā†©ļøŽ ā†©ļøŽ

  3. NYC Planimetric Database: Sidewalk. NYC Open Data. https://data.cityofnewyork.us/City-Government/NYC-Planimetric-Database-Sidewalk/vfx9-tbb6. Accessed 2025-02-22. ā†©ļøŽ ā†©ļøŽ

  4. 2020 Census Tracts. NYC Open Data. https://data.cityofnewyork.us/City-Government/2020-Census-Tracts/63ge-mke6/about_data. Accessed 2025-02-22. ā†©ļøŽ ā†©ļøŽ

  5. 2020 Census Data-census tracts & higher. NYC Department of City Planning. https://www.nyc.gov/site/planning/planning-level/nyc-population/2020-census.page. Accessed 2025-02-22. ā†©ļøŽ

Backlinks