2020 SciPy - Jacob Dice

Geospatial analysis of Housing Stock at Risk of Sea-Level Rise

In the jupyter notebook below, I describe a research project I've been working with colleagues at the Federal Reserve Bank of Kansas City and NOAA (David Rodziewicz, Christopher Amante, and Eugene Wahl).

  • Please start with the embedded video
  • Read through the code example
  • Check out our recently-posted research working paper (link)
  • If you have any follow-up questions, don't hesitate to email me at: Jacob.Dice@kc.frb.org
In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo('vk7ZmUXINPg', width=900, height=600)
Out[1]:
In [2]:
from pathlib import Path
import rasterio as rio
import fiona
import geopandas as gpd
import pandas as pd
import numpy as np
from geopandas.tools import sjoin
import matplotlib.pyplot as plt
import re
import operator

To determine the effects of sea level rise at a property level, we first have to identify the exact elevation each property in our sample will go underwater. To do this, we use two ways of deriving the inundation elevation of a property:

  • DEM (Digital Elevation Model) Elevation
  • Shapefile Elevation

The DEM file is a raster file (tiled data) that defines the elevation for any given point on the coastline. The shapefile is a series of polygons describing the coastline boundary at 0-10 ft.

The shapefile, while a less granular match (only 1ft increments), was computed accounting for hydrological connectivity. Essentially, if there are natural barriers/dams preventing the water from flowing inland, the shapefile correctly identifies these regions as not-inundated.

To optimize for match granularity and fidelity, we match to both files. If the DEM elevation resides within the 1ft tolerance of the shapefile, we inundate at the DEM elevation. If the shapefile indicates that the property will innundate at a higher elevation than the DEM elevation (indicating hydrologic connectivity is at play), we inundate the property at the shapefile elvation since we can't estimate a better proxy.

Below, I show how to use rasterio and geopandas to perform these matches on 3 made-up properties in Florida. In our research working paper (Kansas City Federal Reserve), our sample was ~5.4M properties across 15 Metros.

In [3]:
#Specify file paths to DEM file:
#For the research paper, we use a variant of what can be downloaded at: https://coast.noaa.gov/slrdata/
DEM_FILE_PATH = Path('../data/FL_MFL_2_GCS_5m_MHHWft.tif')

#Specify file paths to SHP file:
#Source: https://coast.noaa.gov/slrdata/
SHP_FILE_PATH = Path('../data/FL_MFL_slr_data_dist/FL_MFL_slr_final_dist.gdb')
In [4]:
#Generate dataframe of made-up properties on the coastline of FL.
houses = pd.DataFrame({'house_id': ['a','b','c'],
                       'lat': [25.772332, 25.974418 , 25.945423],
                       'lon': [-80.206236, -80.133745, -80.141970],
                      })

#Convert to geopandas dataframe (will make spatial joins easier)
houses = gpd.GeoDataFrame(houses,
                          geometry=gpd.points_from_xy(houses['lon'], houses['lat']))
#set coordinate-reference-system metadata in geopandas
houses.crs = "EPSG:4326"
houses
Out[4]:
house_id lat lon geometry
0 a 25.772332 -80.206236 POINT (-80.20624 25.77233)
1 b 25.974418 -80.133745 POINT (-80.13375 25.97442)
2 c 25.945423 -80.141970 POINT (-80.14197 25.94542)
In [5]:
#state shapefile (only to generate map below): https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_5m.zip
states = gpd.read_file('../data/state_shapefile/cb_2018_us_state_5m.shp')
states = states[states['STUSPS']=='FL']
states
Out[5]:
STATEFP STATENS AFFGEOID GEOID STUSPS NAME LSAD ALAND AWATER geometry
38 12 00294478 0400000US12 12 FL Florida 00 138949136250 31361101223 MULTIPOLYGON (((-80.75164 24.85725, -80.72906 ...
In [6]:
fig, ax = plt.subplots(ncols=2, figsize=(10,5))
states.boundary.plot(ax=ax[0])
houses.plot(color='green', alpha=1, markersize=80, ax=ax[0])

states.boundary.plot(ax=ax[1])
houses.plot(color='green', alpha=1, markersize=10, ax=ax[1])
ax[1].set_xlim(xmin=-81, xmax=-79.5)
ax[1].set_ylim(ymin=25.2, ymax=26.5)
ax[0].set_title('Florida Coastline')
ax[1].set_title('Miami - 3 made up properties')
plt.show()

Query DEM height (rasterio)

In [7]:
coord_tuples = [tuple(x) for x in houses[['lon','lat']].values]

#generate list of sampled elevations from raster file.
with rio.open(DEM_FILE_PATH) as dem:
    #DEM is only a single layer, so we just need the 0th index
    dem_heights = [i[0] for i in dem.sample(coord_tuples)]
    
houses['DEM'] = dem_heights

houses
Out[7]:
house_id lat lon geometry DEM
0 a 25.772332 -80.206236 POINT (-80.20624 25.77233) 2.363548
1 b 25.974418 -80.133745 POINT (-80.13375 25.97442) 4.911385
2 c 25.945423 -80.141970 POINT (-80.14197 25.94542) 5.717912

Query Shapefile height (geopandas)

In [8]:
layers = fiona.listlayers(SHP_FILE_PATH)
print(layers)
['FL_MFL1_low_0ft', 'FL_MFL1_low_10ft', 'FL_MFL1_low_1ft', 'FL_MFL1_low_2ft', 'FL_MFL1_low_3ft', 'FL_MFL1_low_4ft', 'FL_MFL1_low_5ft', 'FL_MFL1_low_6ft', 'FL_MFL1_low_7ft', 'FL_MFL1_low_8ft', 'FL_MFL1_low_9ft', 'FL_MFL1_slr_0ft', 'FL_MFL1_slr_10ft', 'FL_MFL1_slr_1ft', 'FL_MFL1_slr_2ft', 'FL_MFL1_slr_3ft', 'FL_MFL1_slr_4ft', 'FL_MFL1_slr_5ft', 'FL_MFL1_slr_6ft', 'FL_MFL1_slr_7ft', 'FL_MFL1_slr_8ft', 'FL_MFL1_slr_9ft', 'FL_MFL2_low_0ft', 'FL_MFL2_low_10ft', 'FL_MFL2_low_1ft', 'FL_MFL2_low_2ft', 'FL_MFL2_low_3ft', 'FL_MFL2_low_4ft', 'FL_MFL2_low_5ft', 'FL_MFL2_low_6ft', 'FL_MFL2_low_7ft', 'FL_MFL2_low_8ft', 'FL_MFL2_low_9ft', 'FL_MFL2_slr_0ft', 'FL_MFL2_slr_10ft', 'FL_MFL2_slr_1ft', 'FL_MFL2_slr_2ft', 'FL_MFL2_slr_3ft', 'FL_MFL2_slr_4ft', 'FL_MFL2_slr_5ft', 'FL_MFL2_slr_6ft', 'FL_MFL2_slr_7ft', 'FL_MFL2_slr_8ft', 'FL_MFL2_slr_9ft']
In [9]:
#Typically this is a fast process.
#In this case, the coastline polygons are very complex, and take a few minutes per layer to load into memory

for layer in layers:
 if "low" not in layer:
    print("Merging: {}".format(layer))
    gdf = gpd.read_file(SHP_FILE_PATH,
        driver='FileGDB',
        layer=layer)
    
    gdf = gdf.to_crs("EPSG:4326")
    
    gdf[layer] = True
    gdf = gdf[[layer, 'geometry']]
    houses = sjoin(houses, gdf, how ='left', op='within').drop(['index_right'], axis=1)
Merging: FL_MFL1_slr_0ft
Merging: FL_MFL1_slr_10ft
Merging: FL_MFL1_slr_1ft
Merging: FL_MFL1_slr_2ft
Merging: FL_MFL1_slr_3ft
Merging: FL_MFL1_slr_4ft
Merging: FL_MFL1_slr_5ft
Merging: FL_MFL1_slr_6ft
Merging: FL_MFL1_slr_7ft
Merging: FL_MFL1_slr_8ft
Merging: FL_MFL1_slr_9ft
Merging: FL_MFL2_slr_0ft
Merging: FL_MFL2_slr_10ft
Merging: FL_MFL2_slr_1ft
Merging: FL_MFL2_slr_2ft
Merging: FL_MFL2_slr_3ft
Merging: FL_MFL2_slr_4ft
Merging: FL_MFL2_slr_5ft
Merging: FL_MFL2_slr_6ft
Merging: FL_MFL2_slr_7ft
Merging: FL_MFL2_slr_8ft
Merging: FL_MFL2_slr_9ft
In [10]:
# There is now a column for each of the shapefile coastline polygons.
# The values are true/nan value if the property was/wasn't contained within the given coastline elevation
print(houses)
  house_id        lat        lon                    geometry       DEM  \
0        a  25.772332 -80.206236  POINT (-80.20624 25.77233)  2.363548   
1        b  25.974418 -80.133745  POINT (-80.13375 25.97442)  4.911385   
2        c  25.945423 -80.141970  POINT (-80.14197 25.94542)  5.717912   

  FL_MFL1_slr_0ft FL_MFL1_slr_10ft FL_MFL1_slr_1ft FL_MFL1_slr_2ft  \
0             NaN              NaN             NaN             NaN   
1             NaN              NaN             NaN             NaN   
2             NaN              NaN             NaN             NaN   

  FL_MFL1_slr_3ft  ... FL_MFL2_slr_10ft FL_MFL2_slr_1ft FL_MFL2_slr_2ft  \
0             NaN  ...             True             NaN             NaN   
1             NaN  ...             True             NaN             NaN   
2             NaN  ...             True             NaN             NaN   

  FL_MFL2_slr_3ft FL_MFL2_slr_4ft FL_MFL2_slr_5ft FL_MFL2_slr_6ft  \
0             NaN             NaN             NaN             NaN   
1             NaN             NaN            True            True   
2             NaN             NaN             NaN            True   

   FL_MFL2_slr_7ft FL_MFL2_slr_8ft FL_MFL2_slr_9ft  
0             True            True            True  
1             True            True            True  
2             True            True            True  

[3 rows x 27 columns]
In [11]:
#make list of tuples of form: (shapefile name, elevation)

elev_filter = re.compile(r'^.*_(\d{1,2})ft$')
elev_cols = list(filter(elev_filter.search, list(houses)))
elev_heights = [int(re.search(r'^.*_(\d{1,2})ft$',nm).group(1)) for nm in elev_cols]
elev_zip = sorted(zip(elev_cols, elev_heights), key=operator.itemgetter(1))
print("Elevation heights found: {}".format(elev_zip))
Elevation heights found: [('FL_MFL1_slr_0ft', 0), ('FL_MFL2_slr_0ft', 0), ('FL_MFL1_slr_1ft', 1), ('FL_MFL2_slr_1ft', 1), ('FL_MFL1_slr_2ft', 2), ('FL_MFL2_slr_2ft', 2), ('FL_MFL1_slr_3ft', 3), ('FL_MFL2_slr_3ft', 3), ('FL_MFL1_slr_4ft', 4), ('FL_MFL2_slr_4ft', 4), ('FL_MFL1_slr_5ft', 5), ('FL_MFL2_slr_5ft', 5), ('FL_MFL1_slr_6ft', 6), ('FL_MFL2_slr_6ft', 6), ('FL_MFL1_slr_7ft', 7), ('FL_MFL2_slr_7ft', 7), ('FL_MFL1_slr_8ft', 8), ('FL_MFL2_slr_8ft', 8), ('FL_MFL1_slr_9ft', 9), ('FL_MFL2_slr_9ft', 9), ('FL_MFL1_slr_10ft', 10), ('FL_MFL2_slr_10ft', 10)]
In [12]:
#Find most restrictive shapefile match by iterating through the sorted list of elevations.
#Update column shapefile_elevation to the lowest elevation match

houses['shapefile_elevation'] = np.nan

for shape_name, elev in elev_zip:
    #This should work for shapefiles with multiple shapes with same elevation. Only update if it's nan or elev is less than current set value
    houses.loc[(houses[shape_name] == True) &
            ((np.isnan(houses["shapefile_elevation"])) | (elev < houses["shapefile_elevation"])),
            'shapefile_elevation'] = elev
    
houses = houses.drop(columns=elev_cols)
houses
Out[12]:
house_id lat lon geometry DEM shapefile_elevation
0 a 25.772332 -80.206236 POINT (-80.20624 25.77233) 2.363548 7.0
1 b 25.974418 -80.133745 POINT (-80.13375 25.97442) 4.911385 5.0
2 c 25.945423 -80.141970 POINT (-80.14197 25.94542) 5.717912 6.0
In [13]:
houses['inundation_elevation'] = np.where(houses['DEM'] > (houses['shapefile_elevation']-1),
                                          houses['DEM'],
                                          houses['shapefile_elevation'])

houses
Out[13]:
house_id lat lon geometry DEM shapefile_elevation inundation_elevation
0 a 25.772332 -80.206236 POINT (-80.20624 25.77233) 2.363548 7.0 7.000000
1 b 25.974418 -80.133745 POINT (-80.13375 25.97442) 4.911385 5.0 4.911385
2 c 25.945423 -80.141970 POINT (-80.14197 25.94542) 5.717912 6.0 5.717912

Results:

At this point, we have mapped the elevation of inundation for each property in the sample. We can proceed to applying a variety of sea level rise models, however in our paper we highlight a model by Robert Kopp in a 2014 paper. Please see our research working paper for more information: https://www.kansascityfed.org/publications/research/rwp

In [ ]: