Vector Analysis
Vector GIS analysis in WbW-R covers attribute management, geometric measurement, shape analysis, spatial overlay, proximity tools, spatial joins, and vector-to-raster conversion. All computation runs in the Whitebox backend through wbw_<tool>(...) wrappers (or wbw_run_tool(...) when tool IDs are dynamic); R handles session management, sequencing, and result processing.
Core Concepts
Vector analysis depends on understanding these core concepts:
- Feature geometry: Points (single coordinate pairs), lines (ordered sequences of coordinate pairs), and polygons (rings of coordinates forming closed boundaries). Each feature type supports different analyses.
- Topology: The spatial relationships between features (adjacency, containment, intersection). Topological errors (overshoots, undershoots, self-intersections) corrupt overlay operations and topology queries.
- Attribute table: The database associated with each feature layer, carrying descriptive fields and values. Attribute queries filter features; joins link external tables.
- Spatial index: Internal index structure (R-tree or quadtree) enabling fast spatial queries. Used by intersection, containment, and proximity operations. Always build spatial indices on frequently queried layers.
- Envelope (bounding box): The minimum rectangular boundary of a feature or layer; used for quick spatial culling before geometry tests.
- Buffer: A polygon created at a fixed distance around a feature. Buffers model proximity zones and are fundamental to distance-based analysis.
- Overlay (intersection, union, difference): Combining two polygon layers to create a new layer. Union merges boundaries; intersection keeps overlapping area only; difference removes one layer from another.
- Dissolve (aggregation): Merging adjacent features with identical attribute values, creating larger aggregate features. Reduces feature count and simplifies geometry.
- Spatial join: Associating features from one layer with features from another based on spatial relationship (overlap, containment, proximity). Assigns attributes across layers.
- Proximity analysis: Finding nearest features, distances, or connectivity. Foundation for network analysis, market analysis, and accessibility studies.
Session Setup
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/vector')
Reading and Writing Vectors
polys <- wbw_read_vector('parcels.shp')
lines <- wbw_read_vector('roads.shp')
points <- wbw_read_vector('samples.shp')
meta <- polys$metadata()
cat('Feature count:', meta$num_features, '\n')
cat('Geometry type:', meta$geom_type, '\n')
cat('CRS:', meta$wkt, '\n')
Attribute Management
# Add a new field
wbw_add_field(i = polys$file_path(),
output = 'parcels_v2.shp',
field_name = 'AREA_HA',
field_type = 'Float')
# Delete an unwanted field
wbw_delete_field(i = 'parcels_v2.shp',
output = 'parcels_v3.shp',
field_name = 'OLD_FIELD')
# Rename a field
wbw_rename_field(i = 'parcels_v3.shp',
output = 'parcels_v4.shp',
input_field = 'AREA_HA',
output_field = 'HECTARES')
# Extract features by attribute value
wbw_extract_by_attribute(i = polys$file_path(),
output = 'large_parcels.shp',
field = 'AREA_M2',
filter_value = 10000.0,
filter_type = 'Greater Than')
Geometric Measurements
# Add area, perimeter, and basic shape metrics to polygon attribute table
wbw_add_polygon_coordinates_to_table(i = polys$file_path(),
output = 'parcels_geom.shp')
# Polygon shape index — compactness
wbw_compactness_ratio(i = polys$file_path(), output = 'parcels_compact.shp')
wbw_elongation_ratio(i = polys$file_path(), output = 'parcels_elong.shp')
wbw_related_circumscribing_circle(i = polys$file_path(), output = 'parcels_rcc.shp')
wbw_patch_orientation(i = polys$file_path(), output = 'parcels_orient.shp')
wbw_radius_of_gyration(i = polys$file_path(), output = 'parcels_rog.shp')
Geometric Operations
# Centroids
wbw_centroid_vector(i = polys$file_path(), output = 'centroids.shp')
# Convex hull
wbw_convex_hull(i = polys$file_path(), output = 'convex_hulls.shp')
# Minimum bounding envelopes
wbw_minimum_bounding_envelope(i = polys$file_path(), output = 'mbe.shp')
# Smooth vector polygons
wbw_smooth_vectors(i = polys$file_path(), output = 'parcels_smooth.shp', filter = 5)
# Simplify features (Douglas-Peucker)
wbw_simplify_line_or_polygon(i = polys$file_path(), output = 'parcels_simplified.shp',
dist = 5.0, remove_spurs = TRUE, errors_only = FALSE)
# Dissolve polygons on field value
wbw_dissolve(i = polys$file_path(), output = 'parcels_dissolved.shp',
field = 'LAND_USE', snap_tol = 0.001)
Spatial Overlay
# Clip
wbw_clip(i = polys$file_path(),
clip = 'study_area.shp',
output = 'clipped.shp',
snap_tol = 0.001)
# Intersect
wbw_intersect(i = polys$file_path(),
overlay = 'zones.shp',
output = 'intersection.shp',
snap_tol = 0.001)
# Erase
wbw_erase(i = polys$file_path(),
erase = 'exclusion_areas.shp',
output = 'erased.shp',
snap_tol = 0.001)
# Union
wbw_union(i = polys$file_path(),
overlay = 'other_layer.shp',
output = 'union.shp',
snap_tol = 0.001)
# Symmetrical difference
wbw_symmetrical_difference(i = polys$file_path(),
overlay = 'other_layer.shp',
output = 'symdiff.shp',
snap_tol = 0.001)
Proximity Analysis
# Euclidean distance from vector features
wbw_vector_points_to_raster(i = points$file_path(), output = 'points.tif', field = 'FID',
assign = 'last', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
wbw_euclidean_distance(i = 'points.tif', output = 'euclidean_dist.tif')
# Voronoi diagram
wbw_voronoi_diagram(i = points$file_path(), output = 'voronoi.shp')
Select by Location
wbw_select_by_location(input = polys$file_path(),
select = 'stream_buffer.shp',
output = 'parcels_near_streams.shp',
condition = 'within')
Spatial Join
wbw_spatial_join(target = points$file_path(),
join = polys$file_path(),
output = 'points_joined.shp',
condition = 'within',
attr = 'first')
Aggregation Strategies
# Join and aggregate field values from nearest polygon
for (stat in c('count', 'sum', 'mean', 'min', 'max')) {
wbw_spatial_join(target = 'zones.shp',
join = 'observations.shp',
output = paste0('zones_', stat, '.shp'),
condition = 'contains',
attr = stat)
}
Vector-to-Raster Conversion
# Rasterize polygon layer
wbw_vector_polygons_to_raster(i = polys$file_path(), output = 'parcels_raster.tif',
field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
# Rasterize line layer
wbw_vector_lines_to_raster(i = lines$file_path(), output = 'roads_raster.tif',
field = 'FID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
# Rasterize points
wbw_vector_points_to_raster(i = points$file_path(), output = 'points_raster.tif',
field = 'VALUE', assign = 'max', nodata = TRUE, cell_size = 5.0)
Field Calculator
# Compute area in hectares and write to existing field
wbw_field_calculator(i = polys$file_path(),
output = 'parcels_calc.shp',
field_name = 'AREA_HA',
py_statement = '@Area / 10000.0',
analyse = FALSE)
Point Cluster Analysis
# Kernel density estimation (heat map)
wbw_kernel_density_estimation(i = points$file_path(),
output = 'heatmap.tif',
bandwidth = 200.0,
kernel_type = 'quartic',
cell_size = 5.0,
base = 'dem.tif')
# Hexagonal binning
wbw_create_hexagonal_vector_grid(i = 'study_area.shp', output = 'hex_grid.shp', width = 500.0, orientation = 'horizontal')
wbw_spatial_join(target = 'hex_grid.shp', join = points$file_path(),
output = 'hex_counts.shp', condition = 'contains', attr = 'count')
WbW-Pro Spotlight: Market Access and Site Intelligence
- Problem: Rank candidate sites using repeatable network-access and demand logic.
- Tool:
market_access_and_site_intelligence_workflow - Typical inputs: Network, existing sites, candidate sites, demand surface, drive-time rings.
- Typical outputs: Catchment polygons, competitive-overlap layer, candidate-ranking CSV, executive summary JSON.
result <- s$run_tool(
'market_access_and_site_intelligence_workflow',
list(
network = 'street_network.shp',
sites_existing = 'existing_sites.shp',
sites_candidates = 'candidate_sites.shp',
demand_surface = 'demand_points.shp',
ring_costs = c(5.0, 10.0, 15.0),
catchments_output = 'candidate_catchments.shp',
overlap_analysis_output = 'competitive_overlap.shp',
candidate_rank_csv = 'candidate_rankings.csv',
executive_summary_json = 'market_summary.json'
)
)
print(result)
Note: This workflow requires a session initialized with a valid Pro licence.
Complete Vector Analysis Workflow
library(whiteboxworkflows)
s <- wbw_session()
setwd('/data/vector_workflow')
parcels <- wbw_read_vector('parcels.shp')
study <- wbw_read_vector('study_area.shp')
streams <- wbw_read_vector('streams.shp')
# 1. Clip to study area
wbw_clip(i = parcels$file_path(), clip = study$file_path(),
output = 'parcels_clipped.shp', snap_tol = 0.001)
# 2. Add shape metrics
wbw_compactness_ratio(i = 'parcels_clipped.shp', output = 'parcels_shape.shp')
# 3. Buffer streams and intersect with parcels
wbw_buffer_raster(i = streams$file_path(), output = 'stream_buf.shp',
size = 30.0, gridcells = FALSE)
wbw_intersect(i = 'parcels_shape.shp', overlay = 'stream_buf.shp',
output = 'riparian_parcels.shp', snap_tol = 0.001)
# 4. Dissolve by land-use class
wbw_dissolve(i = 'riparian_parcels.shp', output = 'riparian_dissolved.shp',
field = 'LAND_USE', snap_tol = 0.001)
# 5. Rasterize result
wbw_vector_polygons_to_raster(i = 'riparian_dissolved.shp', output = 'riparian.tif',
field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0)
cat('Vector analysis complete.\n')
Tips
- Always validate topology before analysis: Run
check_vector_topology()to detect overshoots, undershoots, self-intersections, and sliver polygons. Topological errors propagate through overlay and spatial join operations. - Build spatial indices on large layers: Large datasets (> 10,000 features) benefit from spatial indexing. Use
build_spatial_index()explicitly before repeated spatial queries; operations like containment or proximity are fast with indices. - Choose your overlay operation carefully: Union retains all boundaries and combines attributes (can create many small slivers). Intersection keeps only overlapping regions. Difference retains Polygon A minus Polygon B. Test on small subsets first.
- Dissolve reduces feature count and file size: After overlay, dissolve by ownership or category to collapse unnecessary edges. Dissolved layers render faster and are cleaner for publication.
- Spatial joins are sensitive to alignment: Ensure both input layers use the same CRS and are free of topology errors. Reproject to equal-area projection before computing buffer distances or areas for analysis.
- Buffer distance and units matter: Buffer distances are in map units (meters, feet, degrees). Use an equal-area projection if precise areas or distances are critical. Negative buffers can collapse small polygons (inset); test with small buffer values first.
- Attribute table size is a memory constraint: Attribute tables with millions of rows and dozens of fields consume RAM. Export to CSV or database for large tables; work with summaries or samples when memory is limited.
- Point-in-polygon operations scale with complexity: Containment tests are O(n) per point; on large datasets (> 1 million points), consider spatial index binning or vector-to-raster conversion for speed.