Vector Analysis

Vector GIS analysis in WbW-R covers attribute management, geometric measurement, shape analysis, spatial overlay, proximity tools, spatial joins, and vector-to-raster conversion. All computation runs in the Whitebox backend through wbw_<tool>(...) wrappers (or wbw_run_tool(...) when tool IDs are dynamic); R handles session management, sequencing, and result processing.

Core Concepts

Vector analysis depends on understanding these core concepts:

Feature geometry: Points (single coordinate pairs), lines (ordered sequences of coordinate pairs), and polygons (rings of coordinates forming closed boundaries). Each feature type supports different analyses.
Topology: The spatial relationships between features (adjacency, containment, intersection). Topological errors (overshoots, undershoots, self-intersections) corrupt overlay operations and topology queries.
Attribute table: The database associated with each feature layer, carrying descriptive fields and values. Attribute queries filter features; joins link external tables.
Spatial index: Internal index structure (R-tree or quadtree) enabling fast spatial queries. Used by intersection, containment, and proximity operations. Always build spatial indices on frequently queried layers.
Envelope (bounding box): The minimum rectangular boundary of a feature or layer; used for quick spatial culling before geometry tests.
Buffer: A polygon created at a fixed distance around a feature. Buffers model proximity zones and are fundamental to distance-based analysis.
Overlay (intersection, union, difference): Combining two polygon layers to create a new layer. Union merges boundaries; intersection keeps overlapping area only; difference removes one layer from another.
Dissolve (aggregation): Merging adjacent features with identical attribute values, creating larger aggregate features. Reduces feature count and simplifies geometry.
Spatial join: Associating features from one layer with features from another based on spatial relationship (overlap, containment, proximity). Assigns attributes across layers.
Proximity analysis: Finding nearest features, distances, or connectivity. Foundation for network analysis, market analysis, and accessibility studies.

Phase A Spatial Statistics Status

Phase A vector spatial-statistics tools are available through direct wrappers:

wbw_global_morans_i(...)
wbw_local_morans_i_lisa(...)
wbw_getis_ord_gi_star(...)
wbw_nearest_neighbour_index(...)
wbw_quadrat_count_test(...)

Current inference support note:

inference = "asymptotic" is supported.
inference = "permutation" is intentionally deferred for Phase A and currently returns a validation error.

Session Setup

library(whiteboxworkflows)

s <- wbw_session()
setwd('/data/vector')

Reading and Writing Vectors

polys  <- wbw_read_vector('parcels.shp')
lines  <- wbw_read_vector('roads.shp')
points <- wbw_read_vector('samples.shp')

meta <- polys$metadata()
cat('Feature count:', meta$num_features, '\n')
cat('Geometry type:', meta$geom_type, '\n')
cat('CRS:', meta$wkt, '\n')

Attribute Management

# Add a new field
wbw_add_field(i         = polys$file_path(),
  output    = 'parcels_v2.shp',
  field_name = 'AREA_HA',
  field_type = 'Float')

# Delete an unwanted field
wbw_delete_field(i          = 'parcels_v2.shp',
  output     = 'parcels_v3.shp',
  field_name = 'OLD_FIELD')

# Rename a field
wbw_rename_field(i             = 'parcels_v3.shp',
  output        = 'parcels_v4.shp',
  input_field   = 'AREA_HA',
  output_field  = 'HECTARES')

# Extract features by attribute value
wbw_extract_by_attribute(i            = polys$file_path(),
  output       = 'large_parcels.shp',
  field        = 'AREA_M2',
  filter_value = 10000.0,
  filter_type  = 'Greater Than')

Geometric Measurements

# Add area, perimeter, and basic shape metrics to polygon attribute table
wbw_add_polygon_coordinates_to_table(i      = polys$file_path(),
  output = 'parcels_geom.shp')

# Polygon shape index — compactness
wbw_compactness_ratio(i = polys$file_path(), output = 'parcels_compact.shp')

wbw_elongation_ratio(i = polys$file_path(), output = 'parcels_elong.shp')

wbw_related_circumscribing_circle(i = polys$file_path(), output = 'parcels_rcc.shp')

wbw_patch_orientation(i = polys$file_path(), output = 'parcels_orient.shp')

wbw_radius_of_gyration(i = polys$file_path(), output = 'parcels_rog.shp')

Geometric Operations

# Centroids
wbw_centroid_vector(i = polys$file_path(), output = 'centroids.shp')

# Convex hull
wbw_convex_hull(i = polys$file_path(), output = 'convex_hulls.shp')

# Minimum bounding envelopes
wbw_minimum_bounding_envelope(i = polys$file_path(), output = 'mbe.shp')

# Smooth vector polygons
wbw_smooth_vectors(i = polys$file_path(), output = 'parcels_smooth.shp', filter = 5)

# Simplify features (Douglas-Peucker)
wbw_simplify_line_or_polygon(i = polys$file_path(), output = 'parcels_simplified.shp',
  dist = 5.0, remove_spurs = TRUE, errors_only = FALSE)

# Dissolve polygons on field value
wbw_dissolve(i = polys$file_path(), output = 'parcels_dissolved.shp',
  field = 'LAND_USE', snap_tol = 0.001)

Spatial Overlay

# Clip
wbw_clip(i        = polys$file_path(),
  clip     = 'study_area.shp',
  output   = 'clipped.shp',
  snap_tol = 0.001)

# Intersect
wbw_intersect(i        = polys$file_path(),
  overlay  = 'zones.shp',
  output   = 'intersection.shp',
  snap_tol = 0.001)

# Erase
wbw_erase(i        = polys$file_path(),
  erase    = 'exclusion_areas.shp',
  output   = 'erased.shp',
  snap_tol = 0.001)

# Union
wbw_union(i        = polys$file_path(),
  overlay  = 'other_layer.shp',
  output   = 'union.shp',
  snap_tol = 0.001)

# Symmetrical difference
wbw_symmetrical_difference(i        = polys$file_path(),
  overlay  = 'other_layer.shp',
  output   = 'symdiff.shp',
  snap_tol = 0.001)

Proximity Analysis

# Euclidean distance from vector features
wbw_vector_points_to_raster(i = points$file_path(), output = 'points.tif', field = 'FID',
  assign = 'last', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')
wbw_euclidean_distance(i = 'points.tif', output = 'euclidean_dist.tif')

# Voronoi diagram
wbw_voronoi_diagram(i = points$file_path(), output = 'voronoi.shp')

Select by Location

wbw_select_by_location(input   = polys$file_path(),
  select  = 'stream_buffer.shp',
  output  = 'parcels_near_streams.shp',
  condition = 'within')

Spatial Join

wbw_spatial_join(target  = points$file_path(),
  join    = polys$file_path(),
  output  = 'points_joined.shp',
  condition = 'within',
  attr    = 'first')

Aggregation Strategies

# Join and aggregate field values from nearest polygon
for (stat in c('count', 'sum', 'mean', 'min', 'max')) {
  wbw_spatial_join(target    = 'zones.shp',
    join      = 'observations.shp',
    output    = paste0('zones_', stat, '.shp'),
    condition = 'contains',
    attr      = stat)
}

Vector-to-Raster Conversion

# Rasterize polygon layer
wbw_vector_polygons_to_raster(i = polys$file_path(), output = 'parcels_raster.tif',
  field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')

# Rasterize line layer
wbw_vector_lines_to_raster(i = lines$file_path(), output = 'roads_raster.tif',
  field = 'FID', nodata = TRUE, cell_size = 5.0, base = 'dem.tif')

# Rasterize points
wbw_vector_points_to_raster(i = points$file_path(), output = 'points_raster.tif',
  field = 'VALUE', assign = 'max', nodata = TRUE, cell_size = 5.0)

Field Calculator

# SQL-style field update through runtime tool invocation
fc_result <- s$run_tool(
  'field_calculator',
  list(
    input      = polys$file_path(),
    field      = 'AREA_HA',
    field_type = 'float',
    expression = 'CASE WHEN AREA_M2 IS NULL THEN NULL ELSE AREA_M2 / 10000.0 END',
    overwrite  = TRUE,
    output     = 'parcels_calc.gpkg'
  )
)

# Preview-only mode (no output write); returns preview payload
fc_preview <- s$run_tool(
  'field_calculator',
  list(
    input        = polys$file_path(),
    field        = 'SPEED',
    field_type   = 'integer',
    expression   = "CASE TYPE WHEN 'motorway' THEN 100 ELSE 60 END",
    overwrite    = TRUE,
    preview_rows = 10
  )
)

print(fc_preview$outputs$preview)

field_calculator now supports SQL-style CASE, UPDATE ... SET ... WHERE ... wrappers, SQL operators/null predicates, and CAST(... AS type) expressions.

Point Cluster Analysis

# Kernel density estimation (heat map)
wbw_kernel_density_estimation(i         = points$file_path(),
  output    = 'heatmap.tif',
  bandwidth = 200.0,
  kernel_type = 'quartic',
  cell_size = 5.0,
  base      = 'dem.tif')

# Hexagonal binning
wbw_create_hexagonal_vector_grid(i = 'study_area.shp', output = 'hex_grid.shp', width = 500.0, orientation = 'horizontal')
wbw_spatial_join(target = 'hex_grid.shp', join = points$file_path(),
  output = 'hex_counts.shp', condition = 'contains', attr = 'count')

WbW-Pro Spotlight: Market Access and Site Intelligence

Problem: Rank candidate sites using repeatable network-access and demand logic.
Tool: market_access_and_site_intelligence_workflow
Typical inputs: Network, existing sites, candidate sites, demand surface, drive-time rings.
Typical outputs: Catchment polygons, competitive-overlap layer, candidate-ranking CSV, executive summary JSON.

result <- s$run_tool(
  'market_access_and_site_intelligence_workflow',
  list(
    network                 = 'street_network.shp',
    sites_existing          = 'existing_sites.shp',
    sites_candidates        = 'candidate_sites.shp',
    demand_surface          = 'demand_points.shp',
    ring_costs              = c(5.0, 10.0, 15.0),
    catchments_output       = 'candidate_catchments.shp',
    overlap_analysis_output = 'competitive_overlap.shp',
    candidate_rank_csv      = 'candidate_rankings.csv',
    executive_summary_json  = 'market_summary.json'
  )
)

print(result)

Note: This workflow requires a session initialized with a valid Pro licence.

Complete Vector Analysis Workflow

library(whiteboxworkflows)

s <- wbw_session()
setwd('/data/vector_workflow')

parcels <- wbw_read_vector('parcels.shp')
study   <- wbw_read_vector('study_area.shp')
streams <- wbw_read_vector('streams.shp')

# 1. Clip to study area
wbw_clip(i = parcels$file_path(), clip = study$file_path(),
  output = 'parcels_clipped.shp', snap_tol = 0.001)

# 2. Add shape metrics
wbw_compactness_ratio(i = 'parcels_clipped.shp', output = 'parcels_shape.shp')

# 3. Buffer streams and intersect with parcels
wbw_buffer_raster(i = streams$file_path(), output = 'stream_buf.shp',
  size = 30.0, gridcells = FALSE)
wbw_intersect(i = 'parcels_shape.shp', overlay = 'stream_buf.shp',
  output = 'riparian_parcels.shp', snap_tol = 0.001)

# 4. Dissolve by land-use class
wbw_dissolve(i = 'riparian_parcels.shp', output = 'riparian_dissolved.shp',
  field = 'LAND_USE', snap_tol = 0.001)

# 5. Rasterize result
wbw_vector_polygons_to_raster(i = 'riparian_dissolved.shp', output = 'riparian.tif',
  field = 'LAND_USE_ID', nodata = TRUE, cell_size = 5.0)

cat('Vector analysis complete.\n')

Tips

Always validate topology before analysis: Run check_vector_topology() to detect overshoots, undershoots, self-intersections, and sliver polygons. Topological errors propagate through overlay and spatial join operations.
Build spatial indices on large layers: Large datasets (> 10,000 features) benefit from spatial indexing. Use build_spatial_index() explicitly before repeated spatial queries; operations like containment or proximity are fast with indices.
Choose your overlay operation carefully: Union retains all boundaries and combines attributes (can create many small slivers). Intersection keeps only overlapping regions. Difference retains Polygon A minus Polygon B. Test on small subsets first.
Dissolve reduces feature count and file size: After overlay, dissolve by ownership or category to collapse unnecessary edges. Dissolved layers render faster and are cleaner for publication.
Spatial joins are sensitive to alignment: Ensure both input layers use the same CRS and are free of topology errors. Reproject to equal-area projection before computing buffer distances or areas for analysis.
Buffer distance and units matter: Buffer distances are in map units (meters, feet, degrees). Use an equal-area projection if precise areas or distances are critical. Negative buffers can collapse small polygons (inset); test with small buffer values first.
Attribute table size is a memory constraint: Attribute tables with millions of rows and dozens of fields consume RAM. Export to CSV or database for large tables; work with summaries or samples when memory is limited.
Point-in-polygon operations scale with complexity: Containment tests are O(n) per point; on large datasets (> 1 million points), consider spatial index binning or vector-to-raster conversion for speed.

Whitebox Workflows for R User Manual