Working with Vectors
This chapter covers schema inspection, feature iteration, attribute reads/writes, and persistence workflows.
Vector processing is often more about data contracts than geometry mechanics. Schemas, field types, and attribute consistency determine whether downstream analysis remains trustworthy. The patterns below emphasize validating structure first, then applying deterministic edits, then persisting to stable interchange formats for downstream tools.
See Also: Online Sources
If you need to acquire vectors directly from web providers (starting with OSM Overpass), see the dedicated chapter:
Read and Inspect
This step establishes the schema contract your downstream edits depend on.
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
roads = wbe.read_vector('roads.gpkg')
schema = roads.schema()
print(schema)
print('features:', roads.feature_count())
Memory-Backed Vectors for Pipeline Efficiency
For workflows that chain multiple vector operations, memory-backed vectors eliminate disk I/O between steps. This is valuable for complex pipelines where intermediate results are passed between spatial operations.
Load a vector into memory with file_mode='m':
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
# Read directly into memory
roads = wbe.read_vector('roads.gpkg', file_mode='m')
rivers = wbe.read_vector('rivers.gpkg', file_mode='m')
print(roads.file_path) # prints: memory://vector/...
Memory-backed vectors are compatible with all downstream operations:
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
v = wbe.read_vector('polygons.gpkg', file_mode='m')
# Inspect schema and metadata
schema = v.schema()
meta = v.metadata()
# Pass to spatial tools
centroids = wbe.vector.geometry_processing.centroid_vector(v)
# Export to disk when ready
wbe.write_vector(centroids, 'centroids_final.gpkg')
Vector Memory Lifecycle
Memory-backed vectors persist until explicitly removed or cleared. For long-running vector pipelines, manage memory explicitly:
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
# Check current memory
print(f"Vectors in memory: {wbe.vector_memory_count()}")
# Read vectors
v1 = wbe.read_vector('large1.gpkg', file_mode='m')
v2 = wbe.read_vector('large2.gpkg', file_mode='m')
print(f"After reads: {wbe.vector_memory_count()}")
# Remove when done
wbe.remove_vector_from_memory(v1)
print(f"After remove: {wbe.vector_memory_count()}")
# Or clear all
wbe.clear_vector_memory()
print(f"After clear: {wbe.vector_memory_count()}")
Implicit Memory Output from Tools
All vector-output tools store their result in memory automatically when the
output parameter is omitted. You do not need to pass file_mode='m' or
choose a temporary path — simply leave output out and the returned Vector
object is already memory-backed:
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
roads = wbe.read_vector('roads.gpkg')
# No output path — result is stored in memory automatically
centroids = wbe.vector.geometry_processing.centroid_vector(roads)
print(centroids.file_path) # prints: memory://vector/...
# Chain operations without any intermediate files
clipped = wbe.vector.overlay_analysis.clip(centroids, 'boundary.gpkg')
print(clipped.file_path) # also memory://vector/...
# Persist the final result only
wbe.write_vector(clipped, 'result.gpkg')
This applies to all tool categories — GIS, hydrology, geomorphometry, and stream
network tools all follow the same rule. Providing an explicit output path
writes to disk as before.
Best practices:
- Use
file_mode='m'for intermediate spatial analysis results. - Export memory-backed vectors to disk with
write_vector()when persisting final outputs. - Call
remove_vector_from_memory()after a vector is no longer needed. - Use
clear_vector_memory()between independent analysis phases. - Use
clear_memory()when resetting all in-process raster/vector/lidar stores together.
Iterate Through Features
Use feature iteration for inspections, QA checks, or bespoke attribute rules.
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
v = wbe.read_vector('roads.gpkg')
n = v.feature_count()
for i in range(n):
attrs = v.attributes(i)
# attrs is dict-like; process values
print(i, attrs)
Read and Update Attribute Table
This example demonstrates single-field updates, grouped updates, and schema extension in one controlled sequence.
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
v = wbe.read_vector('roads.gpkg')
# Read one field value
name0 = v.attribute(0, 'name')
print('name[0]=', name0)
# Update one field
v.update_attribute(0, 'name', 'Main Street')
# Update multiple fields
v.update_attributes(1, {'speed': 50, 'class': 'collector'})
# Add a new field
v.add_field('reviewed', field_type='bool', default_value=False)
Persist Vector Outputs
This pattern shows both default extension behavior and explicit format control for reproducibility.
For complete write-option keys and allowed values, see Output Controls.
import whitebox_workflows as wb
wbe = wb.WbEnvironment()
roads = wbe.read_vector('roads.gpkg')
centroids = wbe.vector.geometry_processing.centroid_vector(roads)
# Extensionless output defaults to GeoPackage
wbe.write_vector(centroids, 'roads_centroids')
# Explicit output format
wbe.write_vector(buffered, 'roads_buffer.parquet', options={
'strict_format_options': True,
'geoparquet': {'compression': 'zstd'},
})
Practical Notes
- Use
schema()first to validate field names and types. - Prefer
update_attributes()for grouped edits to a feature. - Re-read and validate after major writes, especially when switching formats.
Vector Object Method Reference
Common simple properties such as file_path and file_name are omitted here so
the tables stay focused on callable Vector methods.
Schema and Attribute Access
| Method | Description |
|---|---|
schema | Return the vector schema, including field structure and geometry information. |
feature_count | Report how many features are present. |
attribute_fields, attribute_field_names | Inspect available attribute fields by full definition or by field name list. |
attribute | Read a single field value from one feature. |
attributes | Read all attribute values for one feature as a grouped record. |
add_field | Add a new attribute field to the dataset schema. |
update_attribute | Update one field in one feature. |
update_attributes | Update multiple fields in one feature at once. |
File, Metadata, and Copying
| Method | Description |
|---|---|
metadata | Return VectorMetadata describing file state, CRS, and feature count. |
absolute_path | Resolve the vector to an absolute file path string. |
parent_directory | Return the containing directory path. |
exists | Check whether the backing dataset exists on disk. |
get_short_filename, get_file_extension | Return convenience filename information. |
get_file_size_in_bytes, get_last_modified_unix_seconds | Inspect filesystem metadata for reporting or audit logs. |
deep_copy | Write a copied vector dataset to a derived or explicit output path. |
CRS and Geometry-Safe Persistence
| Method | Description |
|---|---|
crs_wkt, crs_epsg | Inspect CRS metadata as WKT text or EPSG code. |
set_crs_wkt, set_crs_epsg | Assign CRS metadata without moving feature coordinates. |
clear_crs | Remove CRS metadata so it can be assigned again explicitly. |
reproject | Reproject the vector dataset with explicit failure, topology, and antimeridian policies. |