planetiler/ARCHITECTURE.md

141 wiersze
8.5 KiB
Markdown

# Planetiler Architecture
2021-10-20 01:57:47 +00:00
![Architecture Diagram](diagrams/architecture.png)
Planetiler builds a map in 3 phases:
2021-10-20 01:57:47 +00:00
2022-09-23 10:49:09 +00:00
1. [Process Input Files](#1-process-input-files) according to the [Profile](#profiles) and write vector tile features to
intermediate files on disk
2. [Sort Features](#2-sort-features) by tile ID
3. [Emit Vector Tiles](#3-emit-vector-tiles) by iterating through sorted features to group by tile ID, encoding, and
2023-01-17 12:05:45 +00:00
writing to the output tile archive
2021-10-20 01:57:47 +00:00
2022-09-23 10:49:09 +00:00
User-defined [profiles](#profiles) customize the behavior of each part of this pipeline.
2021-10-20 01:57:47 +00:00
## 1) Process Input Files
First, Planetiler
reads [SourceFeatures](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/SourceFeature.java)
from each input source:
2021-10-20 01:57:47 +00:00
- For "simple
sources" [NaturalEarthReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/NaturalEarthReader.java)
or [ShapefileReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/ShapefileReader.java) can get the
2021-10-20 01:57:47 +00:00
latitude/longitude geometry directly from each feature
- For OpenStreetMap `.osm.pbf` files,
[OsmReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/osm/OsmReader.java)
2021-10-20 01:57:47 +00:00
needs to make 2 passes through the input file to construct feature geometries:
- pass 1:
- nodes: store node latitude/longitude locations in-memory or on disk
using [LongLongMap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMap.java)
2021-10-20 01:57:47 +00:00
- ways: nothing
2022-09-23 10:49:09 +00:00
- relations: call `preprocessOsmRelation` on the [profile](#profiles) and store information returned for each
relation of
2021-10-20 01:57:47 +00:00
interest, along with relation member IDs in-memory using
a [LongLongMultimap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMultimap.java).
2021-10-20 01:57:47 +00:00
- pass 2:
- nodes: emit a point source feature
- ways:
- lookup the latitude/longitude for each node ID to get the way geometry and relations that the way is contained
in
- emit a source feature with the reconstructed geometry which can either be a line or polygon, depending on
the `area` tag and whether the way is closed
- if this way is part of a multipolygon, also save the way geometry in-memory for later in
a [LongLongMultimap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMultimap.java)
2021-10-20 01:57:47 +00:00
- relations: for any multipolygon relation, fetch the member geometries and attempt to reconstruct the multipolygon
geometry
using [OsmMultipolygon](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/osm/OsmMultipolygon.java),
then emit a polygon source feature with the reconstructed geometry if successful
2021-10-20 01:57:47 +00:00
Then, for each [SourceFeature](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/SourceFeature.java),
2022-09-23 10:49:09 +00:00
generate vector tile features according to the [profile](#profiles) in a worker thread (default 1 per core):
2021-10-20 01:57:47 +00:00
2022-09-23 10:49:09 +00:00
- Call `processFeature` method on the [profile](#profiles) for each source feature
2021-10-20 01:57:47 +00:00
- For every vector tile feature added to
the [FeatureCollector](planetiler-core/src/main/java/com/onthegomap/planetiler/FeatureCollector.java):
- Call [FeatureRenderer#accept](planetiler-core/src/main/java/com/onthegomap/planetiler/render/FeatureRenderer.java)
2021-10-20 01:57:47 +00:00
which for each zoom level the feature appears in:
- Scale the geometry to that zoom level
- Simplify it in screen pixel coordinates
- Use [TiledGeometry](planetiler-core/src/main/java/com/onthegomap/planetiler/render/TiledGeometry.java)
2021-10-20 01:57:47 +00:00
to slice the geometry into subcomponents that appear in every tile it touches using the stripe clipping algorithm
derived from [geojson-vt](https://github.com/mapbox/geojson-vt):
- `sliceX` splits the geometry into vertical slices for each "column" representing the X coordinate of a vector
tile
- `sliceY` splits each "column" into "rows" representing the Y coordinate of a vector tile
- Uses an [IntRangeSet](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/IntRangeSet.java) to
2021-10-20 01:57:47 +00:00
optimize processing for large filled areas (like oceans)
2021-10-24 13:07:07 +00:00
- If any features wrapped past -180 or 180 degrees longitude, repeat with a 360 or -360 degree offset
2021-10-20 01:57:47 +00:00
- Reassemble each vector tile geometry and round to tile precision (4096x4096)
- For
polygons, [GeoUtils#snapAndFixPolygon](planetiler-core/src/main/java/com/onthegomap/planetiler/geo/GeoUtils.java)
2021-10-20 01:57:47 +00:00
uses [JTS](https://github.com/locationtech/jts) utilities to fix any topology errors (i.e. self-intersections)
introduced by rounding. This is very expensive, but necessary since clients
like [MapLibre GL JS](https://github.com/maplibre/maplibre-gl-js) produce rendering artifacts for invalid
polygons.
- Encode the feature into compact binary format
using [FeatureGroup#newRenderedFeatureEncoder](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/FeatureGroup.java)
2021-10-20 01:57:47 +00:00
consisting of a sortable 64-bit `long` key (zoom, x, y, layer, sort order) and a binary value encoded
using [MessagePack](https://msgpack.org/) (feature group/limit, feature ID, geometry type, tags, geometry)
- Add the encoded feature to
a [WorkQueue](planetiler-core/src/main/java/com/onthegomap/planetiler/worker/WorkQueue.java)
2021-10-20 01:57:47 +00:00
Finally, a single-threaded writer reads encoded features off of the work queue and writes them to disk
using [ExternalMergeSort#add](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java)
2021-10-20 01:57:47 +00:00
- Write features to a "chunk" file until that file hits a size limit (i.e. 1GB) then start writing to a new file
## 2) Sort Features
[ExternalMergeSort](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java) sorts all
of the intermediate features using a worker thread per core:
2021-10-20 01:57:47 +00:00
- Read each "chunk" file into memory
- Sort the features it contains by 64-bit `long` key
- Write the chunk back to disk
## 3) Emit Vector Tiles
[TileArchiveWriter](planetiler-core/src/main/java/com/onthegomap/planetiler/archive/TileArchiveWriter.java) is the main
driver.
First, a single-threaded reader reads features from disk:
2021-10-20 01:57:47 +00:00
- [ExternalMergeSort](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java) emits
sorted features by doing a k-way merge using a priority queue of iterators reading from each sorted chunk
- [FeatureGroup](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/FeatureGroup.java) collects
consecutive features in the same tile into a `TileFeatures` instance, dropping features in the same group over the
grouping limit to limit point label density
- Then [TileArchiveWriter](planetiler-core/src/main/java/com/onthegomap/planetiler/archive/TileArchiveWriter.java)
groups tiles
into variable-sized batches for workers to process (complex tiles get their own batch to ensure workers stay busy
while the writer thread waits for finished tiles in order)
2021-10-20 01:57:47 +00:00
Then, process tile batches in worker threads (default 1 per core):
- For each tile in the batch, first check to see if it has the same contents as the previous tile to avoid re-encoding
the same thing for large filled areas (i.e. oceans)
- Encode using [VectorTile](planetiler-core/src/main/java/com/onthegomap/planetiler/VectorTile.java)
2021-10-20 01:57:47 +00:00
- gzip each encoded tile
- Pass the batch of encoded vector tiles to the writer thread
2023-01-17 12:05:45 +00:00
Finally, a single-threaded writer writes encoded vector tiles to the output archive format:
2021-10-20 01:57:47 +00:00
2023-01-17 12:05:45 +00:00
- For MBTiles, create the largest prepared statement supported by SQLite (999 parameters)
2021-10-20 01:57:47 +00:00
- Iterate through finished vector tile batches until the prepared statement is full, flush to disk, then repeat
- Then flush any remaining tiles at the end
2022-09-23 10:49:09 +00:00
## Profiles
To customize the behavior of this pipeline, custom profiles implement
the [Profile](planetiler-core/src/main/java/com/onthegomap/planetiler/Profile.java) interface to override:
- what vector tile features to generate from an input feature
- what information from OpenStreetMap relations we need to save for later use
- how to post-process vector features grouped into a tile before emitting
A Java project can implement this interface and add arbitrarily complex processing when overriding the methods.
The [custommap](planetiler-custommap) project defines
a [ConfiguredProfile](planetiler-custommap/src/main/java/com/onthegomap/planetiler/custommap/ConfiguredProfile.java)
implementation that loads instructions from a YAML config file to dynamically control how schemas are generated without
needing to write or compile Java code.