Written by Paul
Introduction
When developing a map-based service, handling pin data is essential, and implementing clustering for multiple pins is often necessary. Key considerations in clustering include:
- Determining the "distance level" for clustering.
- Managing clusters based on zoom levels in client SDKs like Mapbox, Kakao Map, or Naver Map.
Model Setup
Using a Prisma schema, the required fields are
lat
, lng
, and geohash
, as shown below:model Location { lat Int lng Int geohash String title String // Additional fields as needed }
These fields support querying (via
geohash
) and clustering (via lat
and lng
).Querying Location Data with Geohash
On the backend, map data is queried, clustered, and sent to the client as an HTTP response. The following outlines how to query location data using
geohash
.Client Side
In map SDKs like Mapbox, Kakao Map, or Naver Map, you can retrieve the boundary coordinates (
bounds
) visible to the user, defined as SouthWestLng
, SouthWestLat
, NorthEastLng
, and NorthEastLat
. After obtaining these boundaries, install the ngeohash
library:$ yarn add ngeohash
Using
ngeohash
, generate all geohash
values within these bounds as follows:import ngeohash from 'ngeohash'; const bboxes = ngeohash.bboxes( swlat, swlng, nelat, nelng, precision - 2 > 1 ? precision - 2 : 1 );
To adjust precision based on zoom level, use the following function:
const getGeohashPrecision = (zoomLevel: number) => { if (zoomLevel <= 3) return 2; if (zoomLevel <= 6) return 4; if (zoomLevel <= 9) return 6; if (zoomLevel <= 12) return 7; return 8; };
The
bboxes
array now contains all geohash
values within the map’s boundaries. Pass these geohashes to the server via HTTP.Server Side
After parsing the geohashes from the query string, use them to query location data as follows:
const data = await database.location.findMany({ where: { OR: geohashes.map((prefix) => ({ geohash: { startsWith: prefix, } })) } })
The result will include all location data within the requested geohash regions.
Clustering with K-means
For clustering, you can use the
ml-kmeans
library:import { kmeans } from 'ml-kmeans'; const limit = 20; const kmeansData = kmeans( data.map((item) => [Number(item.lat), Number(item.lng)]), limit, {} );
The
limit
value defines the number of clusters and may need adjustment based on data size:const limit = data.length < 100 ? data.length : Math.min(data.length, geohashes.length);
Once clusters are created, you can associate the cluster IDs with the queried data:
const clusterData = kmeansData.cluster.map((clusterId, index) => ({ ...data[index], clusterId }));
Aggregating Clustered Data
Aggregate data for each cluster using
reduce
:const locationPins = clusterData.reduce( (acc, cluster) => { const { clusterId } = cluster; if (!acc[clusterId]) { acc[clusterId] = { title: '', sumLat: 0, sumLng: 0, locations: [] }; } acc[clusterId].sumLat += Number(cluster.lat); acc[clusterId].sumLng += Number(cluster.lng); acc[clusterId].locations.push({ title: cluster.title, lat: cluster.lat ? Number(cluster.lat) : 0, lng: cluster.lng ? Number(cluster.lng) : 0 }); return acc; }, {} as { [key: string]: { title: string, sumLat: number, sumLng: number, locations: { title: string, lat: number, lng: number }[] } } );
Now, transform the aggregated
locationPins
into a serialized DTO to return in the response:return Object.keys(locationPins).map( (key) => new LocationPinDTO({ lat: locationPins[key].sumLat / locationPins[key].locations.length, lng: locationPins[key].sumLng / locationPins[key].locations.length, count: locationPins[key].locations.length, locations: locationPins[key].locations }) );
Client Rendering
On the client side, parse the response data and render it on the map. Render a single pin if
count
is 1; otherwise, render a cluster icon.