Class DuplicatesFilter<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry>
java.lang.Object
org.apache.sedona.core.joinJudgement.DuplicatesFilter<U,T>
- Type Parameters:
U-T-
- All Implemented Interfaces:
Serializable,org.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U, T>>, Iterator<org.apache.commons.lang3.tuple.Pair<U, T>>>
public class DuplicatesFilter<U extends org.locationtech.jts.geom.Geometry,T extends org.locationtech.jts.geom.Geometry>
extends Object
implements org.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>,Iterator<org.apache.commons.lang3.tuple.Pair<U,T>>>
Provides optional de-dup logic. Due to the nature of spatial partitioning, the same pair of
geometries may appear in multiple partitions. If that pair satisfies join condition, it will be
included in join results multiple times. This duplication can be avoided by (1) choosing spatial
partitioning that doesn't allow for overlapping partition extents and (2) reporting a pair of
matching geometries only from the partition whose extent contains the reference point of the
intersection of the geometries.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionDuplicatesFilter(org.apache.spark.broadcast.Broadcast<DedupParams> dedupParamsBroadcast) -
Method Summary
-
Constructor Details
-
DuplicatesFilter
-
-
Method Details
-
call
public Iterator<org.apache.commons.lang3.tuple.Pair<U,T>> call(Integer partitionId, Iterator<org.apache.commons.lang3.tuple.Pair<U, T>> geometryPair) throws Exception- Specified by:
callin interfaceorg.apache.spark.api.java.function.Function2<Integer,Iterator<org.apache.commons.lang3.tuple.Pair<U extends org.locationtech.jts.geom.Geometry, T extends org.locationtech.jts.geom.Geometry>>, Iterator<org.apache.commons.lang3.tuple.Pair<U extends org.locationtech.jts.geom.Geometry, T extends org.locationtech.jts.geom.Geometry>>> - Throws:
Exception
-