pyspark.sql.functions.map_concat#
- pyspark.sql.functions.map_concat(*cols)[source]#
Map function: Returns the union of all given maps.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
ColumnA map of merged entries from other maps.
Notes
For duplicate keys in input maps, the handling is governed by spark.sql.mapKeyDedupPolicy. By default, it throws an exception. If set to LAST_WIN, it uses the last map’s value.
Examples
Example 1: Basic usage of map_concat
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, 'c') as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +------------------------+ |map_concat(map1, map2) | +------------------------+ |{1 -> a, 2 -> b, 3 -> c}| +------------------------+
Example 2: map_concat with overlapping keys
>>> from pyspark.sql import functions as sf >>> originalmapKeyDedupPolicy = spark.conf.get("spark.sql.mapKeyDedupPolicy") >>> spark.conf.set("spark.sql.mapKeyDedupPolicy", "LAST_WIN") >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(2, 'c', 3, 'd') as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +------------------------+ |map_concat(map1, map2) | +------------------------+ |{1 -> a, 2 -> c, 3 -> d}| +------------------------+ >>> spark.conf.set("spark.sql.mapKeyDedupPolicy", originalmapKeyDedupPolicy)
Example 3: map_concat with three maps
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a') as map1, map(2, 'b') as map2, map(3, 'c') as map3") >>> df.select(sf.map_concat("map1", "map2", "map3")).show(truncate=False) +----------------------------+ |map_concat(map1, map2, map3)| +----------------------------+ |{1 -> a, 2 -> b, 3 -> c} | +----------------------------+
Example 4: map_concat with empty map
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map() as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +----------------------+ |map_concat(map1, map2)| +----------------------+ |{1 -> a, 2 -> b} | +----------------------+
Example 5: map_concat with null values
>>> from pyspark.sql import functions as sf >>> df = spark.sql("SELECT map(1, 'a', 2, 'b') as map1, map(3, null) as map2") >>> df.select(sf.map_concat("map1", "map2")).show(truncate=False) +---------------------------+ |map_concat(map1, map2) | +---------------------------+ |{1 -> a, 2 -> b, 3 -> NULL}| +---------------------------+