Enrich tables with custom metadata

Databricks recommends that you always create comments for tables and columns in tables. You can generate these comments using AI. See Add AI-generated comments to Unity Catalog objects.

Unity Catalog also has the ability to tag data. See Apply tags to Unity Catalog securable objects.

Log messages for individual commits to tables in a field in the transaction log.

Set user-defined commit metadata

Specify user-defined strings as metadata in commits using the DataFrameWriter option userMetadata. You can use this option with any write mode, including append and overwrite. This user-defined metadata is readable in the DESCRIBE HISTORY operation. For more information, see Work with table history.

SQL

For Delta tables:

SET spark.databricks.delta.commitInfo.userMetadata=overwrite-comment
INSERT OVERWRITE target_table SELECT * FROM data_source

For Iceberg tables:

SET spark.databricks.iceberg.commitInfo.userMetadata=overwrite-comment
INSERT OVERWRITE target_table SELECT * FROM data_source

Python

df.write \
  .mode("overwrite") \
  .option("userMetadata", "overwrite-comment") \
  .saveAsTable("target_table")

df.write \
  .mode("append") \
  .option("userMetadata", "append-comment") \
  .saveAsTable("target_table")

userMetadata works with any write mode, including overwrite and append.

Scala

df.write
  .mode("overwrite")
  .option("userMetadata", "overwrite-comment")
  .saveAsTable("target_table")

df.write
  .mode("append")
  .option("userMetadata", "append-comment")
  .saveAsTable("target_table")

userMetadata works with any write mode, including overwrite and append.

Notes on compute types

On classic compute, you can also specify user-defined commit metadata using the SparkSession configuration keys spark.databricks.delta.commitInfo.userMetadata (Delta) or spark.databricks.iceberg.commitInfo.userMetadata (Iceberg). If both the DataFrameWriter option userMetadata and the SparkSession configuration are specified, the DataFrameWriter option takes precedence.

On serverless compute, use the DataFrameWriter option userMetadata directly. The SparkSession configuration keys for commit metadata are not supported on serverless compute.