Managed tables
Databricks manages both the data itself and the table’s metadata. The table is stored in the root storage location that you configure when you create a metastore (option to overwrite root storage location at the catalog or schema levels)
default way to create tables in Unity Catalog.
not able to manage files in these tables using tools outside of Databricks.
always use the Delta table format
when you drop a managed table, its underlying data is deleted from your cloud tenant within 30 days
External tables
Data is stored outside of the managed storage location specified for the metastore, catalog, or schema.
Supports non-delta file formats (csv, parquet, avro, text, etc.).
DROP TABLE
doesn’t delete the underlying data (ability to manually delete usingdbutils.fs.rm("<your-storage-path>", true)
).before creating an external table, UC must have the storage credential that allows it to read from and write to the storage path and the external location that references the storage credential.
General rule of thumb to choose between these two:
Additional factors to consider:
managed tables generally have simpler data management and better query performance as the data is stored and managed by Databricks and it optimizes the storage format.
new features rolled out to managed tables first
external tables give you more data storage and data governance flexibility, providing more control over data lifecycle management, data security and costs.