What's an asset?

An asset is an object in persistent storage that captures some understanding of the world. If you have an existing data pipeline, you’re likely already creating assets. For example, your pipeline might incorporate objects like:

A database table or view, such as those in a Google BigQuery data warehouse
A file, such as a file in your local machine or blob storage like Amazon S3
A machine learning model
An asset from an integration, like a dbt model or a Fivetran connector

Assets aren’t limited to just the objects listed above - these are just some common examples.

Anatomy of an asset

To create an asset, you write code that describes an asset that you want to exist, along with any other assets that the asset is derived from, and a function that computes the contents of the asset.

Specifically, an asset includes:

An @asset decorator. This tells Dagster that the function produces an asset.
An asset key that uniquely identifies the asset in Dagster. By default, this is the function name. However, asset keys can have prefixes, much like how files are in folders or database tables are in schemas.
A set of upstream asset dependencies, referenced using their asset keys. We’ll talk about this more in the next lesson, which focuses on asset dependencies.
A Python function that defines how the asset is computed.

Let’s look at our cookie example to demonstrate. The following code creates a cookie_dough asset, which depends on the upstream dry_ingredients and wet_ingredients assets:

@asset
def cookie_dough(dry_ingredients, wet_ingredients):
    return dry_ingredients + wet_ingredients

When naming assets, it’s best practice to use a noun, specifically a descriptor of what is produced, and not the steps required to produce it.

For example, the example asset combines the dry_ingredients and wet_ingredients assets to create cookie dough. We named it cookie_dough because that’s what the asset produces, whereas a name like combine_ingredients focuses on an action and not the end result.