Central Source of Truth

DBT As An Industry Shaper: The Pros and Cons

Not many products get to shape an entire industry and become a tool used by millions of people. dbt successfully did this in the data industry, enabling data teams to do things they couldn’t do before.

Given its mass adoption and the fact that many data teams today follow the Modern Data Stack paradigm led by dbt, it is worth noting how dbt has affected the data teams’ day-to-day workflows.

A Long Time ago…

Before dbt, to create data pipelines, data teams had two main alternatives:

Data Engineers

Data engineers with a relatively high level of technical skill used tools like Python and Spark to implement transformation logic with code and orchestrate data pipelines with tools like Airflow.

Because these tools required a high technical level, data engineering teams became bottlenecks to data initiatives, resulting in complex, inaccessible pipelines with slow delivery times.

Data Analysts

Analysts wrote data transformations and business logic in isolated, locally stored SQL files and/or directly into BI tools - causing siloed business logic, duplicated work, and inaccessible definitions.

The Big Bang

In the 2010s, cloud computing was booming, transforming entire industries.

The era of cloud computing

In the mid-2010s (2013-2015), cloud data warehouses such as Snowflake and BigQuery started gaining widespread adoption. These tools solved primitive core data challenges, making storage much cheaper and data computation much faster.

These new capabilities - unlimited storage and speed - had a profound impact on the data industry, creating an opportunity for a new breed of tools.

ETL vs ELT

With unlimited storage and compute, companies shifted from transforming data before storing it (which often led to data loss) to extracting and loading all raw data into a central data lake and transforming it later.

This new approach allowed companies to retain all data in its raw format indefinitely, transforming it as needed without worrying about data loss.

The Modern Data Stack

The main ideas behind the MDS are:

While tools like Fivetran, Stitch, Rivery, and Airbyte were built to solve the EL part, dbt is responsible for the T in ELT—the data transformations. Once the data is stored in the data lake (or data warehouse, in this case), data teams use dbt to transform raw data into analysis-ready data.

What’s So Great About dbt?

Now that we have the background in place, let’s map the main reasons for dbt’s product success - why do data teams love dbt?

Lowering the Technical Barrier for Data Teams with SQL-based Transformations

Data practitioners are most comfortable with SQL.

With dbt, anyone with basic SQL skills can write business logic, and voilà—dbt takes care of everything else: materialization, dependencies, version control, and scheduling (on dbt Cloud).

dbt also extends SQL with Jinja templates, unlocking powerful capabilities like loops, dynamic parameters, hooks, and macros. This allows for more flexible, reusable, and scalable data transformations while keeping SQL at the core—making dbt suitable for both analysts writing simple SQL and engineers needing richer capabilities and flexibility, all within the same tool.

In practice, dbt lowered the technical barrier, significantly reducing dependency on data engineers and empowering analysts and other SQL-savvy data practitioners to build and manage data transformations.

Applying Software Engineering Best Practices

Version control, testing, modularity, CI/CD, and documentation are all core software engineering principles that were rarely applied to data engineering before dbt. dbt successfully integrated these concepts, improving data team methodologies and best practices by borrowing from the software engineering world.

Centralized Business Logic

Instead of siloed logic, locally stored SQL scripts, or hidden transformations within BI tools, dbt allows data teams to maintain a shared project where everyone can easily view and contribute to the company’s business logic.

What’s Still Missing?

In practice - although the lives of data teams got better - we are still failing to deliver self-service analytics 

Why is that so? What are we still missing in order to fix that?

The data mess

dbt does not enforce any governance or structure.
It is great that anyone can contribute business logic with simple SQL code. But in reality, projects often become a mess of duplicated logic and outdated models—negatively affecting trust and increasing cloud costs.

In the era of AI, this mess is amplified tenfold, preventing companies from successfully implementing AI due to inconsistent and unclear data definitions.

dbt’s semantic layer aims to deliver consistency and enable AI, but it has yet to provide a valid solution to the growing complexity. (We will cover the differences between dbt Semantic Layer and Lynk in a future blog post).

Flexibility

dbt pioneered the shift-left movement, where many data teams moved business logic from BI tools like Looker into dbt. The idea was to manage business logic as code and reduce BI vendor lock-in.

However, hard-coded and materialized business logic makes it impossible to slice and filter data dynamically at the consumption level, limiting end-user flexibility.

Ideally, we need the best of both worlds: business logic should be centrally defined and managed with code, yet users should still have the flexibility to explore, slice, and aggregate data as needed.

Business Users Are Left Out

SQL is simple for most data teams, but it’s still a technical skill that many data consumers lack. This means non-technical analysts and business users still face significant barriers to accessing and influencing business logic.

Summary

dbt is great for data team members who are proficient in SQL.

It lowered the technical barrier for building data pipelines, allowing anyone with basic SQL skills to contribute to the company’s ELT process. It also introduced software engineering best practices, increasing trust in data.

However, dbt lacks governance and structure for data modeling—leading to increasing complexity, reduced productivity, inflated costs, and challenges in AI adoption.

dbt excels at technical data transformations, but we still need the flexibility to define logic once and consume it anywhere, while allowing users to slice and aggregate data as needed at the consumption level.

What’s next

Just as cloud computing revolutionized the data ecosystem and introduced new tools and paradigms, AI is now reshaping the landscape—and we’re excited to see what the future holds.

From my perspective, the next generation of data tools will adopt what’s great about dbt while adding structure, governance, and flexibility.

Keep

  1. SQL first approach
  2. Centralized business logic
  3. Software engineering best practices (code-first, version control, CI/CD, testing, modularity, etc.)

Add

  1. Further lower the technical barrier to make data accessible to non-technical team members
  2. Apply data modeling structure and governance to ensure consistency, increase trust, and reduce costs
  3. Provide flexibility while maintaining solid business definitions

dbt and Lynk

Lynk was built for the era of AI with these principles in mind. If your team is planning to integrate AI on top of your company’s data, we encourage you to try Lynk and see these concepts in action.

Step 1 - Normalized Data Model with dbt

Use dbt to clean raw data and create foundational dim and fact tables (we call this the “normalized data model”).

Step 2 - Denormalized Data Model with Lynk

Define business logic in the form of entities, features, and relationships in a simple, structured, and governed way—while keeping it flexible for consumption.

Lynk Discovery scans your database schemas, SQL scripts, and dbt projects to automatically extract metadata and accelerate onboarding.

Step 3 - AI apps

Your data is now AI-ready. Use Lynk’s smart AI apps or build your own AI applications using Lynk’s RAG as a service, to enable data consumers to interact with your data in a fun and trusted way.

Bring your data to the era of AI

Automate data workflows with consistency, clarity and trust, to enable AI and business users succeed with data

Request Access