Anticipating Databricks Data + AI Summit 2025: What Data Engineers Should Watch For

Table of Contents

Categories

Sign up for our newsletter

We care about the protection of your data. Read our Privacy Policy.

Futuristic data engineering workspace featuring a dark-themed interface with Databricks visuals, AI brain diagram, SQL code editor, and data dashboards—illustrating innovation ahead of the Databricks (Data + AI) Summit 2025.

As the Databricks Data + AI Summit 2025 kicks off today, I’m genuinely looking forward to seeing what’s next. Databricks has made impressive progress in GenAI integration, the developer experience, and platform governance. That said, there are still areas where I think the platform can go further—especially from the perspective of day-to-day data engineering work. Here are a few features I’m keeping an eye on, along with some hopes for what might come next.

 

1. Unity Catalogue: Lakehouse Federation

Unity Catalog was a game-changer when it launched—finally bringing consistent governance and discovery across databases like MySQL, PostgreSQL, Azure SQL, Redshift, and Snowflake. Lakehouse Federation, introduced more recently, builds on that by allowing users to query external data sources without ingesting them into Delta Lake. It’s a powerful abstraction that simplifies life for teams managing fragmented data environments.

What I’m hoping to see: While federation is already here, performance still feels like a work in progress. Cross-source queries sometimes suffer from latency and limited pushdown optimization. I’d love to see Databricks invest more in smart caching, parallel query planning, and source-side transforms to minimize data movement. Deeper native support for transformation logic at the source—especially beyond SQL—would be a huge win.

 

2. SQL Editor

The new SQL Editor was a welcome upgrade—it’s clean, fast, and much easier for analysts and engineers alike. Serverless SQL Warehouses make it even more accessible. However, one thing I’ve noticed is that you still need data to be registered within Unity Catalog, even when querying external systems.

What I’d love to see: Seamless support for querying and transforming data in external sources (like ADLS Gen2 or GCS) without needing to wrap everything in notebooks or move data into Delta. If the editor evolves into a truly federated workspace—one that blends live external access with real-time transformation—that could unlock a lot of flexibility for hybrid teams.

 

3. Mosaic AI

Mosaic AI has emerged as Databricks’ flagship for operationalizing GenAI, enabling teams to build and share reusable functions (SQL, Python, external calls) that work across the platform. It’s a step toward standardizing how organizations embed LLMs into workflows.

Still missing, in my opinion: There’s no real versioning yet for these functions, which can create chaos in shared environments. Also, the UI for managing these components could use some love—searching for reusable functions or understanding dependencies feels clunky. Adding Git-backed version control and lineage tracking would give it a more production-ready feel.

 

4. Mosaic AI Agent Framework

The agent framework is where things get exciting—it’s designed to bring Retrieval-Augmented Generation (RAG) to enterprise data using foundational models. From what I’ve seen, it’s still early, and much of it is gated behind private preview or internal testing.

What I’m hoping for: Broader access and clear integration pathways with third-party observability tools(e.g., Langfuse, Trulens) would be fantastic. Automated evaluation pipelines and built-in telemetry could turn Mosaic Agent Framework into a serious contender for production GenAI workloads—without requiring every team to roll their own infra.

 

5. AI/BI dashboard

The concept of AI/BI dashboards built on Unity Catalog is promising—especially with features like Genie, which lets users ask natural language questions like, “Show me some particular information that is not part of the current dashboard.” They go beyond static dashboards. It feels like a nod to Microsoft Copilot or ThoughtSpot Sage.

Still, it’s early days: Genie currently struggles with multi-source joins or complex queries, and dashboard interactivity (like page-level filtering or drill-through) can feel rigid. If they enhance cross-page filtering, query previews, and even introduce recommended visualizations based on past behavior, this could become a truly self-serve BI layer powered by GenAI.

 

Reflections from Snowflake Summit 2025:

The Snowflake Summit wrapped up just last week, and some announcements caught my attention—especially because they set a high bar for platforms like Databricks.

  • Snowflake Cortex & AI SQL: These allow users to process unstructured data (like images or PDFs) directly using SQL—without external tools. It’s powerful and removes a layer of friction for analysts.
  • Snowflake Intelligence: This agentic experience lets business users perform tasks like querying Slack messages, detecting inventory issues, or drafting emails—all within the guardrails of Snowflake’s native access controls.
  • OpenFlow: A new push to unify structured, semi-structured, and real-time data integrations. It’s an ambitious step toward streamlining what has traditionally been a very fragmented space.

 

These launches reinforce how fast the lines between AI, BI, and engineering are blurring. Databricks will no doubt respond—but I’m particularly curious about how. Will they double down on developer workflows, or expand their focus on business user enablement?

 

Closing Thoughts

If I had to sum it up, the 2025 Data + AI Summit could be a pivotal moment for Databricks—especially for engineers like me who want to move fast without compromising reliability or governance.

If they prioritize:

  • Smarter Lakehouse Federation performance
  • Better external data tooling in SQL Editor
  • Version control in Mosaic AI
  • More open evaluation for RAG agents
  • And a more intuitive, intelligent BI layer

 

…then I think we’ll walk away with a Databricks platform that’s not just GenAI-ready, but GenAI-useful.

It’s going to be an exciting week. Let’s see what they bring to the table.

We’ll be doing a follow-up blog after the Summit wraps, breaking down the biggest announcements and what they mean for engineering teams.
 In the meantime, if you’re exploring how to implement any of these new features within your organization, reach out to us—we’d love to help bring them to life through tailored consulting and hands-on enablement.

Subscribe to our newsletter

Stay informed with the latest insights, industry trends, and expert tips delivered straight to your inbox. Sign up for our newsletter today and never miss an update!

We care about the protection of your data. Read our Privacy Policy.

Keep reading

Dig deeper into data development by browsing our blogs…
Futuristic digital illustration of data modernization featuring neon blue icons on a dark grid background. The landscape image includes a glowing database, an upload-to-cloud symbol, a computer monitor with a line graph, and a gear with a brain, representing AI. The central text reads “DATA MODERNIZATION,” visually linking the components in a connected data ecosystem.

What Is Data Modernization?

In today’s hyper-competitive landscape, organizations that fail to modernize their data infrastructure risk falling behind on innovation, efficiency, and growth.Data modernization is no longer optional—it’s

Get in Touch

Let us leverage your data so that you can make smarter decisions. Talk to our team of data experts today or fill in this form and we’ll be in touch.