This started while I was preparing my Big Data labs for my students. I wanted something closer to reality. Not only pipelines and Spark jobs, but how AI agents would actually interact with a governed data platform. So I set the environment as it should be in a serious setup. A lakehouse governed with AWS Lake Formation, metadata centralized in AWS Glue Data Catalog, and Spark handling execution. Clean, controlled, auditable. Then I added the missing piece. An agent that needs to understand intent and retrieve context, not just run queries. And almost immediately, the same request appeared. “We need a graph database for ontology.” In AWS terms, that means Amazon Neptune. I see this pattern often, not only with students. Also in real projects. Someone comes with a solution already decided. I always give the same answer. What is the business problem you are trying to solve? Because “I need Neptune” is not a requirement. It is a conclusion. When you force the conversation back...
Data Platform Architecture & AI Engineering
Essays, architecture insights and reflections on data, AI and society