This started while I was preparing my Big Data labs for my students. I wanted something closer to reality. Not only pipelines and Spark jobs, but how AI agents would actually interact with a governed data platform. So I set the environment as it should be in a serious setup. A lakehouse governed with AWS Lake Formation, metadata centralized in AWS Glue Data Catalog, and Spark handling execution. Clean, controlled, auditable. Then I added the missing piece. An agent that needs to understand intent and retrieve context, not just run queries. And almost immediately, the same request appeared. “We need a graph database for ontology.” In AWS terms, that means Amazon Neptune. I see this pattern often, not only with students. Also in real projects. Someone comes with a solution already decided. I always give the same answer. What is the business problem you are trying to solve? Because “I need Neptune” is not a requirement. It is a conclusion. When you force the conversation back...
There’s something powerful about seeing architecture come to life outside of slides and tools. Today I want to recognize Ahra, one of my students, who took the time to translate her understanding of our Big Data labs into a hand-drawn reference architecture. Not only is it correct in structure, but it reflects clarity of thought and ownership of the problem. From ingestion to Bronze using a dual pipeline pattern (batch and streaming), through data quality and standardization in Silver, to consumption-ready Gold for BI and AI use cases, she captured the full journey. Even more interesting, she extended the architecture into Labs 11 and 12, incorporating natural language AI agents capable of querying the data platform using skills, RAG, and process knowledge. This is exactly the mindset we aim for: not just using tools, but designing systems that solve real problems and enable business interaction through AI. Kudos to you, Ahra. This is the kind of thinking that builds real data architec...