A structured system that turns AI-assisted building into a predictable, repeatable process
Designing a Deterministic Workflow for
AI Agent Development
The Challenge:
I subscribe to a large number of AI and tech newsletters, and I kept running into the same issue. The same story would show up across multiple sources, often rewritten with different headlines but pointing to the same underlying news. Instead of helping me stay informed, it turned reading into a repetitive process.
​
I didn’t want summaries or rewritten content. I wanted a way to read each story once, in its strongest and most complete version, while still keeping everything else that was unique. This project became a focused first step toward building a more useful personal AI assistant.
My Role:
I approached this project as both a builder and a systems thinker. Rather than treating it as a narrow coding exercise, I saw it as a larger workflow design problem: how do you take a messy stream of semi-structured information, process it intelligently, and return something that feels cleaner, more useful, and more trustworthy than the original input?
​
I designed and built the full pipeline end to end, from email ingestion to final output. That included defining how stories should be extracted, what qualifies as a duplicate, and where AI should be used versus where simpler logic was more reliable. A key part of the work was moving away from loosely defined AI behavior and toward a structured workflow that I could understand, adjust, and trust.

My Process:
The finished system takes two simple inputs: an email folder and a date range. From there, it retrieves the newsletters in that range, extracts the individual stories, removes duplicates, and presents the remaining results in a clean browser-based digest with a downloadable PDF version. Although that flow sounds straightforward on the surface, the logic behind it required several layers of filtering, structuring, and refinement.
​
One of the earliest and most important parts of the pipeline was turning raw email HTML into usable story records. That meant identifying titles, body content, links, metadata, and content boundaries in newsletters that were often formatted in inconsistent and unpredictable ways. Once those story records were created, I implemented a two-stage deduplication process. First, I used embeddings on the body text rather than the titles, since titles alone were often too inconsistent to be reliable. This allowed the system to group stories based on similarity in meaning rather than exact wording.
​
From there, I used an LLM to refine those groupings and determine whether the stories in a cluster were actually the same story, merely related, or completely different. That second pass was important because semantic similarity alone can be too broad. Different articles about the same general topic can look close in vector space while still being meaningfully distinct. Once true duplicates were identified, the system selected the strongest version by favoring the most complete story, typically the one with the longest body, a clear title, and usable links. I intentionally moved away from summarizing or rewriting articles, because preserving the original wording and source context made the final output more trustworthy and more useful.
Real-World Challenges:
The most technically demanding part of this project was not calling AI models or building the interface. It was dealing with the messiness of real newsletter content. Email HTML is wildly inconsistent, and each source tends to have its own quirks, structural patterns, and formatting decisions. In some cases, multiple stories were packed into a single visual block with almost no structural separation. In others, decorative elements such as pull quotes, sponsor sections, or navigation content looked enough like article content that they had to be actively filtered out. There were also formatting artifacts introduced during HTML-to-text conversion that caused unrelated sections to bleed together in ways that broke story extraction.
​
Defining what counted as a duplicate turned out to be another major challenge. Two newsletters could be covering the exact same story while using completely different headlines and emphasizing different details. At the same time, two stories could look similar on the surface while actually being distinct developments. That meant I needed a system that was broad enough to catch duplication, but careful enough not to collapse separate stories into one. The final approach, using embeddings for broad semantic grouping and LLM comparison for precision, gave me a much better balance between recall and accuracy.
​
On a personal level, one of the biggest shifts was learning to understand and control the system rather than just relying on AI outputs. Early versions worked, but not in a way I could clearly reason about. Once I clarified the goal and restructured the workflow, the system became much more predictable and easier to improve.

The Outcome:
The result is a working MVP that transforms a crowded collection of overlapping newsletters into a single digest of distinct stories that is much easier to read and navigate. Instead of forcing the user to scan the same news repeatedly across different sources, the system preserves one strong version of each repeated story while still keeping the rest of the unique reporting intact. The output is cleaner, more focused, and significantly more useful than the raw inbox experience that inspired it.
​
Beyond the immediate utility of the tool, this project also marked an important shift in how I think about AI systems. It pushed me to move from simply using AI to structuring it more intentionally, with clearer stages, decision points, and tradeoffs. It also showed me that the most interesting work often lives in the messy space between inputs and outputs, where real-world data, human goals, and system behavior all have to be reconciled in a way that is both practical and reliable.
What’s Next:
This project is still in progress, and there are several future phases I would like to build out as I continue refining it. The highest priority is improving story extraction across a wider range of newsletter formats, since parsing remains one of the most complex and important parts of the workflow. I would also like to add real-time progress streaming so the generation process feels more transparent while a digest is running.
​
Beyond that, I have plans for additional features such as scheduled digest generation, cross-run deduplication, persisted embeddings, story prioritization, digest history, runtime configuration through the interface, and eventually broader support for multiple accounts or content sources. While the current version is intentionally focused and practical, the longer-term vision is to evolve this into a more complete system that reduces repetitive information work and surfaces what actually matters.