Keeping Cool in a Hot Space: A No BS Framework for AI Products

In a fast moving space like AI, how do we as investors determine what’s hype and what’s the real deal?

We find ourselves asking these questions all the time, so in this short essay we’ll propose a framework to segment today’s AI companies. To make things a bit more digestible, we’ll limit our discussion strictly to products built around large language models. We’ll provide some working definitions, observations from our dealflow, and some of the open questions we have about each.

Note: AI is an evolving space where definitions are hotly debated and the lines between different products blur constantly. The categories and examples we provide here aren’t conclusive or exhaustive, but we think they’re helpful to share as a starting point.

Tools

Best example: Granola (also our team’s favorite)

We think of tools as standalone products that focus on a single workflow like notetaking, video editing, or research. Most of the time, tools are triggered manually and have well-defined start/stop points (e.g. transcribing a fixed-length video call, generating a video or compiling research.) Granola is a great example; our meetings are transcribed, enhanced with AI, and sent around to the different apps we use.

Although we love using them, building investment conviction in tools is tough. Consider how just a few short months ago, generating research with up-to-date sources was unique to Perplexity, whereas today it’s common across a variety of tools like Gemini, Grok, and ChatGPT. Similar “feature bleed” happens in a variety of ways to early-stage companies and can quickly erode what might look like a competitive advantage.

In general, what tools gain from foundation model improvements, they lose equally in their proximity to them. Many a startup has been absorbed by a hyperscaler bumping up against their use case, hence the (possibly deprecated) meme of OpenAI killing startups with every new model release. Breakout products like Granola seem to have decent staying power on user experience alone, but we’ve seen countless other “cool” tools made obsolete. It seems the answer might be that building right on the edges of a hyperscaler’s roadmap – where users can’t get a similar workflow from the base model alone – is the safest place to be.

Essential questions we like to ask:

How do you respond to foundation model improvements?
Is the workflow you focus on defensible?

Assistants

Best example: Glean (and now Onyx)

Assistants are the messy middle of AI right now and represent the vast majority of companies. Whereas tools mainly solve isolated problems or replace workflows, most assistants today look more like an “intelligence layer” wrapped around a customer’s data. Architecturally, they tend to look like an internal chat interface that can access documents and connect to other software.

What’s strange about assistants is that the workflows built around them are typically very “light” – if they can be called workflows at all. Common patterns include generating documents or proposals, summarizing information, and helping to find documents with retrieval-augmented generation. Base models themselves do most of the heavy lifting, which is nice from a macro model-advancement perspective but raises questions on long-term defensibility/durability.

Most investors today believe the “moat” for these products will come from proprietary data. While this is convenient (and logical) it is more often just an extension of classic vertical SaaS thinking. We find in our evaluations of companies that the quality and depth of “proprietary” data varies greatly.

Essential questions we like to ask:

What exactly is the proprietary data you capture? What makes it impactful?
What workflow do you most want to augment/replace, and why is it valuable?
What does this product evolve into on a five to ten year timeline?

Agents

Best example: Deep Research

Agents are the true wild west in technology right now. We think there are far fewer true agents in-market than it seems, but the lines are starting to blur as Assistant products gradually build in agentic features. In our view the frontier of agents is centered on autonomy and customer trust. These are systems that are set up with guidelines – not rules – and operate without human intervention or oversight by design.

The big challenge we’ve observed so far is that people outside the tech community don’t necessarily trust agents yet. The technology to deploy autonomous agents (within some boundary of acceptable error) is already here, but many enterprise customers still want deeper observability and control into how LLM-based systems actually arrive at decisions.

I think Dario has the most street cred to speak to this, and I think he’s likely right that it *could* write all the code. But enterprises will not let it do that for at least a couple of years. https://t.co/UHMWOOtSHM
— Amjad Masad (@amasad) March 11, 2025

Pricing models are also an open question as token cost is highly variable for multi-agent systems, even on a per-query or per-task basis. We’ve seen a variety of different paradigms from outcome-based (where a task completion earns revenue) to hybrid (blended subscription with rate limits) to 100% “token markup” business models that look more like API-as-a-product. The ascendancy of reasoning models and the additional cost of test-time scaling only adds to this complexity.

We find that the most exciting work in AI is happening at these edges where agents are allowed an almost “scary” level of responsibility. The question in these cases is less about what workflow an agent replaces, and more about what new abilities someone gains when using (or working with) one.

Essential questions we like to ask:

What’s the underlying cost structure for an agent?
How do customers want to pay? How are they willing to pay?
To what extent do your customers trust agents?

Again, these are just the rough categories we see most companies fall into. We’re most excited, of course, for the founders breaking this mold and building against entirely different product paradigms. We’ll be tracking this space closely and look forward to updating this framework as things evolve.

To keep up with the latest from Alpaca VC, connect with us on Twitter, Instagram, or LinkedIn, subscribe to our bi-weekly newsletter The Rundown here, or by reaching out directly to [email protected].

Disclaimer: Alpaca VC Investment Management LLC is a registered investment adviser with the U.S. Securities and Exchange Commission. Information presented is for informational purposes only and does not intend to make an offer or solicitation for the sale or purchase of any securities. Alpaca VC’s website and its associated links offer news, commentary, and generalized research, not personalized investment advice. Nothing on this website should be interpreted to state or imply that past performance is an indication of future performance. All investments involve risk and unless otherwise stated, are not guaranteed. Be sure to consult with a tax professional before implementing any investment strategy. Past performance is not indicative of future results. Statements may include statements made by Alpaca VC portfolio company executives. The portfolio company executive has not received compensation for the above statement and this statement is solely his opinion and representative of his experience with Alpaca VC. Other portfolio company executives may not necessarily share the same view. An executive in an Alpaca VC portfolio company may have an incentive to make a statement that portrays Alpaca VC in a positive light as a result of the executive’s ongoing relationship with Alpaca VC and any influence that Alpaca VC may have or had over the governance of the portfolio company and the compensation of its executives. It should not be assumed that Alpaca VC’s investment in the referenced portfolio company has been or will ultimately be profitable.

Keeping Cool in a Hot Space: A No BS Framework for AI Products

Tools

Best example: Granola (also our team’s favorite)

Essential questions we like to ask:

Assistants

Best example: Glean (and now Onyx)

Essential questions we like to ask:

Agents

Best example: Deep Research

Essential questions we like to ask:

Subscribe toour newsletter

Subscribe to
our newsletter