I'm tired of LLM bullshitting. So I fixed it.

SuspciousCarrot78@lemmy.world · 1 month ago

I'm tired of LLM bullshitting. So I fixed it.

AliasAKA@lemmy.world · 1 month ago

I’ll look into this, but at first blush this is just mostly tool calling with RAG. This does not prevent a whole host of issues with AI, and doesn’t really prevent lying. The general premise here is to put tight guard rails on how it can interact with data, and in some cases entirely forcing a function / tool path with macros. I am not really sure this would work any better than just a stateful and traditional search algorithm on your own data sources, and would require much less hardware / battery / requirements and would be much more portable.

I like the effort, but this feels a bit like trying to make everything look like a nail.

SuspciousCarrot78@lemmy.world · 1 month ago

I’ll cop to that. At a high level it is “tool calling + RAG + guardrails”.

Ok.

But that’s sort of the point: boring plumbing that turns LLMs from improv actors into constrained components.

Addressing your points directly as I understand them -

1) Doesn’t prevent lying

If you mean “LLMs can still hallucinate in general”, yes. No argument. I curtailed them as much as I could with what I could.

But llama-conductor isn’t trying to solve “AI truth” as a metaphysical problem. It’s trying to solve a practical one:

In Mentats mode, the model is not allowed to answer from its own priors or chat history. It only gets a facts block from the Vault. No facts → refusal (not “best effort guess").

That doesn’t make the LLM truthful. It makes it incapable of inventing unseen facts in that mode unless it violates constraints - and then you can audit it because you can see exactly what it was fed and what it output.

So it’s not “solving lying,” it’s reducing the surface area where lying can happen. And making violations obvious.

2) Wouldn’t a normal search algorithm be better?

I don’t know. Would it? Maybe. If all you want is “search my docs,” then yes: use ripgrep + a UI. That’s lighter and more portable.

The niche here is when you want search + synthesis + policy:

bounded context (so the system doesn’t slow down / OOM after long chats)
deterministic short-term memory (JSON on disk, not “model remembers")
staged KB pipeline (raw docs → summaries with provenance → promote to Vault)
refusal-capable “deep think" mode for high-stakes questions

I think an algo or plain search engine can do wonders.

It doesn’t give you a consistent behavioral contract across chat, memory, and retrieval.

3) “Everything looks like a nail”

Maybe. But the nail I’m hitting is: “I want local LLMs to shut up when they don’t know, and show receipts when they do.”

That’s a perfectly cromulent nail to hit.

If you don’t want an LLM in the loop at all, you’re right - don’t use this.

If you do want one, this is me trying to make it behave like infrastructure instead of “vibes”.

Now let’s see Paul Allen’s code :P

I'm tired of LLM bullshitting. So I fixed it.

I'm tired of LLM bullshitting. So I fixed it.

llama-conductor

The thing: llama-conductor

1) KB mechanics that don’t suck (1990s engineering: markdown, JSON, checksums)

2) Mentats: proof-or-refusal mode (Vault-only)

3) Vodka: deterministic memory on a potato budget

1) Doesn’t prevent lying

2) Wouldn’t a normal search algorithm be better?

3) “Everything looks like a nail”