Pitfalls of building with large language…

Sep 27, 2023

Dealing with more than just hallucinations.

7 Comments

Sep 28, 2023

Great roundup! I'm working on combining language models with formal methods (e.g., via intermediate code generation) to try and make them more robust but it is definitely proving extra hard because even if you embed the actual correct answer in the prompt, the LM can still bypass it completely and pull from its faulty, hallucinating-prone long-memory memory. One approach I'm very excited about is grammar-restricted output.

Expand full comment

Reply (1)

Charlie Guo

Sep 28, 2023

I haven't seen grammar-restricted output before - do you have any links to papers or further reading on how it works?

Expand full comment

Reply (1)

Alejandro Piad Morffis

Sep 28, 2023

Oh, it's super cool. You take a BNF grammar as guidance and when sampling a new token, you squash the logits of any tokens that are not allowed by the grammar. Thus you can make the LLM output strict JSON, for example. Or anything context-free.

It doesn't guarantee the semantics will be ok of course, but at least the syntax will be strict. Worst case scenario you can get stuck during sampling if none of the tokens allowed by the grammar have nonzero probability.

Let me see if I can find a relevant link.

Expand full comment

Reply (2)

Charlie Guo

Sep 28, 2023

Wait that's awesome! (Also I haven't had to think about BNF grammars since undergrad haha)

It makes me think that we have a lot further to go with code-generating models specifically, since they're able to detect syntactically invalid output. Also, at what point do we stop having LLMs generate human-readable code to pass to other LLMs, and just have them generate more compressed instructions?

Expand full comment

Reply (2)

Alejandro Piad Morffis

Sep 28, 2023

Yeah you could sample in the space of AST nodes directly. Harder to find or build a suitable training set I suppose.

Expand full comment