Skip to main content

· 8 min read


Retrieval Augmented Generation (RAG) has become a standard approach to building LLM interfaces on top of specific data, but it can struggle in cases where the input doesn't overlap meaningfully with the stored embeddings.

In this post, I want to explore a solution to this by introducing a thought-generation step. The process works as follows:

  1. Text chunks are first passed to a prompt to generate thoughts about the passage.
  2. The generated thoughts are then embedded
  3. RAG interface uses the generated thoughts as context

All tests are done using chatgpt-3.5-turbo and can be run using the notebook.

· 12 min read


Modern language models are great at producing, and understanding, short passages of text. But if you've ever tried to generate longer, more complicated content, it can quickly become a confusing mess as the model loses track of what it should focus on.

In this post we'll explore a technique published by DeepMind for creating screenplays using LLMs. Although this is focused on creative writing, I think the techniques could be employed more widely with some modifications.

I've created an accompanying notebook with all the code below if you want to try it out yourself.