Skip to main content

· 8 min read

cover

Retrieval Augmented Generation (RAG) has become a standard approach to building LLM interfaces on top of specific data, but it can struggle in cases where the input doesn't overlap meaningfully with the stored embeddings.

In this post, I want to explore a solution to this by introducing a thought-generation step. The process works as follows:

  1. Text chunks are first passed to a prompt to generate thoughts about the passage.
  2. The generated thoughts are then embedded
  3. RAG interface uses the generated thoughts as context

All tests are done using chatgpt-3.5-turbo and can be run using the notebook.

· 12 min read

Cover

Modern language models are great at producing, and understanding, short passages of text. But if you've ever tried to generate longer, more complicated content, it can quickly become a confusing mess as the model loses track of what it should focus on.

In this post we'll explore a technique published by DeepMind for creating screenplays using LLMs. Although this is focused on creative writing, I think the techniques could be employed more widely with some modifications.

I've created an accompanying notebook with all the code below if you want to try it out yourself.