Skip to content

Memory RAG

Memory RAG is a simple approach that boosts LLM accuracy from ~50% to 90-95% compared to GPT-4. It creates contextual embeddings that capture meaning and relationships, allowing smaller models to achieve high accuracy without complex RAG setups or fine-tuning overhead.

Quickstart

First, make sure your API key is set (get yours at app.lamini.ai):

export LAMINI_API_KEY="<YOUR-LAMINI-API-KEY>"

To use Memory RAG, build an index by uploading documents and selecting a base open-source LLM. This is the model you'll use to query over your documents.

Initialize the MemoryRAG client:

from lamini import MemoryRAG

client = MemoryRAG("meta-llama/Meta-Llama-3.1-8B-Instruct")

Define the PDF file to embed:

lamini_wikipedia_page_pdf = "https://huggingface.co/datasets/sudocoder/lamini-wikipedia-page/blob/main/Lamini-wikipedia-page.pdf"

Embed and build the Memory RAG Index:

response = client.memory_index(documents=[lamini_wikipedia_page_pdf])

curl --location 'https://api.lamini.ai/alpha/memory-rag/train' \
    --header 'Authorization: Bearer $LAMINI_API_KEY' \
    --form 'files="https://huggingface.co/datasets/sudocoder/lamini-wikipedia-page/blob/main/Lamini-wikipedia-page.pdf"' \
    --form 'model_name="meta-llama/Meta-Llama-3.1-8B-Instruct"'

Next, wait for training to complete by polling for status.

job_id = response['job_id']

status = client.status(job_id)
curl --location 'https://api.lamini.ai/alpha/memory-rag/status' \
    --header 'Authorization: Bearer $LAMINI_API_KEY' \
    --header 'Content-Type: application/json' \
    --data '{
        "job_id": 1,
    }'

Finally, query the model.

Create a prompt:

user_prompt = "How is lamini related to llamas?"
prompt_template = "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n {prompt} <|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
prompt = prompt_template.format(prompt=user_prompt)

Pass the prompt to the Memory RAG model:

response = client.query(prompt)

curl --location 'https://api.lamini.ai/alpha/memory-rag/completions' \
    --header 'Authorization: Bearer $LAMINI_API_KEY' \
    --header 'Content-Type: application/json' \
    --data '{
        "prompt": "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n How are you? <|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
        "job_id": 1
    }'