Let's talk
Don’t like forms?
Thanks! We'll get back to you very soon.
Oops! Something went wrong while submitting the form.
Tool Comparisons • 10 minutes • Oct 26, 2023

Guardrails for LLMs: a tooling comparison

Misha Iakovlev
Misha Iakovlev
Senior MLOps Engineer

This blog is part of our MindGPT series, which you can find out more about here. It’s also the second edition of our mini series on guardrails for large language models, the first in this series explored how we apply guardrails to MindGPT.

We’re continuing our journey in building out MindGPT, a mental health conversational system backed by a large language model. The data that’s used as a foundation to answer questions is from the Mind and NHS websites, both high credibility sources of information. Our aim in documenting our journey is to provide an accessible and transparent view of our current progress, and to discuss open problems and how we’ve solved them. The Github containing all of our code is available to view at any time.

In this blog, the second in our mini series on guardrails for LLMs, we’ll compare two of the main open source tools out there for adding guardrails to your models. We’ll cover how they work and when you might favour one over the other, along with contextualising all of this to MindGPT.

What are Guardrails for Large Language Models?

If we think about MindGPT, what guardrails are and why they’re useful will become clear. 

MindGPT is a system designed to answer questions people may have about mental health and we want to make sure that the model sticks to that topic, preventing it from going off-topic or giving its opinion on other subjects. You can imagine that responding with something unexpected or bad advice could have a negative impact on the person asking the questions, particularly if it’s a sensitive topic.

Guardrails prevent this from happening. They’re a set of rules and boundaries that an LLM is allowed to operate in, i.e., they ensure the model behaves in a certain way. 

In our previous blog, we cover this in more detail and show how our LLM had an opinion on a range of topics including its favourite football team and pasta recipe. (Spoiler: guardrails helped us prevent this from happening). Today, we want to compare and contrast two major libraries for guardrails: NeMo Guardrails and Guardrails AI.

NeMo vs. Guardrails AI

Both libraries are under development, and have not yet had their 1.0 release. NeMo Guardrails is developed by NVIDIA and Guardrails AI is developed by a team under the same name. Both libraries are available under Apache 2.0 licence.

Although they aim to accomplish the task – push an LLM into behaving a certain way – they approach it rather differently.

NeMo Guardrails uses text embeddings (find out what these are here) to determine the flow that it should execute. In other words, we define a set of scenarios that we want the system to follow (we will explore what it means exactly later in the post), and the user input is embedded to match one of the scenarios. The embedding model can be configured to better suit our application.

Guardrails AI on the other hand utilises text extraction capabilities of LLMs to produce a JSON summary of a given text. It allows us to specify certain factual pieces of information that we want to extract from the input text such as names or dates, or more abstract things like the overall topic of the text as we’ve demonstrated the previous blog. We are then free to make decisions based on the extracted information, the library itself does not determine the subsequent logic.

How do you add Guardrails to your LLM?

Since the libraries use different mechanisms, they are also integrated into your LLM system differently 

Guardrails AI by its nature can be integrated anywhere within the existing application, as it does not make business logic decisions itself. It only needs to be fed the text to extract from (be it the user input, the model output, or some intermediate representation of them), and then we are free to use its output to define the logic of the downstream application.

NeMo Guardrails on the other hand defines the whole conversation flow logic, and wraps our LLMs, vector databases, and any associated tools. This results in a system that is tightly coupled, but the library itself is flexible enough, as it is powered by Langchain, to cover many use cases.

What LLMs can be used with Guardrails?

Both libraries allow for using many different models, yet again they approach it differently.

NeMo Guardrails, as it is based on LangChain, supports everything that is supported by Langchain itself. This includes closed-source APIs like OpenAI and Anthropic, and some open-source options like self-hosted HuggingFace models.

Guardrails AI abstracts it away, by basically requiring some function that takes a prompt text as an input, and returns JSON-parsable text as its output. So pretty much anything can be used: documentation examples use OpenAI API, but one can use any other LLM APIs or self-hosted models as long as they implement this simple interface. The only caveat is that the model must be able to produce valid JSON, otherwise the validation will always fail. The library comes with a prompt template to make the GPT3.5 model produce JSON, but additional prompt engineering might be required for other models. 

Configuring Guardrails

At this point of the blog, the comparison between two libraries diverges quite a bit, but it is still worth looking at how each of them can be configured separately, to have a better understanding of where and how they can be used.

NeMo Guardrails

<pre><code>define user express greeting
 "Hello!"
 "Good afternoon!"

define flow
 user express greeting
 bot express greeting
 bot offer to help

define bot express greeting
 "Hello there!"

define bot offer to help
 "How can I help you today?"</code></pre>

The above is an example from the NeMo documentation, let’s dissect what we can configure:

  • User example phrases, that are used for similarity search on the user input
  • Hardcoded bot responses
  • Flows, sequences of actions that can be chosen to be followed, based on the similarity search

Outside of this configuration, but what we have already discussed, we can also choose:

  • What LLM is used under the hood, as well as its prompt template and inference parameters
  • Additional external tools, such as search APIs and databases

Guardrails AI

<pre><code>guardrails_prompt = """
Given the user's question, determine its topic, and decide whether it's on topic of mental health or not.

User asked: ${user_question}

${gr.json_suffix_without_examples}
"""

class OffTopicModel(BaseModel):
    topic: str = Field(description="Topic of the question")

    is_mental_health: bool = Field(description="Is the question related to mental health? Set, False, if and only if the question is related to something other than mental health, such as sports, politics or weather.")

guard = gd.Guard.from_pydantic(output_class=OffTopicModel, prompt=guardrails_prompt)

guard(
   openai.Completion.create,
   engine="text-davinci-003",
   prompt_params={
       "user_question": input_text
   },
   temperature=0.3,
   max_tokens=1024
)
</code></pre>

Let’s now have a look at what we have control over in Guardrails AI configuration (shown above):

  • Prompt template: the instructions on what to do with the input, and the instructions on how to form JSON
  • Pydantic schema, that defines what information we want the system to include in the output, and, potentially, how to validate it
  • LLM – a model we send the formatted prompt to, to extract information, as well as the its inference parameters

So, since these tools use different approaches to how they guard LLMs, the configuration that is available and/or required, also differs significantly. However, in both cases we have a great degree of customisability.

Conclusion

We can summarise all of the points discussed in the previous section in a table:

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
 overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
 font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-baqh{text-align:center;vertical-align:top}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
 <tr>
   <th class="tg-0lax">Library</th>
   <th class="tg-baqh">NeMo Guardrails</th>
   <th class="tg-baqh">Guardrails</th>
 </tr>
</thead>
<tbody>
 <tr>
   <td class="tg-0lax">Who?</td>
   <td class="tg-baqh">Nvidia</td>
   <td class="tg-baqh">Guardrails AI</td>
 </tr>
 <tr>
   <td class="tg-0lax">License</td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">Apache 2.0</span></td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">Apache 2.0</span></td>
 </tr>
 <tr>
   <td class="tg-0lax">Mechanism</td>
   <td class="tg-baqh">Match user inputs with examples in the DB with sentence embeddings to determine the flow</td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">Take the input and extract information as JSON based on a schema</span></td>
 </tr>
 <tr>
   <td class="tg-0lax">Integration</td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">It expects the LLM to be wrapped and for the app logic to be defined in its rails configuration</span></td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">It can be part of a bigger app and called at any stage. It does not wrap the model</span></td>
 </tr>
 <tr>
   <td class="tg-0lax">LLM Compatability</td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">Any LLM supported by LangChain</span></td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal">Anything that is a function that takes text as an input and can return JSON-parsable text as an output can be used as a model</span></td>
 </tr>
 <tr>
   <td class="tg-0lax">Required Configuration</td>
   <td class="tg-baqh">User and bot message definitions; the flows; an embedding model; additional tools</td>
   <td class="tg-baqh"><span style="font-weight:400;font-style:normal;text-decoration:none">Prompt template; Pydantic schema</span>
 </tr>
</tbody>
</table>

This is by no means a definitive guide on which LLM guardrails library is the best and one should always choose, especially because both of these tools are under development. Instead, I intend this blog post to guide you a little, so you can decide which tool to use for your particular use case.

What's next?

We’ve compared and contrasted two of the main open source tools out there for adding guardrails to your LLM. In the next blog, the last in our mini series, we’ll delve even deeper into Guardrails AI.

Share this article