How RamaLama runs AI models in isolation by default

February 20, 2025

Daniel Walsh

Over the last few weeks, we have seen a spike in both users and GitHub stars for RamaLama, an open source project that simplifies AI model management by leveraging OCI containers. (Read How RamaLama makes working with AI models boring for an overview of the project.)

Coincidentally, this happened around the same time that the DeepSeek AI model was released. We realized a large number of individuals were downloading the model and running it with RamaLama, because the servers sharing the AI model were overloaded and started returning 503 errors, which triggered a bug in RamaLama. We quickly fixed the issue and pushed a new release.

The challenge of AI model security

There is some controversy about running DeepSeek models from a security point of view, but this is indicative of a larger problem with AI model proliferation. Entities globally, including the U.S. government, are considering how to monitor and potentially restrict the use of DeepSeek applications and models within their territories. The question at the core of this problem is this: Can we trust this AI model or the application that the model runs in?

This reveals a significant issue with AI models and the applications that run them. With thousands of people experimenting with AI models locally on their laptops, does this present a security issue? Can a given model, DeepSeek or otherwise, be trusted? Can a model trigger the software it’s running on to start stealing information off your laptop and sending it out to the internet?

Compounding this are applications and websites that host many of these models. Consider that a large number of individuals accessed the DeepSeek model through the DeepSeek website and mobile app. In January 2025, the DeepSeek app rocketed to #1 in the iTunes store. This means that individual users are sharing their credentials, their smartphone details, and myriad of additional information as they type into the prompt with an untrusted entity to test an untrusted model. Many enterprise users and teams have security concerns, whether for geopolitical reasons or IT security in general.

RamaLama to the rescue

RamaLama, however, offers a better way.

The RamaLama llama standing behind a container at the beach giving a thumbs up.

RamaLama defaults to running AI models inside of rootless containers using Podman or Docker. These containers isolate the AI models from information on the underlying host. With RamaLama containers, the AI model is mounted as a volume into the container in read-only mode. This results in the process running the model, llama.cpp or vLLM, being isolated from the host.

In addition, becuase ramalama run uses the --network=none option, the container cannot reach the network and leak any information out of the system. Finally, containers are run with --rm options, which means that any content written during the running of the container is wiped out when the application exits.

Conclusion

Here’s how RamaLama delivers a robust security footprint:

Container isolation: AI models run within isolated containers, preventing direct access to the host system.
Read-only volume mounts: The AI model is mounted in read-only mode, meaning that processes inside the container cannot modify host files.
No network access: ramalama run is executed with --network=none, so the model has no outbound connectivity through which information can be leaked.
Auto-cleanup: RamaLama runs containers with --rm, wiping out any temporary data once the session ends.
No access to Linux capabilities: RamaLama drops all Linux capabilities so there is no access to attack the underlying host.
No new privileges: A Linux kernel feature disables container processes from gaining additional privileges.

Given these capabilities, RamaLama containerization addresses many of the common risks of testing models.

How to try RamaLama

Try out RamaLama on your machines by following these installation instructions.

Last updated: March 20, 2025

Report a website issue

Linux

Java runtimes & frameworks

Kubernetes

Integration & App Connectivity

AI/ML

Automation

Developer tools

Developer Sandbox

Programming Languages & Frameworks

System Design & Architecture

Developer Productivity

Secure Development & Architectures

Platform Engineering

Automated Data Processing

Start exploring in the Developer Sandbox for free

Interactive Lessons and Learning Paths

Developer Sandbox Activities

E-Books

Tutorials

Cheat Sheets

Documentation

Red Hat Learning

How RamaLama runs AI models in isolation by default

The challenge of AI model security

RamaLama to the rescue

Conclusion

How to try RamaLama

New C++ features in GCC 15

Evaluating memory overcommitment in OpenShift Virtualization

Cracking the code: How neural networks might actually “think”

How to use content templates in Red Hat Insights

Fine-tune LLMs with Kubeflow Trainer on OpenShift AI

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue

How RamaLama runs AI models in isolation by default

Share:

The challenge of AI model security

RamaLama to the rescue

Conclusion

How to try RamaLama

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue