Over the last few weeks, we have seen a spike in both users and GitHub stars for RamaLama, an open source project that simplifies AI model management by leveraging OCI containers. (Read How RamaLama makes working with AI models boring for an overview of the project.)
Coincidentally, this happened around the same time that the DeepSeek AI model was released. We realized a large number of individuals were downloading the model and running it with RamaLama, because the servers sharing the AI model were overloaded and started returning 503 errors, which triggered a bug in RamaLama. We quickly fixed the issue and pushed a new release.
The challenge of AI model security
There is some controversy about running DeepSeek models from a security point of view, but this is indicative of a larger problem with AI model proliferation. Entities globally, including the U.S. government, are considering how to monitor and potentially restrict the use of DeepSeek applications and models within their territories. The question at the core of this problem is this: Can we trust this AI model or the application that the model runs in?
This reveals a significant issue with AI models and the applications that run them. With thousands of people experimenting with AI models locally on their laptops, does this present a security issue? Can a given model, DeepSeek or otherwise, be trusted? Can a model trigger the software it’s running on to start stealing information off your laptop and sending it out to the internet?
Compounding this are applications and websites that host many of these models. Consider that a large number of individuals accessed the DeepSeek model through the DeepSeek website and mobile app. In January 2025, the DeepSeek app rocketed to #1 in the iTunes store. This means that individual users are sharing their credentials, their smartphone details, and myriad of additional information as they type into the prompt with an untrusted entity to test an untrusted model. Many enterprise users and teams have security concerns, whether for geopolitical reasons or IT security in general.
RamaLama to the rescue
RamaLama, however, offers a better way.

RamaLama defaults to running AI models inside of rootless containers using Podman or Docker. These containers isolate the AI models from information on the underlying host. With RamaLama containers, the AI model is mounted as a volume into the container in read-only mode. This results in the process running the model, llama.cpp or vLLM, being isolated from the host.
In addition, becuase ramalama run
uses the --network=none
option, the container cannot reach the network and leak any information out of the system. Finally, containers are run with --rm options
, which means that any content written during the running of the container is wiped out when the application exits.
Conclusion
Here’s how RamaLama delivers a robust security footprint:
Container isolation: AI models run within isolated containers, preventing direct access to the host system.
Read-only volume mounts: The AI model is mounted in read-only mode, meaning that processes inside the container cannot modify host files.
No network access:
ramalama run
is executed with--network=none
, so the model has no outbound connectivity through which information can be leaked.Auto-cleanup: RamaLama runs containers with
--rm
, wiping out any temporary data once the session ends.No access to Linux capabilities: RamaLama drops all Linux capabilities so there is no access to attack the underlying host.
No new privileges: A Linux kernel feature disables container processes from gaining additional privileges.
Given these capabilities, RamaLama containerization addresses many of the common risks of testing models.
How to try RamaLama
Try out RamaLama on your machines by following these installation instructions.