Use Llama.cpp in FileMaker

February 19Feb 19

For MBS FileMaker Plugin 16.1 we include new Llama functions to use local LLMs on your computer. Instead of paying for a web service to run the LLM on someone else's computer, you can run it locally on yours.

Llama chat example showing what the LLM knows about FileMaker.

About Llama

The Llama.cpp project allows you to run efficient Large Language Model Inference in pure C/C++. You can run any powerful artificial intelligence model including all LLaMa models, Falcon and RefinedWeb, Mistral models, Gemma from Google, Phi, Qwen, Yi, Solar 10.7B and Alpaca.

You do not need to pay to use Llama.cpp or buy a subscription. It is completely free, open-source, constantly updated and available under the “MIT” license. And Monkeybread Software provides an interface for FileMaker as part of the MBS Plugin.

Get Models

You can find various models on the Internet, e.g. on huggingface.co. For Llama you need a model in the GGUF format. Other formats may need a conversion currently.

You may search models for llama.cpp as app: huggingface.co/models?apps=llama.cpp.

We made tests with gemma-3-1b-it.Q8_0.gguf model (1 GB) from google and gpt-oss-20b-Q4_K_M.gguf (11.6 GB) from openai. The bigger models are usually better in their knowledge, but please be aware, that the model needs to fit into memory.
The links will break as new models get uploaded and old models disappear.

Get Libraries

To install llama.cpp you can use package managers like homebrew to install a copy: brew.sh/llama.cpp.

Or you download binaries from the llama website or directly from the github llama release page.

You find there builds for macOS (Apple Silicon or Intel), for Linux and for Windows. For Windows you have various builds to use either Vulkan, CUDA or CPU for performing interference.

The libraries can stay in whatever folder you or the installer chooses. Just tell our plugin where to find them.

Load the library

Now you may want to open the example file from us: Llama chat.fmp12.

There you need to modify the script to load the llama libraries. Go to the Start script and look for the calls to Llama.LoadLibrary. For Windows you also need to use Process.SetCurrentDirectory to set the folder, so Windows actually finds the DLLs.

Next you put in the file path to the model into the Model Path field. This should be a native path to the model like "C:\Users\User\Modesl\gemma-3-1b-it.Q8_0.gguf".

Use the model

Next you can use Llama.New to start a session. You can of course have multiple sessions, but please free memory later with Llama.Release function. Then you start a session with Llama.StartContext function.

On the session, you can Llama.Query function to ask the LLM a question. The function then returns the output of the model. Or you use the Llama.Chat function which applies a chat template and records answers, so the model knows previous answers. See also Llama.ModelChatTemplate, Llama.Transcript, and Llama.TranscriptJSON

Set Variable [ $path ; Value: Llama::Model Path ]

Set Variable [ $r ; Value: MBS("Llama.LoadModel"; $$session; $path) ]

If [ MBS("IsError") ]

Show Custom Dialog [ "Failed to load model." ; $r ]

Else

Set Variable [ $r ; Value: MBS( "Llama.StartContext" ; $$session ; JSONSetElement( "{}" ; [ "n_ctx" ; 2048 ; JSONNumber ] ; [ "n_batch" ; 2048 ; JSONNumber ] ) ) ]

If [ MBS("IsError") ]

Show Custom Dialog [ "Failed to start context." ; $r ]

Else

Set Variable [ $r ; Value: MBS( "Llama.InitSampler" ; $$session ; JSONSetElement( "[]" ; [ "[+].sampler" ; "min_p" ; JSONString ] ; [ "[:].p" ; ,8 ; JSONNumber ] ; [ "[:].min_keep" ; 1 ; JSONNumber ] ; [ "[+].sampler" ; "temp" ; JSONString ] ; [ "[:].t" ; ,8 ; JSONNumber ] ; [ "[+].sampler" ; "dist" ; JSONString ] ) ) ]

Show Custom Dialog [ "Model loaded and ready." ]

Commit Records/Requests [ With dialog: Off ]

Truncate Table [ With dialog: Off ; Table: “Llama” ]

New Record/Request

Go to Field [ Llama::Text ]

Set Field [ Llama::Text ; "Hello" ]

End If

We have a lot of parameters like context size or for the samplers passed to Llama.InitSampler function. Different parameters to the sampler or context may provide very different outputs.

Please try and let us know.