| Title: | Interface for Large Language Model APIs in R |
|---|---|
| Description: | Provides a unified interface to large language models across multiple providers. Supports text generation, tidy data workflows, structured output with optional JSON Schema validation, XML-like tag extraction, and embeddings. Includes chat sessions, consistent error handling, and parallel batch tools. |
| Authors: | Ali Sanaei [aut, cre] |
| Maintainer: | Ali Sanaei <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.8.3 |
| Built: | 2026-06-10 18:06:45 UTC |
| Source: | https://github.com/asanaei/llmr |
Bind tools to a config (provider-agnostic)
bind_tools(config, tools, tool_choice = NULL)bind_tools(config, tools, tool_choice = NULL)
config |
llm_config |
tools |
list of tools (each with name, description, and parameters/input_schema) |
tool_choice |
optional tool_choice spec (provider-specific shape) |
modified llm_config
Creates a tibble of experiments for factorial designs where you want to test all combinations of configs, messages, and repetitions with automatic metadata.
build_factorial_experiments( configs, user_prompts, system_prompts = NULL, repetitions = 1, config_labels = NULL, user_prompt_labels = NULL, system_prompt_labels = NULL )build_factorial_experiments( configs, user_prompts, system_prompts = NULL, repetitions = 1, config_labels = NULL, user_prompt_labels = NULL, system_prompt_labels = NULL )
configs |
List of llm_config objects to test. |
user_prompts |
Character vector (or list) of user-turn prompts. |
system_prompts |
Optional character vector of system messages. These are fully crossed with the user prompts (every combination appears), like the other factors. Missing/NA values are ignored; those messages are user-only. |
repetitions |
Integer. Number of repetitions per combination. Default is 1. |
config_labels |
Character vector of labels for configs. If NULL, uses "provider_model". |
user_prompt_labels |
Optional labels for the user prompts. |
system_prompt_labels |
Optional labels for the system prompts. |
A tibble with columns: config (list-column), messages (list-column), config_label, user_prompt_label, system_prompt_label, and repetition. Ready for use with call_llm_par().
## Not run: # Factorial design: 3 configs x 2 user prompts x 10 reps = 60 experiments configs <- list(gpt4_config, claude_config, llama_config) user_prompts <- c("Control prompt", "Treatment prompt") experiments <- build_factorial_experiments( configs = configs, user_prompts = user_prompts, repetitions = 10, config_labels = c("gpt4", "claude", "llama"), user_prompt_labels = c("control", "treatment") ) # Use with call_llm_par results <- call_llm_par(experiments, progress = TRUE) ## End(Not run)## Not run: # Factorial design: 3 configs x 2 user prompts x 10 reps = 60 experiments configs <- list(gpt4_config, claude_config, llama_config) user_prompts <- c("Control prompt", "Treatment prompt") experiments <- build_factorial_experiments( configs = configs, user_prompts = user_prompts, repetitions = 10, config_labels = c("gpt4", "claude", "llama"), user_prompt_labels = c("control", "treatment") ) # Use with call_llm_par results <- call_llm_par(experiments, progress = TRUE) ## End(Not run)
call_llm() dispatches to the correct provider implementation based on
config$provider. It supports both generative chat/completions and
embeddings, plus a simple multimodal shortcut for local files.
call_llm(config, messages, verbose = FALSE) ## S3 method for class 'ollama' call_llm(config, messages, verbose = FALSE)call_llm(config, messages, verbose = FALSE) ## S3 method for class 'ollama' call_llm(config, messages, verbose = FALSE)
config |
An |
messages |
One of:
|
verbose |
Logical. If |
Generative mode: an llmr_response object. Use as.character(x) to get just the text; print(x) shows text plus a status line; use helpers finish_reason(x) and tokens(x).
Embedding mode: provider-native list with an element data; convert with parse_embeddings().
OpenAI-compatible: On a server 400 that identifies the bad
parameter as max_tokens, LLMR will, unless no_change=TRUE,
retry once replacing max_tokens with max_completion_tokens
(and inform via a cli_alert_info). The former experimental
"uncapped retry on empty content" is disabled by default to
avoid unexpected costs.
Anthropic: max_tokens is required; if omitted LLMR uses
2048 and warns. Multimodal images are inlined as base64 and PDFs
as document blocks. Extended thinking is supported: provide
thinking_budget (which must stay below max_tokens) and the
response will carry content blocks of type "thinking", also
exposed as the thinking field of the result. Beta features can be
requested by passing anthropic_beta = "...", sent as the
anthropic-beta header.
Gemini (REST): systemInstruction is supported; user
parts use text/inlineData(mimeType,data); responses are set to
responseMimeType = "text/plain". For Vertex AI, use
provider = "gemini", vertex = TRUE, project = ....
Ollama (local): OpenAI-compatible endpoints on http://localhost:11434/v1/*;
no Authorization header is required. Override with api_url as needed.
Alibaba / Moonshot regions: Defaults target the
international endpoints (dashscope-intl.aliyuncs.com and
api.moonshot.ai). China-region accounts must pass api_url for the
mainland hosts (dashscope.aliyuncs.com and api.moonshot.cn);
using the wrong region returns HTTP 401.
Error handling: HTTP errors raise structured conditions with
classes like llmr_api_param_error, llmr_api_rate_limit_error,
llmr_api_server_error; see the condition fields for status, code,
request id, and (where supplied) the offending parameter.
See the "multimodal shortcut" described under messages. Internally,
LLMR expands these into the provider's native request shape and tilde-expands
local file paths.
Ollama provides an OpenAI-compatible HTTP API on localhost by default. Start the
daemon and pull a model first (terminal): ollama serve (in background) and
ollama pull llama3. Then configure LLMR with
llm_config("ollama", "llama3", embedding = FALSE) for chat or
llm_config("ollama", "nomic-embed-text", embedding = TRUE) for embeddings.
Override the endpoint with api_url if not using the default
http://localhost:11434/v1/*.
llm_config,
call_llm_robust,
llm_chat_session,
parse_embeddings,
finish_reason,
tokens
## Not run: ## 1) Basic generative call cfg <- llm_config("openai", "gpt-5-nano") call_llm(cfg, "Say hello in Greek.") ## 2) Generative with rich return r <- call_llm(cfg, "Say hello in Greek.") r as.character(r) finish_reason(r); tokens(r) ## 3) Anthropic extended thinking (single example) ## max_tokens must cover the thinking budget plus the visible reply. a_cfg <- llm_config("anthropic", "claude-sonnet-4-6", max_tokens = 20000, thinking_budget = 16000) r2 <- call_llm(a_cfg, "Compute 87*93 in your head. Give only the final number.") # reasoning text: r2$thinking # final text: as.character(r2) ## 4) Multimodal (named-vector shortcut) msg <- c( system = "Answer briefly.", user = "Describe this image in one sentence.", file = "~/Pictures/example.png" ) call_llm(cfg, msg) ## 5) Embeddings e_cfg <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) emb_raw <- call_llm(e_cfg, c("first", "second")) emb_mat <- parse_embeddings(emb_raw) ## 6) With a chat session ch <- chat_session(cfg) ch$send("Say hello in Greek.") # prints the same status line as `print.llmr_response` ch$history() ## End(Not run)## Not run: ## 1) Basic generative call cfg <- llm_config("openai", "gpt-5-nano") call_llm(cfg, "Say hello in Greek.") ## 2) Generative with rich return r <- call_llm(cfg, "Say hello in Greek.") r as.character(r) finish_reason(r); tokens(r) ## 3) Anthropic extended thinking (single example) ## max_tokens must cover the thinking budget plus the visible reply. a_cfg <- llm_config("anthropic", "claude-sonnet-4-6", max_tokens = 20000, thinking_budget = 16000) r2 <- call_llm(a_cfg, "Compute 87*93 in your head. Give only the final number.") # reasoning text: r2$thinking # final text: as.character(r2) ## 4) Multimodal (named-vector shortcut) msg <- c( system = "Answer briefly.", user = "Describe this image in one sentence.", file = "~/Pictures/example.png" ) call_llm(cfg, msg) ## 5) Embeddings e_cfg <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) emb_raw <- call_llm(e_cfg, c("first", "second")) emb_mat <- parse_embeddings(emb_raw) ## 6) With a chat session ch <- chat_session(cfg) ch$send("Say hello in Greek.") # prints the same status line as `print.llmr_response` ch$history() ## End(Not run)
Broadcasts different messages using the same configuration in parallel.
Perfect for batch processing different prompts with consistent settings.
Use setup_llm_parallel() when you want explicit control over workers.
call_llm_broadcast(config, messages, ...)call_llm_broadcast(config, messages, ...)
config |
Single llm_config object to use for all calls. |
messages |
A character vector (each element is a prompt) OR a list where each element is a pre-formatted message list. |
... |
Additional arguments passed to |
A tibble with columns: message_index (metadata), the config
list-column (so llm_par_resume() can re-run failures), provider, model,
all model parameters, response_text, raw_response_json, success,
error_message, and the other diagnostics documented in call_llm_par().
Recommended workflow:
Call setup_llm_parallel() once at the start of your script.
Run one or more parallel experiments (e.g., call_llm_broadcast()).
Call reset_llm_parallel() at the end to restore sequential processing.
If the active future plan is sequential, call_llm_par() temporarily switches
to multisession for the duration of the call.
setup_llm_parallel, reset_llm_parallel,
call_llm_par, llm_fn, llm_mutate
## Not run: # Broadcast different questions config <- llm_config(provider = "openai", model = "gpt-4.1-nano") messages <- list( list(list(role = "user", content = "What is 2+2?")), list(list(role = "user", content = "What is 3*5?")), list(list(role = "user", content = "What is 10/2?")) ) setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_broadcast(config, messages) reset_llm_parallel(verbose = TRUE) ## End(Not run)## Not run: # Broadcast different questions config <- llm_config(provider = "openai", model = "gpt-4.1-nano") messages <- list( list(list(role = "user", content = "What is 2+2?")), list(list(role = "user", content = "What is 3*5?")), list(list(role = "user", content = "What is 10/2?")) ) setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_broadcast(config, messages) reset_llm_parallel(verbose = TRUE) ## End(Not run)
Compares different configurations (models, providers, settings) using the same message.
Perfect for benchmarking across different models or providers.
Use setup_llm_parallel() when you want explicit control over workers.
call_llm_compare(configs_list, messages, ...)call_llm_compare(configs_list, messages, ...)
configs_list |
A list of llm_config objects to compare. |
messages |
A character vector or a list of message objects (same for all configs). |
... |
Additional arguments passed to |
A tibble with columns: config_index (metadata), the config
list-column (so llm_par_resume() can re-run failures), provider, model,
all varying model parameters, response_text, raw_response_json, success,
error_message, and the other diagnostics documented in call_llm_par().
Recommended workflow:
Call setup_llm_parallel() once at the start of your script.
Run one or more parallel experiments (e.g., call_llm_broadcast()).
Call reset_llm_parallel() at the end to restore sequential processing.
If the active future plan is sequential, call_llm_par() temporarily switches
to multisession for the duration of the call.
setup_llm_parallel, reset_llm_parallel,
call_llm_par
## Not run: # Compare different models config1 <- llm_config(provider = "openai", model = "gpt-5-nano") config2 <- llm_config(provider = "groq", model = "openai/gpt-oss-20b") configs_list <- list(config1, config2) messages <- "Explain quantum computing" setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_compare(configs_list, messages) reset_llm_parallel(verbose = TRUE) ## End(Not run)## Not run: # Compare different models config1 <- llm_config(provider = "openai", model = "gpt-5-nano") config2 <- llm_config(provider = "groq", model = "openai/gpt-oss-20b") configs_list <- list(config1, config2) messages <- "Explain quantum computing" setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_compare(configs_list, messages) reset_llm_parallel(verbose = TRUE) ## End(Not run)
Processes experiments from a tibble where each row contains a config and message pair.
This is the core parallel processing function. Metadata columns are preserved.
Use setup_llm_parallel() when you want explicit control over workers.
call_llm_par( experiments, simplify = TRUE, tries = 10, wait_seconds = 2, backoff_factor = 120^(1/tries), verbose = FALSE, memoize = FALSE, max_workers = NULL, progress = FALSE, json_output = NULL, start_jitter = 0 )call_llm_par( experiments, simplify = TRUE, tries = 10, wait_seconds = 2, backoff_factor = 120^(1/tries), verbose = FALSE, memoize = FALSE, max_workers = NULL, progress = FALSE, json_output = NULL, start_jitter = 0 )
experiments |
A tibble/data.frame with required list-columns 'config' (llm_config objects) and 'messages' (character vector OR message list). |
simplify |
If TRUE (default), provider, model, and the model parameters stored in each row's config are unnested into regular columns for easy filtering and grouping. |
tries |
Integer. Total number of attempts per call (first call plus retries). Default is 10. |
wait_seconds |
Numeric. Initial wait time (seconds) before retry. Default is 2. |
backoff_factor |
Numeric. Multiplier for wait time after each failure. Default is 3. |
verbose |
Logical. If TRUE, prints progress and debug information. |
memoize |
Logical. If TRUE, enables caching for identical requests. Note that under a multisession plan each worker process keeps its own cache, so deduplication is per worker, not global. |
max_workers |
Integer. Maximum number of parallel workers. If NULL, auto-detects. |
progress |
Logical. If TRUE, shows progress bar. |
json_output |
Deprecated. Raw JSON string is always included as raw_response_json. This parameter is kept for backward compatibility but has no effect. |
start_jitter |
Each call starts after a uniformly distributed delay
between 0 and |
A tibble containing all original columns plus:
response_text - assistant text (or NA on failure)
raw_response_json - raw JSON string (on failure: the
provider's error body when available)
success, error_message
finish_reason - e.g. "stop", "length", "filter", "tool", or "error:category"
sent_tokens, rec_tokens, total_tokens, reasoning_tokens
response_id
duration - seconds
status_code, error_code, bad_param - error
diagnostics (NA on success)
response - the full llmr_response object (or NULL on failure)
The response column holds llmr_response objects on success, or NULL on failure.
Recommended workflow:
Call setup_llm_parallel() once at the start of your script.
Run one or more parallel experiments (e.g., call_llm_broadcast()).
Call reset_llm_parallel() at the end to restore sequential processing.
If the active future plan is sequential, this function temporarily switches
to multisession for the duration of the call.
For setting up the environment: setup_llm_parallel, reset_llm_parallel.
For simpler, pre-configured parallel tasks: call_llm_broadcast, call_llm_sweep, call_llm_compare.
For creating experiment designs: build_factorial_experiments.
## Not run: # Simple example: Compare two models on one prompt cfg1 <- llm_config("openai", "gpt-4.1-nano") cfg2 <- llm_config("groq", "openai/gpt-oss-20b") experiments <- tibble::tibble( model_id = c("gpt-4.1-nano", "groq-gpt-oss-20b"), config = list(cfg1, cfg2), messages = "Count the number of the letter e in this word: Freundschaftsbeziehungen " ) setup_llm_parallel(workers = 2) results <- call_llm_par(experiments, progress = TRUE) reset_llm_parallel() print(results[, c("model_id", "response_text")]) ## End(Not run)## Not run: # Simple example: Compare two models on one prompt cfg1 <- llm_config("openai", "gpt-4.1-nano") cfg2 <- llm_config("groq", "openai/gpt-oss-20b") experiments <- tibble::tibble( model_id = c("gpt-4.1-nano", "groq-gpt-oss-20b"), config = list(cfg1, cfg2), messages = "Count the number of the letter e in this word: Freundschaftsbeziehungen " ) setup_llm_parallel(workers = 2) results <- call_llm_par(experiments, progress = TRUE) reset_llm_parallel() print(results[, c("model_id", "response_text")]) ## End(Not run)
Enables structured output on each config (if not already set), runs, then parses JSON.
call_llm_par_structured(experiments, schema = NULL, .fields = NULL, ...)call_llm_par_structured(experiments, schema = NULL, .fields = NULL, ...)
experiments |
Tibble with |
schema |
Optional JSON Schema list. |
.fields |
Optional fields to hoist from parsed JSON (supports nested paths). |
... |
Passed to |
call_llm_par(), llm_parse_structured_col(),
enable_structured_output()
Injects tag instructions into each experiment row, runs call_llm_par(),
then parses XML-like tags from each response via llm_parse_tags_col().
call_llm_par_tags(experiments, .tags, .fields = NULL, ...)call_llm_par_tags(experiments, .tags, .fields = NULL, ...)
experiments |
Tibble with |
.tags |
Character vector of tag names to request and parse. |
.fields |
|
... |
Passed to |
call_llm_par(), llm_parse_tags_col(), llm_fn_tags(),
llm_mutate_tags()
Wraps call_llm so that transient failures are retried while
permanent ones fail fast. Retried conditions are rate limits (HTTP 429),
server errors (HTTP 5xx and 408), and network-level interruptions
(timeouts, connection resets, DNS failures). Errors that retrying cannot
fix, such as an invalid parameter (400), a missing key (401/403), or a
prompt that exceeds the context window, are raised immediately.
call_llm_robust( config, messages, tries = 5, wait_seconds = 2, backoff_factor = 3, verbose = FALSE, memoize = FALSE )call_llm_robust( config, messages, tries = 5, wait_seconds = 2, backoff_factor = 3, verbose = FALSE, memoize = FALSE )
config |
An |
messages |
A list of message objects (or character vector for embeddings). |
tries |
Integer. Total number of attempts (the first call plus retries) before giving up. Default is 5. |
wait_seconds |
Numeric. Initial wait time (seconds) before the first retry. Default is 2. |
backoff_factor |
Numeric. Multiplier for wait time after each failure. Default is 3. |
verbose |
Logical. If TRUE, prints the full API response. |
memoize |
Logical. If TRUE, calls are cached to avoid repeated identical requests. Default is FALSE. |
When the provider supplies a Retry-After header with a 429, the wait
honors it; otherwise waits grow exponentially with a little jitter so that
parallel workers do not retry in lockstep.
The successful result from call_llm, or an error if all retries fail.
call_llm for the underlying, non-robust API call.
cache_llm_call for a memoised version that avoids repeated requests.
llm_config to create the configuration object.
chat_session for stateful, interactive conversations.
## Not run: robust_resp <- call_llm_robust( config = llm_config("groq", "openai/gpt-oss-20b"), messages = list(list(role = "user", content = "Hello, LLM!")), tries = 5, wait_seconds = 2, memoize = FALSE ) print(robust_resp) as.character(robust_resp) ## End(Not run)## Not run: robust_resp <- call_llm_robust( config = llm_config("groq", "openai/gpt-oss-20b"), messages = list(list(role = "user", content = "Hello, LLM!")), tries = 5, wait_seconds = 2, memoize = FALSE ) print(robust_resp) as.character(robust_resp) ## End(Not run)
Like call_llm(), but the reply arrives incrementally: callback is
invoked with each text chunk as it is generated, and the complete
llmr_response is returned at the end. Streaming keeps long generations
responsive and avoids HTTP timeouts on slow, lengthy completions.
call_llm_stream( config, messages, callback = function(chunk) cat(chunk), verbose = FALSE )call_llm_stream( config, messages, callback = function(chunk) cat(chunk), verbose = FALSE )
config |
An llm_config for a generative model. |
messages |
Messages as in |
callback |
Function called with each text chunk (a character scalar)
as it arrives. The default prints chunks to the console with |
verbose |
Print the assembled response object at the end. |
Supported providers: all OpenAI-compatible chat APIs (openai, groq,
together, deepseek, xai, alibaba, zhipu, moonshot, xiaomi, ollama),
Anthropic, and Gemini. The request body is built by the same internals as
call_llm(), so parameters, structured output, and hooks behave
identically; only the transport differs.
An llmr_response assembled from the stream (invisibly). Token
usage is filled when the provider reports it in the stream; otherwise it
is NA.
## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b") r <- call_llm_stream(cfg, "Tell a 100-word story about a lighthouse.") tokens(r) ## End(Not run)## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b") r <- call_llm_stream(cfg, "Tell a 100-word story about a lighthouse.") tokens(r) ## End(Not run)
Sweeps through different values of a single parameter while keeping the message constant.
Perfect for hyperparameter tuning, temperature experiments, etc.
Use setup_llm_parallel() when you want explicit control over workers.
call_llm_sweep(base_config, param_name, param_values, messages, ...)call_llm_sweep(base_config, param_name, param_values, messages, ...)
base_config |
Base llm_config object to modify. |
param_name |
Character. Name of the parameter to vary (e.g., "temperature", "max_tokens"). |
param_values |
Vector. Values to test for the parameter. |
messages |
A character vector or a list of message objects (same for all calls). |
... |
Additional arguments passed to |
A tibble with columns: swept_param_name, the varied parameter column,
the config list-column (so llm_par_resume() can re-run failures),
provider, model, all other model parameters, response_text,
raw_response_json, success, error_message, and the other diagnostics
documented in call_llm_par().
Recommended workflow:
Call setup_llm_parallel() once at the start of your script.
Run one or more parallel experiments (e.g., call_llm_broadcast()).
Call reset_llm_parallel() at the end to restore sequential processing.
If the active future plan is sequential, call_llm_par() temporarily switches
to multisession for the duration of the call.
setup_llm_parallel, reset_llm_parallel,
call_llm_par
## Not run: # Temperature sweep config <- llm_config(provider = "openai", model = "gpt-4.1-nano") messages <- "What is 15 * 23?" temperatures <- c(0, 0.3, 0.7, 1.0, 1.5) setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_sweep(config, "temperature", temperatures, messages) results |> dplyr::select(temperature, response_text) reset_llm_parallel(verbose = TRUE) ## End(Not run)## Not run: # Temperature sweep config <- llm_config(provider = "openai", model = "gpt-4.1-nano") messages <- "What is 15 * 23?" temperatures <- c(0, 0.3, 0.7, 1.0, 1.5) setup_llm_parallel(workers = 4, verbose = TRUE) results <- call_llm_sweep(config, "temperature", temperatures, messages) results |> dplyr::select(temperature, response_text) reset_llm_parallel(verbose = TRUE) ## End(Not run)
Sends messages together with native tool definitions, executes every tool
the model calls, feeds the results back, and repeats until the model
answers in plain text (or max_rounds is reached). Supported for
OpenAI-compatible providers (openai, groq, together, deepseek, xai,
alibaba, zhipu, moonshot, xiaomi, ollama) and Anthropic.
call_llm_tools( config, messages, tools, max_rounds = 8L, max_tool_calls = Inf, verbose = FALSE, tries = 3L, wait_seconds = 2 )call_llm_tools( config, messages, tools, max_rounds = 8L, max_tool_calls = Inf, verbose = FALSE, tries = 3L, wait_seconds = 2 )
config |
An llm_config for a generative model. |
messages |
Messages as in |
tools |
One |
max_rounds |
Maximum model turns (a turn may contain several tool calls). When reached, the last response is returned as-is with a warning. |
max_tool_calls |
Maximum tool executions across the whole loop.
Exceeding it raises a condition of class |
verbose |
Print each tool invocation as it happens. |
tries, wait_seconds
|
Retry controls passed to |
The final llmr_response. The full conversation (including tool
results) is attached as attr(x, "messages"); a tibble of executed
calls as attr(x, "tool_history") with columns round, name,
arguments (JSON), result; and aggregate spend across the whole loop
as attr(x, "tool_loop"), a list with model_calls, sent, rec
(token totals over every internal model call, NA when the provider
reported none), and tool_calls. Note that tokens(x) alone covers
only the final model call.
llm_tool(), tool_calls(), call_llm()
## Not run: weather <- llm_tool( function(city) paste0("22C and clear in ", city), name = "get_weather", description = "Current weather for a city.", parameters = list(city = list(type = "string")) ) cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 0) r <- call_llm_tools(cfg, "What is the weather in Tunis?", tools = weather) as.character(r) attr(r, "tool_history") ## End(Not run)## Not run: weather <- llm_tool( function(city) paste0("22C and clear in ", city), name = "get_weather", description = "Current weather for a city.", parameters = list(city = list(type = "string")) ) cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 0) r <- call_llm_tools(cfg, "What is the weather in Tunis?", tools = weather) as.character(r) attr(r, "tool_history") ## End(Not run)
Removes response_format/response_schema/response_mime_type and schema tool if present. Keeps user tools intact.
disable_structured_output(config)disable_structured_output(config)
config |
llm_config |
Turn on structured output for a model configuration. Supports OpenAI-compatible providers (OpenAI, Groq, Together, x.ai, DeepSeek, Xiaomi, Alibaba (Qwen), Zhipu, Moonshot), Anthropic, and Gemini.
enable_structured_output( config, schema = NULL, name = "llmr_schema", method = c("auto", "json_mode", "tool_call"), strict = TRUE )enable_structured_output( config, schema = NULL, name = "llmr_schema", method = c("auto", "json_mode", "tool_call"), strict = TRUE )
config |
An llm_config object. |
schema |
A named list representing a JSON Schema.
If |
name |
Character. Schema/tool name for providers requiring one. Default "llmr_schema". |
method |
One of c("auto","json_mode","tool_call"). "auto" chooses the best per provider. You rarely need to change this. |
strict |
Logical. Request strict validation when supported
(OpenAI-compatible). Strict mode has formal requirements of its own:
every object must set |
Modified llm_config.
OpenAI, Groq, Together, x.ai, and Ollama accept a strict json_schema
response format. DeepSeek, Alibaba (Qwen), Zhipu, Moonshot, and Xiaomi
accept only JSON-object mode; for them the supplied schema drives local
parsing and validation, so the prompt itself should describe the desired
fields. Anthropic enforcement runs through a forced tool call; Gemini
through responseJsonSchema.
A supplied schema is sent as responseJsonSchema (standard JSON Schema,
supported by Gemini 2.5+ models) together with the JSON mime type. For an
older model that rejects it, set gemini_enable_response_schema = FALSE in
the config to fall back to JSON-mime-type-only mode (the reply is still
parsed and can be validated locally).
For tasks where strict JSON schema is unnecessary or unsupported, consider
llm_mutate() with .tags or llm_mutate_tags() for soft structured output.
disable_structured_output(), llm_parse_structured(),
llm_parse_structured_col(), llm_mutate_structured(),
llm_mutate_tags()
Creates a list of llm_config objects from a base configuration and sweeping
parameter vectors. Uses expand.grid() internally.
expand_llm_config(base_config, ...)expand_llm_config(base_config, ...)
base_config |
An llm_config object to use as the base. |
... |
Named vectors of parameter values to sweep (e.g., |
A list of llm_config objects.
llm_config(), llm_cross_design(), call_llm_par()
## Not run: base <- llm_config("openai", "gpt-4.1-nano") cfgs <- expand_llm_config(base, temperature = c(0, 0.5, 1), model = c("gpt-4.1-nano", "gpt-4.1-mini")) length(cfgs) ## End(Not run)## Not run: base <- llm_config("openai", "gpt-4.1-nano") cfgs <- expand_llm_config(base, temperature = c(0, 0.5, 1), model = c("gpt-4.1-nano", "gpt-4.1-mini")) length(cfgs) ## End(Not run)
A wrapper function that processes a list of texts in batches to generate embeddings,
avoiding rate limits. This function calls call_llm_robust for each
batch and stitches the results together and parses them (using parse_embeddings) to
return a numeric matrix.
get_batched_embeddings( texts, embed_config, batch_size = 50, verbose = FALSE, tries = 5, wait_seconds = 2, backoff_factor = 3 )get_batched_embeddings( texts, embed_config, batch_size = 50, verbose = FALSE, tries = 5, wait_seconds = 2, backoff_factor = 3 )
texts |
Character vector of texts to embed. If named, the names will be used as row names in the output matrix. |
embed_config |
An |
batch_size |
Integer. Number of texts to process in each batch. Default is 50. (Gemini's developer API embeds at most 100 texts per request; larger batches are split automatically.) |
verbose |
Logical. If TRUE, prints progress messages. Default is FALSE. |
tries, wait_seconds, backoff_factor
|
Retry controls forwarded to
|
A numeric matrix where each row is an embedding vector for the corresponding text.
Columns are named v1, v2, ..., vK where K is the embedding dimension.
If embedding fails for certain texts, those rows will be filled with NA values.
The matrix will always have the same number of rows as the input texts.
Returns NULL if no embeddings were successfully generated.
llm_config to create the embedding configuration.
parse_embeddings to convert the raw response to a numeric matrix.
## Not run: # Basic usage texts <- c("Hello world", "How are you?", "Machine learning is great") names(texts) <- c("greeting", "question", "statement") # The key is read from the VOYAGE_API_KEY environment variable. embed_cfg <- llm_config( provider = "voyage", model = "voyage-3.5-lite", embedding = TRUE ) embeddings <- get_batched_embeddings( texts = texts, embed_config = embed_cfg, batch_size = 2 ) ## End(Not run)## Not run: # Basic usage texts <- c("Hello world", "How are you?", "Machine learning is great") names(texts) <- c("greeting", "question", "statement") # The key is read from the VOYAGE_API_KEY environment variable. embed_cfg <- llm_config( provider = "voyage", model = "voyage-3.5-lite", embedding = TRUE ) embeddings <- get_batched_embeddings( texts = texts, embed_config = embed_cfg, batch_size = 2 ) ## End(Not run)
Computes per-row majority labels and overall reliability for replicate
columns produced by llm_replicate() (or any set of columns holding
repeated codings of the same units, including codings by different models
or by humans). Reliability is reported as average pairwise percent
agreement and Krippendorff's alpha for nominal data, the statistic
reviewers most often ask for; alpha handles missing values (failed calls)
gracefully.
llm_agreement(.data, cols = NULL, prefix = NULL, normalize = TRUE)llm_agreement(.data, cols = NULL, prefix = NULL, normalize = TRUE)
.data |
A data frame holding the replicate columns. |
cols |
Character vector naming the replicate columns. Alternatively
supply |
prefix |
Base name: columns matching |
normalize |
If |
An object of class llmr_agreement: a list with
by_rowa tibble with one row per unit: majority (modal
label, NA on ties), share (modal share of non-missing
replicates), n_distinct, unanimous, tie, n_missing.
summarya one-row tibble: n_units, n_replicates,
mean_pairwise_agreement, krippendorff_alpha, n_unanimous,
n_ties.
Printing shows the summary.
Krippendorff, K. (2019). Content Analysis: An Introduction to Its Methodology (4th ed.), chapter 12. The alpha implemented here is the nominal-data form with missing values allowed.
Reference an API key by the name of the environment variable that holds it,
so the secret never appears in your R code or saved objects. Store the key in
your shell profile or in ~/.Renviron (e.g. OPENAI_API_KEY=sk-...).
llm_api_key_env(var, required = TRUE, default = NULL)llm_api_key_env(var, required = TRUE, default = NULL)
var |
Name of the environment variable (e.g., "OPENAI_API_KEY"). A
character vector is also accepted; the variables are tried in order and the
first one that is set wins, which is convenient when a key may live under
more than one name (e.g., |
required |
If TRUE, a missing variable raises an authentication error at call time. If FALSE, a missing variable resolves to an empty key, which is appropriate for providers that do not require authentication (e.g., a local Ollama server). |
default |
Optional default used if the environment variable is not set. |
Best practice is to not pass a key explicitly at all: llm_config() already
looks up the standard variable for each provider (<PROVIDER>_API_KEY, then
<PROVIDER>_KEY). Use llm_api_key_env() only when your variable has a
non-standard name.
A secret handle to pass as api_key = llm_api_key_env("VARNAME") in
llm_config().
cfg <- llm_config( "openai", "gpt-4o-mini", api_key = llm_api_key_env("MY_OPENAI_KEY") )cfg <- llm_config( "openai", "gpt-4o-mini", api_key = llm_api_key_env("MY_OPENAI_KEY") )
Cancel a batch job
llm_batch_cancel(job)llm_batch_cancel(job)
job |
An |
The provider's response, invisibly.
Retrieves a finished job and returns one row per submitted request, in
submission order, with the same diagnostic columns as call_llm_par()
(response text, success, finish reason, token counts including cached
tokens, response id, raw JSON). Rows whose requests failed carry the
provider's error message. Parse structured replies afterwards with
llm_parse_structured_col() or llm_parse_tags_col(), exactly as for
live results.
llm_batch_fetch(job)llm_batch_fetch(job)
job |
An |
A tibble with custom_id plus the diagnostic columns described
above. If the job is not finished yet, an error is raised; check with
llm_batch_status() first.
Check the status of a batch job
llm_batch_status(job)llm_batch_status(job)
job |
An |
A one-row tibble: provider, batch_id, status (provider's
wording; "completed"/"ended"/"done" mean ready), n_total,
n_completed, n_failed (NA where a provider does not report counts).
llm_batch_submit(), llm_batch_fetch()
Provider batch APIs run large jobs asynchronously at a reduced price
(typically half) in exchange for delayed delivery (minutes up to 24 hours).
llm_batch_submit() packages one request per element of messages and
submits the job; llm_batch_status() polls it; llm_batch_fetch()
retrieves results as a tidy tibble aligned with the inputs.
llm_batch_submit(config, messages, state_path = NULL)llm_batch_submit(config, messages, state_path = NULL)
config |
An llm_config for a generative model. |
messages |
An unnamed character vector (each element becomes one
request's user message); a named character vector like
|
state_path |
Optional file path; when given, the job object is also saved there as RDS (and the path is remembered for convenience). |
Supported providers: "openai" and "groq" (Files + Batches protocol),
"anthropic" (Message Batches), and "gemini" (batchGenerateContent
with inline requests; developer API, not Vertex). All request-shaping
features of llm_config() apply: sampling parameters, structured output,
tools, and hooks shape each request exactly as a live call_llm() would.
The returned job object contains no secrets (the config stores an
environment-variable reference, not the key), so it can be saved to disk,
shared, and fetched later or from another machine with the same
environment variables set. Pass state_path to save it automatically.
An llmr_batch_job object.
llm_batch_status(), llm_batch_fetch(), llm_batch_cancel()
## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 0) job <- llm_batch_submit(cfg, c("2+2?", "Capital of Chile?"), state_path = "my_batch.rds") llm_batch_status(job) # ... later, possibly in a new session: res <- llm_batch_fetch("my_batch.rds") ## End(Not run)## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 0) job <- llm_batch_submit(cfg, c("2+2?", "Capital of Chile?"), state_path = "my_batch.rds") llm_batch_status(job) # ... later, possibly in a new session: res <- llm_batch_fetch("my_batch.rds") ## End(Not run)
Create and interact with a stateful chat session object that retains
message history. This documentation page covers the constructor function
chat_session() as well as all S3 methods for the llm_chat_session class.
chat_session(config, system = NULL, quiet = FALSE, ...) ## S3 method for class 'llm_chat_session' as.data.frame(x, ...) ## S3 method for class 'llm_chat_session' summary(object, ...) ## S3 method for class 'llm_chat_session' head(x, n = 6L, width = getOption("width") - 15, ...) ## S3 method for class 'llm_chat_session' tail(x, n = 6L, width = getOption("width") - 15, ...) ## S3 method for class 'llm_chat_session' print(x, width = getOption("width") - 15, ...)chat_session(config, system = NULL, quiet = FALSE, ...) ## S3 method for class 'llm_chat_session' as.data.frame(x, ...) ## S3 method for class 'llm_chat_session' summary(object, ...) ## S3 method for class 'llm_chat_session' head(x, n = 6L, width = getOption("width") - 15, ...) ## S3 method for class 'llm_chat_session' tail(x, n = 6L, width = getOption("width") - 15, ...) ## S3 method for class 'llm_chat_session' print(x, width = getOption("width") - 15, ...)
config |
An llm_config for a generative model ( |
system |
Optional system prompt inserted once at the beginning. |
quiet |
Logical. If |
... |
Default arguments forwarded to every |
x, object
|
An |
n |
Number of turns to display. |
width |
Character width for truncating long messages. |
The chat_session object provides a simple way to hold a conversation with
a generative model. It wraps call_llm_robust() to benefit from retry logic,
caching, and error logging.
For chat_session(), an object of class llm_chat_session.
Other methods return what their titles state.
A private environment stores the running list of
list(role, content) messages.
At each $send() the history is sent in full to the model.
Provider-agnostic token counts are extracted from the JSON response.
$send(text, ..., role = "user")Append a message (default role "user"), query the model,
print the assistant's reply (unless quiet = TRUE), and invisibly
return it. text may also be a named character vector using the same
multimodal shortcut as call_llm(), e.g.
chat$send(c(user = "Describe this image.", file = "plot.png")).
$send_structured(text, schema, ..., role = "user", .fields = NULL, .validate_local = TRUE)Send a message with structured-output enabled using schema, append the assistant's reply,
parse JSON (and optionally validate locally when .validate_local = TRUE),
returning the parsed result invisibly.
$send_tags(text, .tags, ..., role = "user", .fields = NULL)Send a message with XML-like tag instructions injected, append the assistant's reply, parse the requested tags, and invisibly return the parsed list.
$history()Raw list of messages.
$history_df()Two-column data frame (role, content).
$tokens_sent()/$tokens_received()
Running token totals.
$reset()Clear history (retains the optional system message).
llm_config(), call_llm(), call_llm_robust(), llm_fn(), llm_mutate()
if (interactive()) { cfg <- llm_config("openai", "gpt-5-nano") chat <- chat_session(cfg, system = "Be concise.") chat$send("Who invented the moon?") chat$send("Explain why in one short sentence.") chat # print() shows a summary and first 10 turns summary(chat) # stats tail(chat, 2) as.data.frame(chat) }if (interactive()) { cfg <- llm_config("openai", "gpt-5-nano") chat <- chat_session(cfg, system = "Be concise.") chat$send("Who invented the moon?") chat$send("Explain why in one short sentence.") chat # print() shows a summary and first 10 turns summary(chat) # stats tail(chat, 2) as.data.frame(chat) }
llm_config() builds a provider-agnostic configuration object that
call_llm() (and friends) understand. You can pass provider-specific
parameters via ...; LLMR forwards them as-is, with a few safe conveniences.
llm_config( provider, model, api_key = NULL, troubleshooting = FALSE, base_url = NULL, embedding = NULL, no_change = FALSE, ... )llm_config( provider, model, api_key = NULL, troubleshooting = FALSE, base_url = NULL, embedding = NULL, no_change = FALSE, ... )
provider |
Character scalar naming the backend. Known providers:
When |
model |
Character scalar. Model name understood by the chosen provider.
(e.g., |
api_key |
Provider API key. Preferred form is |
troubleshooting |
Logical. If |
base_url |
Optional character. Back-compat alias; if supplied it is
stored as |
embedding |
|
no_change |
Logical. If |
... |
Additional model parameters. LLMR understands a small canonical set spelled the OpenAI way and translates it per provider, so you can keep one vocabulary across backends:
|
An object of class c("llm_config", provider). Fields:
provider, model, api_key, troubleshooting, embedding,
no_change, and model_params (a named list of extras). print() masks
the API key.
Three optional functions in ... customize the HTTP exchange when a
provider needs something unusual (a gateway header, an exotic body field, a
nonstandard response envelope). All are applied on every request for every
provider:
request_modifier: function(body) -> body, edits the JSON body
before serialization (OpenAI-compatible chat paths).
req_builder: function(req) -> req, edits the httr2 request
(headers, URL, auth) just before it is performed.
response_modifier: function(content) -> content, edits the
parsed JSON before LLMR interprets it.
Anthropic temperatures must be in [0, 1]; others in [0, 2]. Out-of-range
values are clamped with a warning. Reasoning or thinking-oriented models may
reject custom temperature values; omit temperature unless the selected
model accepts it.
You can pass api_url (or base_url= alias) in ... to point to gateways
or compatible proxies.
Use provider = "gemini", vertex = TRUE for Gemini on Vertex AI. Supply
project and optionally location; when api_key is omitted, LLMR looks for
VERTEX_ACCESS_TOKEN and sends it as a Bearer token.
call_llm,
call_llm_robust,
llm_chat_session,
call_llm_par,
get_batched_embeddings
## Not run: # Basic OpenAI config cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0.7, max_tokens = 300) # Generative call returns an llmr_response object r <- call_llm(cfg, "Say hello in Greek.") print(r) as.character(r) # Embeddings (inferred from the model name) e_cfg <- llm_config("gemini", "gemini-embedding-001") # Force embeddings even if model name does not contain "embedding" e_cfg2 <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) # Gemini through Vertex AI. VERTEX_ACCESS_TOKEN should contain a Bearer token. v_cfg <- llm_config( "gemini", "gemini-2.5-flash-lite", vertex = TRUE, project = "my-gcp-project", location = "us-central1", api_key = "VERTEX_ACCESS_TOKEN" ) ## End(Not run)## Not run: # Basic OpenAI config cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0.7, max_tokens = 300) # Generative call returns an llmr_response object r <- call_llm(cfg, "Say hello in Greek.") print(r) as.character(r) # Embeddings (inferred from the model name) e_cfg <- llm_config("gemini", "gemini-embedding-001") # Force embeddings even if model name does not contain "embedding" e_cfg2 <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) # Gemini through Vertex AI. VERTEX_ACCESS_TOKEN should contain a Bearer token. v_cfg <- llm_config( "gemini", "gemini-2.5-flash-lite", vertex = TRUE, project = "my-gcp-project", location = "us-central1", api_key = "VERTEX_ACCESS_TOKEN" ) ## End(Not run)
Creates an experimental design tibble that crosses every row in .data with
every config in configs, evaluating glue prompt templates row-by-row.
The result has config and messages list-columns ready for call_llm_par().
llm_cross_design( .data, configs, prompt = NULL, .messages = NULL, .system_prompt = NULL )llm_cross_design( .data, configs, prompt = NULL, .messages = NULL, .system_prompt = NULL )
.data |
A data frame containing variables for the glue prompt. |
configs |
A list of llm_config objects (or a single llm_config). |
prompt |
A glue string for a single user turn. |
.messages |
Optional named character vector of glue templates (roles as names). |
.system_prompt |
Optional system prompt template (glue string). |
A tibble with all original data columns plus config and messages
list-columns.
expand_llm_config(), call_llm_par(), build_factorial_experiments()
## Not run: cities <- data.frame(city = c("Cairo", "Lima")) cfgs <- list(llm_config("openai", "gpt-4.1-nano"), llm_config("openai", "gpt-4.1-mini")) design <- llm_cross_design(cities, cfgs, prompt = "What country is {city} in?") results <- call_llm_par(design) ## End(Not run)## Not run: cities <- data.frame(city = c("Cairo", "Lima")) cfgs <- list(llm_config("openai", "gpt-4.1-nano"), llm_config("openai", "gpt-4.1-mini")) design <- llm_cross_design(cities, cfgs, prompt = "What country is {city} in?") results <- call_llm_par(design) ## End(Not run)
Returns one row per problem: a hard failure (success not TRUE) or a
truncated / content-filtered completion (finish_reason "length" or
"filter"), with the diagnostic detail needed to act. Works on both
call_llm_par() and llm_mutate() results. For a call_llm_par() result
(which still carries config and messages), pass the original frame to
llm_par_resume() to re-run only these rows.
llm_failures(x, prefix = NULL, include = c("all", "failed", "truncated"))llm_failures(x, prefix = NULL, include = c("all", "failed", "truncated"))
x |
A data frame from |
prefix |
For an |
include |
One of |
A tibble (zero rows if nothing matched) with: row (index into x),
success, finish_reason, status_code, error_code, bad_param,
error_message, response_id. Columns absent from x are filled with
NA.
llm_par_resume() to re-run failed rows, llm_usage().
res <- tibble::tibble( success = c(TRUE, FALSE, TRUE), finish_reason = c("stop", "error:server", "length"), error_message = c(NA, "HTTP 503", NA) ) llm_failures(res)res <- tibble::tibble( success = c(TRUE, FALSE, TRUE), finish_reason = c("stop", "error:server", "length"), error_message = c(NA, "HTTP 503", NA) ) llm_failures(res)
Apply an LLM prompt over vectors/data frames
llm_fn( x, prompt, .config, .system_prompt = NULL, ..., .tags = NULL, .fields = NULL, .return = c("text", "columns", "object"), .na_action = c("send", "skip", "error"), .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )llm_fn( x, prompt, .config, .system_prompt = NULL, ..., .tags = NULL, .fields = NULL, .return = c("text", "columns", "object"), .na_action = c("send", "skip", "error"), .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )
x |
A character vector or a data.frame/tibble. |
prompt |
A glue template string. With a data-frame you may reference
columns ( |
.config |
An llm_config object. |
.system_prompt |
Optional system message (character scalar). |
... |
Passed unchanged to |
.tags |
Optional character vector of XML-like tag names to request and parse.
When supplied, delegates to |
.fields |
Optional field selector for tag extraction (see |
.return |
One of |
.na_action |
What to do with elements whose template references an |
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows that a batched call leaves unresolved (dropped, malformed, or truncated). One of:
Recovery is bounded by an internal call budget so it always terminates. |
For generative mode:
.return = "text": character vector
.return = "columns": tibble with diagnostics
.return = "object": list of llmr_response (or NA on failure;
unavailable when .batch_size > 1)
For embedding mode, always a numeric matrix.
With .batch_size > 1, several input elements travel in one generative
request: LLMR wraps each element's prompt in a numbered tag,
<row_1>...</row_1>, <row_2>...</row_2>, and so on, appends that block to
the message (see .batch_payload), and instructs the model to answer each
item inside a matching numbered tag. The reply is split back into the
original elements by those numbers. Batching trades a smaller number of
(larger) requests for some dependence on the model following the protocol; it
is most useful with capable models at temperature = 0, and it is a net loss
when the model ignores the wrapping. Results are deterministic given the
model's outputs: partitioning and parsing add no randomness. Rows the model
drops, reorders, duplicates, or truncates are detected and re-issued
according to .batch_recovery. Because a batch shares one underlying call,
token counts are reported once per batch (on its first resolved row, NA
elsewhere), as is the wall-clock duration, so that summing those columns is
correct. When a batch reply is entirely unusable and its rows succeed only
through recovery calls, the failed call's spend has no successful row to
land on, so sums can slightly undercount in heavy-recovery runs.
llm_mutate(), llm_fn_structured(), llm_fn_tags(),
llm_parse_batch_tags(), setup_llm_parallel(), call_llm_broadcast(),
get_batched_embeddings()
## Not run: words <- c("excellent", "awful") cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) llm_fn(words, "Classify '{x}' as Positive/Negative.", cfg, .return = "text") df <- tibble::tibble(text = words, source = c("review", "review")) llm_fn(df, "Classify '{text}' from {source}.", cfg, .return = "columns") ## End(Not run)## Not run: words <- c("excellent", "awful") cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) llm_fn(words, "Classify '{x}' as Positive/Negative.", cfg, .return = "text") df <- tibble::tibble(text = words, source = c("review", "review")) llm_fn(df, "Classify '{text}' from {source}.", cfg, .return = "columns") ## End(Not run)
Schema-first variant of llm_fn(). It enables structured output on the config,
calls the model via call_llm_broadcast(), parses JSON, and optionally validates.
llm_fn_structured( x, prompt, .config, .system_prompt = NULL, ..., .schema = NULL, .fields = NULL, .local_only = FALSE, .validate_local = TRUE, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )llm_fn_structured( x, prompt, .config, .system_prompt = NULL, ..., .schema = NULL, .fields = NULL, .local_only = FALSE, .validate_local = TRUE, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )
x |
A character vector or a data.frame/tibble. |
prompt |
A glue template string. With a data-frame you may reference
columns ( |
.config |
An llm_config object. |
.system_prompt |
Optional system message (character scalar). |
... |
Passed unchanged to |
.schema |
Optional JSON Schema list; if |
.fields |
Optional fields to hoist from parsed JSON (supports nested paths). |
.local_only |
If TRUE, do not send schema to the provider (parse/validate locally). |
.validate_local |
If TRUE and |
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows that a batched call leaves unresolved (dropped, malformed, or truncated). One of:
Recovery is bounded by an internal call budget so it always terminates. |
llm_fn(), llm_mutate_structured(), enable_structured_output(),
llm_parse_structured_col()
Tags-first variant of llm_fn(). Injects tag instructions, calls the model
via call_llm_broadcast(), then parses XML-like tags from each response.
llm_fn_tags( x, prompt, .config, .system_prompt = NULL, ..., .tags, .fields = NULL, .return = c("columns", "text", "object"), .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )llm_fn_tags( x, prompt, .config, .system_prompt = NULL, ..., .tags, .fields = NULL, .return = c("columns", "text", "object"), .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none") )
x |
A character vector or a data.frame/tibble. |
prompt |
A glue template string. With a data-frame you may reference
columns ( |
.config |
An llm_config object. |
.system_prompt |
Optional system message (character scalar). |
... |
Passed unchanged to |
.tags |
Character vector of tag names to request and parse. |
.fields |
|
.return |
One of |
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows that a batched call leaves unresolved (dropped, malformed, or truncated). One of:
Recovery is bounded by an internal call budget so it always terminates. |
llm_fn(), llm_mutate_tags(), llm_parse_tags_col(),
call_llm_par_tags()
Evaluates outputs in a target column against a custom prompt using
llm_mutate_tags() for clean tag-based extraction. The target column value
is available in the prompt template as {.target}.
llm_judge( .data, .target, .config, prompt, .tags = c("reasoning", "score"), .output = "judge_res", ... )llm_judge( .data, .target, .config, prompt, .tags = c("reasoning", "score"), .output = "judge_res", ... )
.data |
Data frame of experiment results. |
.target |
Bare column name containing the output to evaluate. |
.config |
The judge llm_config. |
prompt |
Evaluation prompt template. Use |
.tags |
Tags to extract from the judge response. Defaults to
|
.output |
Name of the column that receives the judge's raw response.
Default |
... |
Passed to |
.data with judge output columns appended.
llm_mutate_tags(), llm_parse_tags()
## Not run: results |> llm_judge( .target = response_text, .config = judge_cfg, prompt = "Rate this answer on a 1-5 scale:\n{.target}", .tags = c("reasoning", "score") ) ## End(Not run)## Not run: results |> llm_judge( .target = response_text, .config = judge_cfg, prompt = "Rate this answer on a 1-5 scale:\n{.target}", .tags = c("reasoning", "score") ) ## End(Not run)
llm_log_enable() turns on a session-wide audit log: each API call made
through LLMR (including those issued by llm_fn(), llm_mutate(),
call_llm_par(), and chat_session()) appends one JSON object to path.
llm_log_disable() turns logging off. llm_log_status() reports the
current destination, if any.
llm_log_enable(path = "llmr_log.jsonl", include_messages = TRUE) llm_log_disable() llm_log_status()llm_log_enable(path = "llmr_log.jsonl", include_messages = TRUE) llm_log_disable() llm_log_status()
path |
File path for the log. Created on first write; appended to if it exists, so one file can accumulate a whole project's calls. |
include_messages |
Logical. If |
Methodological guidance for LLM-assisted research asks authors to retain, for every call: the model and provider, the full prompt, the inference settings, the output, and identifiers that allow an exact lookup later. The audit log records precisely that:
ts: ISO-8601 timestamp with timezone.
provider, model: as configured; model_version: the identifier the
server reports having served (when echoed), which catches silent model
updates.
request: the JSON body sent to the provider, including all sampling
parameters and the rendered messages. Inline file data (base64) is
replaced by a short placeholder so logs stay small.
text, finish_reason, usage: the reply, why it stopped, and token
counts (including cached tokens when reported).
response_id, status, duration_s: provider request id, HTTP status,
and wall-clock seconds.
Failed calls are logged too (kind = "error"), with the provider's error
message.
Records are appended line by line; under parallel execution all workers append to the same file. Each line is one complete record, so interleaving across workers is harmless. The log contains your prompts and the model's replies in clear text. It never contains API keys.
Set include_messages = FALSE to omit request bodies and reply text
(keeping only metadata, parameters, usage, and identifiers), e.g. when
prompts contain confidential data.
llm_log_enable() and llm_log_disable() return the previous log
path invisibly. llm_log_status() returns the active path or NULL,
invisibly, after printing a one-line status.
llm_usage() for token summaries, llm_methods_text() for a
draft methods paragraph.
## Not run: llm_log_enable("annotation_run.jsonl") cfg <- llm_config("groq", "openai/gpt-oss-20b") call_llm(cfg, "One word: capital of France?") llm_log_disable() # Read the log back as a data frame log_df <- jsonlite::stream_in(file("annotation_run.jsonl"), verbose = FALSE) ## End(Not run)## Not run: llm_log_enable("annotation_run.jsonl") cfg <- llm_config("groq", "openai/gpt-oss-20b") call_llm(cfg, "One word: capital of France?") llm_log_disable() # Read the log back as a data frame log_df <- jsonlite::stream_in(file("annotation_run.jsonl"), verbose = FALSE) ## End(Not run)
Token-level log-probabilities turn a classification into a measurement: the
probability the model assigned to its own answer is a confidence score you
can calibrate, threshold, or carry into downstream models as a soft label.
Request them at config time (llm_config(..., logprobs = TRUE, top_logprobs = 5)); this helper then returns them tidily.
llm_logprobs(x)llm_logprobs(x)
x |
An llmr_response object (from |
For a single response: a tibble with one row per generated token:
token (character), logprob (double), and top_logprobs (a list-column
of data frames with the k most likely alternatives at that position,
when requested). Returns a zero-row tibble when the response carries no
logprobs. For a result frame: a list of such tibbles, one per row.
## Not run: # Provider support varies; deepseek-chat and OpenAI expose logprobs, # Anthropic does not, and several hosts reject the flag model by model. cfg <- llm_config("deepseek", "deepseek-chat", logprobs = TRUE, top_logprobs = 3, temperature = 0) r <- call_llm(cfg, "Answer with one word: is water wet?") llm_logprobs(r) # Confidence of the first answer token: exp(llm_logprobs(r)$logprob[1]) ## End(Not run)## Not run: # Provider support varies; deepseek-chat and OpenAI expose logprobs, # Anthropic does not, and several hosts reject the flag model by model. cfg <- llm_config("deepseek", "deepseek-chat", logprobs = TRUE, top_logprobs = 3, temperature = 0) r <- call_llm(cfg, "Answer with one word: is water wet?") llm_logprobs(r) # Confidence of the first answer token: exp(llm_logprobs(r)$logprob[1]) ## End(Not run)
Turns the diagnostic columns of a finished run into a first draft of the transparency paragraph that journals and methodological guidelines now ask for: which model(s) and provider(s), how many calls, the inference settings that were recorded, token totals, and the failure/truncation counts. Edit the draft; it states only what the result frame actually contains and marks anything unknown as such.
llm_methods_text(x, prefix = NULL, task = NULL)llm_methods_text(x, prefix = NULL, task = NULL)
x |
A data frame from |
prefix |
For an |
task |
Optional one-clause description of what the model was asked to
do (e.g., |
A character scalar (one paragraph). Print it with cat().
llm_usage(), llm_log_enable() for the per-call audit trail.
res <- tibble::tibble( model = "openai/gpt-oss-20b", provider = "groq", success = c(TRUE, TRUE), finish_reason = c("stop", "stop"), sent_tokens = c(10L, 12L), rec_tokens = c(5L, 7L), total_tokens = c(15L, 19L), reasoning_tokens = NA_integer_, duration = c(0.4, 0.5) ) cat(llm_methods_text(res, task = "to classify sample sentences"))res <- tibble::tibble( model = "openai/gpt-oss-20b", provider = "groq", success = c(TRUE, TRUE), finish_reason = c("stop", "stop"), sent_tokens = c(10L, 12L), rec_tokens = c(5L, 7L), total_tokens = c(15L, 19L), reasoning_tokens = NA_integer_, duration = c(0.4, 0.5) ) cat(llm_methods_text(res, task = "to classify sample sentences"))
Adds one or more columns to .data that are produced by a Large-Language-Model.
llm_mutate( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .return = c("columns", "text", "object"), .na_action = c("send", "skip", "error"), .structured = FALSE, .schema = NULL, .fields = NULL, .tags = NULL, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )llm_mutate( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .return = c("columns", "text", "object"), .na_action = c("send", "skip", "error"), .structured = FALSE, .schema = NULL, .fields = NULL, .tags = NULL, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )
.data |
A data.frame / tibble. |
output |
Unquoted name that becomes the new column (generative) or
the prefix for embedding columns. In shorthand form, omit this argument
and pass |
prompt |
Optional glue template string for a single user turn; reference
any columns in |
.messages |
Optional named character vector of glue templates to build
a multi-turn message, using roles in |
.config |
An llm_config object (generative or embedding). |
.system_prompt |
Optional system message sent with every request when
|
.before, .after
|
Standard dplyr::relocate helpers controlling where the generated column(s) are placed. |
.return |
One of |
.na_action |
What to do with rows whose template references an |
.structured |
Logical. If |
.schema |
Optional JSON Schema (R list). When |
.fields |
Optional character vector of fields to extract from parsed JSON
or tag output. In JSON mode, supports nested paths (e.g., |
.tags |
Optional character vector of XML-like tag names to request and parse,
such as |
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows a batched call leaves unresolved.
One of |
... |
Passed to the underlying calls: |
Multi-column injection: templating is NA-safe (NA -> empty string).
Multi-turn templating: supply .messages = c(system=..., user=..., file=...).
Duplicate role names are allowed (e.g., two user turns).
Generative mode: one request per row via call_llm_broadcast().
Parallelism: calls call_llm_broadcast(), which uses
call_llm_robust() under the hood. If no future plan is active,
workers are auto-configured; call setup_llm_parallel() to set worker
count explicitly.
Embedding mode: the per-row text is embedded via get_batched_embeddings().
Result expands to numeric columns named paste0(<output>, 1:N). If all rows
fail to embed, a single <output>1 column of NA is returned.
Diagnostic columns use suffixes: _finish, _sent, _rec, _tot, _reason, _ok, _err, _id, _status, _ecode, _param, _t.
Row batching: with .batch_size > 1, three further columns are added
(_batch, _bn, _bi: the batch identifier, the size of the resolving
call, and the within-call position). They appear only when batching
actually groups rows, so the default schema is unchanged at .batch_size = 1.
.data with the new column(s) appended.
With .batch_size > 1, several rows travel in one generative request. LLMR
wraps each row's prompt in a numbered tag, <row_1>...</row_1>,
<row_2>...</row_2>, and so on, appends that block to the message (see
.batch_payload), and instructs the model to answer each item inside a
matching numbered tag; the reply is split back into rows by those numbers.
This also composes with .tags (each <row_i> then wraps the requested field
tags) and with .structured = TRUE (rows are returned as one JSON object
{"results":[{"row":i, ...}]}, de-multiplexed by the integer row field; a
one-time warning notes that this relies on the model honouring the protocol
and that strict provider-side schema validation is replaced by local parsing).
Batching is most useful with capable models at temperature = 0 and is a net
loss when the model ignores the wrapping. Dropped, reordered, duplicated, or
truncated rows are detected and re-issued per .batch_recovery; token counts
are reported once per batch so that summing token columns stays correct.
You can supply the output column and prompt in one argument:
df |> llm_mutate(answer = "{question} (hint: {hint})", .config = cfg)
df |> llm_mutate(answer = c(system = "One word.", user = "{question}"), .config = cfg)
df |> llm_mutate(country = "Where is {city}? Answer with only the country.", .config = cfg)
This is equivalent to:
df |> llm_mutate(answer, prompt = "{question} (hint: {hint})", .config = cfg)
df |> llm_mutate(answer, .messages = c(system = "One word.", user = "{question}"), .config = cfg)
.structured = TRUE delegates to llm_mutate_structured() for JSON.
.tags delegates to llm_mutate_tags() for XML-like tags.
If both are supplied, .structured takes precedence.
llm_fn(), llm_mutate_structured(), llm_mutate_tags(),
llm_parse_structured_col(), llm_parse_tags_col(),
llm_parse_batch_tags(), call_llm_broadcast(), setup_llm_parallel()
## Not run: library(dplyr) df <- tibble::tibble( id = 1:2, question = c("Capital of France?", "Author of 1984?"), hint = c("European city", "English novelist") ) cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) # Generative: single-turn with multi-column injection df |> llm_mutate( answer, prompt = "{question} (hint: {hint})", .config = cfg, .system_prompt = "Respond in one word." ) # Generative: multi-turn via .messages (system + user) df |> llm_mutate( advice, .messages = c( system = "You are a helpful zoologist. Keep answers short.", user = "What is a key fact about this? {question} (hint: {hint})" ), .config = cfg ) # Multimodal: include an image path with role 'file' pics <- tibble::tibble( img = c("inst/extdata/cat.png", "inst/extdata/dog.jpg"), prompt = c("Describe the image.", "Describe the image.") ) pics |> llm_mutate( vision_desc, .messages = c(user = "{prompt}", file = "{img}"), .config = llm_config("openai","gpt-4.1-mini") ) # Embeddings: output name becomes the prefix of embedding columns emb_cfg <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) df |> llm_mutate( vec, prompt = "{question}", .config = emb_cfg, .after = id ) # Structured output: using .structured = TRUE (equivalent to llm_mutate_structured) schema <- list( type = "object", properties = list( answer = list(type = "string"), confidence = list(type = "number") ), required = list("answer", "confidence") ) df |> llm_mutate( result, prompt = "{question}", .config = cfg, .structured = TRUE, .schema = schema ) # Structured with shorthand df |> llm_mutate( result = "{question}", .config = cfg, .structured = TRUE, .schema = schema ) # Soft structured output with XML-like tags df |> llm_mutate( result = "Extract the person's age and job from: {question}", .config = cfg, .tags = c("age", "job") ) cities <- tibble::tibble(city = c("Cairo", "Lima")) cities |> llm_mutate( geo = "Where is {city}? Give country and continent in their own tags.", .config = cfg, .system_prompt = paste( "Use XML tags for different parts of the answer, but do not nest tags.", "Return <country>...</country> and <continent>...</continent>." ), .tags = c("country", "continent") ) ## End(Not run)## Not run: library(dplyr) df <- tibble::tibble( id = 1:2, question = c("Capital of France?", "Author of 1984?"), hint = c("European city", "English novelist") ) cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) # Generative: single-turn with multi-column injection df |> llm_mutate( answer, prompt = "{question} (hint: {hint})", .config = cfg, .system_prompt = "Respond in one word." ) # Generative: multi-turn via .messages (system + user) df |> llm_mutate( advice, .messages = c( system = "You are a helpful zoologist. Keep answers short.", user = "What is a key fact about this? {question} (hint: {hint})" ), .config = cfg ) # Multimodal: include an image path with role 'file' pics <- tibble::tibble( img = c("inst/extdata/cat.png", "inst/extdata/dog.jpg"), prompt = c("Describe the image.", "Describe the image.") ) pics |> llm_mutate( vision_desc, .messages = c(user = "{prompt}", file = "{img}"), .config = llm_config("openai","gpt-4.1-mini") ) # Embeddings: output name becomes the prefix of embedding columns emb_cfg <- llm_config("voyage", "voyage-3.5-lite", embedding = TRUE) df |> llm_mutate( vec, prompt = "{question}", .config = emb_cfg, .after = id ) # Structured output: using .structured = TRUE (equivalent to llm_mutate_structured) schema <- list( type = "object", properties = list( answer = list(type = "string"), confidence = list(type = "number") ), required = list("answer", "confidence") ) df |> llm_mutate( result, prompt = "{question}", .config = cfg, .structured = TRUE, .schema = schema ) # Structured with shorthand df |> llm_mutate( result = "{question}", .config = cfg, .structured = TRUE, .schema = schema ) # Soft structured output with XML-like tags df |> llm_mutate( result = "Extract the person's age and job from: {question}", .config = cfg, .tags = c("age", "job") ) cities <- tibble::tibble(city = c("Cairo", "Lima")) cities |> llm_mutate( geo = "Where is {city}? Give country and continent in their own tags.", .config = cfg, .system_prompt = paste( "Use XML tags for different parts of the answer, but do not nest tags.", "Return <country>...</country> and <continent>...</continent>." ), .tags = c("country", "continent") ) ## End(Not run)
Drop-in schema-first variant of llm_mutate(). Produces parsed columns.
llm_mutate_structured( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .schema = NULL, .fields = NULL, .validate_local = TRUE, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )llm_mutate_structured( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .schema = NULL, .fields = NULL, .validate_local = TRUE, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )
.data |
A data.frame / tibble. |
output |
Unquoted name that becomes the new column (generative) or
the prefix for embedding columns. In shorthand form, omit this argument
and pass |
prompt |
Optional glue template string for a single user turn; reference
any columns in |
.messages |
Optional named character vector of glue templates to build
a multi-turn message, using roles in |
.config |
An llm_config object (generative or embedding). |
.system_prompt |
Optional system message sent with every request when
|
.before, .after
|
Standard dplyr::relocate helpers controlling where the generated column(s) are placed. |
.schema |
Optional JSON Schema (R list). When provided, this schema is sent to
the provider for strict validation and used for local parsing. When |
.fields |
Optional character vector of fields to extract from parsed JSON. Supports:
|
.validate_local |
If TRUE (default) and |
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows a batched call leaves unresolved.
One of |
... |
Passed to the underlying calls: |
Like llm_mutate(), this function supports shorthand syntax:
df |> llm_mutate_structured(result = "{text}", .schema = schema)
df |> llm_mutate_structured(result = c(system = "Be brief.", user = "{text}"), .schema = schema)
llm_mutate(), llm_fn_structured(), enable_structured_output(),
llm_parse_structured_col(), llm_mutate_tags()
Soft structured variant of llm_mutate(). It asks the model to return simple
XML-like tags, then parses those tags into columns.
llm_mutate_tags( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .tags, .fields = NULL, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )llm_mutate_tags( .data, output, prompt = NULL, .messages = NULL, .config, .system_prompt = NULL, .before = NULL, .after = NULL, .tags, .fields = NULL, .batch_size = 1L, .batch_payload = c("user", "system"), .batch_recovery = c("halve_recursive", "halve_once", "singletons", "retry_same", "none"), ... )
.data |
A data.frame / tibble. |
output |
Unquoted name that becomes the new column (generative) or
the prefix for embedding columns. In shorthand form, omit this argument
and pass |
prompt |
Optional glue template string for a single user turn; reference
any columns in |
.messages |
Optional named character vector of glue templates to build
a multi-turn message, using roles in |
.config |
An llm_config object (generative or embedding). |
.system_prompt |
Optional system message sent with every request when
|
.before, .after
|
Standard dplyr::relocate helpers controlling where the generated column(s) are placed. |
.tags |
Character vector of tag names to request and parse. |
.fields |
|
.batch_size |
Integer scalar, or |
.batch_payload |
One of |
.batch_recovery |
How to handle rows a batched call leaves unresolved.
One of |
... |
Passed to the underlying calls: |
Returns the mutated data frame plus:
tags_okTRUE when all requested tags were found.
tags_dataA list-column of parsed tag lists.
One column per requested tag or field. Scalar columns are coerced to numeric or logical when all non-missing values allow it.
df |> llm_mutate_tags(result = "{text}", .tags = c("age", "job"), .config = cfg)
llm_mutate(), llm_parse_tags(), llm_parse_tags_col(),
llm_mutate_structured(), llm_parse_structured_col()
## Not run: df <- tibble::tibble(city = c("Cairo", "Lima")) cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) df |> llm_mutate_tags( geo = "Where is {city}? Give country and continent in their own tags.", .config = cfg, .system_prompt = paste( "Use XML tags for different parts of the answer, but do not nest tags.", "Return <country>...</country> and <continent>...</continent>." ), .tags = c("country", "continent") ) ## End(Not run)## Not run: df <- tibble::tibble(city = c("Cairo", "Lima")) cfg <- llm_config("openai", "gpt-4.1-nano", temperature = 0) df |> llm_mutate_tags( geo = "Where is {city}? Give country and continent in their own tags.", .config = cfg, .system_prompt = paste( "Use XML tags for different parts of the answer, but do not nest tags.", "Return <country>...</country> and <continent>...</continent>." ), .tags = c("country", "continent") ) ## End(Not run)
Finds rows where success is FALSE or NA in the output of call_llm_par(),
re-runs them, and patches the results back into the original data frame.
llm_par_resume(results, tries = 3, ...)llm_par_resume(results, tries = 3, ...)
results |
Output from |
tries |
Number of retries per call. Default 3. |
... |
Passed to |
The patched data frame with re-run results filled in.
## Not run: results <- call_llm_par(experiments) results <- llm_par_resume(results, tries = 3) ## End(Not run)## Not run: results <- call_llm_par(experiments) results <- llm_par_resume(results, tries = 3) ## End(Not run)
Splits a single batched reply into its numbered <row_i> blocks and then
applies the standard flat tag parser (llm_parse_tags()) inside each block.
This is the parsing counterpart to the <row_i> protocol that LLMR uses when
.batch_size > 1 together with .tags; it is exported so the protocol is
inspectable and testable on its own.
llm_parse_batch_tags(text, tags, m)llm_parse_batch_tags(text, tags, m)
text |
Character scalar: one batched model response containing
|
tags |
Character vector of field tag names to extract within each block. |
m |
Integer: the number of items expected in the batch (local ids
|
Robustness mirrors the internal scanner: reordered, duplicated, hallucinated,
truncated, or accidentally nested <row_i> blocks are handled; only fully
closed blocks contribute. Inner field tags are extracted by the same parser
used in non-batched tag mode, so values coerce and decode identically.
A list of length m. Element i is the named list returned by
llm_parse_tags() for <row_i>, or NULL when that block is absent,
truncated, or otherwise unrecoverable.
llm_parse_tags(), llm_parse_tags_col(), llm_mutate(),
llm_fn()
txt <- paste( "<row_1><age>21</age><job>barista</job></row_1>", "<row_2><age>34</age><job>welder</job></row_2>", sep = "\n" ) llm_parse_batch_tags(txt, tags = c("age", "job"), m = 2)txt <- paste( "<row_1><age>21</age><job>barista</job></row_1>", "<row_2><age>34</age><job>welder</job></row_2>", sep = "\n" ) llm_parse_batch_tags(txt, tags = c("age", "job"), m = 2)
Robustly parses an LLM's structured output (JSON). Works on character scalars or an llmr_response. Strips code fences first, then tries strict parsing, then attempts to extract the largest balanced {...} or [...].
llm_parse_structured(x, strict_only = FALSE, simplify = FALSE)llm_parse_structured(x, strict_only = FALSE, simplify = FALSE)
x |
Character or llmr_response. |
strict_only |
If TRUE, do not attempt recovery via substring extraction. |
simplify |
Logical passed to jsonlite::fromJSON ( |
The return contract is list-or-NULL; scalar-only JSON is treated as failure.
Numerics are coerced to double for stability.
A parsed R object (list), or NULL on failure.
llm_parse_structured_col(), llm_fn_structured(),
llm_mutate_structured(), llm_parse_tags()
llm_parse_structured('{"score": 5, "label": "good"}')llm_parse_structured('{"score": 5, "label": "good"}')
Extracts fields from a column containing structured JSON (string or list) and
appends them as new columns. Adds structured_ok (logical) and structured_data (list).
llm_parse_structured_col( .data, fields, structured_col = "response_text", prefix = "", allow_list = TRUE )llm_parse_structured_col( .data, fields, structured_col = "response_text", prefix = "", allow_list = TRUE )
.data |
data.frame/tibble |
fields |
Character vector of fields or named vector (dest_name = path). |
structured_col |
Column name to parse from. Default "response_text". |
prefix |
Optional prefix for new columns. |
allow_list |
Logical. If TRUE (default), non-scalar values (arrays/objects) are hoisted as list-columns instead of being dropped. If FALSE, only scalar fields are hoisted and non-scalars become NA. |
Supports nested-path extraction via dot/bracket paths (e.g., a.b[0].c)
or JSON Pointer (/a/b/0/c).
When allow_list = TRUE, non-scalar values become list-columns; otherwise
they yield NA and only scalars are hoisted.
.data with diagnostics and one new column per requested field.
llm_parse_structured(), llm_mutate_structured(),
llm_parse_tags_col()
df <- data.frame(response_text = '{"score": 5, "label": "good"}') llm_parse_structured_col(df, fields = c("score", "label"))df <- data.frame(response_text = '{"score": 5, "label": "good"}') llm_parse_structured_col(df, fields = c("score", "label"))
Extracts simple XML-like tags from a character scalar or llmr_response, such
as <age>21</age> and <job>student</job>. This is intended for soft
structured output, not full XML validation.
llm_parse_tags(x, tags)llm_parse_tags(x, tags)
x |
Character scalar or llmr_response. |
tags |
Character vector of tag names to extract. |
A named list of extracted tag values, or NULL when no requested tag
is found.
llm_parse_tags_col(), llm_mutate_tags()
llm_parse_tags("<age>21</age><job>student</job>", tags = c("age", "job"))llm_parse_tags("<age>21</age><job>student</job>", tags = c("age", "job"))
Appends tags_ok, tags_data, and one column per requested tag or field.
llm_parse_tags_col( .data, tags, tags_col = "response_text", fields = NULL, prefix = "" )llm_parse_tags_col( .data, tags, tags_col = "response_text", fields = NULL, prefix = "" )
.data |
data.frame/tibble. |
tags |
Character vector of tag names to parse. |
tags_col |
Column name to parse from. Default |
fields |
|
prefix |
Optional prefix for extracted columns. |
.data with tag diagnostics and extracted columns.
llm_parse_tags(), llm_mutate_tags(), llm_parse_structured_col()
df <- data.frame(response_text = "<age>21</age><job>student</job>") llm_parse_tags_col(df, tags = c("age", "job")) llm_parse_tags_col(df, tags = c("age", "job"), fields = c(person_age = "age"))df <- data.frame(response_text = "<age>21</age><job>student</job>") llm_parse_tags_col(df, tags = c("age", "job")) llm_parse_tags_col(df, tags = c("age", "job"), fields = c(person_age = "age"))
Renders every row exactly as llm_fn() / llm_mutate() would (no API call,
no file I/O), then reports a tidy, row-level summary: the rendered text, the
roles, character counts, file presence and existence, the batch plan, and a
list-column of issues. Problems that would only surface mid-run (a missing
file, a "file" role combined with .batch_size > 1, an embedding config
with row batching, .return = "object" with batching, a schema supplied
without .structured, a template that references NA values or renders an
empty prompt, a file part with no accompanying user text, or a tag name that
collides with the batched <row_N> protocol) are collected per row so you
see all of them at once rather than hitting the first error.
llm_preview( .data, prompt = NULL, .messages = NULL, .system_prompt = NULL, .config = NULL, .structured = FALSE, .schema = NULL, .tags = NULL, .return = c("columns", "text", "object"), .batch_size = 1L, rows = NULL, max_chars = 500L )llm_preview( .data, prompt = NULL, .messages = NULL, .system_prompt = NULL, .config = NULL, .structured = FALSE, .schema = NULL, .tags = NULL, .return = c("columns", "text", "object"), .batch_size = 1L, rows = NULL, max_chars = 500L )
.data |
A data.frame/tibble whose columns feed the |
prompt |
A single |
.messages |
A character vector of |
.system_prompt |
Optional system string, prepended when a row has no
|
.config |
Optional |
.structured |
Logical; if |
.schema |
Optional JSON schema (list). Flagged if supplied without
|
.tags |
Optional character vector of tag names. Flagged if combined with
|
.return |
One of |
.batch_size |
Rows per call. |
rows |
Optional integer vector selecting which rows to render (default: all rows). |
max_chars |
Truncate each row's rendered preview to this many characters (default 500). Set higher to see full prompts. |
Batched data travels inside numbered <row_i>...</row_i> tags; the
batch_id / batch_size / batch_row columns show how rows would be
grouped into calls at the given .batch_size.
A tibble of class llmr_preview, one row per previewed input row,
with columns: row, ok (no issues), roles, rendered_preview,
chars, has_file, file_ok, batch_id, batch_size, batch_row, and
issues (a list-column of character vectors).
llm_render_messages(), llm_usage(), llm_failures().
df <- data.frame(text = c("a", "b", "c"), stringsAsFactors = FALSE) llm_preview(df, prompt = "Classify: {text}", .batch_size = 2)df <- data.frame(text = c("a", "b", "c"), stringsAsFactors = FALSE) llm_preview(df, prompt = "Classify: {text}", .batch_size = 2)
Returns the per-row message objects that llm_fn() / llm_mutate() would
build from prompt or .messages, using the same internal renderer they
use. No request is sent and no file is read or encoded; a "file" role
stays a (glued) path string. Use this to inspect templating, roles, and
multimodal wiring before spending anything.
llm_render_messages( .data, prompt = NULL, .messages = NULL, .system_prompt = NULL, rows = NULL )llm_render_messages( .data, prompt = NULL, .messages = NULL, .system_prompt = NULL, rows = NULL )
.data |
A data.frame/tibble whose columns feed the |
prompt |
A single |
.messages |
A character vector of |
.system_prompt |
Optional system string, prepended when a row has no
|
rows |
Optional integer vector selecting which rows to render (default: all rows). |
A list of length length(rows) (default nrow(.data)). Each element
is either a bare character scalar (prompt only, no system) or a role-named
character vector, identical to what the call path would dispatch.
llm_preview() for a row-level summary with issue flags and the
batch plan; llm_fn(), llm_mutate().
df <- data.frame(text = c("good", "bad"), stringsAsFactors = FALSE) llm_render_messages(df, prompt = "Sentiment of: {text}") llm_render_messages( df, .messages = c(system = "Be terse.", user = "Rate: {text}") )df <- data.frame(text = c("good", "bad"), stringsAsFactors = FALSE) llm_render_messages(df, prompt = "Sentiment of: {text}") llm_render_messages( df, .messages = c(system = "Be terse.", user = "Rate: {text}") )
Calls the model .times times for every row of .data (all replicates run
through the parallel engine in one pass) and appends one column per
replicate: <output>_1, <output>_2, .... Feed the result to
llm_agreement() for per-row majority labels and overall reliability.
llm_replicate( .data, output, prompt, .config, .times = 3L, .system_prompt = NULL, ... )llm_replicate( .data, output, prompt, .config, .times = 3L, .system_prompt = NULL, ... )
.data |
A data.frame / tibble. |
output |
Unquoted base name for the replicate columns. |
prompt |
A glue template string evaluated against the columns of
|
.config |
An llm_config object (generative). |
.times |
Number of replicates (default 3). |
.system_prompt |
Optional system message. |
... |
Passed to |
Replication only measures sampling variability if the model can vary: with
temperature = 0 (or a fixed seed) most providers return nearly identical
draws, which inflates agreement. Conversely, for measurement purposes you
may want exactly that check: high disagreement at low temperature signals a
prompt the model finds genuinely ambiguous.
.data with .times new character columns,
<output>_1 ... <output>_<.times> (NA where a call failed).
llm_agreement(), llm_mutate(), call_llm_par()
## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 1) df <- tibble::tibble(text = c("I loved it", "Meh", "Terrible service")) reps <- df |> llm_replicate(sentiment, prompt = "Sentiment of '{text}'. One word: positive, negative, or neutral.", .config = cfg, .times = 5) llm_agreement(reps, prefix = "sentiment") ## End(Not run)## Not run: cfg <- llm_config("groq", "openai/gpt-oss-20b", temperature = 1) df <- tibble::tibble(text = c("I loved it", "Meh", "Terrible service")) reps <- df |> llm_replicate(sentiment, prompt = "Sentiment of '{text}'. One word: positive, negative, or neutral.", .config = cfg, .times = 5) llm_agreement(reps, prefix = "sentiment") ## End(Not run)
Wraps an R function with the name, description, and JSON-Schema argument
specification that providers need for native tool calling. Pass a list of
tools to call_llm_tools() (or to a chat_session() via that function),
which executes the calls the model makes and returns the model's final
answer.
llm_tool(fn, name, description, parameters = NULL, required = NULL)llm_tool(fn, name, description, parameters = NULL, required = NULL)
fn |
The R function to expose. It is called with the model's arguments
matched by name, so use the same parameter names as in |
name |
Tool name shown to the model (letters, digits, |
description |
One or two sentences telling the model what the tool does and when to use it. Write it for the model, not for a human reader; it is the only documentation the model sees. |
parameters |
Either a named list of JSON-Schema property definitions,
e.g. |
required |
Character vector of required argument names. Defaults to all
parameter names when |
An object of class llmr_tool.
call_llm_tools(), tool_calls()
weather <- llm_tool( fn = function(city) paste0("22C and clear in ", city), name = "get_weather", description = "Current weather for a city.", parameters = list(city = list(type = "string", description = "City name")) )weather <- llm_tool( fn = function(city) paste0("22C and clear in ", city), name = "get_weather", description = "Current weather for a city.", parameters = list(city = list(type = "string", description = "City name")) )
Reads the diagnostic columns produced by call_llm_par() (and
call_llm_broadcast() / llm_fn() with .return = "columns") or by
llm_mutate(), and returns a one-row tibble of counts and token totals. It
reports tokens, not money: sent, received, total, and reasoning tokens
are summed with na.rm = TRUE (correct under row batching, which attributes a
batch's tokens to its first row and leaves the rest NA). To estimate cost,
multiply these by your provider's current per-token prices yourself.
llm_usage(x, prefix = NULL, price_table = NULL)llm_usage(x, prefix = NULL, price_table = NULL)
x |
A data frame from |
prefix |
For an |
price_table |
Optional data frame you supply with your provider's
current prices, holding columns |
A one-row tibble: n, n_ok, n_failed, ok_rate, n_truncated
(finish "length"), n_filtered (finish "filter"), sent_tokens,
rec_tokens, total_tokens, reasoning_tokens, cached_tokens
(prompt tokens served from the provider's cache, when reported),
n_unknown_tokens
(successful rows for which the provider reported no token usage, so the
token sums above understate the truth), duration_s, (when a batch id
column is present) batch_calls and rows_per_batch_call, and (when
price_table is supplied) cost_estimate in the table's currency.
llm_failures(), llm_preview(), llm_par_resume().
res <- tibble::tibble( success = c(TRUE, TRUE, FALSE), finish_reason = c("stop", "length", "error:rate_limit"), sent_tokens = c(10L, 12L, NA_integer_), rec_tokens = c(5L, 7L, NA_integer_), total_tokens = c(15L, 19L, NA_integer_), reasoning_tokens = c(NA_integer_, NA_integer_, NA_integer_), duration = c(0.4, 0.5, 0.1) ) llm_usage(res)res <- tibble::tibble( success = c(TRUE, TRUE, FALSE), finish_reason = c("stop", "length", "error:rate_limit"), sent_tokens = c(10L, 12L, NA_integer_), rec_tokens = c(5L, 7L, NA_integer_), total_tokens = c(15L, 19L, NA_integer_), reasoning_tokens = c(NA_integer_, NA_integer_, NA_integer_), duration = c(0.4, 0.5, 0.1) ) llm_usage(res)
Adds structured_valid (logical) and structured_error (chr) by validating
each row's structured_data against schema. No provider calls are made.
llm_validate_structured_col( .data, schema, structured_list_col = "structured_data" )llm_validate_structured_col( .data, schema, structured_list_col = "structured_data" )
.data |
A data.frame with a |
schema |
JSON Schema (R list) |
structured_list_col |
Column name with parsed JSON. Default "structured_data". |
llm_parse_structured_col(), llm_fn_structured()
A lightweight S3 container for generative model calls. It standardizes finish reasons and token usage across providers and keeps the raw response for advanced users.
Returns the standardized finish reason for an llmr_response.
Returns a list with token counts for an llmr_response.
Convenience check for truncation due to token limits.
finish_reason(x) tokens(x) is_truncated(x) ## S3 method for class 'llmr_response' as.character(x, ...) ## S3 method for class 'llmr_response' print(x, ...)finish_reason(x) tokens(x) is_truncated(x) ## S3 method for class 'llmr_response' as.character(x, ...) ## S3 method for class 'llmr_response' print(x, ...)
x |
An |
... |
Ignored. |
text: character scalar. Assistant reply.
provider: character. Provider id (e.g., "openai", "gemini").
model: character. Model id as requested in the config.
model_version: character. The model identifier the server reports having
served (e.g., a dated snapshot). Useful for reproducibility records; NA
when the provider does not echo it.
finish_reason: one of "stop", "length", "filter", "tool", "other".
usage: list with integers sent, rec, total, reasoning, and
cached (tokens read from the provider's prompt cache; NA when not
reported).
thinking: character. Reasoning text when the provider returns it
separately (e.g., Anthropic thinking blocks, Gemini thought parts,
DeepSeek reasoning_content); NA otherwise.
response_id: provider's response identifier if present.
duration_s: numeric seconds from request to parse.
raw: parsed provider JSON (list).
raw_json: raw JSON string.
print() shows the text, then a compact status line with model, finish reason,
token counts, and a terse hint if truncated or filtered.
as.character() extracts text so the object remains drop-in for code that
expects a character return.
A length-1 character vector or NA_character_.
A list list(sent, rec, total, reasoning, cached). Missing
values are NA. cached counts prompt tokens the provider read from
its cache (cheaper than fresh input tokens); it is NA for providers that
do not report cache usage.
TRUE if truncated, otherwise FALSE.
call_llm(), call_llm_robust(), llm_chat_session(),
llm_config(), llm_mutate(), llm_fn()
# Minimal fabricated example (no network): r <- structure( list( text = "Hello!", provider = "openai", model = "demo", finish_reason = "stop", usage = list(sent = 12L, rec = 5L, total = 17L, reasoning = NA_integer_), response_id = "resp_123", duration_s = 0.012, raw = list(choices = list(list(message = list(content = "Hello!")))), raw_json = "{}" ), class = "llmr_response" ) as.character(r) finish_reason(r) tokens(r) print(r) ## Not run: fr <- finish_reason(r) ## End(Not run) ## Not run: u <- tokens(r) u$total ## End(Not run) ## Not run: if (is_truncated(r)) message("Increase max_tokens") ## End(Not run)# Minimal fabricated example (no network): r <- structure( list( text = "Hello!", provider = "openai", model = "demo", finish_reason = "stop", usage = list(sent = 12L, rec = 5L, total = 17L, reasoning = NA_integer_), response_id = "resp_123", duration_s = 0.012, raw = list(choices = list(list(message = list(content = "Hello!")))), raw_json = "{}" ), class = "llmr_response" ) as.character(r) finish_reason(r) tokens(r) print(r) ## Not run: fr <- finish_reason(r) ## End(Not run) ## Not run: u <- tokens(r) u$total ## End(Not run) ## Not run: if (is_truncated(r)) message("Increase max_tokens") ## End(Not run)
Converts the embedding response data to a numeric matrix.
parse_embeddings(embedding_response)parse_embeddings(embedding_response)
embedding_response |
The response returned from an embedding API call. |
A numeric matrix of embeddings with column names as sequence numbers.
## Not run: text_input <- c("Political science is a useful subject", "We love sociology", "German elections are different", "A student was always curious.") # Configure the embedding API provider (example with Voyage API). # The key is read from the VOYAGE_API_KEY environment variable. voyage_config <- llm_config( provider = "voyage", model = "voyage-3.5-lite" ) embedding_response <- call_llm(voyage_config, text_input) embeddings <- parse_embeddings(embedding_response) # Additional processing: embeddings |> cor() |> print() ## End(Not run)## Not run: text_input <- c("Political science is a useful subject", "We love sociology", "German elections are different", "A student was always curious.") # Configure the embedding API provider (example with Voyage API). # The key is read from the VOYAGE_API_KEY environment variable. voyage_config <- llm_config( provider = "voyage", model = "voyage-3.5-lite" ) embedding_response <- call_llm(voyage_config, text_input) embeddings <- parse_embeddings(embedding_response) # Additional processing: embeddings |> cor() |> print() ## End(Not run)
Configurations never print their key: a literal key shows as
<llmr_secret: literal> and an environment reference as
<llmr_secret: env:VARNAME>, so configs are safe to print in scripts,
logs, and rendered documents.
## S3 method for class 'llm_config' print(x, ...) ## S3 method for class 'llm_config' format(x, ...)## S3 method for class 'llm_config' print(x, ...) ## S3 method for class 'llm_config' format(x, ...)
x |
An |
... |
Ignored. |
x invisibly (for print); a character vector (for format).
Resets the future plan to sequential processing.
reset_llm_parallel(verbose = FALSE)reset_llm_parallel(verbose = FALSE)
verbose |
Logical. If TRUE, prints reset information. |
Invisibly returns the future plan that was in place before resetting to sequential.
## Not run: # Setup parallel processing old_plan <- setup_llm_parallel(workers = 2) # Do some parallel work... # Reset to sequential reset_llm_parallel(verbose = TRUE) # Optionally restore the specific old_plan if it was non-sequential # future::plan(old_plan) ## End(Not run)## Not run: # Setup parallel processing old_plan <- setup_llm_parallel(workers = 2) # Do some parallel work... # Reset to sequential reset_llm_parallel(verbose = TRUE) # Optionally restore the specific old_plan if it was non-sequential # future::plan(old_plan) ## End(Not run)
Convenience function to set up the future plan for optimal LLM parallel processing. Automatically detects system capabilities and sets appropriate defaults.
setup_llm_parallel(workers = NULL, strategy = NULL, verbose = FALSE)setup_llm_parallel(workers = NULL, strategy = NULL, verbose = FALSE)
workers |
Integer. Number of workers to use. If NULL, auto-detects optimal number
(availableCores - 1, capped at 8). If called as |
strategy |
Character. The future strategy to use. Options: "multisession", "multicore", "sequential". If NULL (default), automatically chooses "multisession". |
verbose |
Logical. If TRUE, prints setup information. |
Invisibly returns the previous future plan.
## Not run: # Automatic setup setup_llm_parallel() # Manual setup with specific workers setup_llm_parallel(workers = 4, verbose = TRUE) # Force sequential processing for debugging setup_llm_parallel(strategy = "sequential") # Restore old plan if needed reset_llm_parallel() ## End(Not run)## Not run: # Automatic setup setup_llm_parallel() # Manual setup with specific workers setup_llm_parallel(workers = 4, verbose = TRUE) # Force sequential processing for debugging setup_llm_parallel(strategy = "sequential") # Restore old plan if needed reset_llm_parallel() ## End(Not run)
When a model decides to call tools, finish_reason(x) is "tool" and this
helper returns what it asked for. call_llm_tools() uses it internally; it
is exported so custom loops can be built on it.
tool_calls(x)tool_calls(x)
x |
An llmr_response object. |
A list with one element per requested call:
list(id =, name =, arguments =) where arguments is a named list.
list() when the response contains no tool calls.