---
title: "Small experiment with LLMR"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Small experiment with LLMR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r}
knitr::opts_chunk$set(
  collapse = TRUE, comment = "#>",
  eval = identical(tolower(Sys.getenv("LLMR_RUN_VIGNETTES", "false")), "true") )
```  

## Overview

This vignette demonstrates:

1. Building factorial experiment designs with `build_factorial_experiments()`
2. Running experiments in parallel with `call_llm_par()`
3. Comparing unstructured vs. structured output across providers

The workflow is: **design → parallel execution → analysis**

We will compare three configurations on two prompts, once unstructured and once with structured output. We use open models via DeepSeek and Groq to ensure fast and low-cost execution.

```{r}
library(LLMR)
library(dplyr)
cfg_ds     <- llm_config("deepseek", "deepseek-chat")
cfg_groq1  <- llm_config("groq",     "llama-3.1-8b-instant")
cfg_groq   <- llm_config("groq",     "openai/gpt-oss-20b")
```

## Build a factorial design
```{r}
experiments <- build_factorial_experiments(
  configs       = list(cfg_ds, cfg_groq1, cfg_groq),
  user_prompts  = c("Summarize in one sentence: The Apollo program.",
                    "List two benefits of green tea."),
  system_prompts = c("Be concise.")
)
experiments
```

## Run unstructured
```{r}
setup_llm_parallel(workers = 10)
res_unstructured <- call_llm_par(experiments, progress = TRUE)
reset_llm_parallel()
res_unstructured |>
  select(provider, model, user_prompt_label, response_text, finish_reason) |>
  head()
```

**Understanding the results:**

The `finish_reason` column shows why each response ended:

- `"stop"`: normal completion
- `"length"`: hit token limit (increase `max_tokens`)
- `"filter"`: content filter triggered

The `user_prompt_label` helps track which experimental condition produced each response.

## Structured version
```{r}
schema <- list(
  type = "object",
  properties = list(
    answer = list(type="string"),
    keywords = list(type="array", items = list(type="string"))
  ),
  required = list("answer","keywords"),
  additionalProperties = FALSE
)

experiments2 <- experiments
experiments2$config <- lapply(experiments2$config, enable_structured_output, schema = schema)

setup_llm_parallel(workers = 10)
res_structured <- call_llm_par_structured(experiments2 , .fields = c("answer","keywords") )
reset_llm_parallel()

res_structured |>
  select(provider, model, user_prompt_label, structured_ok, answer) |>
  head()


```