Rust로 AI 에이전트 프레임워크를 바닥부터 만들며 배운 것

ko생성일: 2026. 3. 17.갱신일: 2026. 4. 8.

Rust로 최소한의 비동기 우선 AI 에이전트 프레임워크를 만들며 아키텍처, 공급자별 JSON 파싱, 구조화된 에러 처리, 그리고 Rust의 타입 시스템이 준 교훈을 정리한다.

AI 시대에 참여하고 싶었지만, 단순히 도구의 사용자로만 있고 싶지는 않았습니다. 무언가를 만들고 싶었습니다. 그리고 이런 생각이 들었습니다 — Rust라는 언어가 이 분야에서 어떻게 진화할 수 있는지 보려면, 이를 위한 라이브러리를 직접 만들어보는 것보다 더 좋은 방법이 있을까?

그렇게 mini-agent가 시작되었습니다. OpenAI, Anthropic, OpenRouter, Ollama를 지원하는 Rust의 최소한(minimal) 비동기 우선(async-first) AI 에이전트 프레임워크입니다.

실제로 만들면서 어떤 과정이었는지 정리해보겠습니다.

AI 툴링에 Rust를 쓰는 이유?

대부분의 AI 툴링은 Python입니다. 괜찮습니다 — Python은 빠르게 반복(iteration)하기에 훌륭합니다. 하지만 저는 Let’s Get Rusty와 공식 Rust 책으로 Rust를 배우고 있었고, 계속 궁금했습니다 — Rust로 "진짜" 무언가를 만드는 모습은 어떨까?

AI 에이전트는 적절한 도전처럼 느껴졌습니다. 비동기 코드, HTTP 클라이언트, JSON 파싱, 트레이트 추상화, 에러 처리 등 — 사실상 언어 기능 투어에 가깝기 때문입니다.

아키텍처

핵심 아이디어는 세 가지입니다: Provider, Tool, Agent.

// Implement this to add a new LLM backend
#[async_trait]
pub trait LlmProvider: Send + Sync {
    fn provider_name(&self) -> &str;
    async fn complete(
        &self,
        messages: &[Message],
        tools: &[&dyn Tool],
        model: &str,
    ) -> Result<Completion, AgentError>;
}

// Implement this to give the agent new capabilities
#[async_trait]
pub trait Tool: Send + Sync + 'static {
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    fn parameters_schema(&self) -> Value;
    async fn execute(&self, args: Value) -> Result<String, AgentError>;
}

Agent는 ReAct 스타일의 루프 — plan, act, observe — 를 모델이 최종 답변을 반환하거나 최대 스텝 제한에 도달할 때까지 구동합니다.

User prompt

│

▼

Agent sends messages + tools → LlmProvider

│

▼

LLM responds with tool call?

├── Yes → execute tool → result added to context → loop

└── No → return final answer

가장 어려웠던 부분: 서로 다른 Provider 간 JSON 응답 처리

이건 정말 고통스러웠습니다. 모든 LLM provider가 미묘하게 다른 JSON 형태를 반환합니다.

OpenAI와 OpenRouter는 호환되기 때문에, 둘 사이에서 헬퍼 함수를 공유할 수 있었습니다:

pub fn parse_openai_completion(json: &Value) -> Result<Completion, AgentError> {
    let choice = json
        .get("choices")
        .and_then(|v| v.as_array())
        .and_then(|a| a.first())
        .ok_or_else(|| AgentError::invalid("provider", "missing 'choices' array"))?;
    // ...
}

하지만 Anthropic은 완전히 다릅니다. tool_calls 대신, content 배열 안에 tool_use 블록을 반환합니다:

{
  "content": [
    { "type": "text", "text": "Let me calculate that." },
    { "type": "tool_use", "id": "abc", "name": "add_numbers", "input": { "a": 4, "b": 7 } }
  ]
}

그래서 Anthropic용 별도 파서를 작성한 뒤, 동일한 내부 Completion 구조체로 변환해줘야 했습니다. 이 과정에서 오히려 Rust가 도움이 됐습니다 — 타입 시스템이 모든 케이스를 명시적으로 처리하도록 강제해서, 조용히 실패하는 대신 확실히 다루게 해줬습니다.

또한 이 과정에서 실제 버그도 발견했습니다: Anthropic provider가 system prompt를 조용히 버리고 있었는데, 제가 let system_prompt: Option = None; 를 하드코딩해두고 메시지 히스토리에서 실제로 추출하지 않았기 때문입니다. 컴파일은 잘 됐습니다. 그냥 동작을 안 했을 뿐입니다. 이런 종류의 조용한 실패가 바로 더 나은 에러 처리가 막아주는 것들입니다.

에러 처리 — 문자열에서 구조로

처음의 에러는 이런 형태였습니다:

#[error("Provider error: {0}")]
ProviderError(String),

#[error("Invalid response from LLM: {0}")]
InvalidResponse(String),

문제는 호출자(caller)에게 이런 에러가 쓸모가 없다는 점입니다. 문자열 내용으로 패턴 매칭을 할 수 없습니다. 그 위에 재시도 로직을 구축할 수도 없습니다.

그래서 전부 재구성했습니다:

#[derive(Error, Debug)]
pub enum AgentError {
    #[error("Network error: {0}")]
    Network(#[from] reqwest::Error),

    #[error("Provider '{provider}' error{}: {message}",
        .status.map(|s| format!(" (HTTP {s})")).unwrap_or_default())]
    Provider {
        provider: String,
        message: String,
        status: Option<u16>,
    },

    #[error("Tool '{tool}' failed: {reason}")]
    ToolExecution {
        tool: String,
        reason: String,
    },

    #[error("Agent reached the maximum of {0} steps without a final answer")]
    MaxSteps(usize),
    // ...
}

이제 호출자는 실제로 유용한 일을 할 수 있습니다:

impl AgentError {
    pub fn is_retryable(&self) -> bool {
        matches!(
            self,
            Self::Network(_) | Self::Provider { status: Some(500..=599), .. }
        )
    }

    pub fn is_client_error(&self) -> bool {
        matches!(self, Self::Provider { status: Some(s), .. } if *s >= 400 && *s < 500)
    }
}

이 부분은 Rust가 제게 놀라움을 준 지점 중 하나였습니다. 빌림 검사기와 타입 시스템은 처음엔 제한적으로 느껴지지만, 실제로는 더 나은 설계로 이끌어줍니다. 저는 어떤 다른 언어에서보다 더 깔끔한 에러 처리를 하게 되었는데, 처음부터 그렇게 계획한 게 아니라 — Rust가 대충 하는 접근을 충분히 고통스럽게 만들어서, 제대로 하게 됐기 때문입니다.

Rust 자체에 대해 배운 것

다른 언어에서 넘어오며 진짜 놀랐던 점들이 몇 가지 있습니다:

빌림 검사기는 짜증나다가도, 어느 순간부터는 그렇지 않습니다. 초반에 저는 꽤 오랫동안 이와 싸웠습니다. 하지만 매번 저를 멈춰 세울 때마다, 실제 문제를 잡아내고 있었습니다 — move된 뒤 사용되는 값, 소유자보다 오래 살아남는 참조 같은 것들 말입니다. 시간이 지나면 싸우는 대신, 그에 맞춰 설계하게 됩니다.

트레이트의 async fn은 아직 발전 중입니다. async fn을 트레이트에 넣으면 dyn-호환이 되지 않는 문제를 겪었습니다 — 즉 async 메서드를 가진 트레이트를 Box로 직접 사용할 수 없다는 뜻입니다. async-trait 크레이트가 매크로로 이를 해결해주긴 하지만, 제로 코스트는 아니라는 점은 알아둘 가치가 있습니다.

thiserror는 정말 훌륭합니다. Rust에서 진지하게 에러 처리를 할 거라면 사용하세요. #[from] derive와 #[error(...)]의 포맷 문자열 문법 덕분에 구조화된 에러가 거의 고통 없이 구현됩니다.

메모리 관리는 생각보다 늦게 감이 왔습니다. 소유권 모델은 실제 비동기 프로그램을 디버깅하기 전까지는 이론처럼 느껴집니다. 하지만 그때가 되면 왜 이런 모델이 존재하는지 명확해집니다.

Try It

[dependencies]
mini-agent = { git = "https://github.com/RajMandaliya/mini-agent" }

let mut agent = Agent::new(Box::new(provider), model);
agent.add_tool(AddNumbersTool);
let result = agent.run("What is 42 + 58?").await?;

crates.io: https://crates.io/crates/mini-agent
GitHub: https://github.com/RajMandaliya/mini-agent

이건 v0.1.0입니다. 다음으로 해결할 항목은 타입이 지정된(typed) tool 결과입니다 — 지금은 execute()가 String을 반환하는데, 트레이트를 object-safe하게 유지할 수 있는 대신 호출자가 직접 deserialize해야 합니다. Rust 커뮤니티에서 이 부분에 대해 좋은 피드백을 받았고, 제대로 풀어볼 가치가 있습니다.

Rust를 배우고 있고 무엇을 만들어야 할지 고민 중이라면 — 진짜 무언가를 만들어보세요. 생태계에는 더 많은 Rust 네이티브 AI 툴링이 필요하고, Rust는 이 일을 정말 잘 해냅니다.

// Implement this to add a new LLM backend #[async_trait] pub trait LlmProvider: Send + Sync { fn provider_name(&self) -> &str; async fn complete( &self, messages: &[Message], tools: &[&dyn Tool], model: &str, ) -> Result<Completion, AgentError>; } // Implement this to give the agent new capabilities #[async_trait] pub trait Tool: Send + Sync + 'static { fn name(&self) -> &'static str; fn description(&self) -> &'static str; fn parameters_schema(&self) -> Value; async fn execute(&self, args: Value) -> Result<String, AgentError>; }

pub fn parse_openai_completion(json: &Value) -> Result<Completion, AgentError> { let choice = json .get("choices") .and_then(|v| v.as_array()) .and_then(|a| a.first()) .ok_or_else(|| AgentError::invalid("provider", "missing 'choices' array"))?; // ... }

#[derive(Error, Debug)] pub enum AgentError { #[error("Network error: {0}")] Network(#[from] reqwest::Error), #[error("Provider '{provider}' error{}: {message}", .status.map(|s| format!(" (HTTP {s})")).unwrap_or_default())] Provider { provider: String, message: String, status: Option<u16>, }, #[error("Tool '{tool}' failed: {reason}")] ToolExecution { tool: String, reason: String, }, #[error("Agent reached the maximum of {0} steps without a final answer")] MaxSteps(usize), // ... }

impl AgentError { pub fn is_retryable(&self) -> bool { matches!( self, Self::Network(_) | Self::Provider { status: Some(500..=599), .. } ) } pub fn is_client_error(&self) -> bool { matches!(self, Self::Provider { status: Some(s), .. } if *s >= 400 && *s < 500) } }

Rust로 AI 에이전트 프레임워크를 바닥부터 만들며 배운 것

관련 추천 글

Rust 코드를 대량 삭제하는 이유

AI에 대한 Rust 프로젝트 관점

나의 AI 도입 여정

에이전트 디자인은 여전히 어렵다

관련 추천 글

Rust 코드를 대량 삭제하는 이유

AI에 대한 Rust 프로젝트 관점

나의 AI 도입 여정

에이전트 디자인은 여전히 어렵다