HidsTech
Intelligent AI Studio
← All articles
AI Observability7 min read31 March 2026

Langfuse: Open-Source LLM Observability You Can Self-Host

A practical guide to Langfuse — tracing, evaluation, prompt management, and cost monitoring for any LLM stack, fully open-source.

Most LLM observability tools are SaaS products that send your prompts and outputs to a third-party server. For companies with data privacy requirements — healthcare, finance, legal — that's a dealbreaker. Langfuse is the open-source alternative you can run entirely on your own infrastructure.

What Is Langfuse?

Langfuse is an open-source LLM engineering platform that provides:

  • Tracing — full request/response visibility for any LLM call
  • Evaluations — automated scoring and human annotation
  • Prompt management — versioned prompts with A/B testing
  • Metrics — latency, cost, token usage, error rates
  • Sessions — group traces into conversations or workflows
  • It works with any LLM framework — LangChain, LlamaIndex, OpenAI SDK, Anthropic SDK, or raw HTTP calls.

    Self-Hosting with Docker

    Run `docker-compose up` and you have a full observability stack on your own server.

    Integrating with Python

    The `@observe()` decorator automatically captures inputs, outputs, latency, and token usage.

    LangChain Integration

    Every LangChain step — prompts, LLM calls, tools, retrievers — appears as a nested trace in Langfuse.

    Prompt Management

    Langfuse lets you manage prompts as versioned artefacts with metadata:

    When you update a prompt in the Langfuse UI, your application picks up the new version without a code deploy. You can roll back instantly if quality drops.

    Evaluations

    Set up automated evaluators to score every response:

    For LLM-as-judge evaluation:

    Cost Tracking

    Langfuse automatically calculates cost per trace based on token usage and model pricing. You get dashboards showing:

  • Cost per user / per session / per feature
  • Model spend breakdown (GPT-4o vs Claude vs Gemini)
  • Cost trends over time
  • This is essential for managing LLM API budgets at scale.

    Langfuse vs LangSmith

  • Open-source: Langfuse is MIT licensed; LangSmith is proprietary
  • Self-hostable: Langfuse yes; LangSmith no
  • Framework support: Langfuse works with any framework; LangSmith is LangChain-native
  • Prompt Hub: Both support versioned prompt management
  • Evaluations: Both support automated and human evaluation
  • Hosted option: Both offer cloud-hosted versions
  • Choose Langfuse if: data privacy matters, you use multiple frameworks, or you want self-hosted.

    Choose LangSmith if: you're all-in on LangChain and want tightest integration.

    Getting Started

  • Deploy Langfuse via Docker or use the cloud version at langfuse.com
  • Add the SDK to your project: `pip install langfuse`
  • Wrap your LLM calls with `@observe()` or the callback handler
  • Open the dashboard and start seeing your traces in real time
  • Observability isn't optional for production AI — it's how you catch hallucinations, measure quality, and make confident improvements.

    Talk to us if you need help setting up LLM observability for your team.

    Ready to implement AI in your business?

    Book a free 30-minute strategy call — no commitment required.

    Book a Free Call →