> Jag Patel
Home/Blog/Building an AI Decision Engine with Mistral, FastAPI & PromptFlow

Building an AI Decision Engine with Mistral, FastAPI & PromptFlow

·3 min read·
PythonFastAPIMistralPromptFlowGPT4AllMLOpsAIBicepCI/CDInfrastructure as CodeAzureLocal LLMDecision EnginePDF Reports
Building an AI Decision Engine with Mistral, FastAPI & PromptFlow

Rules-based systems are powerful but brittle — they tell you the outcome, not the why. AI reasoning systems are flexible but opaque. What if you combined both?

I built a dynamic decision engine that evaluates user data against rules defined in JSON and adds AI-powered reasoning to explain the result in plain language. No black-box decisions — every outcome comes with a human-readable explanation.

💡 What It Does

  • Loads decision rules from JSON — no code changes needed to update logic
  • Evaluates user-submitted data against those rules at runtime
  • Passes matched rule context to Mistral-7B-OpenOrca via GPT4All for natural language reasoning
  • Returns structured decisions with AI-generated explanations
  • Produces HTML and PDF reports for each evaluation
  • Orchestrates the full workflow with PromptFlow

⚙️ Tech Stack

LayerTechnology
API serverFastAPI + Uvicorn
Rule engineJSON-based DSL (custom)
Local LLM reasoningMistral-7B-OpenOrca via GPT4All
Workflow orchestrationPromptFlow
Report generationJinja2 + WeasyPrint
InfrastructureAzure App Services (GPU-enabled)
IaCBicep
CI/CDGitHub Actions

🔧 How It Works

1. Rules as JSON

Decision logic lives in structured JSON files — no hard-coded conditionals. Rules define conditions, thresholds, and outcomes. This makes the engine configurable without redeployment.

{
  "rule_id": "income-check",
  "condition": "annual_income < 30000",
  "outcome": "ineligible",
  "reason_prompt": "Explain why this applicant does not meet the income threshold."
}

2. FastAPI Endpoints

Multiple endpoints handle different entry points:

  • POST /evaluate — JSON API for programmatic evaluation
  • POST /evaluate/form — HTML form submission for human users
  • GET /report/{id} — retrieve a generated HTML or PDF report

3. AI Reasoning with Mistral

Once a rule matches, the engine constructs a prompt with the rule context and the user's data, then calls Mistral-7B-OpenOrca locally via GPT4All. The model returns a concise, plain-English explanation of the decision.

Running the LLM locally keeps sensitive applicant data entirely on-premises — no external API calls, no data transmitted.

4. PromptFlow Orchestration

PromptFlow ties the steps together — rule evaluation, LLM reasoning, report generation — as a traceable, debuggable workflow. This makes it easy to replay, inspect, and improve individual steps without running the full pipeline.

5. PDF Reports

Each evaluation generates a branded HTML report rendered to PDF with WeasyPrint. Reports include the decision outcome, matched rules, AI reasoning, and a timestamp — ready for auditing or end-user delivery.

☁️ Deployment

The engine is deployed on Azure App Services with GPU support for the LLM inference step. Infrastructure is defined entirely in Bicep, making environment provisioning reproducible and version-controlled.

The GitHub Actions CI/CD pipeline handles:

  • Lint and unit tests on PR
  • Docker build and push to ACR on merge
  • Automated deployment to App Service

📘 What I Learned

Rule-based logic + AI reasoning is a strong pattern

Pure rule engines are fast and auditable but can't explain nuance. Pure LLMs are flexible but hard to constrain. Combining them gives you the best of both: deterministic outcomes with intelligible reasoning.

PromptFlow adds real observability to LLM workflows

Rather than chaining calls in ad-hoc Python, PromptFlow structures each step as a node with inputs, outputs, and traces. This made debugging and iteration significantly faster.

GPU-backed App Services close the gap between prototype and production

Running a 7B parameter model in a cloud environment with GPU support made latency viable for real user-facing use cases — without needing to manage Kubernetes or custom VM infrastructure.

The combination of rule-based certainty and AI-generated reasoning makes decisions more transparent and defensible — critical for any system that affects real people.

  • PromptFlow — Microsoft's LLM workflow orchestration framework for Azure AI
  • GPT4All — cross-platform runtime for running open-source LLMs locally
  • Mistral-7B — efficient open-source LLM well-suited for reasoning tasks
  • Azure App Service (GPU) — managed hosting with GPU support for ML workloads
  • Bicep — Azure-native IaC language for declarative resource provisioning

Related Posts