Play 19

Edge AI Phi-4

High🔧 Skeleton

Deploy Phi-4 SLM on edge devices with ONNX quantization and offline inference.

Run a small language model on edge devices without cloud connectivity. Phi-4 quantized to INT4 via ONNX Runtime runs on devices with 4GB+ RAM. IoT Hub manages device fleet, syncs model updates, and collects telemetry. Supports offline inference with periodic cloud sync for model updates.

Architecture Pattern

Edge AI, SLM, ONNX quantization, offline inference, device sync

Azure Services

IoT HubContainer InstancesONNX RuntimeAzure Storage

DevKit (.github Agentic OS)

agent.md — root orchestrator with builder→reviewer→tuner handoffs
3 agents — Edge AI Builder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
3 skills — deploy (115 lines), evaluate (100 lines), tune (112 lines)
4 prompts — /deploy, /test, /review, /evaluate with agent routing
.vscode/mcp.json — FrootAI MCP with IoT Hub + HuggingFace inputs + envFile

TuneKit (AI Config)

config/edge.json — quantization level, model config, memory constraints
config/sync.json — update schedule, rollback rules

Tuning Parameters

Quantization level (INT4/INT8)Model configSync scheduleDevice memory budget

Estimated Cost

Dev/Test

$20–50/mo

Production

$100–500/mo

User Guide Open in VS Code View on GitHub Setup Guide Configurator Ask Agent FAI Back to FrootAI