一个工具,多个“大脑”:打造一个多模型 DevOps 架构师
一个工具,多个“大脑”:打造一个多模型 DevOps 架构师
我正在为自己开发一个 AI 辅助的 DevOps 工具……
(点击或按回车查看大图)
🛡️ Introduction — About Omni-Architect
引言:关于 Omni-Architect
Omni-Architect 最初是我为本地一家体育俱乐部做的个人项目。我们自己维护一个网站,我想做一个工具,让朋友们能轻松自动化更新内容、处理新部署,尤其是在不同托管云服务商之间迁移时。
如今,它已经演变成一个多功能、AI 原生的工作台(workbench),专为自动化云原生基础设施的全生命周期而设计。该应用通过结合本地和云端的大语言模型(Large Language Models, LLMs),弥合了原始应用代码与生产级部署之间的鸿沟。
🚀 Current Capabilities
当前功能
- Infrastructure Generation(基础设施生成):自动生成
Dockerfile、docker-compose.yml和 Kubernetes manifests,支持多种发行版(IKS、GKE、EKS、AKS、Kind、Minikube)。 - Observability-as-Code(可观测性即代码):自动注入 OpenTelemetry(OTel)collector,并生成 Prometheus/Grafana 监控栈。
- Multi-Model Intelligence(多模型智能):无论你使用本地 Ollama 模型,还是企业级模型如 IBM watsonx 或 Google Gemini,应用都能根据你选择的 “brain”(模型)动态调整输出。
💡 Observation(观察发现):模型的选择对结果影响显著——例如,Granite-4 在生成精确的 Terraform 代码方面表现优异,而我测试过的 Llama 模型则更倾向于生成 Pulumi 风格的基础设施代码。
⚠️ Project Status: Alpha
项目状态:Alpha(早期测试版)
本项目目前处于 Alpha 阶段,仍在积极开发中,会频繁变动,并且——坦白说——很可能存在 bug。
- ✅ Tested(已测试):本地功能(Ollama 集成、本地文件发现)。
- 🔬 Experimental(实验性):云服务商集成和高级 OTel 功能仍在优化中,尚未完全验证。
我分享这个版本是为了收集反馈并快速迭代新功能。源代码已在 GitHub 上开源,期待看到它如何演进!
Code
代码使用指南
首先,准备好你的环境!
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
安装依赖项:
pip install streamlit ollama watchdog google-generativeai mistralai python-dotenv cryptography requests
或者直接:
pip install -r requirements.txt
The main application 👨💻
主程序代码
import streamlit as st
import ollama
import os, subprocess, uuid, requests
from pathlib import Path
from dotenv import load_dotenv
load_dotenv()
K8S_FLAVORS = ["Standard (Vanilla)", "Minikube (Local VM)", "Kind (Docker-in-Docker)", "Google (GKE)", "AWS (EKS)", "Azure (AKS)", "IBM (IKS)"]
TF_PROVIDERS = ["AWS", "GCP", "IBM Cloud", "Azure", "Oracle Cloud"]
APP_EXTS = {'.c', '.cpp', '.go', '.php', '.js', '.ts', '.java', '.html', '.sh'}
if 'state' not in st.session_state:
st.session_state.state = {
'current_dir': os.getcwd(),
'selected_files': [],
'ai_prov': "Local (Ollama)",
'ai_model': "",
'keys': {
'gemini': os.getenv("GEMINI_API_KEY", ""),
'watsonx_api': os.getenv("WATSONX_API_KEY", ""),
'watsonx_project': os.getenv("WATSONX_PROJECT_ID", "")
},
'infra_out': "",
'obs_out': "",
'gen_cache': {}
}
st.set_page_config(page_title="Omni-Architect v41.1", layout="wide", page_icon="🛡️")
@st.cache_data(ttl=10)
def discover_ollama():
try:
return [m.model for m in ollama.list().models]
except:
try:
res = requests.get("http://localhost:11434/api/tags", timeout=1)
return [m['name'] for m in res.json().get('models', [])] if res.status_code == 200 else []
except: return []
def get_watsonx_token(api_key):
url = "https://iam.cloud.ibm.com/identity/token"
data = f"grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={api_key}"
try:
res = requests.post(url, headers={"Content-Type": "application/x-www-form-urlencoded"}, data=data)
return res.json().get("access_token")
except: return None
def ask_ai(prompt):
prov, model, keys = st.session_state.state['ai_prov'], st.session_state.state['ai_model'], st.session_state.state['keys']
try:
with st.spinner(f"🤖 {prov} is architecting..."):
if prov == "Local (Ollama)":
return ollama.generate(model=model, prompt=prompt)['response']
elif prov == "Google (Gemini)":
import google.generativeai as genai
genai.configure(api_key=keys['gemini'])
return genai.GenerativeModel("gemini-1.5-flash").generate_content(prompt).text
elif prov == "IBM watsonx":
token = get_watsonx_token(keys['watsonx_api'])
url = "https://us-south.ml.cloud.ibm.com/ml/v1/text/generation?version=2023-05-29"
headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
body = {
"input": f"<s>[INST] {prompt} [/INST]",
"parameters": {"max_new_tokens": 1200},
"project_id": keys['watsonx_project']
}
return requests.post(url, headers=headers, json=body).json()['results'][0]['generated_text']
except Exception as e: return f"Error: {e}"
def render_registry(text):
"""Universal renderer for AI file blocks with download buttons"""
if not text or "---FILE:" not in text:
st.markdown(text); return
for part in text.split("---FILE:")[1:]:
try:
fname, content = part.strip().split("\n", 1)
fname, content = fname.strip(), content.strip()
st.session_state.state['gen_cache'][fname] = content
with st.container(border=True):
h_col, b_col = st.columns([0.8, 0.2])
h_col.subheader(f"📄 {fname}")
b_col.download_button("📥 Download", content, file_name=fname, key=f"dl_{fname}_{uuid.uuid4().hex}")
st.code(content, language="hcl" if ".tf" in fname else "yaml")
except: continue
with st.sidebar:
st.header("⚙️ Controller")
st.session_state.state['ai_prov'] = st.selectbox("LLM Provider:", ["Local (Ollama)", "IBM watsonx", "Google (Gemini)"])
if st.session_state.state['ai_prov'] == "Local (Ollama)":
models = discover_ollama()
st.session_state.state['ai_model'] = st.selectbox("Local Model:", models) if models else st.text_input("Model Name (Manual):")
elif st.session_state.state['ai_prov'] == "Google (Gemini)":
st.session_state.state['keys']['gemini'] = st.text_input("Gemini Key:", type="password", value=st.session_state.state['keys']['gemini'])
elif st.session_state.state['ai_prov'] == "IBM watsonx":
st.session_state.state['keys']['watsonx_api'] = st.text_input("IAM Key:", type="password", value=st.session_state.state['keys']['watsonx_api'])
st.session_state.state['keys']['watsonx_project'] = st.text_input("Project ID:", value=st.session_state.state['keys']['watsonx_project'])
st.divider()
st.subheader("📂 File Explorer")
c1, c2 = st.columns(2)
if c1.button("⬅️ Up"):
st.session_state.state['current_dir'] = os.path.dirname(st.session_state.state['current_dir']); st.rerun()
if c2.button("🏠 Home"):
st.session_state.state['current_dir'] = os.getcwd(); st.rerun()
try:
items = os.listdir(st.session_state.state['current_dir'])
folders = sorted([f for f in items if os.path.isdir(os.path.join(st.session_state.state['current_dir'], f))])
files = sorted([f for f in items if os.path.isfile(os.path.join(st.session_state.state['current_dir'], f))])
target = st.selectbox("Go to Folder:", ["."] + folders)
if target != ".":
st.session_state.state['current_dir'] = os.path.join(st.session_state.state['current_dir'], target); st.rerun()
st.divider()
use_filter = st.toggle("✨ Smart Filter (App Code)", value=False)
suggested = [f for f in files if Path(f).suffix.lower() in APP_EXTS]
st.session_state.state['selected_files'] = st.multiselect("📑 Select Files:", options=files, default=suggested if use_filter else [])
except Exception as e: st.error(f"IO Error: {e}")
# --- 5. MAIN UI ---
st.title("🛡️ Omni-Architect v41.1")
if not st.session_state.state['selected_files']:
st.info("👈 Use the Explorer to select your project files.")
else:
tabs = st.tabs(["🏗️ Infra & IaC", "🔭 Observability", "🛡️ Security", "🚀 Execution"])
with tabs[0]: # INFRA
col1, col2 = st.columns(2)
strategy = col1.selectbox("Strategy:", ["Dockerfile", "Docker Compose", "Kubernetes Manifests", "Terraform (IaC)"])
flavor = col2.selectbox("Target Flavor:", K8S_FLAVORS if strategy == "Kubernetes Manifests" else (TF_PROVIDERS if strategy == "Terraform (IaC)" else ["N/A"]))
if st.button(f"Generate {strategy}", type="primary", use_container_width=True):
paths = [os.path.join(st.session_state.state['current_dir'], f) for f in st.session_state.state['selected_files']]
st.session_state.state['infra_out'] = ask_ai(f"Write {strategy} for {paths} on {flavor}. Use ---FILE: filename---")
render_registry(st.session_state.state['infra_out'])
with tabs[1]:
st.subheader("🔭 OpenTelemetry Strategy")
obs_mode = st.radio("Choose OTel Pattern:",
["Universal Sidecar (K8s/Infra)", "SDK Implementation (Code-level)"],
horizontal=True)
c1, c2 = st.columns(2)
if c1.button("🧪 Apply Telemetry", type="primary", use_container_width=True):
if obs_mode == "Universal Sidecar (K8s/Infra)":
if not st.session_state.state['infra_out']:
st.error("❌ No Infrastructure found! Please generate K8s Manifests in the 'Infra' tab first.")
else:
prompt = f"Inject an OpenTelemetry Collector sidecar into these K8s manifests: {st.session_state.state['infra_out']}. Use ---FILE: filename---"
st.session_state.state['infra_out'] = ask_ai(prompt)
st.rerun()
else:
prompt = f"Analyze these files: {st.session_state.state['selected_files']}. Rewrite them to implement OTel SDK. Use ---FILE: filename---"
st.session_state.state['obs_out'] = ask_ai(prompt)
st.rerun()
if c2.button("📊 Gen Grafana/Prometheus", use_container_width=True):
st.session_state.state['obs_out'] = ask_ai(f"Generate Prometheus rules and Grafana dashboard for: {st.session_state.state['selected_files']}")
st.rerun()
render_registry(st.session_state.state['obs_out'])
with tabs[2]:
s1, s2 = st.columns(2)
if s1.button("🛡️ Harden Security", use_container_width=True):
st.session_state.state['infra_out'] = ask_ai(f"Apply DevSecOps hardening (non-root, read-only fs, etc) to: {st.session_state.state['infra_out']}")
st.rerun()
if s2.button("💰 FinOps Optimize", use_container_width=True):
st.session_state.state['infra_out'] = ask_ai(f"Optimize CPU/Memory requests and cloud costs for: {st.session_state.state['infra_out']}")
st.rerun()
with tabs[3]: # EXECUTION
cmd = st.text_input("Terminal:", value="ls -la")
if st.button("🚀 Commit & Run Command", type="primary"):
for f, c in st.session_state.state['gen_cache'].items():
with open(os.path.join(st.session_state.state['current_dir'], f), 'w') as file: file.write(c)
res = subprocess.run(cmd, shell=True, capture_output=True, text=True, cwd=st.session_state.state['current_dir'])
st.text_area("Output:", res.stdout if res.returncode == 0 else res.stderr)
示例输出:AI 能生成什么?
当你选择“SDK Implementation (Code-level)”模式时,AI 可能返回如下代码:
---
filename: ai-devops-Omni-Architect_v40_tracing_metrics.py
---
import os
import logging
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.metrics import MetricsRegistry
from opentelemetry.sdk.metrics.resources import Resource
from opentelemetry.exporter.otlp.proto.trace.v1.trace_exporter import OTLPSpanExporter
from opentelemetry.api.trace import Tracer, Span
# Set up OpenTelemetry tracing
trace.set_tracer_provider(TracerProvider())
resource = Resource(
{
"service.name": "ai-devops-Omni-Architect",
"service.version": "v40_tracing_metrics"
}
)
trace.get_tracer_provider().add_span_processor(OTLPSpanExporter(resource))
# Set up OpenTelemetry metrics
metrics_registry = MetricsRegistry()
这个版本初始化了 OpenTelemetry SDK 的追踪(tracing)和指标(metrics)功能,包含服务名称、版本等资源信息,并配置了 OTLP 导出器用于发送 span 数据。
要在你的应用中真正用起来,还需要在具体业务逻辑中添加对应的埋点代码。
🎯 What’s next — Conclusion & Future Roadmap
下一步计划:总结与未来路线图
我对 Omni-Architect 的愿景是一步步持续增强平台能力。目前我正在探索 IBM 的 Project Bob,希望借此彻底重构并现代化整个应用架构。
- 点赞
- 收藏
- 关注作者
评论(0)