All Posts
#2024
#2025
#2026
#a100
#a2a
#adapters
#adoption
#agentic-ai
#agents
#ai
#ai-studio
#aider
#alibaba
#amd
#analysis
#answer-engine
#anthropic
#api
#api-integration
#architecture
#arize-phoenix
#assistants-api
#attribution
#autogen
#automation
#autonomous
#awq
#aws
#b200
#batching
#beginners
#benchmark
#benchmarks
#blackwell
#bm25
#browser-agents
#budget
#case-studies
#ci-cd
#claude
#claude-code
#cli
#clinical-decision-support
#codex
#coding
#coding-agents
#cognitive-architecture
#collaboration
#comet
#commitments
#comparison
#compliance
#compute
#computer-use
#concepts
#consumer-gpu
#context-engineering
#context-window
#continuous-batching
#copilot
#coreweave
#cost
#cost-analysis
#crewai
#crusoe
#cursor
#customer-service
#decode
#deep-dive
#deepeval
#deployment
#design
#developer-tools
#development
#devin
#devops
#disaggregation
#documentation
#dotnet
#edge-ai
#embeddings
#enterprise
#eu-ai-act
#evals
#evaluation
#fine-tuning
#finops
#fintech
#fly-io
#fp8
#framework
#frameworks
#fraud-detection
#free-tier
#gemini
#github
#glossary
#google
#governance
#gptq
#gpu
#gpu-cloud
#gpu-memory
#guide
#h100
#hardware
#healthcare
#hiring
#human-in-the-loop
#hybrid-search
#ide
#india
#industry-analysis
#inference
#infrastructure
#interoperability
#kong-ai
#kubernetes
#kv-cache
#lambda
#langchain
#langfuse
#langgraph
#langsmith
#latency
#litellm
#llama
#llama-cpp
#llamaindex
#llm
#llm-gateway
#llm-ops
#llm-quality
#llm-security
#llm-serving
#lockin
#lora
#machine-learning
#manufacturing
#mcp
#memory
#metrics
#mi300x
#microsoft
#microsoft-agent-framework
#mig
#milvus
#ml-ops
#mlops
#mlx
#model-serving
#models
#monitoring
#multi-agent
#multi-cloud
#neocloud
#news
#no-code
#notebooklm
#nvidia
#nvidia-gpu-operator
#observability
#okta
#ollama
#on-device
#open-source
#openai
#opencode
#openhands
#opensource
#opentelemetry
#operator
#optimization
#orchestration
#org
#paged-attention
#pagedattention
#pair-programming
#parallel
#patient-engagement
#patterns
#peft
#perplexity
#pgvector
#pinecone
#platform-engineering
#platform-updates
#portkey
#postgres
#prefill
#production
#prompt-injection
#prompts
#protocols
#proxy
#python
#qdrant
#qlora
#quality-assurance
#quantization
#qwen
#radixattention
#rag
#ray
#ray-serve
#recap
#reference
#reliability
#reranker
#research
#resilience
#retail
#retrieval
#retrospective
#review
#rocm
#roi
#runpod
#safety
#salesforce
#scaling
#scheduler
#search
#security
#self-hosting
#semantic-kernel
#seo
#sglang
#snowflake
#software-development
#sovereignty
#speculative-decoding
#sst
#stack
#state-of-industry
#subagents
#supply-chain
#task-tool
#team
#tensorrt-llm
#terminal
#terminology
#testing
#tgi
#throughput
#tokens
#tools
#tracing
#trading
#training
#trends
#triton
#tutorial
#vector-database
#vllm
#weaviate
-
Industry AnalysisAI Agents in Manufacturing and Supply Chain 2026
Balys Kriksciunas •#ai #agents #enterprise #manufacturing #supply-chain -
Industry AnalysisEnterprise AI Agent ROI: The 2026 Reality Check
Balys Kriksciunas •#ai #agents #enterprise #roi #industry-analysis -
TutorialsAgent Eval Tutorial 2026: DeepEval + LangSmith Guide
Balys Kriksciunas •#ai #agents #tutorial #evaluation #deepeval -
Industry AnalysisCISA's AI Agent Warning and What It Means for Your Stack
Balys Kriksciunas •#ai #agents #news #security #governance -
Industry AnalysisEnterprise Platforms Go Agent-Native: May 2026
Balys Kriksciunas •#ai #agents #news #enterprise #salesforce -
Industry AnalysisAI Agent Platforms: May 2026 Updates
Andrius Putna •#ai #agents #recap #industry-analysis #openai -
Industry AnalysisWhat April's AI Agent Launches Mean for 2026
Balys Kriksciunas •#ai #agents #recap #enterprise #industry-analysis -
Deep DivesThe AI Agent Protocol Stack: MCP, A2A & What Comes Next
Andrius Putna •#ai #agents #infrastructure #protocols #mcp -
Deep DivesPerplexity Deep Research: From Search to Infrastructure
Balys Kriksciunas •#ai #agents #infrastructure #research #perplexity -
Deep DivesAI Agent Governance: The 2026 Deep Dive
Balys Kriksciunas •#ai #agents #deep-dive #governance #security -
Deep DivesComplete Guide to AI Agent Frameworks 2026
Andrius Putna •#ai #agents #frameworks #deep-dive #langgraph -
ComparisonsLangChain vs LlamaIndex vs Semantic Kernel 2026
Balys Kriksciunas •#ai #agents #langchain #llamaindex #semantic-kernel -
ComparisonsvLLM vs SGLang: Inference Engine Comparison 2026
Balys Kriksciunas •#ai #infrastructure #vllm #sglang #comparison -
Industry AnalysisAnswer Engine Optimization (AEO): The 2026 Guide
Andrius Putna •#ai #agents #enterprise #seo #answer-engine -
Industry AnalysisEnterprise AI Agents: The Real TCO Nobody Talks About
Balys Kriksciunas •#ai #agents #enterprise #deployment #cost-analysis -
TutorialsBuild a Retail AI Agent with LangGraph: Inventory & Orders
Balys Kriksciunas •#ai #agents #tutorial #langgraph #retail -
TutorialsLangGraph Human-in-the-Loop: Interrupt Patterns in Python
Balys Kriksciunas •#ai #agents #langgraph #tutorial #python -
Industry AnalysisAI Agent Platform Updates: April 2026 News
Andrius Putna •#ai #agents #news #google #openai -
Industry AnalysisAgent Governance: Secure, Observe, and Deploy AI Agents in Production
Balys Kriksciunas •#ai #agents #news #governance #security -
Industry AnalysisGoogle AI Studio 2026: All Gemini Models + Free Tier
Balys Kriksciunas •#ai #google #gemini #ai-studio #models -
Industry AnalysisLangSmith vs Langfuse vs Arize Phoenix: LLM Observability in 2026
Balys Kriksciunas •#ai #agents #observability #langsmith #langfuse -
Deep DivesState of AI Infrastructure 2026: Mid-Year Reality Check
Balys Kriksciunas •#ai #infrastructure #state-of-industry #2026 #analysis -
Deep DivesOpenAI Agents SDK: Deep Dive for Production Agent Builders
Andrius Putna •#ai #agents #deep-dive #openai #framework -
Deep DivesModel Context Protocol (MCP): Agent Builder's Guide
Andrius Putna •#ai #agents #mcp #deep-dive #infrastructure -
ComparisonsCursor vs Claude Code: Which AI Coding Agent Wins in 2026?
Andrius Putna •#ai #agents #comparison #review #cursor -
Industry AnalysisAI Browser Agents Compared: Operator, Comet, Claude & Nova Act
Andrius Putna •#ai #agents #enterprise #browser-agents #automation -
Industry AnalysisState of AI Agents in Enterprise: Adoption Trends and Barriers in 2026
Andrius Putna •#ai #agents #enterprise #adoption #industry-analysis - Infrastructure
Building an AI Platform Team: Roles, Tools, and Rituals
Balys Kriksciunas •#ai #infrastructure #platform-engineering #team #hiring - Infrastructure
GPU FinOps: Reducing Your $10M AI Compute Bill
Balys Kriksciunas •#ai #infrastructure #finops #gpu #cost - Infrastructure
Disaggregated Inference: 30–50% Throughput Wins
Balys Kriksciunas •#ai #infrastructure #inference #disaggregation #prefill - Infrastructure
Multi-Agent Orchestration Infrastructure: Lessons from Production
Balys Kriksciunas •#ai #infrastructure #multi-agent #orchestration #crewai - Infrastructure
Context Engineering: Storage, Retrieval, and the New Memory Stack
Balys Kriksciunas •#ai #infrastructure #context-engineering #memory #rag - Infrastructure
Agent Infrastructure: What's Different from LLM Serving
Balys Kriksciunas •#ai #infrastructure #agents #orchestration #mcp - Infrastructure
Inference at the Edge: Running LLMs on Consumer GPUs
Balys Kriksciunas •#ai #infrastructure #edge-ai #on-device #consumer-gpu - Infrastructure
Running Sovereign AI: EU and India Infrastructure Playbooks
Balys Kriksciunas •#ai #infrastructure #sovereignty #eu-ai-act #india - Infrastructure
MI300X vs H100: AMD's Bet on Inference
Balys Kriksciunas •#ai #infrastructure #gpu #amd #mi300x - AI Tools
Perplexity AI in 2026: Pro, Deep Research, Comet & API
Andrius Putna •#ai #perplexity #search #research #tools - AI Tools
Google AI Tools 2026: Stitch, Opal, Gemini & More
Andrius Putna •#ai #google #tools #gemini #notebooklm - Infrastructure
The AI Infrastructure Stack: 2026 Edition
Balys Kriksciunas •#ai #infrastructure #state-of-industry #analysis #trends - Tutorials
Claude Code Subagents: Parallel Multi-Agent Workflows
Andrius Putna •#ai #agents #claude-code #multi-agent #subagents - Infrastructure
NVIDIA B200 vs H100: Should You Upgrade?
Balys Kriksciunas •#ai #infrastructure #gpu #nvidia #b200 - Infrastructure
Model Evals in Production: Regression Testing Prompts
Balys Kriksciunas •#ai #infrastructure #evals #testing #llm-quality - Infrastructure
LoRA, QLoRA, and PEFT: The Fine-Tuning Infrastructure Guide
Balys Kriksciunas •#ai #infrastructure #lora #qlora #peft - Infrastructure
Securing RAG Pipelines: Prompt Injection via Data
Balys Kriksciunas •#ai #infrastructure #security #rag #prompt-injection - AI Tools
Terminal AI Code Consoles: Claude Code, Gemini Code, and OpenAI Codex
Andrius Putna •#ai #cli #terminal #coding #development - Infrastructure
Hybrid Search in Production: BM25 + Dense Retrieval
Balys Kriksciunas •#ai #infrastructure #hybrid-search #bm25 #rag - Infrastructure
Ray Serve vs Kubernetes for Model Serving
Balys Kriksciunas •#ai #infrastructure #ray #ray-serve #kubernetes - Infrastructure
AI FinOps: Tracking Token Spend Across Your Org
Balys Kriksciunas •#ai #infrastructure #finops #cost #tokens - Infrastructure
KV Cache Optimization Techniques for LLM Serving
Balys Kriksciunas •#ai #infrastructure #kv-cache #inference #vllm - Infrastructure
Speculative Decoding for Production LLMs
Balys Kriksciunas •#ai #infrastructure #speculative-decoding #inference #latency - Infrastructure
LLM Gateway Patterns: LiteLLM, Portkey, and Kong AI
Balys Kriksciunas •#ai #infrastructure #llm-gateway #litellm #portkey - Infrastructure
FP8 and Quantization: Serving LLMs at Half the Cost
Balys Kriksciunas •#ai #infrastructure #fp8 #quantization #awq - Infrastructure
pgvector at Scale: When Postgres Is Enough
Balys Kriksciunas •#ai #infrastructure #pgvector #postgres #vector-database - Infrastructure
vLLM vs TGI vs Triton: LLM Inference Server Benchmarks
Balys Kriksciunas •#ai #infrastructure #vllm #tgi #triton - Infrastructure
Multi-Cloud GPU Strategy: Avoiding Lock-in and Saving 40%
Balys Kriksciunas •#ai #infrastructure #multi-cloud #gpu #lockin - Infrastructure
The State of AI Infrastructure 2025
Balys Kriksciunas •#ai #infrastructure #state-of-industry #2025 #analysis - News
AI Agents Weekly: December 2024 Week 4 - Year-End Retrospective
Andrius Putna •#ai #agents #news #retrospective #2024 - Deep Dives
Testing and Evaluating AI Agents: Metrics, Benchmarks, and Quality Assurance
Andrius Putna •#ai #agents #testing #evaluation #metrics - Comparisons
Semantic Kernel vs LangChain: Enterprise Framework Comparison
Andrius Putna •#ai #agents #semantic-kernel #langchain #microsoft - Deep Dives
Multi-Agent Collaboration Patterns: Hierarchical, Peer-to-Peer, and Hybrid Architectures
Andrius Putna •#ai #agents #multi-agent #architecture #collaboration - Industry
AI Agents Transforming Fintech: Fraud Detection, Trading, Customer Service, and Compliance
Andrius Putna •#ai #agents #fintech #fraud-detection #trading - News
AI Agents Weekly: December 2024 Week 3 - MCP Momentum and Agent Orchestration
Andrius Putna •#ai #agents #news #mcp #microsoft - Comparisons
OpenAI Assistants API vs Claude MCP: Two Approaches to Building AI Agents
Andrius Putna •#ai #agents #openai #claude #mcp - Tutorials
Deploying AI Agents to Production: A Comprehensive Guide
Andrius Putna •#ai #agents #production #deployment #monitoring - Industry
AI Agents in Healthcare: Clinical Decision Support, Patient Engagement, and Administrative Automation
Andrius Putna •#ai #agents #healthcare #clinical-decision-support #patient-engagement - Tutorials
LangChain @tool Decorator: Build Custom Agent Tools
Andrius Putna •#ai #agents #langchain #tools #tutorial - Deep Dives
Understanding Agent Memory Systems: Short-Term, Long-Term, and Episodic
Andrius Putna •#ai #agents #memory #architecture #langchain - Comparisons
LangChain vs LlamaIndex: Which Framework for Building AI Agents?
Andrius Putna •#ai #agents #langchain #llamaindex #comparison - Industry
How AI Agents Are Revolutionizing Customer Service: Real-World Case Studies
Andrius Putna •#ai #agents #customer-service #enterprise #automation - Tutorials
Build a RAG Agent with LangChain: Complete Tutorial
Andrius Putna •#ai #agents #langchain #rag #tutorial - Deep Dives
The Future of Autonomous Coding Agents: From Devin to Claude Code
Andrius Putna •#ai #agents #coding #devin #openhands - News
AI Agents Weekly: December 2024 Week 2 - Production Deployments and Safety Advances
Andrius Putna •#ai #agents #news #gemini #anthropic - Comparisons
AutoGen vs CrewAI: Choosing the Right Multi-Agent Framework
Andrius Putna •#ai #agents #autogen #crewai #multi-agent - Industry
The State of AI Agents in Enterprise: Adoption Trends and Barriers in 2024
Andrius Putna •#ai #agents #enterprise #adoption #industry-analysis - Tutorials
LangGraph Tutorial: Build Your First AI Agent in Python
Andrius Putna •#ai #agents #langgraph #tutorial #python - Tutorials
Framework Deep Dive: CrewAI - Role-Based Multi-Agent Orchestration
Andrius Putna •#ai #agents #crewai #python #framework - Guides
Building Production AI Agents: The Complete Guide from Prototype to Deployment
Andrius Putna •#ai #agents #production #deployment #infrastructure - Coding Agents
Qwen Code by Alibaba: Open-Source Terminal Coding Agent
Andrius Putna •#ai #agents #coding #qwen #alibaba - Coding Agents
OpenCode: The Open Source AI Coding Agent
Andrius Putna •#ai #agents #coding #opencode #open-source - Guides
AI Agents Glossary: Essential Terms & Concepts
Andrius Putna •#ai #agents #glossary #terminology #concepts - Coding Agents
OpenAI Codex CLI: Terminal Coding Agent Deep Dive
Andrius Putna •#ai #agents #coding #openai #codex - Tutorials
Framework Deep Dive: AutoGen - Multi-Agent Collaboration Through Conversation
Andrius Putna •#ai #agents #autogen #microsoft #python - Coding Agents
OpenHands: The Leading Open Source AI Coding Agent
Andrius Putna •#ai #agents #coding #openhands #open-source - Coding Agents
Gemini CLI: Google's Command-Line AI Coding Agent
Andrius Putna •#ai #agents #coding #gemini #google - Coding Agents
GitHub Copilot: Microsoft's AI-Powered Coding Assistant
Andrius Putna •#ai #agents #coding #copilot #microsoft - Tutorials
Framework Deep Dive: LangChain - The Foundation of Modern AI Agents
Andrius Putna •#ai #agents #langchain #python #framework - Coding Agents
Claude Code: Anthropic's Integrated AI Coding Agent
Andrius Putna •#ai #agents #coding #claude #anthropic - News
AI Agents Weekly: December 2024 Framework Updates and Industry News
Andrius Putna •#ai #agents #news #mcp #openai - Coding Agents
Aider: Open-Source AI Pair Programmer for Terminal
Andrius Putna •#ai #agents #coding #aider #pair-programming - Guides
The Complete Guide to AI Agent Frameworks in 2024
Andrius Putna •#ai #agents #frameworks #langchain #autogen - Infrastructure
Self-Hosting Llama 3: A Production Deployment Guide
Balys Kriksciunas •#ai #infrastructure #llama #self-hosting #inference - Infrastructure
Tracing LLM Applications with OpenTelemetry
Balys Kriksciunas •#ai #infrastructure #observability #opentelemetry #tracing - Infrastructure
GPU Cloud Comparison: CoreWeave, Runpod, Lambda
Balys Kriksciunas •#ai #infrastructure #gpu-cloud #coreweave #lambda - AI Tools
Awesome AI Tools
Andrius Putna •#ai #tools #opensource - Infrastructure
PagedAttention Explained: How vLLM Achieves 24x Throughput
Balys Kriksciunas •#ai #infrastructure #vllm #paged-attention #kv-cache - Infrastructure
Continuous Batching for LLMs: Why It Matters
Balys Kriksciunas •#ai #infrastructure #inference #batching #vllm - Infrastructure
Kubernetes for GPU Workloads: A Primer
Balys Kriksciunas •#ai #infrastructure #kubernetes #gpu #mig - Infrastructure
Choosing a Vector Database in 2024: A Practical Guide
Balys Kriksciunas •#ai #infrastructure #vector-database #pinecone #qdrant - Infrastructure
vLLM: The Open-Source Inference Engine Changing LLM Serving
Balys Kriksciunas •#ai #infrastructure #inference #vllm #llm-serving - Infrastructure
NVIDIA H100 vs A100: Which GPU Should You Deploy?
Balys Kriksciunas •#ai #infrastructure #gpu #nvidia #h100 - Infrastructure
The AI Infrastructure Stack Explained (2024)
Balys Kriksciunas •#ai #infrastructure #llm #gpu #inference