philschmid.de - RSS feed
The 10 Steps for product AI generation with Gemini 2.5 Flash
Memory in Agents, Make LLMs remember.
Google Gemini CLI Cheatsheet
Code Sandbox MCP: A Simple Code Interpreter for Your AI Agents
Integrating Long-Term Memory with Gemini 2.5
The New Skill in AI is Not Prompting, It's Context Engineering
Single vs Multi-Agent System?
Zero to One: Learning Agentic Patterns
Google Gemini LangChain Cheatsheet
OpenAI Codex CLI, how does it work?
Model Context Protocol (MCP) an overview
ReAct agent from scratch with Gemini 2.5 and LangGraph
Pass@k vs Pass^k: Understanding Agent Reliability
Google Gemma 3 Function Calling Example
Function Calling Guide: Google DeepMind Gemini 2.0 Flash
From PDFs to Insights: Structured Outputs from PDFs with Gemini 2.0
Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial
How to align open LLMs in 2025 with DPO and and synthetic data
Bite: How Deepseek R1 was trained
How to use Anthropic MCP Server with open LLMs, OpenAI or Google Gemini
Fine-tune classifier with ModernBERT in 2025
How to fine-tune open LLMs in 2025 with Hugging Face
Deploy QwQ-32B-Preview the best open Reasoning Model on AWS with Hugging Face
Deploy Llama 3.2 Vision on Amazon SageMaker
How to Fine-Tune Multimodal Models or VLMs with Hugging Face TRL
Evaluate open LLMs with Vertex AI and Gemini
Evaluate LLMs using Evaluation Harness and Hugging Face TGI/vLLM
Deploy open LLMs with Terraform and Amazon SageMaker
LLM Evaluation doesn't need to be complicated
Evaluating Open LLMs with MixEval: The Closest Benchmark to LMSYS Chatbot Arena
Train and Deploy open Embedding Models on Amazon SageMaker
Deploy Mixtral 8x7B on AWS Inferentia2 with Hugging Face Optimum
Fine-tune Llama 3 with PyTorch FSDP and Q-Lora on Amazon SageMaker
Fine-tune Embedding models for Retrieval Augmented Generation (RAG)
Understanding the Cost of Generative AI Models in Production
Deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum
Deploy open LLMs with vLLM on Hugging Face Inference Endpoints
Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora
Deploy Llama 3 on Amazon SageMaker
Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker
Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum
Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker
Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker
How to fine-tune Google Gemma with ChatML and Hugging Face TRL
RLHF in 2024 with DPO and Hugging Face
How to Fine-Tune LLMs in 2024 with Hugging Face
Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints
Fine-tune Llama 7B on AWS Trainium
Programmatically manage 🤗 Inference Endpoints
Deploy Mixtral 8x7B on Amazon SageMaker
Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker
Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker
Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker
Amazon Bedrock: How good (bad) is Titan Embeddings?
Evaluate LLMs and RAG a practical example using Langchain and Hugging Face
Deploy Idefics 9B and 80B on Amazon SageMaker
Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker
Llama 2 on Amazon SageMaker a Benchmark
Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA and Flash Attention
Fine-tune Falcon 180B with QLoRA and Flash Attention on Amazon SageMaker
Deploy Falcon 180B on Amazon SageMaker
Optimize open LLMs using GPTQ and Hugging Face Optimum
LLMOps: Deploy Open LLMs using Infrastructure as Code with AWS CDK
Deploy Llama 2 7B/13B/70B on Amazon SageMaker
Introducing EasyLLM - streamline open LLMs
Extended Guide: Instruction-tune Llama 2
LLaMA 2 - Every Resource you need
Fine-tune LLaMA 2 (7-70B) on Amazon SageMaker
Train LLMs using QLoRA on Amazon SageMaker
Deploy LLMs with Hugging Face Inference Endpoints
Optimize and Deploy BERT on AWS inferentia2
Securely deploy LLMs inside VPCs with Hugging Face and Amazon SageMaker
Deploy Falcon 7B and 40B on Amazon SageMaker
Fine-tune BERT for Text Classification on AWS Trainium
Introducing the Hugging Face LLM Inference Container for Amazon SageMaker
Generative AI for Document Understanding with Hugging Face and Amazon SageMaker
How to scale LLM workloads to 20B+ with Amazon SageMaker using Hugging Face and PyTorch FSDP
Setting up AWS Trainium for Hugging Face Transformers
Train and Deploy BLOOM with Amazon SageMaker and PEFT
Introducing IGEL an instruction-tuned German large Language Model
Efficient Large Language Model training with LoRA and Hugging Face
Deploy FLAN-UL2 20B on Amazon SageMaker
Getting started with Pytorch 2.0 and Hugging Face Transformers
Controlled text-to-image generation with ControlNet on Inference Endpoints
Combine Amazon SageMaker and DeepSpeed to fine-tune FLAN-T5 XXL
Fine-tune FLAN-T5 XL/XXL using DeepSpeed and Hugging Face Transformers
Deploy FLAN-T5 XXL on Amazon SageMaker
Hugging Face Transformers Examples
Getting started with Transformers and TPU using PyTorch
Fine-tune FLAN-T5 for chat and dialogue summarization
Managed Transcription with OpenAI Whisper and Hugging Face Inference Endpoints
Stable Diffusion Inpainting example with Hugging Face inference Endpoints
Stable Diffusion with Hugging Face Inference Endpoints
Document AI: LiLT a better language agnostic LayoutLM model
Multi-Model GPU Inference with Hugging Face Inference Endpoints
Serverless Machine Learning Applications with Hugging Face Gradio and AWS Lambda
Accelerate Stable Diffusion inference with DeepSpeed-Inference on GPUs
Stable Diffusion on Amazon SageMaker
Deploy T5 11B for inference for less than $500
Outperform OpenAI GPT-3 with SetFit for text-classification
Fine-tuning LayoutLM for document-understanding using Keras and Hugging Face Transformers
Deploy LayoutLM with Hugging Face Inference Endpoints
Document AI: Fine-tuning LayoutLM for document-understanding using Hugging Face Transformers
Custom Inference with Hugging Face Inference Endpoints
Accelerate GPT-J inference with DeepSpeed-Inference on GPUs
Document AI: Fine-tuning Donut for document-parsing using Hugging Face Transformers
Use Sentence Transformers with TensorFlow
Pre-Training BERT with Hugging Face Transformers and Habana Gaudi
Accelerate BERT inference with DeepSpeed-Inference on GPUs
Accelerate Sentence Transformers with Hugging Face Optimum
Deep Learning setup made easy with EC2 Remote Runner and Habana Gaudi
Accelerate Vision Transformer (ViT) with Quantization using Optimum
Optimizing Transformers for GPUs with Optimum
Hugging Face Transformers and Habana Gaudi AWS DL1 Instances
Optimizing Transformers with Hugging Face Optimum
Convert Transformers to ONNX with Hugging Face Optimum
Setup Deep Learning environment for Hugging Face Transformers with Habana Gaudi on AWS
Static Quantization with Hugging Face `optimum` for ~3x latency improvements
Advanced PII detection and anonymization with Hugging Face Transformers and Amazon SageMaker
An Amazon SageMaker Inference comparison with Hugging Face Transformers
Semantic Segmantion with Hugging Face's Transformers and Amazon SageMaker
Automatic Speech Recogntion with Hugging Face's Transformers and Amazon SageMaker
Serverless Inference with Hugging Face's Transformers, DistilBERT and Amazon SageMaker
Accelerated document embeddings with Hugging Face Transformers and AWS Inferentia
Save up to 90% training cost with AWS Spot Instances and Hugging Face Transformers
Speed up BERT inference with Hugging Face Transformers and AWS Inferentia
Creating document embeddings with Hugging Face's Transformers and Amazon SageMaker
Autoscaling BERT with Hugging Face Transformers, Amazon SageMaker and Terraform module
Multi-Container Endpoints with Hugging Face Transformers and Amazon SageMaker
Asynchronous Inference with Hugging Face Transformers and Amazon SageMaker
Deploy BERT with Hugging Face Transformers, Amazon SageMaker and Terraform module
Task-specific knowledge distillation for BERT using Transformers and Amazon SageMaker
Distributed training on multilingual BERT with Hugging Face Transformers and Amazon SageMaker
Financial Text Summarization with Hugging Face Transformers, Keras and Amazon SageMaker
Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker
Image Classification with Hugging Face Transformers and `Keras`
Workshop: Enterprise-Scale NLP with Hugging Face and Amazon SageMaker
Hugging Face Transformers with Keras: Fine-tune a non-English BERT for Named Entity Recognition
New Serverless Transformers using Amazon SageMaker Serverless Inference and Hugging Face
Hugging Face Transformers BERT fine-tuning using Amazon SageMaker and Training Compiler
MLOps: Using the Hugging Face Hub as model registry with Amazon SageMaker
A remote guide to re:Invent 2021 machine learning sessions
MLOps: End-to-End Hugging Face Transformers with the Hub and SageMaker Pipelines
Going Production: Auto-scaling Hugging Face Transformers with Amazon SageMaker
Deploy BigScience T0_3B to AWS and Amazon SageMaker
Scalable, Secure Hugging Face Transformer Endpoints with Amazon SageMaker, AWS Lambda, and CDK
Few-shot learning in practice with GPT-Neo
Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker
Multilingual Serverless XLM RoBERTa with HuggingFace, AWS Lambda
Serverless BERT with HuggingFace, AWS Lambda, and Docker
AWS Lambda with custom docker images as runtime
New Serverless BERT with Huggingface, AWS Lambda, and AWS EFS
efsync my first open-source MLOps toolkit
My path to become a certified solution architect
Create custom Github Action in 4 steps
Fine-tune a non-English GPT-2 Model with Huggingface
Mount your AWS EFS volume into AWS Lambda with the Serverless Framework
Serverless BERT with HuggingFace and AWS Lambda
How to use Google Tag Manager and Google Analytics without Cookies
BERT Text Classification in a different language
Scaling Machine Learning from ZERO to HERO
Getting Started with AutoML and AWS AutoGluon
K-Fold as Cross-Validation with a BERT Text-Classification Example
How to Set Up a CI/CD Pipeline for AWS Lambda With GitHub Actions and Serverless
Set up a CI/CD Pipeline for your Web app on AWS with Github Actions
Getting started with CNNs by calculating LeNet-Layer manually
Google Colab the free GPU/TPU Jupyter Notebook Service