Rajesh P

Rajesh P writes about building scalable, secure, and high-performance backend systems. His articles cover Spring Boot, API design and versioning, system design fundamentals, and modern GenAI concepts like rerankers, LLM limits, and latency optimization.

https://belowthemalt.wordpress.com

RSS Feed

2/5/2026

Spring Boot API Design System Design Backend Engineering Generative AI

Articles from this Blog

10 articles from this blog

12/30/2025 • EN

@MatrixVariable in Spring Boot: When and How to Use It

Explains how to use Spring Boot's @MatrixVariable annotation for embedding key-value parameters in URL path segments, with practical examples.

spring boot rest api url parameters

12/30/2025 • EN

API Versioning & whats new in Spring Boot 4

Explains API versioning concepts and details the new first-class versioning support introduced in Spring Boot 4 (Spring Framework 7).

spring boot rest api Backend Development

12/30/2025 • EN

Securing Spring Boot APIs — Best Practices with Practical Examples

A practical guide to implementing essential API security best practices in Spring Boot, including HTTPS, JWT authentication, authorization, and rate limiting.

authentication spring boot authorization

12/9/2025 • EN

Under the Hood of Rerankers: Scoring, Models, and Trade-Offs

A technical deep dive into how AI rerankers work, explaining their scoring mechanisms, model architectures, and implementation trade-offs.

search algorithms Reranking Information Retrieval

9/11/2025 • EN

Understanding Re-Rankers: The Key to Smarter Search Results

Explains how rerankers improve search and AI results by reordering retrieved documents for better precision and relevance.

Machine Learning genai Search

9/11/2025 • EN

LLMs, Token Limits, and Handling Concurrent Requests

Explains LLM API token limits (TPM) and strategies for managing concurrent requests to avoid rate limiting in production applications.

api llm concurrency

9/2/2025 • EN

Understanding the P95/P99 Latency Principle: Why the Slowest Requests Matter Most

Explains why P95 and P99 latency metrics are crucial for understanding real user experience, not just average response times.

performance monitoring latency observability

9/2/2025 • EN

Little’s Law and Concurrency: Why Your System Gets Slow When It’s Busy

Explains Little's Law from queuing theory and how it applies to system performance, showing why latency increases cause concurrency to balloon under load.

latency concurrency throughput

8/17/2025 • EN

Docker Compose for AI Agents, Part 2: Operate & Evolve

Part 2 of a guide on using Docker Compose to enhance the reliability and portability of AI agents, focusing on Dockerfile and compose.yaml.

DevOps docker fastapi

8/17/2025 • EN

Docker Compose for AI Agents, Part 1: Build & Run

A tutorial on using Docker Compose to create reproducible, containerized runtime environments for AI agents, focusing on a weather query example.

Python docker docker-compose

Rajesh P

Articles from this Blog

Select Language