Building LLMs is probably not going be a brilliant business
Analyzes why building Large Language Models (LLMs) may be a poor business, comparing the AI industry's structure to historically unprofitable sectors like airlines.
Cal Paterson — London-born developer living in Helsinki, creator of csvbase and Python community organizer with a strong interest in data and open collaboration.
22 articles from this blog
Analyzes why building Large Language Models (LLMs) may be a poor business, comparing the AI industry's structure to historically unprofitable sectors like airlines.
Analyzes Tesla's high market cap vs. Toyota, explaining why enterprise value (including debt) is a more complete measure of a company's true worth.
Explains why Amazon S3, despite being used for files, is fundamentally not a filesystem, contrasting its design with the Unix file API and deep vs. shallow modules.
Analyzes the limited practical applications of blockchain technology, arguing it's not a general-purpose solution and is inefficient compared to centralized databases.
Explores the unique, proprietary Python ecosystems used within major investment banks, detailing their unconventional architecture and tools.
Explores the shift from products to subscription models in tech, using printers and milk delivery as examples to explain capex vs. opex.
The article critiques modern search engines' reliance on structured metadata over advanced AI for tasks like indexing and content understanding.
Explores caching strategies that avoid TTLs and maintain correctness through active invalidation and update-on-write techniques.
Analyzes how server location and CDNs impact website speed using real latency data and access logs.
Analysis of Mozilla's financial struggles, declining Firefox usage, and controversial executive pay raises despite major workforce cuts.
A threat modeling case study using bicycle theft to illustrate security principles applicable to IT systems.
Benchmark analysis shows async Python web frameworks often have worse throughput and higher latency variance than synchronous alternatives under realistic conditions.
Argues against clearing the database between automated tests, citing speed, correctness, and parallelism benefits.
Critique of the Active Record pattern, explaining its inefficiencies in data access and performance issues in applications and APIs.
Practical tips for integrating mypy static type checking into existing Python projects, covering setup, manual type hints, and handling Optional types.
Analysis comparing cloud provider pricing for VMs and storage, finding Amazon, Google, and Microsoft charge a premium vs. DigitalOcean, Linode, and OVH.
Explains the internal workings of SQL databases, covering data structures like tables, indexes, rows, and pages for both storage and querying.
Discusses common pitfalls and challenges when using non-relational databases, focusing on difficulties in changing primary keys and access patterns.
Practical advice for software teams to reduce merge conflicts and improve collaboration using version control best practices.
Guide to configuring pip to cache compiled Python wheels for faster installation of C-based libraries, saving time on repeated builds.