I'm swearing off APIs entirely
A developer explains why they are giving up on building apps that rely on external APIs due to access issues, ethical concerns, and platform risks.
A developer explains why they are giving up on building apps that rely on external APIs due to access issues, ethical concerns, and platform risks.
shot-scraper 1.9 CLI tool released, featuring a new -x option to extract page resources and accessibility command fixes.
Analyzing All The Places' open-source location data project, detailing the technical setup and process for downloading and examining millions of brand locations.
A technical analysis of Claude Code's WebFetch and WebSearch tools, detailing their internal architecture and processing pipelines.
Discover an undocumented trick to get xkcd comics at double resolution using a simple URL modification and a Python script to check availability.
Discusses the trend of websites walling off content from AI bots, arguing it undermines open internet principles and may concentrate power.
A security researcher discovers goHardDrive exposed thousands of customer records via an insecure RMA status check form with no authentication.
Explores the ethics of LLM training data and proposes a technical method to poison AI crawlers using nofollow links.
Part two of building a personal recommendation system, covering data collection from Pocket and content extraction using the Jina Reader API.
A developer's frustration with aggressive LLM crawlers causing outages and consuming resources, detailing past abuse like crypto mining and Go module mirror issues.
Explains the LLMs.txt file, a new standard for providing context and metadata to Large Language Models to improve accuracy and reduce hallucinations.
A guide to using browser-use, a scriptable AI agent built with Playwright and LLMs to automate repetitive browser tasks.
Explores using Bing Search API to ground LLM responses for website assistants, comparing custom implementation with Azure AI Agent Service.
A technical tutorial on creating interactive data tables by web-scraping with R's rvest package and styling with reactable.
Cloudflare now offers a simple setting to block AI bots from scraping your website, available even on free plans.
Argues for an evolved robots.txt standard with AI-specific rules and regulations to enforce them, citing Perplexity AI's violations.
A guide to installing and configuring Playwright for browser automation on Heroku using Node.js, including dependency management and code structure.
How to automatically check internal links on a static site using Scrapy and GitHub Actions for continuous integration.
A technical analysis of UK rainfall data, covering data scraping, visualization, and processing using Python and APIs.
A technical walkthrough of scraping and visualizing global airline passenger route data using Python, DuckDB, and QGIS.