shot-scraper 1.9
shot-scraper 1.9 CLI tool released, featuring a new -x option to extract page resources and accessibility command fixes.
shot-scraper 1.9 CLI tool released, featuring a new -x option to extract page resources and accessibility command fixes.
Discover an undocumented trick to get xkcd comics at double resolution using a simple URL modification and a Python script to check availability.
Discusses the trend of websites walling off content from AI bots, arguing it undermines open internet principles and may concentrate power.
A security researcher discovers goHardDrive exposed thousands of customer records via an insecure RMA status check form with no authentication.
Part two of building a personal recommendation system, covering data collection from Pocket and content extraction using the Jina Reader API.
A guide to using browser-use, a scriptable AI agent built with Playwright and LLMs to automate repetitive browser tasks.
A technical tutorial on creating interactive data tables by web-scraping with R's rvest package and styling with reactable.
Cloudflare now offers a simple setting to block AI bots from scraping your website, available even on free plans.
Argues for an evolved robots.txt standard with AI-specific rules and regulations to enforce them, citing Perplexity AI's violations.
How to automatically check internal links on a static site using Scrapy and GitHub Actions for continuous integration.
A programmer's guide to automating a badminton court booking system using Selenium and Python to secure time slots.
A technical tutorial on using R and the rvest package to scrape data from multiple web pages, including handling pagination.
A tutorial on building and scheduling a Python web scraper to run automatically using GitHub Actions, including emailing results.
Learn how to monitor webpage changes using Home Assistant, checking ETag headers or content hashes to trigger automations.
A guide on how to automate posting Instagram Stories using Python and an unofficial API library, including code examples.
A JavaScript snippet to download multiple images from a web page with a timeout to manage browser limitations.
A developer details a tricky middleware bug in a Clojure web scraping framework that caused character encoding issues due to header casing.
A developer creates a bookmarklet to bypass Instagram's login prompt and re-enable scrolling on the web version.
A personal request for help finding a short-term sublet in NYC after a last-minute cancellation.
A tutorial on using Puppeteer to crawl website articles, extract content, and convert it into Markdown files for a static site migration.