Rough Experiments with Llamafile and LLaVA 1.5
A developer experiments with Llamafile and LLaVA 1.5 to extract structured data from comedy show posters, testing its accuracy and JSON output capabilities.
A developer experiments with Llamafile and LLaVA 1.5 to extract structured data from comedy show posters, testing its accuracy and JSON output capabilities.
Building an image search system using GPT-4 Vision and Azure AI to find images via text queries or similar pictures.
A technical guide on using Meta AI's Segment Anything model to perform object segmentation on satellite imagery from Maxar.
A weekly tech learning digest covering Microsoft Fabric, AI topics, computer vision, Azure AI Document Intelligence, embeddings, and vector search.
Interview with Frank Liu on vector databases, embeddings, his career in ML/hardware, and work culture differences between China and the US.
Explains the Supercells algorithm for generating superpixels to improve segmentation of geospatial and satellite imagery.
Explores a future AI-assisted computer interface model inspired by sci-fi, where AI highlights data anomalies for human specialist review.
A guide to using Hugging Face Transformers library with examples for fine-tuning models like BERT and BART for NLP and computer vision tasks.
A review of the top 10 most influential machine learning papers from 2022, including ConvNeXt and MaxViT, with technical analysis.
A tutorial on fine-tuning Microsoft's LayoutLM model for document understanding and information extraction using the Hugging Face Transformers library.
A technical guide on using Hugging Face's SegFormer model with Amazon SageMaker for semantic image segmentation tasks.
A $10,000 charity bet on whether fully autonomous (Level 5) self-driving cars will be commercially available in major US cities by 2030.
A tutorial on creating an interactive digital frame with head-tracking perspective effects using Three.js and TensorFlow.js.
A comprehensive deep learning course covering fundamentals, neural networks, computer vision, and generative models using PyTorch.
A comprehensive deep learning course overview with PyTorch tutorials, covering fundamentals, neural networks, and advanced topics like CNNs and GANs.
A tutorial on building a Sudoku solver application using Azure Form Recognizer AI, .NET backend, and Angular frontend.
Explains the physics and optics behind why smartphone portrait mode uses artificial blur instead of true optical depth of field.
A curated list of public dataset repositories for machine learning and deep learning projects, including computer vision and NLP datasets.
A curated list of public dataset repositories for machine learning and deep learning projects, including sources for computer vision, NLP, and more.