Introducing the new Amazon Q Developer experience in AWS Lambda
AWS Lambda's new code editor features an improved Amazon Q Developer AI assistant for generating and debugging Lambda functions with better in-line code previews.
AWS Lambda's new code editor features an improved Amazon Q Developer AI assistant for generating and debugging Lambda functions with better in-line code previews.
Explains Parquet's columnar storage model, detailing its efficiency for big data analytics through faster queries, better compression, and optimized aggregation.
A practical guide to reading and writing Parquet files in Python using PyArrow and FastParquet libraries.
Explores compression algorithms in Parquet files, comparing Snappy, Gzip, Brotli, Zstandard, and LZO for storage and performance.
Explains the hierarchical structure of Parquet files, detailing how pages, row groups, and columns optimize storage and query performance.
Explains encoding techniques in Parquet files, including dictionary, RLE, bit-packing, and delta encoding, to optimize storage and performance.
Explores how metadata in Parquet files improves data efficiency and query performance, covering file, row group, and column-level metadata.
Explores why Parquet is the ideal columnar file format for optimizing storage and query performance in modern data lake and lakehouse architectures.
Final guide in a series covering performance tuning and best practices for optimizing Apache Parquet files in big data workflows.
An introduction to Apache Parquet, a columnar storage file format for efficient data processing and analytics.
Explains how Parquet handles schema evolution, including adding/removing columns and changing data types, for data engineers.
A technical guide comparing spatial patterns in continuous raster data for overlapping regions using R, focusing on NDVI data analysis.
A practical guide to structuring Go projects, advocating for simplicity over rigid conventions and explaining when to use or avoid common directory patterns.
Security audit results for vdirsyncer reveal four minor findings, including file permissions and error handling issues, with fixes implemented.
Using GitHub Actions to trigger Airflow DAGs for orchestrating data pipelines across Spark, Dremio, and Snowflake.
Explores using GitHub Actions for software development CI/CD and advanced data engineering tasks like ETL pipelines and data orchestration.
Explores the future of PostgreSQL, focusing on the power of extensions like pg_stat_statements, Citus, and pg_search to add new capabilities.
A guide explaining dbt macros, their purpose, benefits, and how to use them to write reusable, standardized SQL code in data transformation projects.
A summary of HashiConf 2024, covering major announcements like Terraform Stacks and the event's focus on Infrastructure and Security Lifecycle Management.
A talk at Python Marche 2024 exploring various ways to contribute to the Python community, from coding to documentation.