Intro to Apache Iceberg with Apache Polaris and Apache Spark
A technical guide on using Apache Iceberg with Apache Spark and Polaris for building and managing a data lakehouse, covering setup, operations, and optimization.
A technical guide on using Apache Iceberg with Apache Spark and Polaris for building and managing a data lakehouse, covering setup, operations, and optimization.
A beginner-friendly introduction to using PySpark for big data processing with Apache Spark, covering the fundamentals.
Using GitHub Actions to trigger Airflow DAGs for orchestrating data pipelines across Spark, Dremio, and Snowflake.
A hands-on tutorial for building a Data Lakehouse on your laptop using Apache Iceberg, Spark, Nessie, Minio, and Dremio.
A technical guide on configuring Apache Flink to write data to Delta Lake tables stored on S3, including required JARs and configuration steps.
A weekly tech digest covering Microsoft Fabric, Power BI, Purview updates, and articles on Generative AI, Semantic Kernel, and the AutoGen framework.
A tutorial on building a local Data Lakehouse using Docker Compose with Apache Spark, Minio, Dremio, and Nessie.
A guide to configuring Apache Spark for use with the Apache Iceberg table format, covering packages, flags, and programmatic setup.
Troubleshooting an Azure Synapse Analytics error: 'LSRServiceException – Could not find Linked Service' when running AutoML.
Summary of key application-agnostic talks from Spark+AI Summit 2020, focusing on scaling and optimizing deep learning models.
Exploring Oracle Stream Analytics (OSA), a tool for real-time analysis of streaming data like Kafka and Twitter feeds via a web interface.
Guide to using Jupyter Notebooks with Oracle Big Data Discovery 1.2 for advanced data science and Python/Spark integration.
Guide to setting up Big Data Discovery Shell and Jupyter Notebooks on Oracle's Big Data Lite VM for advanced data science work.