PySpark 101: Introduction to Big Data with Spark (2nd Wed Talk)

Hosted by Python Frederick
On Wed., Apr. 9, 2025 from 7pm to 9pm
At 122 E Patrick St, Frederick, MD 21701

Unlock the PySpark for Big Data. This is a beginner-friendly course designed to introduce you to Apache Spark, a fast and scalable distributed computing framework. This class covers the fundamentals of PySpark, including:

* Apache Spark Overview – Understand the core concepts and benefits of Spark for big data processing.
* PySpark Essentials – Learn about RDDs (Resilient Distributed Datasets) for distributed computation and DataFrames for optimized, structured data handling. Using SQL.
* Machine Learning with MLlib – Explore basic Spark’s scalable machine learning library for analytics and predictive modeling.

Perfect for beginners in data engineering and analytics, this course will equip you with the foundational skills to process and analyze large datasets efficiently using PySpark.

Also, we're going to have another "after hour." After the presentation is over, anyone who wants to stay and do a bit of coding or chatting is welcome to hang out. Think of it like a mini open workshop.

Visit the event scheduling website for all other details and how to RSVP.