Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction to the Stratio Platform
- Overview of Stratio’s architecture and core modules.
- The critical role of Rocket and Intelligence in the data lifecycle.
- Logging in and navigating the Stratio user interface.
Working with the Rocket Module
- Data ingestion strategies and pipeline creation.
- Connecting data sources and configuring transformations.
- Utilizing PySpark for preprocessing tasks within Rocket.
PySpark Essentials for Stratio Users
- PySpark data structures and core operations.
- Looping constructs: practical usage of for, while, and if/else statements.
- Writing custom functions using 'def' and applying them effectively.
Advanced Usage of Rocket with PySpark
- Streaming ingestion and real-time transformations.
- Leveraging loops and functions in both batch and real-time scenarios.
- Best practices for optimizing performance in PySpark pipelines.
Exploring the Intelligence Module
- Overview of data modeling and analytical features.
- Feature selection, transformation, and exploratory analysis.
- The role of PySpark in delivering custom analytics and insights.
Building Advanced Analytics Workflows
- Creating User-Defined Functions (UDFs) within Intelligence.
- Applying conditionals and loops for complex data logic.
- Practical use cases: segmentation, aggregation, and prediction.
Deployment and Collaboration
- Saving, exporting, and reusing workflows.
- Collaborating effectively with team members on Stratio.
- Reviewing outputs and integrating with downstream tools.
Summary and Next Steps
Requirements
- Prior experience with Python programming.
- A solid understanding of data analytics or big data processing concepts.
- Fundamental knowledge of Apache Spark and distributed computing principles.
Target Audience
- Data engineers working with Stratio-based platforms.
- Analysts or developers utilizing Rocket and Intelligence modules.
- Technical teams transitioning their workflows to PySpark within Stratio.
14 Hours
Testimonials (2)
Doing Exercise
Joe Pang - Lands Department, Hong Kong
Course - QGIS for Geographic Information System
Hands-on examples allowed us to get an actual feel for how the program works. Good explanations and integration of theoretical concepts and how they relate to practical applications.