Get in Touch

Course Outline

Quick Overview

  • Data Sources
  • Minding Data
  • Recommender systems
  • Target Marketing

Datatypes

  • Structured vs unstructured
  • Static vs streamed
  • Attitudinal, behavioural and demographic data
  • Data-driven vs user-driven analytics
  • Data validity
  • Volume, velocity and variety of data

Models

  • Building models
  • Statistical Models
  • Machine learning

Data Classification

  • Clustering
  • kGroups, k-means, the nearest neighbours
  • Ant colonies, birds flocking

Predictive Models

  • Decision trees
  • Support vector machine
  • Naive Bayes classification
  • Neural networks
  • Markov Model
  • Regression
  • Ensemble methods

ROI

  • Benefit/Cost ratio
  • Cost of software
  • Cost of development
  • Potential benefits

Building Models

  • Data Preparation (MapReduce)
  • Data cleansing
  • Choosing methods
  • Developing model
  • Testing Model
  • Model evaluation
  • Model deployment and integration

Overview of Open Source and commercial software

  • Selection of R-project package
  • Python libraries
  • Hadoop and Mahout
  • Selected Apache projects related to Big Data and Analytics
  • Selected commercial solution
  • Integration with existing software and data sources

Requirements

A solid understanding of traditional data management and analysis methods, such as SQL, data warehouses, business intelligence, OLAP, etc., is required. Additionally, a grasp of basic statistics and probability concepts (mean, variance, probability, conditional probability, etc.) is necessary.

 21 Hours

Number of participants


Price per participant

Testimonials (2)

Upcoming Courses

Related Categories