Data Science

Data Science 

Course Outline

Microsoft Excel 

MODULE 1: DATA ANALYSIS WITH EXCEL

  • What is Data Analysis?
  • Excel Pro Tips for Power Users
  • Advanced Excel Formulas & Functions
  • Structuring Source Data for Analysis in Excel
  • Excel tables
  • Advantage of Excel tables over regular ranges
  • Filter, sort and see totals
  • Use tables to add columns to the existing data in the Excel table
  • Raw data structure in Excel
  • Design & formatting options
  • Sorting, filtering, & grouping tools
  • Calculated fields, items & values
  • Interactive Excel dashboards

MODULE 2: PIVOT TABLES

  • Why Pivot Tables?
  • Create pivot tables
  • Data Analysis with Excel PivotTables
  • Use pivot tables and pivot charts to create dashboards
  • Connect multiple slicers to the pivot tables
  • Configuring pivot table layouts & styles
  • Pivot Table layouts & styles
  • Pivot Charts, slicers & timelines

MODULE 3: USING PIVOT TABLES FOR ADVANCED CALCULATIONS

  • The full power of pivot tables
  • Filter pivot table data to achieve interesting subsets of the data
  • Apply calculated fields with pivot tables
  • Calculate profitability and find anomalies

MODULE 4: AGGREGATE DATA

  • Use formulas to aggregate data
  • Aggregate data as an alternative to pivot tables
  • Flexible reporting layouts
  • Pivot charts using more than one table

MODULE 5: LAB ENVIRONMENT & INSTALLATIONS

  • Set the lab environment up with Office applications
  • Data analysis in Excel using tools:
    • pivot tables
    • pivot charts and slicers
  • Excel data models
  • Power Pivot add-in
  • DAX expressions for calculated columns and measures

MODULE 6: QUERIES

  • What are the queries?
  • Power Query for Excel 2010 and 2013
  • Excel data model built from a single flat table
  • Import multiple tables from a SQL database
  • Create an Excel data model using imported data
  • Create a collated result with data from text-files and data from a SQL database
  • Up & running with Power BI Desktop
  • Publishing to Power BI Service
  • SQL database analysis for beginners

MODULE 7: DATA FILTERS & MEASURES

  • How to establish measures to calculate each cell
  • Filter context for calculation
  • DAX functions
  • Enhanced queries to import data from a formatted Excel sheet
  • Queries beyond the standard user interface

MODULE 8: DATA VISUALISATIONS

  • Data visualizations in Excel
  • Cube functions for year-over-year comparisons
  • Create timelines, hierarchies, and slicers
  • Excel working together with Power BI
  • Excel workbooks with Power BI service
  • Excel on the mobile platform

Microsoft powerbi 

Module 1: Get Started with Microsoft Data Analytics

This module explores the different roles in the data space, outlines the important roles and responsibilities of a Data Analysts, and then explores the landscape of the Power BI portfolio.

Lessons

Data Analytics and Microsoft

Getting Started with Power BI

Lab : Getting Started in Power BI Desktop

Getting Started

Module 2: Prepare Data in Power BI

This module explores identifying and retrieving data from various data sources. You will also learn the options for connectivity and data storage and understand the difference and performance implications of connecting directly to data vs. importing it.

Lessons

Get data from various data sources

Lab : Preparing Data in Power BI Desktop

Prepare Data

Module 3: Clean, Transform, and Load Data in Power BI

This module teaches you the process of profiling and understanding the condition of the data. They will learn how to identify anomalies, look at the size and shape of their data, and perform the proper data cleaning and transforming steps to prepare the data for loading into the model.

Lessons

Data shaping

Enhance the data structure

Data Profiling

Lab : Transforming and Loading Data in Power BI Desktop

Loading Data

Module 4: Design a Data Model in Power BI

This module teaches the fundamental concepts of designing and developing a data model for proper performance and scalability. This module will also help you understand and tackle many of the common data modeling issues, including relationships, security, and performance.

Lessons

Introduction to data modeling

Working with tables

Dimensions and Hierarchies

Lab : Data Modeling in Power BI Desktop

Create Model Relationships

Configure Tables

Review the model interface

Create Quick Measures

Lab : Advanced Data Modeling in Power BI Desktop

Configure many-to-many relationships

Enforce row-level security

Module 5: Create Model Calculations using DAX in Power BI

This module introduces you to the world of DAX and its true power for enhancing a model. You will learn about aggregations and the concepts of Measures, calculated columns and tables, and Time Intelligence functions to solve calculation and data analysis problems.

Lessons

Introduction to DAX

DAX context

Advanced DAX

Lab : Advanced DAX in Power BI Desktop

Use the CALCULATE() function to manipulate filter context

Use Time Intelligence functions

Lab : Introduction to DAX in Power BI Desktop

Create calculated tables

Create calculated columns

Create measures

Module 6: Optimize Model Performance in Power BI

In this module you are introduced to steps, processes, concepts, and data modeling best practices necessary to optimize a data model for enterprise-level performance.

Lessons

Optimze the model for performance

Optimize DirectQuery Models

Create and manage Aggregations

Module 7: Create Reports in Power BI

This module introduces you to the fundamental concepts and principles of designing and building a report, including selecting the correct visuals, designing a page layout, and applying basic but critical functionality. The important topic of designing for accessibility is also covered.

Lessons

Design a report

Enhance the report

Lab : Designing a report in Power BI Desktop

Create a live connection in Power BI Desktop

Design a report

Configure visual fields and format properties

Lab : Enhancing reports with interaction and formatting in Power BI Desktop

Create and configure Sync Slicers

Create a drillthrough page

Apply conditional formatting

Create and use Bookmarks

Module 8: Create Dashboards in Power BI

In this module you will learn how to tell a compelling story through the use of dashboards and the different navigation tools available to provide navigation. You will be introduced to features and functionality and how to enhance dashboards for usability and insights.

Lessons

Create a Dashboard

Real-time Dashboards

Enhance a Dashboard

Lab : Creating a Dashboard in Power BI Service

Create a Dashboard

Pin visuals to a Dashboard

Configure a Dashboard tile alert

Use Q&A to create a dashboard tile

Module 9: Enhance reports for usability and storytelling in Power BI

This module will teach you about paginated reports, including what they are how they fit into Power BI. You will then learn how to build and publish a report.

Lessons

Paginated report overview

Create Paginated reports

Lab : Creating a Paginated report in Power BI Desktop

Use Power BI Report Builder

Design a multi-page report layout

Define a data source

Define a dataset

Create a report parameter

Export a report to PDF

Module 10: Perform Advanced Analytics in Power BI

This module helps you apply additional features to enhance the report for analytical insights in the data, equipping you with the steps to use the report for actual data analysis. You will also perform advanced analytics using AI visuals on the report for even deeper and meaningful data insights.

Lessons

Advanced Analytics

Data Insights through AI visuals

Lab : Data Analysis in Power BI Desktop

Create animated scatter charts

Use the visual to forecast values

Work with Decomposition Tree visual

Work with the Key Influencers visual

Module 11: Manage Datasets in Power BI

In this module you will learn the concepts of managing Power BI assets, including datasets and workspaces. You will also publish datasets to the Power BI service, then refresh and secure them.

Lessons

Parameters

Datasets

Security in Power BI

Module 12: Create and Manage Workspaces in Power BI

This module will introduce you to Workspaces, including how to create and manage them. You will also learn how to share content, including reports and dashboards, and then learn how to distribute an App.

Lessons

Creating Workspaces

Sharing and Managing Assets

Lab : Publishing and Sharing Power BI Content

Map security principals to dataset roles

Share a dashboard

SQL

  1. Introduction & Setup
    • Understanding SQL and databases
    • Key concepts of schema, tables, columns, primary keys, and foreign keys
    • Installation of SQL Server
    • Installation of SQL Server Management Studio (SSMS)
    • Sample company-wide DB setup (Adventure Works Cycle)
  2. Introduction to relational and non-relational databases
    • Relational databases like MS SQL, MySQL, and PostgreSQL
    • Non-relational databases
    • SQL is useful across all,  including big data platforms
  3. Building a relational SQL DB Table from scratch
    • CREATE TABLE to build from scratch
    • ALTER TABLE to modify table structure
    • DROP TABLE to delete table
    • INSERT INTO statement to add new rows of records
    • UPDATE statement modify existing rows of records
    • DELETE statement to remove rows of records
  4. SELECT statement for data analysis
    • Basic SELECT statement
    • SELECT DISTINCT
    • SELECT TOP
    • WHERE clause
    • Multiple WHERE conditions
    • ORDER BY
    • GROUP BY
    • HAVING
    • Understanding when to use WHERE vs. HAVING
  5. Logical Operators
    • AND
    • OR
    • NOT
    • IN
    • BETWEEN
    • LIKE
    • CASE
  6. Advanced data manipulation
    • Carrying out mathematical operations
    • Using alias
    • JOIN and the different types
    • UNION, INTERSECT and EXCEPT
    • LIKE wildcards
    • Subqueries
    • Common Table Expressions (CTE)
    • String functions
    • Numbers formatting
    • Date formatting
  7. Views and stored procedures
    • Use of views
    • How to create views
    • Use of stored procedures
    • How to create stored procedures
    • Exporting tables as stored procedures
  8. Understanding SQL and database performance
  9. Extensive practical tasks

Python

Introduction to Python for data analysis

What data analysis is

The Python skills that you need for data analysis

How to use JupyterLab as your IDE

How to split the screen between two Notebooks

How to use Magic Commands

The Pandas essentials for data analysis

Introduction to the Pandas DataFrame

How to examine the data

How to access the columns and rows

How to work with the data

How to shape the data

How to analyze the data

The Pandas essentials for data visualization

Introduction to data visualization

How to create 8 types of plots

How to enhance a plot

The Seaborn essentials for data visualization

Introduction to Seaborn

How to enhance and save plots

How to create relational plots

How to create categorical plots

How to create distribution plots

Other techniques for enhancing a plot

How to get the data

How to find the data that you want to analyze

How to import data into a DataFrame

How to get database data into a DataFrame

How to work with a Stata file

How to work with a JSON file

How to clean the data

Introduction to data cleaning

How to simplify the data

How to find and fix missing values

How to fix data type problems

How find and fix outliers

How to prepare the data

How to add and modify columns

How to apply functions and lambda expressions

How to work with indexes

How to combine DataFrames

How to handle the SettingWithCopyWarning

How to analyze the data

How to create and plot long data

How to group and aggregate the data

How to create and use pivot tables

How to work with bins

More skills for data analysis

How to analyze time-series data

How to reindex time-series data

How to resample time-series data

How to work with rolling windows

How to work with running totals

How to make predictions with a linear regression model

Introduction to predictive analysis

How to find correlations between variables

How to use Scikit-learn to work with a linear regression

How to plot regression models with Seaborn

How to make predictions with a multiple regression model

A simple regression model for a Cars dataset

How to work with a multiple regression model

How to work with categorical variables

How to improve a multiple regression model

SPSS

1           Introduction

–      Introduction to SPSS

–       Data analysis with SPSS: general aspects, workflow, critical issues

–       SPSS: general description, functions, menus, commands

–      SPSS file management

–      A brief theoretical recap of the statistical analysis tools and techniques

2           Input and data cleaning

–      Defining variables

–      Manual input of data

–      Automated input of data and file import

–      Data cleaning

Data manipulation

–      Data Transformations

–       Data file management

–      Syntax files and scripts

–      Output management

3           Descriptive analysis of data

–      Frequencies

–      Descriptives

–      Explore

–      Crosstabs

–      Charts

4           Statistical tests

–      Means

–      T-test

–      One-way ANOVA

– Non-parametric tests

–      Normality tests

Correlation and regression

–      Linear correlation and regression

–      Multiple regression (linear)

5           Multivariate analysis

–      Factor analysis

–      Cluster analysis

Exercise

6           Final test

Software used

R

Introduction to R software for statistical computing 

  • Overview of the R Studio IDE
  • Installing, loading and updating R packages
  • Creating objects in R
  • Data types
  • Data structures
  • Sorting vectors and data frames
  • Directory management commands
  • Direct data entry in R (for small data sets)
  • Importing data from other software
  • Decision structures (if, if-else, if-else if-else)
  • Repetitive structures (for and while loops)
  • Other important programming functions (break, next, warn, stop)

Data Wrangling and Cleaning in R 

  • Working with variables
  • Transform continuous variables to categorical variables
  • Add new variables to data frames
  • Handling missing values
  • Sub-setting data frames
  • Appending and merging data frames
  • Spit data frames
  • Stack and unstack data frames

Explanatory Data Analysis (EDA) in R 

  • Creating tables of frequencies and proportions
  • Cross tabulations of categorical variables
  • Descriptive statistics for continuous variables

Data Visualization using R base package

  • Introduction to graphs and charts in R
  • Customizing graph attributes (titles, axes, text, legends)
  • Graphs for categorical variables
  • Graphs for continuous variables
  • Graphs to investigate relationship between variables

Mean Comparison Tests in R 

  • One Sample T Test
  • Independent Samples T Test
  • Paired Samples T Test
  • One-way analysis of variance (ANOVA)

Tests of Associations in R

  • Chi-Square test of independence
  • Pearson’s Correlation
  • Spearman’s Rank-Order Correlation

Predictive Regression Models using R 

  • Linear Regression
  • Multiple Linear Regression
  • Binary Logistic Regression
  • Ordinal Logistic Regression

Tableau 

Module 1: Introduction to Tableau

  • Overview of Tableau
    • What is Tableau?
    • Tableau’s Key Features
    • Types of Tableau Products (Tableau Desktop, Tableau Server, Tableau Online, Tableau Public)
  • Setting Up Tableau
    • Installation and Configuration
    • Navigating the Tableau Interface
    • Understanding Tableau’s Workspace

Module 2: Data Connection and Preparation

  • Connecting to Data Sources
    • Connecting to Various Data Sources (Excel, SQL, Google Sheets, Web Data Connectors)
    • Data Connection Types (Live vs. Extract)
  • Data Preparation
    • Data Cleaning and Transformation
    • Using Tableau Prep for Data Preparation
    • Joining and Blending Data
    • Handling Null Values and Duplicates

Module 3: Building Basic Visualizations

  • Creating Basic Charts
    • Bar Charts
    • Line Charts
    • Pie Charts
  • Using Tables and Heat Maps
    • Text Tables
    • Heat Maps
  • Building Basic Dashboards
    • Combining Multiple Views
    • Adding Filters and Slicers

Module 4: Advanced Visualization Techniques

  • Advanced Charts
    • Scatter Plots
    • Bullet Charts
    • Gantt Charts
    • Box Plots
  • Geographic Mapping
    • Creating Maps and Geospatial Visualizations
    • Customizing Map Layers
    • Using Map Layers and Custom Geocoding
  • Creating Interactive Dashboards
    • Adding Actions (Filter, Highlight, URL)
    • Using Parameters

Module 5: Data Analysis and Calculations

  • Calculated Fields
    • Basic Calculations (Arithmetic, String Functions)
    • Date Calculations
    • Aggregations and Granularity
  • Table Calculations
    • Running Totals
    • Percent of Total
    • Moving Averages
  • Statistical Analysis
    • Forecasting
    • Trend Lines
    • Reference Lines and Bands

Module 6: Data Visualization Best Practices

  • Designing Effective Dashboards
    • Principles of Data Visualization
    • Choosing the Right Chart Type
    • Layout and Design Considerations
  • Enhancing Dashboards
    • Customizing Tooltips
    • Formatting and Styling
    • Implementing Conditional Formatting

Module 7: Sharing and Publishing

  • Sharing Visualizations
    • Publishing to Tableau Server and Tableau Online
    • Creating and Sharing Tableau Public Dashboards
  • Permissions and Security
    • Managing User Access
    • Setting Permissions for Different Views and Data
  • Exporting Data and Visualizations
    • Exporting to PDF, Image, Excel

Module 8: Tableau Advanced Features

  • Using Tableau Extensions
    • What Are Extensions?
    • Installing and Configuring Extensions
  • Working with Tableau’s API
    • Introduction to Tableau’s REST API
    • Use Cases and Examples
  • Integrating with Other Tools
    • Connecting Tableau with R and Python
    • Using Tableau with BI Tools and Data Warehouses

Module 9: Performance Optimization

  • Optimizing Dashboards
    • Performance Tuning Tips
    • Reducing Load Times
  • Managing Data Extracts
    • Extract Refresh Strategies
    • Incremental Extracts

Module 10: Real-World Case Studies and Projects

  • Analyzing Case Studies
    • Industry-Specific Examples (Finance, Healthcare, Retail)
  • Hands-On Projects
    • Building Comprehensive Dashboards
    • Implementing Best Practices