
Data Science
Data Science
Course Outline
Microsoft Excel
MODULE 1: DATA ANALYSIS WITH EXCEL
- What is Data Analysis?
- Excel Pro Tips for Power Users
- Advanced Excel Formulas & Functions
- Structuring Source Data for Analysis in Excel
- Excel tables
- Advantage of Excel tables over regular ranges
- Filter, sort and see totals
- Use tables to add columns to the existing data in the Excel table
- Raw data structure in Excel
- Design & formatting options
- Sorting, filtering, & grouping tools
- Calculated fields, items & values
- Interactive Excel dashboards
MODULE 2: PIVOT TABLES
- Why Pivot Tables?
- Create pivot tables
- Data Analysis with Excel PivotTables
- Use pivot tables and pivot charts to create dashboards
- Connect multiple slicers to the pivot tables
- Configuring pivot table layouts & styles
- Pivot Table layouts & styles
- Pivot Charts, slicers & timelines
MODULE 3: USING PIVOT TABLES FOR ADVANCED CALCULATIONS
- The full power of pivot tables
- Filter pivot table data to achieve interesting subsets of the data
- Apply calculated fields with pivot tables
- Calculate profitability and find anomalies
MODULE 4: AGGREGATE DATA
- Use formulas to aggregate data
- Aggregate data as an alternative to pivot tables
- Flexible reporting layouts
- Pivot charts using more than one table
MODULE 5: LAB ENVIRONMENT & INSTALLATIONS
- Set the lab environment up with Office applications
- Data analysis in Excel using tools:
- pivot tables
- pivot charts and slicers
- Excel data models
- Power Pivot add-in
- DAX expressions for calculated columns and measures
MODULE 6: QUERIES
- What are the queries?
- Power Query for Excel 2010 and 2013
- Excel data model built from a single flat table
- Import multiple tables from a SQL database
- Create an Excel data model using imported data
- Create a collated result with data from text-files and data from a SQL database
- Up & running with Power BI Desktop
- Publishing to Power BI Service
- SQL database analysis for beginners
MODULE 7: DATA FILTERS & MEASURES
- How to establish measures to calculate each cell
- Filter context for calculation
- DAX functions
- Enhanced queries to import data from a formatted Excel sheet
- Queries beyond the standard user interface
MODULE 8: DATA VISUALISATIONS
- Data visualizations in Excel
- Cube functions for year-over-year comparisons
- Create timelines, hierarchies, and slicers
- Excel working together with Power BI
- Excel workbooks with Power BI service
- Excel on the mobile platform
Microsoft powerbi
Module 1: Get Started with Microsoft Data Analytics
This module explores the different roles in the data space, outlines the important roles and responsibilities of a Data Analysts, and then explores the landscape of the Power BI portfolio.
Lessons
Data Analytics and Microsoft
Getting Started with Power BI
Lab : Getting Started in Power BI Desktop
Getting Started
Module 2: Prepare Data in Power BI
This module explores identifying and retrieving data from various data sources. You will also learn the options for connectivity and data storage and understand the difference and performance implications of connecting directly to data vs. importing it.
Lessons
Get data from various data sources
Lab : Preparing Data in Power BI Desktop
Prepare Data
Module 3: Clean, Transform, and Load Data in Power BI
This module teaches you the process of profiling and understanding the condition of the data. They will learn how to identify anomalies, look at the size and shape of their data, and perform the proper data cleaning and transforming steps to prepare the data for loading into the model.
Lessons
Data shaping
Enhance the data structure
Data Profiling
Lab : Transforming and Loading Data in Power BI Desktop
Loading Data
Module 4: Design a Data Model in Power BI
This module teaches the fundamental concepts of designing and developing a data model for proper performance and scalability. This module will also help you understand and tackle many of the common data modeling issues, including relationships, security, and performance.
Lessons
Introduction to data modeling
Working with tables
Dimensions and Hierarchies
Lab : Data Modeling in Power BI Desktop
Create Model Relationships
Configure Tables
Review the model interface
Create Quick Measures
Lab : Advanced Data Modeling in Power BI Desktop
Configure many-to-many relationships
Enforce row-level security
Module 5: Create Model Calculations using DAX in Power BI
This module introduces you to the world of DAX and its true power for enhancing a model. You will learn about aggregations and the concepts of Measures, calculated columns and tables, and Time Intelligence functions to solve calculation and data analysis problems.
Lessons
Introduction to DAX
DAX context
Advanced DAX
Lab : Advanced DAX in Power BI Desktop
Use the CALCULATE() function to manipulate filter context
Use Time Intelligence functions
Lab : Introduction to DAX in Power BI Desktop
Create calculated tables
Create calculated columns
Create measures
Module 6: Optimize Model Performance in Power BI
In this module you are introduced to steps, processes, concepts, and data modeling best practices necessary to optimize a data model for enterprise-level performance.
Lessons
Optimze the model for performance
Optimize DirectQuery Models
Create and manage Aggregations
Module 7: Create Reports in Power BI
This module introduces you to the fundamental concepts and principles of designing and building a report, including selecting the correct visuals, designing a page layout, and applying basic but critical functionality. The important topic of designing for accessibility is also covered.
Lessons
Design a report
Enhance the report
Lab : Designing a report in Power BI Desktop
Create a live connection in Power BI Desktop
Design a report
Configure visual fields and format properties
Lab : Enhancing reports with interaction and formatting in Power BI Desktop
Create and configure Sync Slicers
Create a drillthrough page
Apply conditional formatting
Create and use Bookmarks
Module 8: Create Dashboards in Power BI
In this module you will learn how to tell a compelling story through the use of dashboards and the different navigation tools available to provide navigation. You will be introduced to features and functionality and how to enhance dashboards for usability and insights.
Lessons
Create a Dashboard
Real-time Dashboards
Enhance a Dashboard
Lab : Creating a Dashboard in Power BI Service
Create a Dashboard
Pin visuals to a Dashboard
Configure a Dashboard tile alert
Use Q&A to create a dashboard tile
Module 9: Enhance reports for usability and storytelling in Power BI
This module will teach you about paginated reports, including what they are how they fit into Power BI. You will then learn how to build and publish a report.
Lessons
Paginated report overview
Create Paginated reports
Lab : Creating a Paginated report in Power BI Desktop
Use Power BI Report Builder
Design a multi-page report layout
Define a data source
Define a dataset
Create a report parameter
Export a report to PDF
Module 10: Perform Advanced Analytics in Power BI
This module helps you apply additional features to enhance the report for analytical insights in the data, equipping you with the steps to use the report for actual data analysis. You will also perform advanced analytics using AI visuals on the report for even deeper and meaningful data insights.
Lessons
Advanced Analytics
Data Insights through AI visuals
Lab : Data Analysis in Power BI Desktop
Create animated scatter charts
Use the visual to forecast values
Work with Decomposition Tree visual
Work with the Key Influencers visual
Module 11: Manage Datasets in Power BI
In this module you will learn the concepts of managing Power BI assets, including datasets and workspaces. You will also publish datasets to the Power BI service, then refresh and secure them.
Lessons
Parameters
Datasets
Security in Power BI
Module 12: Create and Manage Workspaces in Power BI
This module will introduce you to Workspaces, including how to create and manage them. You will also learn how to share content, including reports and dashboards, and then learn how to distribute an App.
Lessons
Creating Workspaces
Sharing and Managing Assets
Lab : Publishing and Sharing Power BI Content
Map security principals to dataset roles
Share a dashboard
SQL
- Introduction & Setup
- Understanding SQL and databases
- Key concepts of schema, tables, columns, primary keys, and foreign keys
- Installation of SQL Server
- Installation of SQL Server Management Studio (SSMS)
- Sample company-wide DB setup (Adventure Works Cycle)
- Introduction to relational and non-relational databases
- Relational databases like MS SQL, MySQL, and PostgreSQL
- Non-relational databases
- SQL is useful across all, including big data platforms
- Building a relational SQL DB Table from scratch
- CREATE TABLE to build from scratch
- ALTER TABLE to modify table structure
- DROP TABLE to delete table
- INSERT INTO statement to add new rows of records
- UPDATE statement modify existing rows of records
- DELETE statement to remove rows of records
- SELECT statement for data analysis
- Basic SELECT statement
- SELECT DISTINCT
- SELECT TOP
- WHERE clause
- Multiple WHERE conditions
- ORDER BY
- GROUP BY
- HAVING
- Understanding when to use WHERE vs. HAVING
- Logical Operators
- AND
- OR
- NOT
- IN
- BETWEEN
- LIKE
- CASE
- Advanced data manipulation
- Carrying out mathematical operations
- Using alias
- JOIN and the different types
- UNION, INTERSECT and EXCEPT
- LIKE wildcards
- Subqueries
- Common Table Expressions (CTE)
- String functions
- Numbers formatting
- Date formatting
- Views and stored procedures
- Use of views
- How to create views
- Use of stored procedures
- How to create stored procedures
- Exporting tables as stored procedures
- Understanding SQL and database performance
- Extensive practical tasks
Python
Introduction to Python for data analysis
What data analysis is
The Python skills that you need for data analysis
How to use JupyterLab as your IDE
How to split the screen between two Notebooks
How to use Magic Commands
The Pandas essentials for data analysis
Introduction to the Pandas DataFrame
How to examine the data
How to access the columns and rows
How to work with the data
How to shape the data
How to analyze the data
The Pandas essentials for data visualization
Introduction to data visualization
How to create 8 types of plots
How to enhance a plot
The Seaborn essentials for data visualization
Introduction to Seaborn
How to enhance and save plots
How to create relational plots
How to create categorical plots
How to create distribution plots
Other techniques for enhancing a plot
How to get the data
How to find the data that you want to analyze
How to import data into a DataFrame
How to get database data into a DataFrame
How to work with a Stata file
How to work with a JSON file
How to clean the data
Introduction to data cleaning
How to simplify the data
How to find and fix missing values
How to fix data type problems
How find and fix outliers
How to prepare the data
How to add and modify columns
How to apply functions and lambda expressions
How to work with indexes
How to combine DataFrames
How to handle the SettingWithCopyWarning
How to analyze the data
How to create and plot long data
How to group and aggregate the data
How to create and use pivot tables
How to work with bins
More skills for data analysis
How to analyze time-series data
How to reindex time-series data
How to resample time-series data
How to work with rolling windows
How to work with running totals
How to make predictions with a linear regression model
Introduction to predictive analysis
How to find correlations between variables
How to use Scikit-learn to work with a linear regression
How to plot regression models with Seaborn
How to make predictions with a multiple regression model
A simple regression model for a Cars dataset
How to work with a multiple regression model
How to work with categorical variables
How to improve a multiple regression model
SPSS
1 Introduction
– Introduction to SPSS
– Data analysis with SPSS: general aspects, workflow, critical issues
– SPSS: general description, functions, menus, commands
– SPSS file management
– A brief theoretical recap of the statistical analysis tools and techniques
2 Input and data cleaning
– Defining variables
– Manual input of data
– Automated input of data and file import
– Data cleaning
Data manipulation
– Data Transformations
– Data file management
– Syntax files and scripts
– Output management
3 Descriptive analysis of data
– Frequencies
– Descriptives
– Explore
– Crosstabs
– Charts
4 Statistical tests
– Means
– T-test
– One-way ANOVA
– Non-parametric tests
– Normality tests
Correlation and regression
– Linear correlation and regression
– Multiple regression (linear)
5 Multivariate analysis
– Factor analysis
– Cluster analysis
Exercise
6 Final test
Software used
R
Introduction to R software
- Overview of the R Studio IDE
- Installing, loading and
updating R packages - Creating objects in R
- Data types
- Data structures
- Sorting vectors and data
frames - Directory management commands
- Direct data entry in R (for
small data sets) - Importing data from other
software - Decision structures (if, if-
else, if-else if-else) - Repetitive structures (for
and while loops) - Other important programming
functions (break, next, warn, stop)
Data Wrangling and Cleaning
- Working with variables
- Transform continuous
variables to categorical variables - Add new variables to data
frames - Handling missing values
- Sub-setting data frames
- Appending and merging data
frames - Spit data frames
- Stack and unstack data frames
Explanatory Data Analysis (
- Creating tables of
frequencies and proportions - Cross tabulations of
categorical variables - Descriptive statistics for
continuous variables
Data Visualization using R
- Introduction to graphs and
charts in R - Customizing graph attributes (
titles, axes, text, legends) - Graphs for categorical
variables - Graphs for continuous
variables - Graphs to investigate
relationship between variables
Mean Comparison Tests in R
- One Sample T Test
- Independent Samples T Test
- Paired Samples T Test
- One-way analysis of variance (
ANOVA)
Tests of Associations in R
- Chi-Square test of
independence - Pearson’s Correlation
- Spearman’s Rank-Order
Correlation
Predictive Regression Models
- Linear Regression
- Multiple Linear Regression
- Binary Logistic Regression
- Ordinal Logistic Regression
Tableau
Module 1: Introduction to Tableau
- Overview of Tableau
- What is Tableau?
- Tableau’s Key Features
- Types of Tableau Products (Tableau Desktop, Tableau Server, Tableau Online, Tableau Public)
- Setting Up Tableau
- Installation and Configuration
- Navigating the Tableau Interface
- Understanding Tableau’s Workspace
Module 2: Data Connection and Preparation
- Connecting to Data Sources
- Connecting to Various Data Sources (Excel, SQL, Google Sheets, Web Data Connectors)
- Data Connection Types (Live vs. Extract)
- Data Preparation
- Data Cleaning and Transformation
- Using Tableau Prep for Data Preparation
- Joining and Blending Data
- Handling Null Values and Duplicates
Module 3: Building Basic Visualizations
- Creating Basic Charts
- Bar Charts
- Line Charts
- Pie Charts
- Using Tables and Heat Maps
- Text Tables
- Heat Maps
- Building Basic Dashboards
- Combining Multiple Views
- Adding Filters and Slicers
Module 4: Advanced Visualization Techniques
- Advanced Charts
- Scatter Plots
- Bullet Charts
- Gantt Charts
- Box Plots
- Geographic Mapping
- Creating Maps and Geospatial Visualizations
- Customizing Map Layers
- Using Map Layers and Custom Geocoding
- Creating Interactive Dashboards
- Adding Actions (Filter, Highlight, URL)
- Using Parameters
Module 5: Data Analysis and Calculations
- Calculated Fields
- Basic Calculations (Arithmetic, String Functions)
- Date Calculations
- Aggregations and Granularity
- Table Calculations
- Running Totals
- Percent of Total
- Moving Averages
- Statistical Analysis
- Forecasting
- Trend Lines
- Reference Lines and Bands
Module 6: Data Visualization Best Practices
- Designing Effective Dashboards
- Principles of Data Visualization
- Choosing the Right Chart Type
- Layout and Design Considerations
- Enhancing Dashboards
- Customizing Tooltips
- Formatting and Styling
- Implementing Conditional Formatting
Module 7: Sharing and Publishing
- Sharing Visualizations
- Publishing to Tableau Server and Tableau Online
- Creating and Sharing Tableau Public Dashboards
- Permissions and Security
- Managing User Access
- Setting Permissions for Different Views and Data
- Exporting Data and Visualizations
- Exporting to PDF, Image, Excel
Module 8: Tableau Advanced Features
- Using Tableau Extensions
- What Are Extensions?
- Installing and Configuring Extensions
- Working with Tableau’s API
- Introduction to Tableau’s REST API
- Use Cases and Examples
- Integrating with Other Tools
- Connecting Tableau with R and Python
- Using Tableau with BI Tools and Data Warehouses
Module 9: Performance Optimization
- Optimizing Dashboards
- Performance Tuning Tips
- Reducing Load Times
- Managing Data Extracts
- Extract Refresh Strategies
- Incremental Extracts
Module 10: Real-World Case Studies and Projects
- Analyzing Case Studies
- Industry-Specific Examples (Finance, Healthcare, Retail)
- Hands-On Projects
- Building Comprehensive Dashboards
- Implementing Best Practices
