Amazon cover image
Image from Amazon.com

Data Analysis with R A comprehensive guide to manipulating, analyzing, and visualizing data in R Anthony Fischetti

By: Material type: TextTextPublication details: UK Packt> 2018Edition: Second EditionDescription: 553ISBN:
  • 9781788393720
DDC classification:
  • 001.422 FIS
Contents:
1. RefresheR RefresheR Navigating the basics Getting help in R Vectors Functions Matrices Loading data into R Working with packages Exercises Summary 2. The Shape of Data The Shape of Data Univariate data Frequency distributions Central tendency Spread Populations, samples, and estimation Probability distributions Visualization methods Exercises Summary 3. Describing Relationships Describing Relationships Multivariate data Relationships between a categorical and continuous variable Relationships between two categorical variables The relationship between two continuous variables Visualization methods Exercises Summary 4. Probability Probability Basic probability A tale of two interpretations Sampling from distributions The normal distribution Exercises Summary 5. Using Data To Reason About The World Using Data To Reason About The World Estimating means The sampling distribution Interval estimation Smaller samples Exercises Summary 6. Testing Hypotheses Testing Hypotheses The null hypothesis significance testing framework Testing the mean of one sample Testing two means Testing more than two means Testing independence of proportions What if my assumptions are unfounded? Exercises Summary 7. Bayesian Methods Bayesian Methods The big idea behind Bayesian analysis Choosing a prior Who cares about coin flips Enter MCMC – stage left Using JAGS and runjags Fitting distributions the Bayesian way The Bayesian independent samples t-test Exercises Summary 8. The Bootstrap The Bootstrap What's... uhhh... the deal with the bootstrap? Performing the bootstrap in R (more elegantly) Confidence intervals A one-sample test of means Bootstrapping statistics other than the mean Busting bootstrap myths Exercises Summary 9. Predicting Continuous Variables Predicting Continuous Variables Linear models Simple linear regression Simple linear regression with a binary predictor Multiple regression Regression with a non-binary predictor Kitchen sink regression The bias-variance trade-off Linear regression diagnostics Advanced topics Exercises Summary 10. Predicting Categorical Variables Predicting Categorical Variables k-Nearest neighbors Logistic regression Decision trees Random forests Choosing a classifier Exercises Summary 11. Predicting Changes with Time Predicting Changes with Time What is a time series? What is forecasting? Creating and plotting time series Components of time series Time series decomposition White noise Autocorrelation Smoothing ETS and the state space model Interventions for improvement What we didn't cover Citations for the climate change data Exercises Summary 12. Sources of Data Sources of Data Relational databases Using JSON XML Other data formats Online repositories Exercises Summary 13. Dealing with Missing Data Dealing with Missing Data Analysis with missing data Visualizing missing data Types of missing data Unsophisticated methods for dealing with missing data So how does mice come up with the imputed values? Exercises Summary 14. Dealing with Messy Data Dealing with Messy Data Checking unsanitized data Regular expressions Other tools for messy data Exercises Summary 15. Dealing with Large Data Dealing with Large Data Wait to optimize Using a bigger and faster machine Be smart about your code Using optimized packages Using another R implementation Using parallelization Using Rcpp Being smarter about your code Exercises Summary 16. Working with Popular R Packages Working with Popular R Packages The data.table package Using dplyr and tidyr to manipulate data Functional programming as a main tidyverse principle Reshaping data with tidyr Exercises Summary 17. Reproducibility and Best Practices
Summary: Frequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. Starting with the basics of R and statistical reasoning, this book dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples. Packed with engaging problems and exercises, this book begins with a review of R and its syntax with packages like Rcpp, ggplot2, and dplyr. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with messy data, large data, communicating results, and facilitating reproducibility. This book is engineered to be an invaluable resource through many stages of anyone’s career as a data analyst.
List(s) this item appears in: New Arrivals ( August 2024)
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Collection Call number Status Date due Barcode
Books Books IIITDM Kurnool COMPUTER SCIENCE ENGINEERING Non-fiction 001.422 FIS (Browse shelf(Opens below)) Available 0005839
Books Books IIITDM Kurnool COMPUTER SCIENCE ENGINEERING Non-fiction 001.422 FIS (Browse shelf(Opens below)) Available 0005840
Books Books IIITDM Kurnool COMPUTER SCIENCE ENGINEERING Non-fiction 001.422 FIS (Browse shelf(Opens below)) Available 0005841
Books Books IIITDM Kurnool COMPUTER SCIENCE ENGINEERING Non-fiction 001.422 FIS (Browse shelf(Opens below)) Available 0005842
Reference Reference IIITDM Kurnool Reference Non-fiction 001.422 FIS (Browse shelf(Opens below)) Not For Loan 0005843

1. RefresheR
RefresheR
Navigating the basics
Getting help in R
Vectors
Functions
Matrices
Loading data into R
Working with packages
Exercises
Summary
2. The Shape of Data
The Shape of Data
Univariate data
Frequency distributions
Central tendency
Spread
Populations, samples, and estimation
Probability distributions
Visualization methods
Exercises
Summary
3. Describing Relationships
Describing Relationships
Multivariate data
Relationships between a categorical and continuous variable
Relationships between two categorical variables
The relationship between two continuous variables
Visualization methods
Exercises
Summary
4. Probability
Probability
Basic probability
A tale of two interpretations
Sampling from distributions
The normal distribution
Exercises
Summary
5. Using Data To Reason About The World
Using Data To Reason About The World
Estimating means
The sampling distribution
Interval estimation
Smaller samples
Exercises
Summary
6. Testing Hypotheses
Testing Hypotheses
The null hypothesis significance testing framework
Testing the mean of one sample
Testing two means
Testing more than two means
Testing independence of proportions
What if my assumptions are unfounded?
Exercises
Summary
7. Bayesian Methods
Bayesian Methods
The big idea behind Bayesian analysis
Choosing a prior
Who cares about coin flips
Enter MCMC – stage left
Using JAGS and runjags
Fitting distributions the Bayesian way
The Bayesian independent samples t-test
Exercises
Summary
8. The Bootstrap
The Bootstrap
What's... uhhh... the deal with the bootstrap?
Performing the bootstrap in R (more elegantly)
Confidence intervals
A one-sample test of means
Bootstrapping statistics other than the mean
Busting bootstrap myths
Exercises
Summary
9. Predicting Continuous Variables
Predicting Continuous Variables
Linear models
Simple linear regression
Simple linear regression with a binary predictor
Multiple regression
Regression with a non-binary predictor
Kitchen sink regression
The bias-variance trade-off
Linear regression diagnostics
Advanced topics
Exercises
Summary
10. Predicting Categorical Variables
Predicting Categorical Variables
k-Nearest neighbors
Logistic regression
Decision trees
Random forests
Choosing a classifier
Exercises
Summary
11. Predicting Changes with Time
Predicting Changes with Time
What is a time series?
What is forecasting?
Creating and plotting time series
Components of time series
Time series decomposition
White noise
Autocorrelation
Smoothing
ETS and the state space model
Interventions for improvement
What we didn't cover
Citations for the climate change data
Exercises
Summary
12. Sources of Data
Sources of Data
Relational databases
Using JSON
XML
Other data formats
Online repositories
Exercises
Summary
13. Dealing with Missing Data
Dealing with Missing Data
Analysis with missing data
Visualizing missing data
Types of missing data
Unsophisticated methods for dealing with missing data
So how does mice come up with the imputed values?
Exercises
Summary
14. Dealing with Messy Data
Dealing with Messy Data
Checking unsanitized data
Regular expressions
Other tools for messy data
Exercises
Summary
15. Dealing with Large Data
Dealing with Large Data
Wait to optimize
Using a bigger and faster machine
Be smart about your code
Using optimized packages
Using another R implementation
Using parallelization
Using Rcpp
Being smarter about your code
Exercises
Summary
16. Working with Popular R Packages
Working with Popular R Packages
The data.table package
Using dplyr and tidyr to manipulate data
Functional programming as a main tidyverse principle
Reshaping data with tidyr
Exercises
Summary
17. Reproducibility and Best Practices

Frequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. Starting with the basics of R and statistical reasoning, this book dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples. Packed with engaging problems and exercises, this book begins with a review of R and its syntax with packages like Rcpp, ggplot2, and dplyr. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with messy data, large data, communicating results, and facilitating reproducibility. This book is engineered to be an invaluable resource through many stages of anyone’s career as a data analyst.

There are no comments on this title.

to post a comment.
LIBRARY HOURS
Mon - Sat : 9:00 AM - 5.30 PM
Library will remain closed on public holidays
Contact Us

Librarian
Central Libray
Indian Institute of Information Technology Design and Manufacturing Kurnool
Andhra Pradesh - 518 007

Library Email ID: library@iiitk.ac.in

Copyright @ Central Library | IIITDM Kurnool

Powered by Koha