Using R for Big Data with Spark - Training DVD

Price: $49.99
Product prices and availability are accurate as of 2018-06-04 18:23:11 EDT and are subject to change. Any price and availability information displayed on http://www.amazon.com/ at the time of purchase will apply to the purchase of this product.
Availability: In Stock
Usually ships in 24 hours
CERTAIN CONTENT THAT APPEARS ON THIS SITE COMES FROM AMAZON SERVICES LLC. THIS CONTENT IS PROVIDED 'AS IS' AND IS SUBJECT TO CHANGE OR REMOVAL AT ANY TIME.

Manufacturer Description

Number of Videos: 2.5 hours - 20 lessons
Ships on: DVD-ROM
User Level: Intermediate

Data analysts familiar with R will learn to leverage the power of Spark, distributed computing and cloud storage in this course that shows you how to use your R skills in a big data environment. You'll learn to create Spark clusters on the Amazon Web Services (AWS) platform; perform cluster based data modeling using Gaussian generalized linear models, binomial generalized linear models, Naive Bayes, and K-means modeling; access data from S3 Spark DataFrames and other formats like CSV, Json, and HDFS; and do cluster based data manipulation operations with tools like SparkR and SparkSQL. By course end, you'll be capable of working with massive data sets not possible on a single computer. This hands-on class requires each learner to set-up their own extremely low-cost, easily terminated AWS account.

  • Discover how to use your R skills in a big data distributed cloud computing cluster environment
  • Gain hands-on experience setting up Spark clusters on Amazon's AWS cloud services platform
  • Understand how to control a cloud instance on AWS using SSH or PuTTY
  • Explore basic distributed modeling techniques like GLM, Naive Bayes, and K-means
  • Learn to do cloud based data manipulation and processing using SparkR and SparkSQL
  • Understand how to access data from the CSV, Json, HDFS, and S3 formats
Manuel Amunategui is a data science practitioner, consultant, teacher, and author with 16+ years of data science experience. A former quantitative analyst for a Wall Street brokerage firm, he now serves as the lead data scientist for Providence Health & Services in Portland, Oregon. In his free time, Manuel does competitive data modeling on Kaggle.com, CrowdANALYTIX.com, Datascience.net, and DrivenData.org.

Product Features

Learn Using R for Big Data with Spark from a professional trainer from your own desk. Visual training method, offering users increased retention and accelerated learning Breaks even the most complex applications down into simplistic steps. Easy to follow step-by-step lessons, ideal for all Comes with Extensive Working Files!

Write a Review