This item can be returned in its original condition for a full refund or replacement within 30 days of receipt. You may receive a partial or no refund on used, damaged or materially different returns.

Read full return policy

Payment

Secure transaction

Payment

Secure transaction

We work hard to protect your security and privacy. Our payment security system encrypts your information during transmission. We don’t share your credit card details with third-party sellers, and we don’t sell your information to others. Learn more

Details

$8.52

Get Fast, Free Shipping with Amazon Prime FREE Returns

100% satisfaction guaranteed. Ships directly from Amazon. 100% satisfaction guaranteed. Ships directly from Amazon. See less

FREE delivery Thursday, May 16 on orders shipped by Amazon over $35. Order within 8 hrs 11 mins

Select delivery location

Only 6 left in stock - order soon.

$$41.57 () Includes selected options. Includes initial monthly payment and selected options. Details

Access codes and supplements are not guaranteed with used items.

Enhancements you chose aren't available for this seller. Details

${cardName} not available for the seller you chose

${cardName} unavailable for quantities greater than ${maxQuantity}.

Sold by Signature-Marketplace and Fulfilled by Amazon.

Other sellers on Amazon

New & Used (56) from $1.87 + $3.98 shipping

Follow the author

Nathan Marz

+ Follow

More books from this author

Image Unavailable

Image not available for
Color:

To view this video download Flash Player

Big Data: Principles and best practices of scalable realtime data systems 1st Edition

by Nathan Marz (Author), James Warren (Author)

4.2 100 ratings

See all formats and editions

{"desktop_buybox_group_1":[{"displayPrice":"$41.57","priceAmount":41.57,"currencySymbol":"$","integerValue":"41","decimalSeparator":".","fractionalValue":"57","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"NCgF1vl2buNGpNQZAKavphD6Ci57Xoj8Wm1YilPJ8v0ATfdF45PREcO7tDfumxgkGBZKk96P9vSxyT%2FrKPL9ce4anySsQlLiWK6Xkh3lC6%2FFHqYcWUcHc8p%2FmpeBSDgoH6kDOcjVTnCbgBhnGYtgHEKvKFasTrlnm%2BCuaZ3yminLBTqj3KZ%2BcNbCiADtUanw","locale":"en-US","buyingOptionType":"NEW","aapiBuyingOptionIndex":0}, {"displayPrice":"$8.52","priceAmount":8.52,"currencySymbol":"$","integerValue":"8","decimalSeparator":".","fractionalValue":"52","symbolPosition":"left","hasSpace":false,"showFractionalPartIfEmpty":true,"offerListingId":"NCgF1vl2buNGpNQZAKavphD6Ci57Xoj8Y8%2BbnjTIonUWvd8ssguQDrY90uAVbC2PsxE29ySuF%2FuX6wSm6wHl2m1n4PYKx9DacdevrLyiGW1bsVSH9DMOZuqjQxrwHb6ndog7InZGckdIDq%2FrxtFmogYfLFYIhHQLVzaMlgGWPQjCmGUTLlXJoA%3D%3D","locale":"en-US","buyingOptionType":"USED","aapiBuyingOptionIndex":1}]}

Purchase options and add-ons

Summary

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.

Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

About the Book

Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive.

Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases.

This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.

What's Inside

Introduction to big data systems
Real-time processing of web-scale data
Tools like Hadoop, Cassandra, and Storm
Extensions to traditional database skills

About the Authors

Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing.

Table of Contents

A new paradigm for Big Data
PART 1 BATCH LAYER
Data model for Big Data
Data model for Big Data: Illustration
Data storage on the batch layer
Data storage on the batch layer: Illustration
Batch layer
Batch layer: Illustration
An example batch layer: Architecture and algorithms
An example batch layer: Implementation
PART 2 SERVING LAYER
Serving layer
Serving layer: Illustration
PART 3 SPEED LAYER
Realtime views
Realtime views: Illustration
Queuing and stream processing
Queuing and stream processing: Illustration
Micro-batch stream processing
Micro-batch stream processing: Illustration
Lambda Architecture in depth

Frequently bought together

Big Data: Principles and best practices of scalable realtime data systems

$41.57

Get it May 15 - 17

Only 1 left in stock - order soon.

Ships from and sold by 365giftshop.

+

Hadoop: The Definitive Guide: Storage and Analysis at Internet Scale

$28.80

Get it as soon as Monday, May 20

Only 1 left in stock - order soon.

Sold by Pearlzone and ships from Amazon Fulfillment.

+

Learning Spark: Lightning-Fast Data Analytics

$43.99

Get it as soon as Thursday, May 16

In Stock

Ships from and sold by Amazon.com.

Total price:

To see our price, add these items to your cart.

Try again!

Details

Added to Cart

Some of these items ship sooner than the others.

Show details Hide details

Choose items to buy together.

Bill Chambers
429
Paperback
35 offers from $31.62
Joe Reis
449
Paperback
65 offers from $27.29
Jules Damji
291
Paperback
25 offers from $29.97
Tom White
285
Paperback
110 offers from $2.08
Foster Provost
1,292
Paperback
90 offers from $6.54
James Densmore
349
Paperback
37 offers from $9.68

From the Publisher

About This Book

Services like social networks, web analytics, and intelligent e-commerce often need to manage data at a scale too big for a traditional database. Complexity increases with scale and demand, and handling Big Data is not as simple as just doubling down on your RDBMS or rolling out some trendy new technology. Fortunately, scalability and simplicity are not mutually exclusive—you just need to take a different approach. Big Data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers.

Big Data teaches you to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to Big Data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of Big Data systems and how to implement them in practice.

Big Data requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful, though not required. The goal of the book is to teach you how to think about data systems and how to break down difficult problems into simple solutions. We start from first principles and from those deduce the necessary properties for each component of an architecture.

Editorial Reviews

About the Author

Nathan Marz is currently working on a new startup. Previously, he was the lead engineer at BackType before being acquired by Twitter in 2011. At Twitter, he started the streaming compute team which provides and develops shared infrastructure to support many critical realtime applications throughout the company. Nathan is the creator of Cascalog and Storm, open-source projects which are relied upon by over 50 companies around the world, including Yahoo!, Twitter, Groupon, The Weather Channel, Taobao, and many more companies.

James Warren is an analytics architect at Storm8 with a background in big data processing, machine learning and scientific computing.

Product details

Publisher ‏ : ‎ Manning Publications; 1st edition (May 19, 2015)
Language ‏ : ‎ English
Paperback ‏ : ‎ 328 pages
ISBN-10 ‏ : ‎ 1617290343
ISBN-13 ‏ : ‎ 978-1617290343
Item Weight ‏ : ‎ 1.21 pounds
Dimensions ‏ : ‎ 7.38 x 0.6 x 9.25 inches

Best Sellers Rank: #822,210 in Books (See Top 100 in Books)
- #313 in Data Mining (Books)
- #1,022 in Software Development (Books)
- #1,248 in Internet & Telecommunications

Customer Reviews:
4.2 100 ratings

Brief content visible, double tap to read full content.

Full content visible, double tap to read brief content.

Videos

Help others learn more about this product by uploading a video!

Upload your video

About the author

Follow authors to get new release updates, plus improved recommendations.

Nathan Marz

Brief content visible, double tap to read full content.

Full content visible, double tap to read brief content.

Discover more of the author’s books, see similar authors, read author blogs and more

Customer reviews

4.2 out of 5

100 global ratings

How customer reviews and ratings work

Reviews with images

See all photos

Sort reviews by

Top reviews from the United States

There was a problem filtering reviews right now. Please try again later.

Amazon Customer

typo

Reviewed in the United States on April 10, 2016

Verified Purchase

If you are looking for a survey of different approaches of handling big data, you want to read "ELEMENTS OF SCALE: COMPOSING AND SCALING DATA PLATFORMS". ([...]) This book is dedicated to Lambda Architecture (one that is surveyed in the above article.)

The book is very organized. Introduction in chapter 1 will be the road map of the whole book. Motivating with a simple web application based on RDBMS, the author showed how the approach to scale it becomes undesirable. After enumerating a list of desired properties, he proposed Lambda architecture, an approach in contrast to fully incremental architecture (with RDBMS).

The Lambda architecture is partitioned into three layers:
1. batch layer that computes different views on big data
2. serving layer that answers user queries using views from the batch layer and speed layer.
3. speed layer that compensates an approximate answer over a period time when the batch layer is working on the complete answers.

In the remaining chapters, the author dive deep into the rationale and requirements of all the different pieces of Lambda Architecture.

To understand the context of Lambda Architecture, also refer to the wikipedia for crticism.

9 people found this helpful

Helpful

Zambonilli

Lambda Architecture FTW

Reviewed in the United States on June 14, 2015

Verified Purchase

Great explanation of both the theory and practice of the lambda architecture. While the practice chapters are nice, it's the theory chapters that really shine. The book explains down to the byte level why components are implemented the way they are. For example, there's an immense amount of detail as to why using a db that doesn't support random writes allows for an application to query the batch layer's results without locking.

The only downside to the book is that the architecture and exosystem is so new that there's not really a lot of pragmatic solutions. For example, the theory describes a query layer that can merge the results of batch and real time processing for client applications. However, in real life there are no pragmatic solutions for doing this so you'd have to write your own.

It'll be interesting to see how the lambda architecture matures and to see future editions of this book. Hopefully, future editions will be as well written and have a better ecosystem for practice chapters.

14 people found this helpful

Helpful

Amazon Customer

The perfect book to understand big data concepts

Reviewed in the United States on November 4, 2015

Verified Purchase

In all honesty, the book has simplified big data architecture and its general premise in an eye opening way. Starting from the batch layer and spending a good amount of time addressing different aspects of it gave me a valuable lesson as a developer in understanding the complexity as well as the necessity of evaluating my data entries and their impact in the future formation of worthy analytics/results.

My girlfriend and I enjoyed every chapter in this book. I guarantee you that you won't regret buying this book. I am looking forward to another book from you guys on the topic because its the first time where I couldn't wait to pick up the book and get to the end of it.

2 people found this helpful

Helpful

Richard Hedin

Coherent view, not a particular technology

Reviewed in the United States on February 23, 2020

Verified Purchase

Right up there with Paul's Letter to the Romans! Well, not equal with Paul's Letter to the Romans.

But it brought Paul's letter to the Romans to mind!

Clear, just enough detail, well-ordered.

I work at a large corporation, on a real-time data system. If we had followed the author's recommendations, I wouldn't have the problem I've been dealing with for the last several weeks.

Helpful

Dimitri K

Written by a specialis

Reviewed in the United States on March 13, 2016

Verified Purchase

This book is written by a specialist in big data. I know that because I worked on the big data pipeline. And now I read the book and I see that all my problems are addressed in this book. Virtually every problem discussed appeared in my pipeline too, as if the author worked with me on my project.

The other very useful for me feature of this book is that it is the first book where I could find a concise explanation of Storm Trident framework, even though the book is not about Storm.

5 people found this helpful

Helpful

Y. Yuan

Everything looks good until page 20 ...

Reviewed in the United States on October 21, 2015

Verified Purchase

I feel really sorry for those who gave 5 stars for this book. I purchased the book and started reading it eagerly as soon as I received it. It got my attention until I got to page 20 with a statement saying "...... If anything ever goes wrong, you can discard the state for the entire speed layer, and everything will be back to normal within a few hours." Within a few hours? No high-traffic production sites can afford a few hours down-time. At that point, I decided to return the book, which I did.

I did scan through the rest of the book, though. First, the so-called lambda architecture might sound like a new term, but many high concurrency websites already work that way. For a high concurrency web site, the first-layer would be memcached-based, which gives O(1) low latency on all queries. The second layer would be a clustered app-server layer. The third layer could be a high-concurrency, extremely low-latency layer like a NoSQL cluster. The far backend could be Hadoop- or Spark-based for batch jobs. This is the known architecture in production for high traffic websites that need to support millions of concurrent users.

Secondly, the bulk of the book is actually about Hadoop in the so-called batch layer. Hadoop once generated some excitement, but has lost its steam due to the new kid in the spot named Spark, which can do whatever Hadoop can do, but 10x - 20x faster with a fractional cost.

9 people found this helpful

Helpful

David

Bad binding

Reviewed in the United States on May 4, 2020

Verified Purchase

This book has a bad binding. I bought this book and opened it only twice and it is already broken

Helpful

Antonio P. Paes

Great content. Bad structure/assembling quality

Reviewed in the United States on December 22, 2021

Verified Purchase

I just received this book. Content is great but as I started to turn the pages, they started to fall off. I buy a lot of books and it has been a very long time since I saw such a bad quality in the book physical structure. 5 stars on the content. 0 stars on the book physical structure.

Helpful

Top reviews from other countries

Translate all reviews to English

jonathanGabriel

Buen libro sobre arquitectura

Reviewed in Mexico on January 21, 2020

Verified Purchase

Es un muy buen libro con las bases de Lambda Architecture. Fácil de leer y entender. Lo recomiendo. Es teórico sobre arquitectura con algunos ejemplos prácticos.

Translate review to English

Amazon Customer

Pleasant and interesting

Reviewed in the United Kingdom on December 5, 2016

Verified Purchase

Bid Data and technologies around this subject can be very hard and low-level to understand.
With this book i found it clear, concise and explained in such a way that everyone with little or no background in IT can understand.
A very good Big Data insight and also helpful for understanding which are the best tools to achieve good results with Hadoop and other technologies.
I found it very interesting, well written and pleasant to read as well. This book helped me a lot and i'm sure it can help a lot beginners with this subject.

Eduard

Clasico de la arquitectura Big Data

Reviewed in Spain on October 24, 2016

Verified Purchase

Si estas en diseñando arquitecturas para big data o incluso si piensas que algun dia tu aplicación podria llegar a big data este es tu libro. Los conceptos serian CQRS y Event Sourcing pero a gran escala y para dar respuestas en real time.

Translate review to English

frchatel

excellent

Reviewed in France on November 6, 2016

Verified Purchase

Excellent ouvrage précis, détaillé sur un cas de big data.
Ouvrage didactique, mais qui nécessite une certaine concentration en raison de la complexité technologique décrite.
S'adresse à un public averti de développeurs (nombreuses illustrations avec échantillons de code Java)

One person found this helpful

Translate review to English

Alfred Huang

Five Stars

Reviewed in Canada on June 6, 2015

Verified Purchase

it is a good book.

See more reviews

Return this item for free

Follow the author

Image Unavailable

Big Data: Principles and best practices of scalable realtime data systems 1st Edition

Purchase options and add-ons

PART 1 BATCH LAYER

PART 2 SERVING LAYER

PART 3 SPEED LAYER

Frequently bought together

Similar items that may deliver to you quickly

From the Publisher

About This Book

Editorial Reviews

About the Author

Product details

Videos

About the author

Nathan Marz

Customer reviews

Reviews with images

Top reviews from the United States

There was a problem filtering reviews right now. Please try again later.

Top reviews from other countries