图书介绍

设计数据密集型应用2025|PDF|Epub|mobi|kindle电子书版本百度云盘下载

设计数据密集型应用
  • Martin Kleppmann著 著
  • 出版社: 南京:东南大学出版社
  • ISBN:9787564173852
  • 出版时间:2017
  • 标注页数:594页
  • 文件大小:267MB
  • 文件页数:613页
  • 主题词:软件工具-基本知识-英文

PDF下载


点此进入-本书在线PDF格式电子书下载【推荐-云解压-方便快捷】直接下载PDF格式图书。移动端-PC端通用
种子下载[BT下载速度快]温馨提示:(请使用BT下载软件FDM进行下载)软件下载地址页直链下载[便捷但速度慢]  [在线试读本书]   [在线获取解压码]

下载说明

设计数据密集型应用PDF格式电子书版下载

下载的文件为RAR压缩包。需要使用解压软件进行解压得到PDF格式图书。

建议使用BT下载工具Free Download Manager进行下载,简称FDM(免费,没有广告,支持多平台)。本站资源全部打包为BT种子。所以需要使用专业的BT下载软件进行下载。如BitComet qBittorrent uTorrent等BT下载工具。迅雷目前由于本站不是热门资源。不推荐使用!后期资源热门了。安装了迅雷也可以迅雷进行下载!

(文件页数 要大于 标注页数,上中下等多册电子书除外)

注意:本站所有压缩包均有解压码: 点击下载压缩包解压工具

图书目录

Part Ⅰ.Foundations of Data Systems3

1.Reliable,Scalable,and Maintainable Applications3

Thinking About Data Systems4

Reliability6

Hardware Faults7

Software Errors8

Human Errors9

How Important Is Reliability?10

Scalability10

Describing Load11

Describing Performance13

Approaches for Coping with Load17

Maintainability18

Operability:Making Life Easy for Operations19

Simplicity:Managing Complexity20

Evolvability:Making Change Easy21

Summary22

2.Data Models and Query Languages27

Relational Model Versus Document Model28

The Birth of NoSQL29

The Object-Relational Mismatch29

Many-to-One and Many-to-Many Relationships33

Are Document Databases Repeating History?36

Relational Versus Document Databases Today38

Query Languages for Data42

Declarative Queries on the Web44

MapReduce Querying46

Graph-Like Data Models49

Property Graphs50

The Cypher Query Language52

Graph Queries in SQL53

Triple-Stores and SPARQL55

The Foundation:Datalog60

Summary63

3.Storage and Retrieval69

Data Structures That Power Your Database70

Hash Indexes72

SSTables and LSM-Trees76

B-Trees79

Comparing B-Trees and LSM-Trees83

Other Indexing Structures85

Transaction Processing or Analytics?90

Data Warehousing91

Stars and Snowflakes:Schemas for Analytics93

Column-Oriented Storage95

Column Compression97

Sort Order in Column Storage99

Writing to Column-Oriented Storage101

Aggregation:Data Cubes and Materialized Views101

Summary103

4.Encoding and Evolution111

Formats for Encoding Data112

Language-Specific Formats113

JSON,XML,and Binary Variants114

Thrift and Protocol Buffers117

Avro122

The Merits of Schemas127

Modes of Dataflow128

Dataflow Through Databases129

Dataflow Through Services:REST and RPC131

Message-Passing Dataflow136

Summary139

Part Ⅱ.Distributed Data151

5.Replication151

Leaders and Followers152

Synchronous Versus Asynchronous Replication153

Setting Up New Followers155

Handling Node Outages156

Implementation of Replication Logs158

Problems with Replication Lag161

Reading Your Own Writes162

Monotonic Reads164

Consistent Prefix Reads165

Solutions for Replication Lag167

Multi-Leader Replication168

Use Cases for Multi-Leader Replication168

Handling Write Conflicts171

Multi-Leader Replication Topologies175

Leaderless Replication177

Writing to the Database When a Node Is Down177

Limitations of Quorum Consistency181

Sloppy Quorums and Hinted Handoff183

Detecting Concurrent Writes184

Summary192

6.Partitioning199

Partitioning and Replication200

Partitioning of Key-Value Data201

Partitioning by Key Range202

Partitioning by Hash of Key203

Skewed Workloads and Relieving Hot Spots205

Partitioning and Secondary Indexes206

Partitioning Secondary Indexes by Document206

Partitioning Secondary Indexes by Term208

Rebalancing Partitions209

Strategies for Rebalancing210

Operations:Automatic or Manual Rebalancing213

Request Routing214

Parallel Query Execution216

Summary216

7.Transactions221

The Slippery Concept of a Transaction222

The Meaning of ACID223

Single-Object and Multi-Object Operations228

Weak Isolation Levels233

Read Committed234

Snapshot Isolation and Repeatable Read237

Preventing Lost Updates242

Write Skew and Phantoms246

Serializability251

Actual Serial Execution252

Two-Phase Locking(2PL)257

Serializable Snapshot Isolation(SSI)261

Summary266

8.The Trouble with Distributed Systems273

Faults and Partial Failures274

Cloud Computing and Supercomputing275

Unreliable Networks277

Network Faults in Practice279

Detecting Faults280

Timeouts and Unbounded Delays281

Synchronous Versus Asynchronous Networks284

Unreliable Clocks287

Monotonic Versus Time-of-Day Clocks288

Clock Synchronization and Accuracy289

Relving on Synchronized Clocks291

Process Pauses295

Knowledge,Truth,and Lies300

The Truth Is Defined by the Majority300

Byzantine Faults304

System Model and Reality306

Summary310

9.Consistency and Consensus321

Consistency Guarantees322

Linearizability324

What Makes a System Linearizable?325

Relying on Linearizabillty330

Implementing Linearizable Systems332

The Cost of Linearizability335

Ordering Guarantees339

Ordering and Causality339

Sequence Number Ordering343

Total Order Broadcast348

Distributed Transactions and Consensus352

Atomic Commit and Two-Phase Commit(2PC)354

Distributed Transactions in Practice360

Fault-Tolerant Consensus364

Membership and Coordination Services370

Summary373

Part Ⅲ.Derived Data389

10.Batch Processing389

Batch Processing with Unix Tools391

Simple Log Analysis391

The Unix Philosophy394

MapReduce and Distributed Filesystems397

MapReduce Job Execution399

Reduce-Side Joins and Grouping403

Map-Side Joins408

The Output of Batch Workflows411

Comparing Hadoop to Distributed Databases414

Beyond MapReduce419

Materialization of Intermediate State419

Graphs and Iterative Processing424

High-Level APIs and Languages426

Summary429

11.Stream Processing439

Transmitting Event Streams440

Messaging Systems441

Partitioned Logs446

Databases and Streams451

Keeping Systems in Sync452

Change Data Capture454

Event Sourcing457

State,Streams,and Immutability459

Processing Streams464

Uses of Stream Processing465

Reasoning About Time468

Stream Joins472

Fault Tolerance476

Summary479

12.The Future of Data Systems489

Data Integration490

Combining Specialized Tools by Deriving Data490

Batch and Stream Processing494

Unbundling Databases499

Composing Data Storage Technologies499

Designing Applications Around Dataflow504

Observing Derived State509

Aiming for Correctness515

The End-to-End Argument for Databases516

Enforcing Constraints521

Timeliness and Integrity524

Trust,but Verify528

Doing the Right Thing533

Predictive Analytics533

Privacy and Tracking536

Summary543

Glossary553

Index559

热门推荐