在上一篇文章中,通过阅读 《In Search of an Understandable Consensus Algorithm》 前三节的内容,对论文的大致内容做了简介,简单说明了一下 Replicated state machines 的用途以及 Paxos 本身存在的问题。
4. Designing for understandability several goals in designing Raft:
it must providea complete and practical foundation for system building;
it must be safe under all conditions and available under typical operating conditions;
it must be efficient for common operations.
Our most important goal — and most difficult challenge — was understandability.
从这里可以看出,Raft 设计的初衷就是为了易于理解和便于构建。
There were numerous points in the design of Raft where we had to choose among alternative approaches. In these situations we evaluated the alternatives based on understandability.
前一段时间,在一次开组会的时候,给小组成员简单介绍了一下 Raft 协议。大概四年前读过 Raft 的论文,这次分享的时候,好多好多细节都忘了。所以,再次把 《In Search of an Understandable Consensus Algorithm》 这篇论文找出来,重读一遍,做个笔记和摘要,方便后续学习和复习。
Abstract Raft is a consensus algorithm for managing a replicated log. 开篇摘要就点出了 Raft 的特点: Raft 是一种管理复制日志的共识算法。
In order to enhance understandability, Raft separates the key elements of consensus, such as leader election, log replication, and safety, and it enforcesa stronger degree of coherency to reduce the number of states that must be considered. 为了增强可理解性,Raft 将共识分解成几个关键元素,例如 Leader 选举,日志复制,以及安全性等;同时,为了降低需要考虑的状态的数量,还强制实施了更强的一致性。
1. Introduction Consensus algorithms allow a collection of machines to work as a coherent group that can survive the failures of some of its members.
在上一篇文章 分布式锁之 Apache Curator InterProcessMutex 中介绍了基于 ZooKeeper 实现的互斥锁。除此之外,还可以实现读写锁。这篇文章就来简要介绍一下 InterProcessReadWriteLock 的实现原理。
老规矩,先看看类的注释:
/** * <p> * A re-entrant read/write mutex that works across JVMs. Uses Zookeeper to hold the lock. All processes * in all JVMs that use the same lock path will achieve an inter-process critical section. Further, this mutex is * "fair" - each user will get the mutex in the order requested (from ZK's point of view). * </p> * * <p> * A read write lock maintains a pair of associated locks, one for read-only operations and one * for writing. The read lock may be held simultaneously by multiple reader processes, so long as * there are no writers. The write lock is exclusive. * </p> * * <p> * <b>Reentrancy</b><br> * This lock allows both readers and writers to reacquire read or write locks in the style of a * re-entrant lock. Non-re-entrant readers are not allowed until all write locks held by the * writing thread/process have been released. Additionally, a writer can acquire the read lock, but not * vice-versa. If a reader tries to acquire the write lock it will never succeed.<br><br> * * <b>Lock downgrading</b><br> * Re-entrancy also allows downgrading from the write lock to a read lock, by acquiring the write * lock, then the read lock and then releasing the write lock. However, upgrading from a read * lock to the write lock is not possible. * </p> */ public class InterProcessReadWriteLock {
对分布式锁耳熟能详。不过,一直关注的是基于 Redis 实现的分布式锁。知道 ZooKeeper 也可以实现分布式锁。但是,原来的想法是把 Redis 那个思路切换到 ZooKeeper 上来实现就好。今天了解到 Apache Curator 内置了分布式锁的实现: InterProcessMutex。查看了一下源码实现,发现跟基于 Redis 实现的源码相比,在思路上还是有很大不同的。所以,特别作文记录一下。
先来看一下,整体流程:
结合流程图和源码,加锁的过程是这样的:
先判断本地是否有锁数据,如果有则对锁定次数自增一下,然后返回 true;
如果没有锁数据,则尝试获取锁:
在指定路径下创建临时顺序节点
获取指定路径下,所有节点,检查自身是否是序号最小的节点:
如果自身序号最小,则获得锁;否则
如果自身不是序号最小的节点,则通过 while 自旋 + wait(times) 不断尝试获取锁,直到成功。
获得锁后,把锁信息缓存在本地 ConcurrentMap<Thread, LockData> threadData 变量中,方便计算重入。
在 ZooKeeper 中的结构大致如下:
下面我们逐个方法进行分析说明。先来看一下 InterProcessMutex 的注释:
/** * A re-entrant mutex that works across JVMs. Uses Zookeeper to hold the lock. All processes in all JVMs that * use the same lock path will achieve an inter-process critical section. Further, this mutex is * "fair" - each user will get the mutex in the order requested (from ZK's point of view) */ public class InterProcessMutex implements InterProcessLock, Revocable<InterProcessMutex>