-
Notifications
You must be signed in to change notification settings - Fork 34
description of replication work
Description
The deliverable of this project is implementing the master/slave replication protocol in the IMVU istatd open source project. The project is developed in C++ on Linux, and the code has to compile out of the box on ubuntu 10.04, ubuntu 12.04, and Arch Linux 201301. �IMVU istatd� is a daemon-and-agent system that collects, collates, and reports counter information from a large number of computers to a central server. That central server keeps time series databases for the counters. These statistics are important to the IMVU business, and we want to keep a secondary server in sync with the data collected by the first server. A protocol has been defined for a secondary server to be configured to contact a current master server, be told about data that needs to be replicated, and then receive replication data on an ongoing basis to create an up-to-date clone of the master. Should the master fail, the replica will be switched to master mode, and the system reconfigured to answer to the IP address of the master -- the replica system becomes the new master. There is existing code for all of the server data collection, storage, query, and connection functionality already.
This project simply implements the replication protocol, described below, on top of this existing functionality and code. The delivery should include well documented and tested C++ code that builds on the defined Linux versions, and efficiently implements the behavior descrived. All ownership, copyright, clear title and moral rights in the work product must be transfered to IMVU, Inc, as part of the accepted delivery. IMVU may possibly further release this work product as open source. Periodic checkpoints will be established as part of the project plan, and review to identify acceptance/re-work will be done by IMVU, Inc.
Detailed description of work: The existing code and documentation is available on github: Code: https://github.com/imvu-open/istatd Documentation: https://github.com/imvu-open/istatd/wiki
As an addition to this work, using the existing infrastructure and patterns, and not adding any other libraries or dependencies, C++ code is developed to implement the replication protocol as described in the file daemin/ReplicaOf.cpp, largely reproduced here:
The replication protocol is simple. PDUs are enclosed in framing of a "type" word (4 bytes) and a "payload length" word (4 bytes). These are little-endian, because that's what the data is. The replication master will send a "discover" PDU for a counter after that file is opened (and potentially more than once). This lets the slave build up the appropriate set of files to replicate. The slave will send a "commit file time" message to the master. This asks the master to send any data for the given file that's after the given time. This may not be all the data, but that's OK as the replica slave will ask for data again later. The master will send a "file data" PDU with data for a time range for a file, which is just a chunk of buckets, preceded by the file name and appropriate protocol framing. This is only sent in response to a request for file data from the slave. The data may overlap previously sent data, and may have gaps. After receiving a chunk of data and recording it, the slave will send a new �commit file time� message to the master, letting the master know that more data can be sent, and where in the source file to start that sending. The master should typically send the "discover" message when it gets around to flushing each file for the first time, to spread the traffic around a little bit. The ideal place to do this is in the periodic sync thread that visits all files in a cyclic manner to flush them to disk. Similarly, the slave should probably request counter data with some amount of pacing, to avoid totally flooding the connection during start-up. A PID and a random number (from /dev/urandom) is used to prevent a process from communicating with itself over this protocol.
PDU stands for Protocol Data Unit, and a simple mechanism for defining and sending PDUs between machines exists in the code. Several of the PDUs needed to implement the protocol are already defined in the file daemon/ReplicaPdus.h. �bucket� stands for the basic data organizational unit of istatd, defined in the file include/istat/Bucket.h, and used in the StatFile. Additionally, functional tests and unit tests proving the full functioning of the system should be part of delivery.
The framework for functional tests is bash scripting, found in ftest/functions. See ftest/test_replication.sh for a functional test that tests the current configured-slave-connects-to-configured-master behavior. (This is the behavior that should be extended to implement the replication protocol.)
The framework for unit tests is simple C++ macros, found in include/istat/test.h. An example of a working unit test is test/test_PduProtocol.cpp, which tests part of the infrastructure to be used for implementing the replication protocol. Another example is test/test_StatFile.cpp, which tests the underlying storage system.
In production, this server stresses the I/O subsystem of the running server. Additionally, a reactor model based on boost::asio and boost::thread is used to scale across multiple CPU cores and sockets. Thus, the work implemented should use best practices in multithreading, I/O, and networking, to be robust to network partitioning, not overload the I/O system, and integrate well with the existing threading/reactor system. Use of boost::asio, boost::asio::strand, and boost::bind as well as the classes interfacing with those from within the implementation of istatd may be needed. Naive implementations such as sending the multi-megabyte contents of a single file in one big TCP stream write are unlikely to achieve these goals. With the amount of counter files used (500,000 files) and counters touched every second (60,000 per second,) RAM and disk space is also at a premium, and efficient use of system resources is required to complete the work.
Suitable internal metrics and counters to monitor performance of the replication system should be included in the work (see daemon/LoopbackCounter.h) and administration interface commands to manage the replication state, including suspending and resuming replication functions on both master and slave, and turning off replication configuration on a slave, should be part of the work. See daemon/AdminServer.cpp for the implementation of the administration interface where these functions should be added.
Code organization: The code is built with GNU Make, GCC, and boost libraries. There is a �quickstart.sh� script that generally sets up a freshly checked out repository and builds it as appropriately for the current system. Data structures and functions that are of general use to utilities dealing with istat files on their own are collected into a library, with headers in include/istat/.h, and source files in lib/.cpp. Classes and structures used by istatd itself are implemented in daemon/.h and daemon/.cpp. These are also built into a separate library, which is linked by the unit test executables, but not otherwise intended to be exposed to other tools and programs on the system.