Due to the technical progress, we can shoot Full HD or even 4K videos with our phone, upload them and share with our friends in Odnoklassniki, or we can even broadcast live videos all over the world. It means we need to store dozens of petabytes of data and to ensure access to them with a speed of hundreds of Gb/sec, which demands an infrastructure that consists of thousands of disks and hundreds of servers.
Earlier to ensure the right level of data storage reliability and fault tolerance we had to store 3 data replicas, 1 for every data center. The explosive growth of the quantity of uploading videos and the operating experience we got made us reconsider the approach to the storage of such data as photos and videos. We decided to develop a new system of data storage to store it in a cheaper and more reliable way. It was important to simplify system operation, as even disks replacement or data restoration of this scale requires significant resources.
This talk is about how we both decreased data storage redundancy from 3 till 2.1 and got a higher level of system reliability and availability in general. We'll share our experience of system operation with thousands of disks. We'll tell you how we made the process of disks replacement simple and safe, about problems you don't expect and creative solutions.
Alexander Khristoforov, Odnoklassniki
Was born in 1979. Graduated from Riga Technical University. Has been working as a programmer since 1998, started writing in Java in 2000. Senior Developer in a platform team at Odnoklassniki since 2009, where among his duties were development and implementation of data warehouses, along with creation and development of various services, such as API, video, feed, music.