Enhancing Availability of Marine Bigdata Repository with a New Fault Tolerance Technique

Ahmad Shukri Mohd Noor, Farizah Yunus, Rabiei Mamat, Emma A. Sirajuddin, Nur F. Mat Zin


System availability is one of the crucial properties of a dependable knowledge repository system in order to preserve and pull through from minor outages in a short timespan by an automated process. National Marine Bioinformatics System or NABTICS is a Marine Microbial Bigdata Repository that unites the integrated information on genomic sequence and associated metadata which projected to be a large and growing database as well as a metadata system for inputs of research analysis and solving community issues. Therefore, it is decisive to maintain the availability of the system by accurately detecting the failure in a timely manner and a prompt recovery action during the event of failure. The failure in any of NABTICS' system component can be devastating for the system causing the system is inaccessible for a period of time. In this paper, we integrated NABTICS with Cloud-based Neighbour Replication and Failure Recovery (NRFR) in order to enhance the availability of the system. We showed that the implementation resulted in better user experience with minimum system downtime as well as online database application is said to be highly available. Furthermore, NABTICS also performed better resource utilization and higher response application during runtime.


Availability; Bigdata; Database Replication; Distributed System;

Full Text:



K. An, S. Shekhar, F. Caglar, A. Gokhale, and S. Sastry, “A cloud middleware for assuring performance and high availability of soft realtime applications,” Journal of Systems Architecture, vol. 60, no. 9, pp. 757-769, 2014.

S. Gokhale, J. Crigler, W. Farr, and D. Wallace, “System availability analysis considering hardware/software failure severities,” in Proc. 29th Annual IEEE/NASA Software Engineering Workshop 2005, Greenbelt, USA, 2005, pp. 47–56.

F. G. Khan, K. Qureshi, and B. Nazir, “Performance evaluation of fault tolerance techniques in grid computing system,” Computers & Electrical Engineering, vol. 36, pp. 1110–1122, Nov 2010.

T.L. Broto, and D. Hagimont, “Approaches to cloud computing fault tolerance,” in Proc. International Conference on Computer, Information and Telecommunication Systems (CITS), Amman, Jordan, 2012, pp. 1–6.

S. S. Sathya and K. S. Babu, “Survey of fault tolerant techniques for grid,” Computer Science Review, vol. 4, no. 2, pp. 101-120, 2010.

T. Ma, J. Hillston and S. Anderson, “Evaluation of the QoS of crashrecovery failure detection categories and subject descriptors,” in Proc. of the 2007 ACM symposium on Applied Computing, Seoul, Korea, 2007, pp. 538–542.

A. S. M. Noor, Data Neighbor Replica Affirmative Adaptive Failure Detection and Autonomous Recovery. Dissertation for Doctor of Philosophy in Computer Science, Universiti Tun Hussein Onn Malaysia, 2012.

R. Subramaniyan, P. Raman, A. D. George, and M. Radlinski, “GEMS: Gossip-enabled monitoring Service for scalable heterogeneous distributed systems,” Cluster Computing, vol. 9, pp. 101–120, Jan. 2006.

T. Amjad, M. Sher, and A. Daud, “A survey of dynamic replication strategies for improving data availability in data grids,” Future Generation Computer Systems, vol. 28, pp. 337–349, Feb. 2012.

H. H. Shen, S. M. Chen, W. M. Zheng, and S. M. Shi, “A communication model for data availability on server clusters,” in Proc. Int’l. Symposium on Distributed Computing and Application, Wuhan, 2001, pp. 169-171.

R. Mamat, M. M. Deris, and M. Jalil, “Neighbor replica distribution technique for cluster server systems,” Malaysian Journal of Computer Science, vol. 17, pp. 11–20, 2004.

D. Ford, F. I. Popovici, M. Stokely, V.-a. Truong, L. Barroso, C. Grimes, and S. Quinlan, “Availability in globally distributed storage systems,” in Proc. of the 9th USENIX Symposium on Operating Systems Design and Implementation, USENIX, 2010.

H. Lin, K. Chen, and X. Yan, “Astrolabe: a Grid Operating Environment with Full-fledged Usability,” in Proc. 6th International Conference on Grid and Cooperative Computing, Los Alamitos, USA, 2007.

F. Haas, “Ahead of the pack: the pacemaker high-availability stack,” Linux Digital Journal Magazine, pp. 98–100, April 2012.

J. Gray and D. P. Siewiorek, “High availability computer systems,” IEEE Computer, vol. 24, no. 9, pp. 39-48, 1991.

A. S. M. Noor and M. M. Deris, “Fail-stop failure recovery in neighbor replica environment,” Procedia Computer Science, vol. 19, pp. 1040, 2013.

A. S. M. Noor and M. M. Deris, “Failure recovery mechanism in neighbor replica distribution architecture,” in Lecture Notes in Computer Science (LNCS), vol. 6377, 2010, pp. 41-48.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

ISSN: 2180-1843

eISSN: 2289-8131