System Reliability: implementing ‘golden metrics’

Before we start lets think first what is a system reliability means. In simple words, this is the probability of a product performing its intended function under stated conditions without failure for a given period of time. It means, among other things, continuous monitoring of the state of the system. Why this is so important … Read more

Ransomware attack: decision tree

Ransomware attack is one of the most damaging types of cyber attacks of all time, and the one feared the most by business owners and cybersecurity defenders. This worry is not without reason. In an instant, an organization’s critical IT infrastructure can be brought down for weeks to months, completely stopping all business. Some data … Read more

Kubernetes distributed alert management with Prometheus Operator and Flux Notification Controller

Distributed alert management (DAM) allows automatically identify a non-compliance of service level objectives and any risky activities inside a cluster and GitOps infrastructure. In my previous post, I presented a redundant monitoring infrastructure based on variety of tools such like Grafana, Prometheus, Loki and Thanos. This article focuses on a way to integrate continuous monitoring … Read more

Simple message queue with RabbitMQ and EasyNetQ on .Net Core

RabbitMQ is one of the most popular open source message brokers and a critical component of distributed applications and platform based on microservice pattern, like online trading, order processing software and booking hubs. Some customers choose RabbitMQ for its feature richness, active community support, and broad range of supported clients and frameworks. However, RabbitMQ message … Read more

Building redundant EKS monitoring and alerting stack

As Kubernetes containers are actually Linux processes, we can use our favorite tools to monitor and log cluster performance. In Kubernetes, application monitoring does not depend on a single monitoring solution. Each organization is unique in form of requirements to monitoring sensitivity and log ingestion, analysis and persistence. This is very important building our own … Read more