Logs

From printf to logging indusrialization

Christophe Mosa

Disclaimers

(what I try not to talk about)

  • Me
  • My company
  • Debugging
  • Buzzwords
  • System logs, nagios, mrtg...

Storage is cheap

... very cheap ...

1995: $500/GB

2000: $10/GB

Today: $0.04/GB

Computing

... is a commodity

Distributed computing for everyone

Data processing cost is (relatively) ridiculous

Logs for everyone

Devs

Ops

Business

What do all these guys want?

Developers

Debug

Test

Ops

System failures

Performance bottlenecks

Deployment effects/consequences

Business

$$$

$$$

Analyze data, get insights, detect patterns, predict the future...

Can application logs help all these guys?

(yes)

Developers

It's a good idea to let developers debug as they like

just give them the right tools

... but teach them basics about logging

Logs for Developers

Logs for Developers

Logs for Ops

Metrics aggregates

Internal/Application level latency

Logs for Ops

Bottom line: you don't have to use only monitoring solutions, it is very easy to build relevant metrics using (good) logs

Logs for Business Operations

Application logs are a powerful weapon

But you will need data scientists/engineers

In our business: good data analysis can grow our revenues by two-digits percentages

Let's talk architecture

Centralization

Scalability

Scalability

Scalability

Scalability

How does it apply to logs?

And... What is a log anyway?

Data

Timestamp

And a source.

We try here to build the destination

How does it apply to logs?

Logs is a constant stream of time-ordered information

How does it apply to logs?

Logs are directly and linearly linked to your application load, in terms of frequency and size

How does it apply to logs?

Your application scales and your logs don't?

They are useless

Let's build a log system

Let's build a log system

Let's build a log system

Growth?

Let's build a log system

Just add capacity

You don't want to go that way

(that's called vertical scaling)

You can't handle this kind of growth

Let's build a better log system

Let's build a better log system

Let's build a better log system

Let's build a better log system

Let's build a better log system

We need storage

We need storage

Blob/Object storage?

Database?

We need a Front End

We need a Front End

Polling?

Push?

We need a user interface

Interface

true nerds don't need interface, come on!

grep/awk/sed are enough, no?

NOPE.

Interface

Ideal: one interface to rule them all

Devs and Ops can search inside it and see the details

Business team can do some dashboarding and aggregates

Implementation

Commercial SAAS solutions: Papertrail, Sumologic, Loggly...

OSS: Logstash + (Elasticsearch (+ Kibana))

Why Logstash?

Ultra flexible via plugins (and open source)

Built by devops who know how painful it is to read logs at 4AM

What is a Logstash instance? (1/2)

Gets/Receives data

Optionally transforms/filters/aggregate/rejects the data

Outputs data

Basically, an ETL for logs

What is a Logstash instance? (2/2)

Configured via a simple text file

Elasticsearch + Kibana

Logstash's default database is Elasticsearch

Kibana is a web interface built to query Elastisearch and particularly Logstash data

It's gorgeous, well designed and works out of the box with Logstash

Kibana

Implementation

Are we done?

nope.

2 problems left

How do the logs go from apps to logstash?

Is one receiver a good idea?

Introducing a message broker

Acts as a buffer between apps and receivers

My choice: Redis

Shippers everywhere

Shippers push logs to the broker

They either get data from the application, or pull logs from appended files

Log backlog

(no pun intended)

Thank you, horizontal scaling

Analytics

Thank You