Deploy Loki and Grafana with Coolify

(Mis)adventures in amateur devops
Fri Oct 04 2024

Why this stack?

There are a few options for a log monitoring stack out there, with even more options for components to swap in and out of various parts of the stack. I find the options more than a little overwhelming—ELK, Azure Monitor, CloudWatch, Datadog, Graylog, and on and on. Some of the options are more comprehensive than others, and some require piecing multiple components together while others don’t.

However, the basic structure of any log monitoring solution looks like this:

a diagram showing the flow of log monitoring, from an app node to a log aggregator to a UI for viewing logs

  1. App nodes produce logs
  2. An aggregator collects the logs, indexes them, attaches relevant metadata, and prepares them to be queried
  3. Some kind of UI queries the logs from the aggregator for display to the user, setting up alerts, and so on.

As I prepare to launch my first production app, I thought it prudent to get a basic monitoring setup off the ground ahead of time. I thought about my requirements and position as an (amateurish) solo developer, and decided I needed a monitoring stack that does at least the following:

The last two requirements are what we’ll be addressing today. They’re a tricky pair. A solution like Logstash, part of the ELK stack, has a default heap size of 1GB according to this answer on stackoverflow. While that’s not too big, I want to keep the resource usage as low as possible because I will be running a number of other services on the same VPS.

I ended up deciding to go with Loki, from the creators of Grafana (which I will use for the UI dashboard). Loki is extremely resource efficient, open-source, and free. All three are also true for the Grafana Dashboard.

The highly-modular nature of Grafana means that I can set up Loki for logs now, connect them, and continue to add components to my monitoring stack which operate independently of each other. This flexibility is very appealing to me, and it means my stack can grow when it needs to, not before.

Installing Grafana and Loki with Coolify

I’ve written about how I use Coolify to host my personal website. I use Coolify to manage almost all of my servers. I’ll be using it again to deploy my monitoring stack.

Deploying Grafana with Coolify is dead simple, since it is one of the services available for one-click setup in the Coolify UI. Just create your project, add a resource, and find Grafana in the list of available services. Set the empty password fields with a secure password. Assign a domain to it and you’re good to go. I’ve chosen to install Grafana with Postgresql, although to be quite honest with you, I’m not totally sure whether that or the simple Grafana one-click setup option is better.

Either way, that is why you’ll see some references and environment variables for Postgres below. You can safely ignore them if you aren’t using Postgres.

Installing Loki with a Docker Compose file

It’s time to set up Loki. Open your resource generated by Grafana’s one-click setup. We’re going to be modifying the Docker Compose generated by Coolify, so that we can run Loki and Nginx (for basic auth) in the same Docker network. Here is my compose file for reference:

services:
  grafana:
    image: grafana/grafana-oss
    environment:
      - SERVICE_FQDN_GRAFANA_3000
      - 'GF_SERVER_ROOT_URL=${SERVICE_FQDN_GRAFANA}'
      - 'GF_SERVER_DOMAIN=${SERVICE_FQDN_GRAFANA}'
      - 'GF_SECURITY_ADMIN_PASSWORD=${SERVICE_PASSWORD_GRAFANA}'
      - GF_DATABASE_TYPE=postgres
      - GF_DATABASE_HOST=postgresql
      - GF_DATABASE_USER=$SERVICE_USER_POSTGRES
      - GF_DATABASE_PASSWORD=$SERVICE_PASSWORD_POSTGRES
      - 'GF_DATABASE_NAME=${POSTGRES_DB:-grafana}'
    volumes:
      - 'grafana-data:/var/lib/grafana'
    healthcheck:
      test:
        - CMD
        - curl
        - '-f'
        - 'http://127.0.0.1:3000/api/health'
      interval: 5s
      timeout: 20s
      retries: 10
    depends_on:
      - postgresql
  postgresql:
    image: 'postgres:16-alpine'
    volumes:
      - 'postgresql-data:/var/lib/postgresql/data'
    environment:
      - POSTGRES_USER=$SERVICE_USER_POSTGRES
      - POSTGRES_PASSWORD=$SERVICE_PASSWORD_POSTGRES
      - 'POSTGRES_DB=${POSTGRES_DB:-grafana}'
    healthcheck:
      test:
        - CMD-SHELL
        - 'pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}'
      interval: 5s
      timeout: 20s
      retries: 10
  loki:
    image: grafana/loki
    command:
      - '-config.file=/etc/loki/loki-config.yaml'
    volumes:
      - '/etc/loki:/etc/loki'
      - 'loki-data:/loki'
    healthcheck:
      test:
        - CMD-SHELL
        - 'wget --tries=1 --spider http://localhost:3100/ready || exit 1'
      interval: 10s
      timeout: 5s
      retries: 5
  nginx:
    image: 'nginx:alpine'
    volumes:
      - '/etc/nginx/nginx.conf:/etc/nginx/nginx.conf'
      - '/etc/nginx/conf.d/loki.conf:/etc/nginx/conf.d/loki.conf'
      - '/etc/nginx/.htpasswd:/etc/nginx/.htpasswd'
    depends_on:
      - loki

There are a few things going on here worth our attentions.

First, the sections for Grafana and Postgresql were auto-generated by Coolify. They haven’t been modified at all.

Our attention belongs down here:

  loki:
    image: grafana/loki
    command:
      - '-config.file=/etc/loki/loki-config.yaml'
    volumes:
      - '/etc/loki:/etc/loki'
      - 'loki-data:/loki'
    healthcheck:
      test:
        - CMD-SHELL
        - 'wget --tries=1 --spider http://localhost:3100/ready || exit 1'
      interval: 10s
      timeout: 5s
      retries: 5
  nginx:
    image: 'nginx:alpine'
    volumes:
      - '/etc/nginx/nginx.conf:/etc/nginx/nginx.conf'
      - '/etc/nginx/conf.d/loki.conf:/etc/nginx/conf.d/loki.conf'
      - '/etc/nginx/.htpasswd:/etc/nginx/.htpasswd'
    depends_on:
      - loki

First, we’re pulling the official Loki image. Then, we’re telling Loki to look for its config file at /etc/loki/loki-config.yaml. We’ve then configured an appropriate healthcheck.

Second, we’re installing Nginx. We’re installing Nginx because Loki does not come with authentication. We’ll be using Nginx to implement HTTP basic auth according to the Nginx docs.

Of course, Traefik already acts as a reverse proxy for our Coolify-managed server. Fortunately, this simplifies things instead of complicating them.

In the Nginx section of the compose file, you can see that we’re mounting an nginx.conf, a nginx/conf.d/loki.conf, and a .htpasswd file. The file can be generated by following the instructions in the Nginx guide. Save the username and password you generated somewhere, since we will need it to connect log scraping tools like Promtail with Loki later.

Setting up the config files

Once you’ve saved your compose file, Coolify will automatically generate fields for you to add the configurations for both Loki and Nginx, which can be found in the Storages tab on the lefthand side of the resource menu, here:

A screenshot of the Storages menu in the Coolify service configuration menu

Here are the contents of my loki-config.yaml:

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  instance_addr: 127.0.0.1
  path_prefix: /tmp/loki
  storage:
    filesystem:
      chunks_directory: /tmp/loki/chunks
      rules_directory: /tmp/loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

analytics:
  reporting_enabled: false

Pretty standard, and more or less identical to the default. Any future modifications will be made here.

Beware that sometimes Coolify tends to create a directory instead of a file for a mounted config. In that case, just click the Convert to file button for the appropriate file mount. In a worst case scenario, you can just SSH into your server and create the files at the location that Loki or Nginx will be expecting them.

As for our nginx.conf and loki.conf, you can enter those in the same menu, or create them directly on the server.

Here is my nginx.conf:

user  nginx;
worker_processes  auto;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    keepalive_timeout  65;

    include /etc/nginx/conf.d/*.conf;
}

It imports all .conf files in /etc/nginx/conf.d/, which is exactly where this loki.conf is located:

server {
    listen 80;
    server_name loki.example.dev; # PUT YOUR DOMAIN HERE

    location / {
        auth_basic "Restricted Access";
        auth_basic_user_file /etc/nginx/.htpasswd;

        proxy_pass http://loki:3100;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /health {
	proxy_pass http://loki:3100/ready;
	auth_basic off;
    }
}

This config is very simple. First, it is listening on port 80 (for HTTP requests), and it is identified as loki.example.dev. Then, for the root path /, it implements basic auth and imports the user and password from our .htpasswd file we generated earlier.

Below that, we have another location block which makes an exception for the /health path, since we’ll use that for our healthchecks.

Both blocks forward requests to http://loki:3100. We can use this name because Nginx and Loki are on the same Docker network, so they can refer to each other by their service names in most cases.

That’s it! Our setup is simple because we’ll be letting Coolify’s Traefik instance handle TLS and all of our certs. In light of that, there’s one last step.

Set the domain in the Coolify interface

Now go back to the main Service Stack page of the resource configuration. Under Services, find Nginx and click Settings. You’ll be taken to the configuration page for Nginx. In the Domains field, enter your desired domain.

Assuming you’ve already setup a CNAME or A record for your subdomain with your DNS provider, all you need to do is enter the full domain in the Domains field and Coolify will take care of the rest. Make sure you enter the full name as it appears in your Nginx config, e.g. https://loki.example.dev. This will not work unless the record with your DNS provider, your Nginx config, and your Coolify config are all identical.

After that, deploy all your services with the Deploy button in the upper-right-hand corner of the Coolify configuration menu.

Testing the Nginx setup.

If everything worked, you should be able to do two different things to test it.

First, open your browser and go to loki.example.dev/health (obviously substitute the domain you actually set). If all is well, you will see a blank screen only with the word ‘ready’ in the corner. This is the healthcheck endpoint we configured in Nginx’s loki.conf not to require HTTP basic auth. This means Loki is up and running, and Traefik and Nginx are working together smoothly.

Next, go to the root of the domain you set (loki.example.dev). If everything has worked, you should get a browser alert that the webpage is asking you to sign in. Enter the username and password we generated earlier to confirm that everything is configured correctly. If you get a successful response, you’re golden! Everything is working as intended and your Loki instance is (moderately) secure!

Connect Grafana with Coolify

Once everything is deployed and running, go to your Grafana URL, (either your serverIP:port or a domain if you set one). Log in using the admin username and password you set during one-click setup.

On the home screen, you should see a panel with the text “Add your first data source”. Click it, and you’ll be taken to a list of possible sources. Select Loki, which should be near the top.

Connection is dead simple, since Grafana and Loki are running in the same docker network. Just enter http://loki:3100 (assuming loki is what you named the service in the Docker Compose) in the connection URL field. That’s entirely it. No Auth, no TLS settings, nothing. Since neither Grafana nor Loki has a port exposed directly to the public network, they can communicate securely inside the docker network on the server.

If you are running Grafana on a different server or docker network, here is where you would add the username and password you set for Nginx’s HTTP basic auth under the Authentication Methods field.

What now?

Get logs using Promtail

Now that we’ve got the basic stack set up, we obviously need to start feeding logs to Loki for processing with Grafana. There are plenty of guides for setting up Promtail and connecting it to Loki, but installing Promtail via your package manager might be the most straight-forward option, and it’s the one I went with for testing it on my home machine.

Without getting into too much depth, here is my Promtail config for testing it on a project on my home machine:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: https://loki.example.dev/loki/api/v1/push
    basic_auth:
      username: <your .htpasswd username>
      password: <your .htpasswd password>

scrape_configs:
  - job_name: sales_api_dev
    static_configs:
      - targets:
          - localhost
        labels:
          job: dev_api
          __path__: /path/to/your/logs/project-*.log
    pipeline_stages:
      - regex:
          expression: '^(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3} \+\d{2}:\d{2}) \[(?P<level>\w+)\] (?P<message>.*)$'
      - labels:
          timestamp:
          level:
      - timestamp:
          source: timestamp
          format: '2006-01-02 15:04:05.000 -07:00'

This config is set to scrape the logs from an Asp.Net Core project that I’m working on and send them to my Loki instance.

As a tip, you can go to loki.example.dev/metrics, enter your auth information, and see what Loki’s up to in order to check if your logs are being received.

Create panels

This is beyond the scope of this article—which is getting pretty long—but once you’ve generated and aggregated some logs, it’s time to start querying them. I’ll leave the details to the experts, but I suggest you start by reading this blog article from Grafana Labs, which has a number of suggestions for setting up a useful log dashboard.