Prometheus-Grafana搭建和使用

1.prometheus监控框架工具介绍

prometheus是由谷歌研发的一款开源的监控软件,它通过安装在远程机器上的exporter,通过HTTP协议从远程的机器收集数据并存储在本地的时序数据库上

同时Prometheus后端用 golang语言开发,前端是 Grafana

2.支持类型

Prometheus为了支持各种中间件以及第三方的监控提供了exporter,大家可以把它理解成监控适配器,将不同指标类型和格式的数据统一转化为Prometheus能够识别的指标类型。

例如Node exporter主要通过读取Linux的/proc以及/sys目录下的系统文件获取操作系统运行状态,reids exporter通过Reids命令行获取指标,mysql exporter通过读取数据库监控表获取MySQL的性能数据。他们将这些异构的数据转化为标准的Prometheus格式,并提供HTTP查询接口。

Prometheus的流行和Kubernetes密不可分,支持对Kubernetes、容器、OpenStack的监控。

3. Prometheus 源码安装和启动配置

IP 角色 系统
192.168.3.64 Prometheus 服务端 Ubuntu20.4

普罗米修斯下载网址:https://github.com/prometheus/prometheus/releases

下载之后上传到服务器,

1
2
3
tar xf prometheus-3.1.0.linux-amd64.tar.gz
mv prometheus-3.1.0.linux-amd64 /prometheus
chamod 775 prometheus

查看版本

1
2
3
4
5
6
7
8
./prometheus --version

prometheus, version 3.1.0 (branch: HEAD, revision: 7086161a93b262aa0949dbf2aba15a5a7b13e0a3)
build user: root@74c225e2044f
build date: 20250102-13:52:43
go version: go1.23.4
platform: linux/amd64
tags: netgo,builtinassets,stringlabels

prometheus.yml 配置解释

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# my global config
global:
# 默认情况下,每15s拉取一次目标采样点数据。
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# 每15秒评估一次规则。默认值为每1分钟。
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# job名称会增加到拉取到的所有采样点上,同时还有一个instance目标服务的host:port标签也会增加到采样点上
- job_name: 'prometheus'

# 覆盖global的采样点,拉取时间间隔5s
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']

启动服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 启动服务
./prometheus --config.file=prometheus.yml


# 指定配置文件
--config.file="prometheus.yml"
# 默认指定监听地址端口,可修改端口
--web.listen-address="0.0.0.0:9090"
# 最大连接数
--web.max-connections=512
# tsdb数据存储的目录,默认当前data/
--storage.tsdb.path="data/"
# premetheus 存储数据的时间,默认保存15天
--storage.tsdb.retention=15d
# 通过命令热加载无需重启 curl -XPOST 192.168.3.64:9090/-/reload
--web.enable-lifecycle
# 可以启用 TLS 或 身份验证 的配置文件的路径
--web.config.file=""


启动选项了解:./prometheus --help

访问:http://192.168.3.64:9090

将Prometheus配置为系统服务

进入systemd目录下:cd /usr/lib/systemd/system

1
cd /usr/lib/systemd/system

创建文件:vim prometheus.service

1
2
3
4
5
6
7
8
9
10
[Unit]
Description=https://prometheus.io

[Service]
Restart=on-failure
ExecStart=/home/xiyuanhuaigu/prometheus/prometheus --config.file=/home/xiyuanhuaigu/prometheus/prometheus.yml --web.listen-address=:9090


[Install]
WantedBy=multi-user.target

生效系统system文件

1
systemctl daemon-reload

启动服务

1
systemctl start prometheus

4.客户端安装node_exporter

因为只有一个主机,既是客户端又是服务端所以得安装

下载地址:https://github.com/prometheus/node_exporter/releases

1
2
tar xf node_exporter-1.8.2.linux-amd64.tar.gz
mv node_exporter-1.8.2.linux-amd64 node_exporter

添加启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
vim /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=node_exporter
After=network.target

[Service]
ExecStart=/home/xiyuanhuaigu/node_exporter/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target


# 启动node_exporter
systemctl daemon-reload
systemctl start node_exporter

5.Prometheus服务端配置文件添加监控项

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@VM_2-45 /usr/local/prometheus]# cat prometheus.yml 
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['192.168.2.45:9090']

- job_name: 'linux'
static_configs:
- targets: ['192.168.2.45:9100'] # 多个用,分开

# 添加上面三行

重启普罗米修斯

1
systemctl restart prometheus.service

6.安装grafana

安装必备软件包

1
sudo apt-get install -y apt-transport-https software-properties-common wget

导入 GPG 密钥

1
2
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null

添加稳定版本的仓库,运行以下命令

1
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list

更新可用软件包列表

1
sudo apt-get update

安装 Grafana OSS

1
sudo apt-get install grafana

启动

1
2
sudo systemctl daemon-reload
sudo systemctl start grafana-server

7.添加Prometheus数据源

选择数据源

导入IP

导入模板8919

选择添加