OpenTelemetry Logs(日志)
OpenTelemetry Logs 提供对应用日志的结构化采集和关联到 Trace 的能力,是 APM "可观测性三大支柱"之一。
1. 基本概念
1.1 三大支柱
| 支柱 | 回答什么问题 | 典型工具 |
|---|---|---|
| Metrics | 系统整体趋势怎么样?CPU 多高?QPS 多大? | Prometheus、Grafana |
| Logs | 具体发生了什么?异常堆栈、调试信息 | ELK、Loki |
| Traces | 请求经过了哪些环节,哪里慢? | Jaeger、Zipkin |
1.2 Log 的关键概念
| 术语 | 含义 |
|---|---|
| Log Record | 一条日志记录(原子单元) |
| Logger | 创建 Log Record 的工具 |
| LoggerProvider | 全局 provider,管理 Logger 生命周期 |
| Log Record Body | 日志主体内容(string / 结构化数据) |
| Severity | 严重等级(TRACE/DEBUG/INFO/WARN/ERROR/FATAL) |
| Attribute | 键值对(标签),用于过滤和搜索 |
| Resource | 资源属性(service.name 等) |
| Trace Context | 关联到 Trace 的 trace_id / span_id |
| Event | 嵌入到 Span 中的事件日志(API 关联) |
| Log Appender | 把现有日志框架(log4j/zap/logrus)对接 OTel |
1.3 Severity 等级
| 等级 | 数值 | 用途 |
|---|---|---|
| TRACE | 1 | 最细粒度,调试级别 |
| DEBUG | 5 | 调试信息 |
| INFO | 9 | 关键业务事件 |
| WARN | 13 | 警告(可恢复) |
| ERROR | 17 | 错误(需要关注) |
| FATAL | 21 | 致命错误(系统级) |
2. 数据模型
2.1 Log Record 结构
LogRecord
├── Timestamp (时间戳)
├── ObservedTimestamp(采集时间)
├── TraceId (关联 trace)
├── SpanId (关联 span)
├── Severity (INFO/WARN/ERROR)
├── SeverityText ("INFO" / "WARN" / ...)
├── Body (主体内容)
├── Resource (服务身份)
│ ├── service.name
│ ├── service.version
│ └── ...
├── Attributes (键值对标签)
│ ├── http.method
│ ├── user.id
│ └── ...
└── InstrumentationScope (库标识)
2.2 数据流
应用日志 → Logger → LoggerProvider → Processor → Exporter → 后端
(Filter/Batch)
3. Go 语言 OpenTelemetry Logs 完整示例
3.1 安装依赖
go get go.opentelemetry.io/otel \
go.opentelemetry.io/otel/sdk \
go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc \
go.opentelemetry.io/otel/log
3.2 初始化(直接用 OTel API)
package main
import (
"context"
"time"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc"
"go.opentelemetry.io/otel/log"
sdklog "go.opentelemetry.io/otel/sdk/log"
"go.opentelemetry.io/otel/sdk/resource"
semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
)
func setupLogs(ctx context.Context) (*sdklog.LoggerProvider, error) {
// 1. Resource
res, _ := resource.New(ctx,
resource.WithAttributes(
semconv.ServiceName("my-service"),
semconv.ServiceVersion("1.0.0"),
),
)
// 2. Exporter:OTLP gRPC 发给 OTel Collector
exporter, err := otlploggrpc.New(ctx,
otlploggrpc.WithEndpoint("otel-collector:4317"),
otlploggrpc.WithInsecure(),
)
if err != nil {
return nil, err
}
// 3. LoggerProvider
lp := sdklog.NewLoggerProvider(
sdklog.WithResource(res),
sdklog.WithProcessor(
sdklog.NewBatchProcessor(exporter,
sdklog.WithBatchTimeout(5*time.Second),
),
),
)
otel.SetLoggerProvider(lp)
return lp, nil
}
3.3 业务使用
import (
"go.opentelemetry.io/otel/log"
)
func main() {
ctx := context.Background()
lp, _ := setupLogs(ctx)
defer lp.Shutdown(ctx)
// 获取 Logger
logger := otel.Logger("my-service/main")
// 记录不同级别日志
var (
trace log.Value
debug log.Value
info log.Value
warn log.Value
error_ log.Value
)
logger.Emit(ctx, log.Event{
Severity: log.LevelInfo,
Body: log.StringValue("应用启动成功"),
})
logger.Emit(ctx, log.Event{
Severity: log.LevelWarn,
Body: log.StringValue("数据库连接慢"),
Attributes: []log.KeyValue{
log.String("db.system", "postgresql"),
log.Int("duration_ms", 1500),
},
})
logger.Emit(ctx, log.Event{
Severity: log.LevelError,
Body: log.StringValue("支付失败"),
Attributes: []log.KeyValue{
log.String("user.id", "u-12345"),
log.String("order.id", "o-98765"),
log.String("error.type", "timeout"),
},
})
_ = trace; _ = debug; _ = info; _ = warn; _ = error_
}
3.4 与现有日志库(slog / logrus / zap)集成
标准库 slog + OTel:
import (
"log/slog"
"os"
)
func main() {
handler := slog.NewJSONHandler(os.Stdout, nil)
logger := slog.New(handler)
logger.Info("订单创建",
slog.String("order_id", "o-12345"),
slog.Int("amount", 9999),
)
}
实战建议:生产用 OTel 自动桥接(slog/logrus/zap)而不是直接用 OTel Logger API,这样既保留原生日志体验,又自动附加 trace_id / span_id。
4. Python 语言示例
4.1 安装依赖
pip install opentelemetry-api \
opentelemetry-sdk \
opentelemetry-exporter-otlp-proto-grpc
4.2 完整代码
import logging
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.semconv.resource import ResourceAttributes
# 1. 设置 TracerProvider(让日志能关联到 trace)
resource = Resource.create({
ResourceAttributes.SERVICE_NAME: "my-python-service",
})
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="otel-collector:4317", insecure=True))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
# 2. 配置 logging 自动注入 trace_id / span_id
from opentelemetry.instrumentation.logging import LoggingInstrumentor
LoggingInstrumentor().instrument(set_logging_format=True)
# 3. 业务使用
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# 简单日志
logger.info("应用启动")
# 带 trace context 的日志
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("process-order"):
logger.info("处理订单", extra={"order_id": "o-12345"})
# 输出会包含 trace_id 和 span_id 字段
4.3 输出示例
{
"asctime": "2026-06-21T10:00:00",
"name": "__main__",
"levelname": "INFO",
"message": "处理订单",
"order_id": "o-12345",
"otelSpanID": "abc123def456",
"otelTraceID": "789xyz456abc",
"otelServiceName": "my-python-service"
}
5. Java 语言示例
5.1 Maven 依赖
<dependencies>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<version>1.36.0</version>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
<version>1.36.0</version>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-otlp</artifactId>
<version>1.36.0</version>
</dependency>
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-logback-mdc-1.0</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
5.2 配置(OpenTelemetry SDK)
import io.opentelemetry.api.OpenTelemetry;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.logs.Logger;
import io.opentelemetry.exporter.otlp.logs.OtlpGrpcLogExporter;
import io.opentelemetry.sdk.OpenTelemetrySdk;
import io.opentelemetry.sdk.logs.SdkLoggerProvider;
import io.opentelemetry.sdk.logs.export.BatchLogProcessor;
import io.opentelemetry.sdk.resources.Resource;
import io.opentelemetry.semconv.ResourceAttributes;
import java.time.Duration;
public class LogExample {
public static void main(String[] args) {
Resource resource = Resource.getDefault().merge(
Resource.create(Attributes.of(
ResourceAttributes.SERVICE_NAME, "my-java-service"
))
);
OtlpGrpcLogExporter exporter = OtlpGrpcLogExporter.builder()
.setEndpoint("http://otel-collector:4317")
.build();
SdkLoggerProvider loggerProvider = SdkLoggerProvider.builder()
.setResource(resource)
.addLogProcessor(BatchLogProcessor.builder(exporter)
.setScheduleDelay(Duration.ofSeconds(5))
.build())
.build();
OpenTelemetry sdk = OpenTelemetrySdk.builder()
.setLoggerProvider(loggerProvider)
.build();
// 获取 Logger
Logger logger = sdk.getLogsBridge().get("my-service/main");
// 业务日志
logger.logRecordBuilder()
.setSeverity(io.opentelemetry.api.logs.Severity.INFO)
.setBody("应用启动成功")
.emit();
logger.logRecordBuilder()
.setSeverity(io.opentelemetry.api.logs.Severity.ERROR)
.setBody("支付失败")
.setAttribute("user.id", "u-12345")
.setAttribute("order.id", "o-98765")
.emit();
}
}
5.3 与 logback 集成(推荐)
logback.xml:
<configuration>
<appender name="OTEL" class="io.opentelemetry.instrumentation.logback.mdc.v1_0.OpenTelemetryAppender">
<appender-ref ref="CONSOLE"/>
</appender>
<appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d{yyyy-MM-dd HH:mm:ss} %-5level [%thread] %logger{36} traceId=%X{trace_id} spanId=%X{span_id} - %msg%n</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="OTEL"/>
</root>
</configuration>
业务代码无需改动,logback 自动附加 trace_id / span_id 到 MDC。
6. 关联到 Trace(Logs ↔ Traces)
6.1 为什么关联?
排查问题时,可以从 metric 异常 → trace 慢调用 → 日志详情 逐层下钻。
[Grafana] HTTP P99 = 2s 异常
↓
[Tempo/Jaeger] 慢 trace: GET /api/order
↓
[Loki] 该 trace 下所有日志(含异常堆栈)
6.2 实现:在 Span 内打印日志
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("process-order") as span:
# span_id / trace_id 已经在 context 中
logger.info("开始处理订单", extra={"order_id": "o-12345"})
try:
process_payment()
except Exception as e:
# 日志自动关联到当前 span
logger.exception("支付失败", extra={"error": str(e)})
span.record_exception(e)
span.set_status(trace.Status(trace.StatusCode.ERROR))
6.3 实现:OTel Collector 关联配置
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
loki:
endpoint: http://loki:3100/loki/api/v1/push
otlp/tempo:
endpoint: tempo:4317
tls:
insecure: true
service:
pipelines:
logs:
receivers: [otlp]
exporters: [loki]
traces:
receivers: [otlp]
exporters: [otlp/tempo]
7. OpenTelemetry Collector 配置(接收 logs)
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
filelog:
include:
- /var/log/pods/*/*/*.log
operators:
- type: json_parser
- type: time_parser
parse_from: attributes.time
processors:
batch:
timeout: 5s
memory_limiter:
check_interval: 1s
limit_percentage: 80
resource:
attributes:
- key: deployment.environment
value: production
action: insert
exporters:
loki:
endpoint: http://loki:3100/loki/api/v1/push
otlp:
endpoint: http://backend-logs:4317
service:
pipelines:
logs:
receivers: [otlp, filelog]
processors: [memory_limiter, batch, resource]
exporters: [loki, otlp]
8. 关键术语速查
| 术语 | 含义 |
|---|---|
| LogRecord | 一条日志记录 |
| Logger | 创建 LogRecord 的工具 |
| LoggerProvider | 全局 provider |
| Severity | 严重等级 |
| Body | 日志主体内容 |
| Attribute | 键值对标签 |
| Resource | 资源属性 |
| TraceId / SpanId | 关联到 trace |
| OTLP | OpenTelemetry 传输协议 |
| Log Appender | 把现有日志框架接入 OTel |
| Mdc / Context | 跨调用栈的上下文传递(Java MDC / Python contextvars) |
9. 实战建议
- 结构化日志:用 JSON 格式,字段化(message、level、timestamp、trace_id、span_id)
- 统一关联字段:所有日志带
trace_id和span_id,从 Loki 一键跳到 Tempo - 避免高基数标签:不要把 user_id、order_id 放 attribute
- 日志级别规范:ERROR 需要人工介入,WARN 关注但不阻塞,INFO 关键业务
- 日志采样:高 QPS 服务只采样 INFO,DEBUG 全量入库成本高
- 错误日志必带堆栈:用
logger.exception()而不是logger.error() - 生产用日志桥接:用 slog/logrus/logback 配 OTel Appender,而不是直接用 OTel Logger API
- OTel Collector 收 log:应用直发到 Collector,由 Collector 路由到 Loki/ES
- 保留周期:业务合规要求 vs 存储成本
- 关联查询:metric → trace → log,逐层下钻
10. 速记
- 数据模型:LogRecord = Timestamp + Severity + Body + Attributes + Resource + TraceContext
- Severity:TRACE/DEBUG/INFO/WARN/ERROR/FATAL
- 关联 Trace:日志带 trace_id / span_id,可从 trace 跳到 log
- 生产用日志桥接:logback/logrus/slog 配 OTel Appender
- 结构化输出:JSON 格式,字段化
- 数据流:App → LoggerProvider → Processor(Batch) → Exporter(OTLP) → 后端(Loki/ES)
- OTel Collector 统一收 log / trace / metric,分别路由