目录

Apache SkyWalking 实践

SkyWalking 分布式系统的应用程序性能监视工具,专为微服务、云原生架构和基于容器(Docker、K8s、Mesos)架构而设计。

提供分布式追踪、服务网格遥测分析、度量聚合和可视化一体化解决方案。

SkyWalking OAP

1
2
3
4
5
wget https://www.apache.org/dyn/closer.cgi/skywalking/9.2.0/apache-skywalking-apm-9.2.0.tar.gz
tar -zxvf apache-skywalking-apm-9.2.0.tar.gz

# 修改存储配置
vim config/application.yml
  • 默认存储 H2
  • 默认监听 0.0.0.0/11800 for gRPC APIs and 0.0.0.0/12800 for HTTP REST APIs.
1
2
3
4
5
storage:
  selector: ${SW_STORAGE:h2}
  elasticsearch:
    namespace: ${SW_NAMESPACE:""}
    clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}

默认启动脚本是 /bin/oapService.sh

OAP 服务器的地址。默认值为 http://127.0.0.1:12800

  • gRPCPort 11800
  • restPort 12800

持久化 ElasticSearch

1
2
3
4
5
storage:
  selector: ${SW_STORAGE:elasticsearch}
  elasticsearch:
    namespace: ${SW_NAMESPACE:""}
    clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}

SkyWalking UI

SkyWalking UI 分发版已经包含在我们的 Apache 官方版本中。

UI listens on 8080 port and request 127.0.0.1/12800 to run a GraphQL query.

vim webapp/webapp.yml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

server:
  port: 8088

spring:
  cloud:
    gateway:
      routes:
        - id: oap-route
          uri: lb://oap-service
          predicates:
            - Path=/graphql/**
    discovery:
      client:
        simple:
          instances:
            oap-service:
              - uri: http://127.0.0.1:12800
            # - uri: http://<oap-host-1>:<oap-port1>
            # - uri: http://<oap-host-2>:<oap-port2>

  mvc:
    throw-exception-if-no-handler-found: true

  web:
    resources:
      add-mappings: true

management:
  server:
    base-path: /manage

启动脚本 /bin/webappService.sh

SkyWalking Agent 配置

cd /c/apps/skywalking
wget https://www.apache.org/dyn/closer.cgi/skywalking/java-agent/8.12.0/apache-skywalking-java-agent-8.12.0.tgz
tar -zxvf apache-skywalking-java-agent-8.12.0.tgz && cd skywalking-agent

脚本启动

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# 配置 Agent 名字。一般使用 Spring Boot 项目的 `spring.application.name`
export SW_AGENT_NAME=demo-application
# 配置 Collector 地址。
export SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800
# 配置链路的最大 Span 数量。一般情况下,不需要配置,默认为 300 。
export SW_AGENT_SPAN_LIMIT=1000
export JAVA_AGENT=-javaagent:C:\Apps\skywalking\skywalking-agent\skywalking-agent.jar

# Jar 启动
java -jar $JAVA_AGENT -jar app-1.0.0.RELEASE.jar

#复杂脚本可参考:
nohup java -javaagent:C:\Apps\skywalking\skywalking-agent\skywalking-agent.jar -Xmx512m -Xms512m -jar ${APP_NAME} --spring.profiles.active=dev --spring.cloud.nacos.discovery.server-addr=127.0.0.1:8848 --spring.cloud.nacos.discovery.password=nacos --spring.cloud.nacos.discovery.username=nacos --spring.cloud.nacos.config.server-addr=127.0.0.1:8848 >/dev/null 2>&1 &

IDEA

VM options:

-javaagent:C:\Apps\skywalking\skywalking-agent\skywalking-agent.jar

Environment varibables:

SW_AGENT_NAME=wb-gateway;SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800

Log Collection and Analysis

toolkit logback

1
2
3
4
5
6
<!-- https://mvnrepository.com/artifact/org.apache.skywalking/apm-toolkit-logback-1.x -->
<dependency>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>apm-toolkit-logback-1.x</artifactId>
    <version>8.12.0</version>
</dependency>

logback日志使用grpc收集,logback.xml 片段为:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="true" scanPeriod=" 5 seconds">

    <appender name="stdout" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
                <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <appender name="grpc-log" class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.log.GRPCLogClientAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
                <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <appender name="fileAppender" class="ch.qos.logback.core.FileAppender">
        <file>/tmp/skywalking-logs/logback/e2e-service-provider.log</file>
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
            <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
                <Pattern>[%sw_ctx] [%level] %d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %logger:%line - %msg%n</Pattern>
            </layout>
        </encoder>
    </appender>

    <root level="INFO">
        <appender-ref ref="grpc-log"/>
        <appender-ref ref="stdout"/>
    </root>
    <logger name="fileLogger" level="INFO">
        <appender-ref ref="fileAppender"/>
    </logger>
</configuration>

参考配置2

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="true" scanPeriod="60 seconds" debug="false">
<!-- 日志存放路径 -->
<property name="log.path" value="/data/logs/iot-platform-gateway"/>
<!-- 日志输出格式 -->
<property name="log.pattern" value="%d{HH:mm:ss.SSS} [%thread] %-5level %logger{20} - [%method,%line] - %msg%n"/>
<property name="log.pattern.traceId" value="%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - [%tid] - %msg%n"/>

<!-- 控制台输出 -->
<!--    <appender name="console" class="ch.qos.logback.core.ConsoleAppender">-->
<!--        <encoder>-->
<!--            <pattern>${log.pattern}</pattern>-->
<!--        </encoder>-->
<!--    </appender>-->

<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
            <pattern>${log.pattern.traceId}</pattern>
        </layout>
    </encoder>
</appender>

<!-- skywalking 通过grpc采集日志 -->
<appender name="grpc_log_info" class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.log.GRPCLogClientAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
            <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
        </layout>
    </encoder>
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
        <!-- 过滤的级别 -->
        <level>INFO</level>
        <!-- 匹配时的操作:接收(记录) -->
        <onMatch>ACCEPT</onMatch>
        <!-- 不匹配时的操作:拒绝(不记录) -->
        <onMismatch>DENY</onMismatch>
    </filter>
</appender>


<appender name="grpc_log_error" class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.log.GRPCLogClientAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.mdc.TraceIdMDCPatternLogbackLayout">
            <Pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{tid}] [%thread] %-5level %logger{36} -%msg%n</Pattern>
        </layout>
    </encoder>
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
        <!-- 过滤的级别 -->
        <level>ERROR</level>
        <!-- 匹配时的操作:接收(记录) -->
        <onMatch>ACCEPT</onMatch>
        <!-- 不匹配时的操作:拒绝(不记录) -->
        <onMismatch>DENY</onMismatch>
    </filter>
</appender>

<!-- 系统日志输出 -->
<appender name="file_info" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>${log.path}/info.log</file>
    <!-- 循环政策:基于时间创建日志文件 -->
    <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
        <!-- 日志文件名格式 -->
        <fileNamePattern>${log.path}/info.%d{yyyy-MM-dd}.log</fileNamePattern>
        <!-- 日志最大的历史 60天 -->
        <maxHistory>60</maxHistory>
    </rollingPolicy>
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
            <pattern>${log.pattern.traceId}</pattern>
        </layout>
    </encoder>
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
        <!-- 过滤的级别 -->
        <level>INFO</level>
        <!-- 匹配时的操作:接收(记录) -->
        <onMatch>ACCEPT</onMatch>
        <!-- 不匹配时的操作:拒绝(不记录) -->
        <onMismatch>DENY</onMismatch>
    </filter>
</appender>

<appender name="file_error" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>${log.path}/error.log</file>
    <!-- 循环政策:基于时间创建日志文件 -->
    <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
        <!-- 日志文件名格式 -->
        <fileNamePattern>${log.path}/error.%d{yyyy-MM-dd}.log</fileNamePattern>
        <!-- 日志最大的历史 60天 -->
        <maxHistory>60</maxHistory>
    </rollingPolicy>
    <!--        <encoder>-->
    <!--            <pattern>${log.pattern}</pattern>-->
    <!--        </encoder>-->
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
            <pattern>${log.pattern.traceId}</pattern>
        </layout>
    </encoder>
    <filter class="ch.qos.logback.classic.filter.LevelFilter">
        <!-- 过滤的级别 -->
        <level>ERROR</level>
        <!-- 匹配时的操作:接收(记录) -->
        <onMatch>ACCEPT</onMatch>
        <!-- 不匹配时的操作:拒绝(不记录) -->
        <onMismatch>DENY</onMismatch>
    </filter>
</appender>
<!--只打印 dozer包下的error级别的日志 -->
<logger name="org.dozer" level="error" additivity="false"/>

<!--屏蔽 nacos包下的warn级别以上的日志 -->
<logger name="com.alibaba.nacos.client.naming" level="warn" additivity="false"/>
<!-- Spring日志级别控制  -->
<logger name="org.springframework" level="warn"/>

<root level="info">
    <appender-ref ref="console"/>
</root>

<!--系统操作日志-->
<root level="info">
    <appender-ref ref="file_info"/>
    <appender-ref ref="file_error"/>
    <appender-ref ref="grpc_log_info"/>
    <appender-ref ref="grpc_log_error"/>
</root>
</configuration>
1
vim config/agent.config
# Backend service addresses.
collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:127.0.0.1:11800}
# 如果agent和oap服务不在同一台服务器上 追加配置
plugin.toolkit.log.grpc.reporter.server_host=${SW_GRPC_LOG_SERVER_HOST:127.0.0.1}
plugin.toolkit.log.grpc.reporter.server_port=${SW_GRPC_LOG_SERVER_PORT:11800}
plugin.toolkit.log.grpc.reporter.max_message_size=${SW_GRPC_LOG_MAX_MESSAGE_SIZE:10485760}
plugin.toolkit.log.grpc.reporter.upstream_timeout=${SW_GRPC_LOG_GRPC_UPSTREAM_TIMEOUT:30}

支持 SpringCloud Gateway

skywalking-agent/optional-plugins目录下的 apm-spring-cloud-gateway-3.x-plugin-8.12.0.jar 拷贝到 skywalking-agent/optional-plugins 目录下即可

apm-spring-webflux-5.x-plugin-8.12.0.jar

忽略跟踪某项节点(url)

  • skywalking-agent/optional-plugins 目录下的 apm-trace-ignore-plugin-8.12.0.jar 拷贝到 skywalking-agent/plugins 目录下
  • skywalking-agent/config 下创建 apm-trace-ignore-plugin.config 文件,并加入trace.ignore_path=${SW_AGENT_TRACE_IGNORE_PATH:/api-docs/**} 为例,来忽略访问swaggerUI的追踪

代码中获取 traceId

自定义一个跟踪方法很简单,只需在要跟踪的方法上添加 @Trace注解即可,当然它也需要 activations/apm-toolkit-trace-activation-8.6.0.jar插件的支持

实际上,项目中不需要每个方法都加@Trace这个注解来获得traceId,只需要在全局响应的地方来获取即可。

1
2
3
4
5
6
<!-- https://mvnrepository.com/artifact/org.apache.skywalking/apm-toolkit-trace -->
<dependency>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>apm-toolkit-trace</artifactId>
    <version>8.12.0</version>
</dependency>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
@Slf4j
@ControllerAdvice
public class GlobalResponseBodyAdvice implements ResponseBodyAdvice<Object> {
    @Override
    public boolean supports(MethodParameter returnType,
                            Class<? extends HttpMessageConverter<?>> converterType) {
        // 若接口返回的类型本身就是ResultWrapper,则无需操作,返回false
        // return !returnType.getParameterType().equals(ResultWrapper.class);
        return true;
    }
    
@Override
@ResponseBody
@Trace
public Object beforeBodyWrite(Object body, MethodParameter returnType,
                              MediaType selectedContentType,
                              Class<? extends HttpMessageConverter<?>> selectedConverterType,
                              ServerHttpRequest request, ServerHttpResponse response) {
 
    String traceId = TraceContext.traceId();
 
    if (body instanceof String) {
        // 若返回值为String类型,需要包装为String类型返回。否则会报错
        try {
            ObjectMapper objectMapper = new ObjectMapper();
            Result<Object> result = new Result<>().data(body).traceId(traceId);
            return objectMapper.writeValueAsString(result);
        } catch (JsonProcessingException e) {
            throw new RuntimeException("序列化String错误");
        }
    } else if (body instanceof Result) {
        return ((Result)body).traceId(traceId);
    }
 
    return new Result<>().traceId(traceId).data(body);
}
}

彩色日志

1
2
3
4
5
6
7
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
        <layout class="org.apache.skywalking.apm.toolkit.log.logback.v1.x.TraceIdPatternLogbackLayout">
            <pattern>[%tid] ${CONSOLE_LOG_PATTERN:-%clr(%d{${LOG_DATEFORMAT_PATTERN:-yyyy-MM-dd HH:mm:ss.SSS}}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr(${PID:- }){magenta} %clr(---) {faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}}</pattern>
        </layout>
    </encoder>
</appender>

使用[%tid] 来占trace-id的位置,默认为TID:N/A,当有请求调用时,会显示trace-id。

附录