🔥 面试 | Spring Cloud 相关

项目场景

如何保证业务数据的幂等性？

首先，每笔交易/报文必须有唯一的业务标识——报文参考号、流水号,作为表的唯一索引。

乐观锁，通过给该条数据加状态/版本字段，只有一个线程能更新成功，失败的线程表示有别的线程已在处理。

加分布式锁，多线程的情况下，只有拿到锁的那一方可以执行操作，执行完释放锁。使用redisson（处理锁续租、断开重入等边界问题），设置lockkey，用redisson生成lock，继续执行业务，最后释放锁即可，另有线程进来时则会返回。如果用redis原生和lua脚本则需注意当前线程的token以及只有自己才能释放锁。

如果Eureka Server 挂了，服务还能互相调用吗？

If the Eureka Server goes down, services can still call each other in the short term because Eureka clients cache the registry locally.

However, over time, the cache may become stale, causing failed calls. In production, we usually deploy Eureka Server in a cluster to avoid single point of failure.

如果 MQ 堆积了几百万消息，怎么解决？

分析原因(消费端/生产端) -> 消费端增加处理/生产端限流 + 死信/监控

If millions of messages are piled up in the MQ, the first step is to analyze the cause, usually either insufficient consumer throughput or traffic spike on the producer side.

In the short term, we can add more consumers(such as pods or threads), enable batch consumption, or throttle producers.

For long-term solutions, we should optimize consumer logic, increase parallelism by splitting queues and partitioning, and use dead-letter queues plus monitoring to prevent future backlogs.

Using dead-letter queues means the messages reach the queue but there are no consumers to consume them. And We configure a delay time to control when the message can be delivered to the specified queue with routing key. Thus we accomplish to avoid the excessive concurrency.

如果 MyBatis 查询很慢，怎么排查？

应用层sql -> 数据库explain -> 系统连接

If Mybatis query is slow, I would first check at the application level whether the generated SQL is correct, avoiding issues like N+1 queries or inefficient pagination.

Then I would run the final SQL directly in the database and use EXPLAIN to see if it’s missing indexes or blocked by locks.

Finally, I’d check system-level factors like the connection pool or network letency.

The typical solutions are optimizing SQL and indexes, avoiding N+1 queries, using database-native pagination, and applying caching for hot data.

如果 Pod 内存不断飙升，直到 OOMKilled，怎么分析？

底层k8s -> 应用层JVM -> 业务层

If a Pod’s memory keeps growing until it gets OOMKilled, I would first check in Kubernetes whether the memory limit is too low.

Then I’d look at monitoring to see if the growth is sudden(possible memory leak) or load-driven(data accumulation).

For application-level leaks, I’d export a heap dump and analyze it with tools like jmap to check for issues like unbounded caches, unclosed connections, or objects not released.

If it’s caused by business load, such as MQ messages are not consumed in time, I’d optimize the consumption logic or scale out the service.

The solutions usually involve setting memory limits, optimizing memory usage in the code, and setting cache expiration or current limiting when necessary.

如果 CI/CD 部署失败，如何快速回滚？

first, check the failure occurs in the build phase or the deployment phase.

A build failure does’t affect production, so I just fix and rebuild.

If the deployment fails, We usually use kubernetes rollout undo to quickly revert to the previous version, and it will switch to the last stable environment.

如果 Spring Boot 服务启动很慢，可能的原因有哪些？

Sping Boot applications may start slowly for several reasons:

too many dependencies and bean initializations,
slow database connections or schema auto-generation,
delays from external services like Eureka or Redis,
heavy logging or reflection overhead,
heavy custom initialization logic such as large cache preloading,
pod startuping loads too many configmap or secret.

To troubleshoot, I usually check the logs to see where the application is stuck, use debug mode, or analyze by the startup actuator endpoint.

Optimizations include reducing dependency scope, removing unnecessary auto-configurations, tuning database connections, and deferring heavy initialization tasks.

Spring boot

Spring Boot 自动装配原理是什么？

Spring Boot is a rapid development framework based on Spring. It can accomplish automatic configuration, embedded container, fast startup, and easy integration with spring cloud.

Its automatic configuration is based on @EnableAutoConfiguration and spring.factories. Spring Boot will scan the configuration classes under the dependencies at startup and automatically assembles the beans that meet the requirements.

@SpringBootApplication 包含了哪些注解？

@SpringBootApplication is the core annotation of Spring Boot. It consists of @SpringBootConfiguration, @EnableAutoConfiguration and @ComponentScan.

Spring Boot 如何做配置管理（application.yml、Profile、多环境切换）？

Spring Boot uses application.yaml or application.rpoperties to manage the configuration.

It supports multi-environment configuration via Profile and can maps the configuration to an object via @ConfigurationProperties.

常见 Starter 原理？自定义 Starter 怎么写？

Starter is based on @EnableAutoConfiguration and SpringFactoriesLoader. It dynamicly load configuration classes through confitional annotation to achieve automatic configuration.

we can define a starter by 4 steps:

write a automatic configuration class,
write a property binding class,
regiter in spring.factories or AutoConfiguration.imports
package the codes to a dependency, then we can import it in the project.

Spring Boot 如何实现热部署？

如何处理配置中心（Spring Cloud Config / Nacos）的动态刷新？

In the Spring Cloud Config, we can dynamicly refresh the configurations by @RefreshScpoe annotation and actuator or Spring Cloud Bus.

In Nacos, the configuration supports automatic refresh by default.

IoC 是什么？

IoC是一种设计思想，让对象的创建和依赖关系的管理交由spring容器来负责。在传统开发中，代码中通常会new出对象，而在spring中，容器通过配置（xml或注解）自动创建并注入对象，开发者只需声明依赖。

实现机制：DI 依赖注入。构造函数注入、setter注入、字段注入

Spring IoC 实现是通过 BeanFactory 加载配置、注册 BeanDefinition、利用反射创建对象并自动注入依赖。我们通过注解（annotations）来告诉spring哪些类需要被创建和注入。

IoC 容器实现了创建对象、管理依赖关系、控制生命周期、支持代理、AOP、事务等机制，被容器接管的对象也就是Bean

常见的注解有哪些？

@Component ：声明这是一个Bean，交给Spring容器管理
@Service ：语义化（业务逻辑层）注解，本质也是 @Component
@Repository ：持久层（数据访问层）注解，支持异常转换
@Controller ：表示控制层 Bean ，也就是web层
@Autowired ：自动注入依赖对象
@Qualifier ：配合@Autowired按名称注入
@ComponnetScan ：指定Spring扫描的包路径
@Configuration + @Bean ：手动注册Bean。@Configuration告诉spirng此类是一个配置类，它里边定义了一些bean，而有些内部方法则不需要被注册成bean，被@bean标注的方法，spring才会把返回的对象注册为bean

spring中Bean的生命周期

加载配置（xml或注解）-> 实例化 -> 依赖注入 -> 初始化 -> 使用 -> 销毁

AOP是如何实现的？

AOP 面向切片编程，是在不修改业务代码的前提下，在特定的点（方法调用前/后/异常）插入通用逻辑，比如：日志记录、权限校验、事务控制

Spring AOP底层是通过动态代理实现。对实现接口的类使用JDK动态代理；对未实现接口的类使用CGLIB动态代理（通过继承目标类创建子类）

AOP通过 BeanPostProcessor创建代理对象，将切面逻辑植入到代理方法执行链中。

Eureka

什么是服务注册与发现？Eureka 的工作原理？

The service instance register the infomation to the Eureka Server when start up, and keep alive by heartbeat mechanism. the consumer can get the service address lists by Eureka with load balance.

Eureka choose AP, which ensures high availability. And it can avoid
services being mistakenly rejected on a large scale through self-protection mode.

Eureka Client 向 Eureka Server 注册和心跳机制是怎样的？

The client will send a heartbeat meassage to keep alive status per 30s. Eureka server will mark it expiration if the service doesn’t send heartbeat in 90s.

什么是 CAP 定理？Eureka 为什么选择 AP？

CAP means consistency, availability and partition tolerance.

Eureka chooses AP , which ensures high availability and partition tolerance, but sacrifices strong consistency. When Eureka fails, the service is still available instead of directly rejecting the request. This action will provide a better user experience.

ZooKeeper chooses CP, which means it ensures the service data has consistency, but when some errors occurs at Zookeeper, the requests will be rejected.

Eureka 和 Zookeeper、Nacos 的区别？

Eureka chooses AP
ZooKeeper chooses CP

Eureka 服务下线 / 健康检查机制？

There are two type of Eureka offline mechanism. Active offline and Passive offline. Active offline means the service sends an cancel request to the Eureka server. Passive offline means Eureka server cleans the services which don’t send a heartbeat over 90s.

The Eureka default health check mechanism is heartbeat. but it can be combined with Spring Boot Actuator health check. We can configure and enable the healthcheck on the services, and then the service will report Actuator health status to Eureka server.

Eureka 自我保护机制是怎么实现的？

Eureka will count the heartbeat ratio in the last 15 minutes, if it is lower than 85% by default, Eureka will enter self-protection mode.

under this mode, Eureka won’t remove any services in order to protect AP and won’t return its original mode until the network is restored.

消息队列

MQ 的应用场景有哪些？为什么要用 MQ？

Using MQ can accomplish asynchronous decoupling. for example, our payment system sends a cross-border transfer, then it needs to generate and send a swift message to the clearing system, record
this transfer flow and inform account manager. MQ can process these operations asynchronously after core system notifies MQ.

The clearing system is responsible for message delivery, the risk control system is responsible for verification, and the notification system is responsible for push notification. This decouples core system, reduces peak traffic and protects the clearing channel, ensures message delivery, and meets audit trail requirements.

Handle peak traffic situations. for example, when a large number of meassges enter the system, they can first be delivered to the MQ queue, and then be smoothly sent to the clearing system for consumption.

MQ can copy messages to the log system to ensure full link traceablity. and It can meet compliance, auditing and risk control needs.

RabbitMQ 的交换机类型（Direct/Topic/Fanout/Headers）的区别？

RabbitMQ provides four common exchange types:
Direct(for exact matching by routing key),
Topic(for matching with regex),
Fanout(for broadcast mode that will inform every consumer),
Headers(for matching based on message attributes).

Direct and Topic are most commonly used in general business scenarios.

如何保证消息不丢失？

retry mechanism, acknowledgement mechanism

如何避免消息重复消费？

MQ cann’t avoid duplicate delivery, so idempotence must be guaranteed on the consumer side. Common practices include using unique business IDs(such as serial numbers), database unique constaints, or Redis deduplication to ensure that messages are only rocessed once.

如何保证消息的顺序性？

partition based on business keys(such as account IDs) ensures that individual keys are ordered and the entire process can be run in parallel.

什么是消息积压？如何解决？

The producer send too fast and the consumer has no time to process.

So we can solve this problem from consumer side and producer side.

in short term: we can add consumers, patch consumption or throttle producers.

In long term: we can optimize the consumption logic, increase parallelism by splitting queues and partitioning, and use dead-letter queues and monitoring to prevent future backlogs.

Spring AMQP 的重试机制？

Kafka 和 RabbitMQ 的区别？

Kafka is designed for high throughput and big data processing, using a distributed logging architecture. It’s suitable for log collection and real-time analysis.

RabbitMQ is designed for reliable message delivery. Based on the AMQP protocol, It supports flexible routing and is suitable for scenarios requiring strong consistency, such as financial payments.

Mybatis

MyBatis 的执行流程是怎样的？

Mybatis creates SqlSession via SqlSessionFactory, dynamically proxies mapper interface, maps SQL to XML/annotations, and performs database operations.

`#{}` 和 `${}` 的区别？

#{} is precompiled, placeholder, secure, and can prevent SQL injection.

${} is string concatenation, which poses a risk of SQL injection, is used for dynamic SQL statements such as table names and fields.

MyBatis 一级缓存和二级缓存的区别？

Mapper 接口与 XML 是如何绑定的？

The Mapper interface is bound to XML through namespace(package) and method name(id), and MyBatis maps the interface method call to the coresponding SQL statement execution through dynamic proxy.

Usually, MyBatis add the class to session.getMapper method to generate a proxy object. the proxy object will intercept the method , and find the SQL in XML through namespace and method. Then run the SQL, get result, return the value object.

MyBatis 如何实现动态 SQL？

Mybatis provides the tags such as if/choose/where/set/foreach to splice SQL statement dynamically.

MyBatis 性能调优手段？

first, optimize SQL, such as avoid N+1 queries, only query necessary fields, batch update or insert, use pagination query, establish indexes for hot fields.

second, optimize MyBatis config, such as reasonable use level 1 or level 2 cache，batch executor, use resultMap instead of automatic mapping.

third, use Redis cache for hot data, tuning the connection pool using HikariCP.

Docker

Docker 镜像和容器的区别？

Image is a static read-only template. It consists of the APP and running environment.

Container is the running instance of Image. It can be read and written.

容器和虚拟机的区别？

The bigest difference of VM and container is the virtualization layer. VM uses hypervisors to virtualize hardware. Each VM needs to run a complete OS, So It will be in high resource overhead and slow startup, but stronger isolation.

Containers are process-level isolation based on the host OS kernel, so they don’t require an additional OS and have fast startup speeds, and high resource utilization, but relatively weaker isolation.

Dockerfile 常见指令（FROM、RUN、CMD、ENTRYPOINT、COPY、WORKDIR…）？

Docker 镜像的分层结构？

In the build process, every command in Dockerfile will generate a new layer. These layers can be reused by multiple images and containers. This makes the build process more efficient, the storage more economical and the transmission faster.

About optimization, I will merge RUN commmands and reduce the invalid layers.

如何减少镜像体积？

At first, we usually don’t change the basic image, we have to use the domestic image like kirin-jre.

secondly, we just COPY the compiled jar file to build, that means it just has the runtime environment.

Meanwhile, merge the docker RUN commands.

finally, Use the .dockerignore to exclude the unnecessary files like log, test resource.

Docker 网络模式有哪些？

Kubernetes (K8s)

K8s 的核心组件有哪些？（etcd、API Server、Controller、Scheduler、Kubelet…）

Kubernetes have two core components: Master and Node.

The master :
kube-ApiServer (API entry point),
etcd (Cluster state storage),
kube-scheduler (Pod scheduling),
Kube-controller-manager (running various controllers)

Node:

kubelet. (pod management),
kube-proxy. (network proxy),
container runtime. (running containers)

Pod、Deployment、Service、Ingress 的区别？

Pod is the smallest running unit, encapsulation the container and its operating environment.

Deployment can manage pods, which supports replica number control, rolling updates and rollbacks.

Service provides the stable access entry. Pods’ ip is dynamic, so Service can proxy the dynamic ip to the actual pod and accomplish load balance.

Ingress is the upper-layer entrance of Service. It is responsible for routing external traffic.

Pod 为什么会被 OOMKilled？

Usually Container memory is out of memory.

It may be configured too small resource limit, or the app occurs memory leak resulting in the surging memory usage in a short term.

In kubernetes, If the memory usage is over the limits, the POD will be killed by kernel OOM killer.

I usually check the pod event, and then check the container resource configuration, analyze the monitoring data and app logs.

finally, The solution is usually Optimizing memory usage or tuning the memory limits.

Liveness Probe 和 Readiness Probe 有什么区别？

Liveness probe and Readiness probe are all used to be probe to check the container health.

Liveness Probe check if the container lives, if not, kubelet will restart the container.

Readiness Probe check if the pod is ready for providing service. If not, Pod will be temporarily removed from the load balance of Service. But container can’t be restarted.

K8s 滚动更新和回滚是怎么实现的？

In kubernetes, Deployment uses RollingUpdate stratgy to update pods. It will create a new pod and delete the old pod according to maxUnavilable and maxSurge configuration.

Deployment can generate a new replicaSet when updating, and preserve the old replicaSet version for rollback.

ConfigMap 和 Secret 的区别？

In kubernetes, ConfigMap is used to store the non-sensitive configuration infomation such as database address, app params.

Secret is used to store the sensitive infomation such as keys, certificate and token.

And Secret can be encoded by base64 and Kubernetes mechanism, but ConfigMap is plaintext storage.

如何排查 Pod CrashLoopBackOff？

HPA（Horizontal Pod Autoscaler）工作原理？

CI/CD

CI/CD 的流程是怎样的？

CI - Continuous Integration
CD - Continuous Delivery/Deployment

First, developers commit their codes to gitlab.

We configture a pipeline to clone the project codes, install the dependencies, build the Application using maven or npm. also we can do the code scaning and unit test.

when CD phase, we build the image Using docker, it will package the application zip file to the image file by dockerfile, of course, we can write the build logic in dockerfile, like we can use domestic basic image jre or nginx for different end, copy that zip file and unzip it.
Then we push the image to our private repository. Finally, we deploy the project by the kubernetes manifest, It will contain the infomation of pod, deployment, service, route and that docker image address, etc.

When the pipeline execution ends, we can see our application in our cloud platform.

Also we have different pipelines for different environments such as dev, sit, uat. After going online, there is the production environment, which developers have no right to access.

常见的 CI/CD 工具（Jenkins、GitLab CI、ArgoCD）区别？

My last company got the DevOps team, So they develop and maintain the CI/CD tools base on Jenkins.

Jenkins is more flexible for large company, the devops team can configure according to the needs of Company teams.

在 CI/CD 中如何做自动化测试？

如何在 CI/CD 中实现蓝绿发布 / 灰度发布？

CI/CD 如何与 Docker、K8s 结合？

如何保障生产环境的回滚机制？

Gateway

Spring Gateway 工作原理？

Spring Gateway is based on three core concept: route, predicate and filter.

Firstly, A request enter and match predicate to choose a route. Then, It will be delivered to the downstream service after current limiting, authentication, and logging through filter chain.

如何在 Gateway 做权限校验？

we can set the token authentication like JWT validation in the global filter. and then we forward the request.

Gateway 如何实现限流？

use requestRateLimiter filter. it supports flow control by user, IP, and API.

Security

Spring Security 的核心原理

the core is Filter Chain.

JWT 在 Spring Security 中如何集成？

it will generate JWT after successful logining,

分布式事务

什么是分布式事务？为什么会出现？

The distributed system involves multiple services, multiple databases, even message queue. So we need to ensure data consistency across systems.

Seata 是什么，是如何实现分布式事务管理？

Seata is a framework that can solve distributed transaction.

Seata manages transaction by three charactors: Transaction Coodinator, Transaction Manager and Resource Manager.

what are the common solutions for distributed transactions?

2PC two phase commit, high consistentcy, but low preforamence.

TCC Try-comfirm-cancel, Try phase will reserve resource, Confirm phase will commit, Cancel phase will rollback.

XA is a two phase commit mode. In the first phase, The TC notifies each transaction participant to execute local transaction. the transaction participant will report the execution status to TC after transaction finished. And this time the transaction isn’t committed and hold the database lock. In the second phase, If all transaction participants are successful in the first phase, The TC will notify each transaction participant to commit their transaction. If not, each transaction participant will rollback.

AT also is a two-phase commit mode. In the first phase, TM launches and registers the global transaction to TC. RM registers the branches transaction and records undo-log, Then execute the SQL and report local transaction status to TC. In the second phase, TM notifies TC that the transaction is finished. Then TC check the branch transaction status. If sucessed, delete the undo-log. If failed, rollback, restore the undo-log(snapshot).

微服务保护

sentinel 如何实现微服务保护？

request current limiting, is to avoid failures caused by traffic surges,
thread isolation is to prevent the problematic interface from affecting other normal interfaces,
service circuit breaking is to implement service degradation logic to avoid affecting the current service.