개인연구현황 ~11/21

[문제 정의]

The high access latency of the slow tier limits the performance of latency-sensitive applications in tiered memory systems.

[배경]

[Hypothesis 1]

As the proportion of fast-tier memory decreases under limited fast-tier capacity, an increasing number of memory accesses to slow-tier memory degrades the end-to-end performance of microservice-based applications.

[Insight]

The critical path representing the longest chain of dependent tasks in the microservice dependency graph determines the end-to-end latency of a microservice application [Zhang et al., ATC 2022].
Demoting pages allocated to microservice instances on the critical path to slow-tier memory results in exponential increases in tail latency.
Cloud services with a widely adopted microservice architecture exhibit sporadic and bursty request patterns [Stojkovic et al., ISCA 2025], limiting the accuracy of identifying page hotness based on access frequency.

조금 더 구체적인 가설 설정 배경은 다음과 같습니다.

Microservice architecture는 단일 서비스를 다수의 software module로 분해하여 실행하는 구조로, microservice들 간의 dependency는 graph 형태로 표현될 수 있습니다. 이 graph에는 end-to-end latency에 직접적인 영향을 주는 critical path가 존재하며 [Zhang et al., ATC 2022], critical path 상에 위치한 microservice가 slow-tier memory에 할당될 경우 tail latency의 증가를 야기합니다.

그러나 각 microservice들은 서로 다른 upper-level 서비스들로부터 호출될 수 있고, user의 request 또한 불규칙하기 때문에 비정형적인 workload 패턴을 갖습니다 [Luo et al., ASPLOS 2023], [Stojkovic et al., ISCA 2025]. 이로 인해 access frequency 기반의 page hotness 판정으로는 promotion 및 demotion candidate를 효과적으로 구분하기 어려울 것으로 판단하였습니다.

결과적으로 fast-tier memory의 capacity 비중이 줄어들수록, critical path 상의 microservice가 fast-tier memory에 지속적으로 유지되기 어려워질 수 있습니다. 따라서 slow-tier memory access로 인한 tail latency 증가가 microservice-based application 성능의 병목으로 작용할 것으로 보고, 이와 같이 가설을 설정하였습니다.

[실험 목적] Working set size가 fast-tier memory 크기를 초과하여 발생하는 slow-tier access로 인해 microservice application의 성능 저하가 발생함을 보이고, 성능 저하에 대한 각 microservice의 기여도가 상이함을 증명한다.

[실험 case 1]

전체 memory 크기를 고정한 상태에서 slow-tier memory의 비율을 0%에서 100%까지 일정 간격으로 증가시키며, microservice application의 service-level objectives (SLO)를 만족하는 최대 request-per-second (RPS)를 측정