Homework #2 | Notion

Translate the following RISC-V code to C. Assume that the variables f, g, h, i, and j are assigned to registers x5, x6, x7, x28, and x29, respectively. Assume that the base address of the arrays A and B are in registers x10 and x11, respectively.

아마 사이즈는 64bit인 것 같고

void Translate(DWORD *A){

*f = A[0] * 2;

}

아님? 근데 RISC-V 코드 첫 줄에서 A[1]을 잡았는지 모르겠음

3번째 줄에서 어차피 x30 위치인 A[1]에 A[0]값인 x31을 넣었기 때문에, A[1] = A[0]임

결과적으로 A[0] + A[0]가 되는 듯. 그럼 다시 네번째 줄에서 x30에 A[1]을 다시 load도 불필요?

Translate the following C code to RISC-V assembly code. Use a minimum number of instructions. Assume that the values of a, b, i, and j are in registers x5, x6, x7, and x29, respectively. Also, assume that register x10 holds the base address of the array D.

RISC-V code below,

# x5 = a, x6 = b, x7 = i, x29 = j, x10 = D_Base
	li x7, x0         # i = 0 초기화
outer_loop:
	bge x7, x5, end   # i가 a보다 크거나 같으면 end로 가라
	li x29, x0        # j = 0 초기화
inner_loop:
	bge x29, x6, outer_increment
	add x11, x7, x29  # x11 = i + j 넣어놓고
	slli x12, x29, 2  # j * 4로 x12에 배열 index 저장
	add x12, x10, x12 # 저장된 인덱스에 해당하는 D배열 위치 저장
	sw x11, 0(x12)    # D[4*j] = i + j
	
	addi x29, x29, 1  # j++
	jal x0, innter_loop
outer_increment:
	addi x7, x7, 1    # i++
	jal x0, outer_loop
end:

byte 연산 때문에 increment나 indexing 계산하는게 좀 헷갈리네

The following C code implements a four-tap FIR filter on input array sig_in. Assume that all arrays are 16-bit fixed-point values.

Assume you are to write an optimized implementation of this code in assembly language on a processor that has SIMD instructions and 128-bit registers. Without knowing the details of the instruction set, briefly describe how you would implement this code, maximizing the use of sub-word operations and minimizing the amount of data that is transferred between registers and memory. State all your assumptions about the instructions you use.

FIR(Finite Impulse Response) filter 코드이고, 128b 레지스터와 SIMD 명령어를 활용하면 한 번의 연산으로 여러 데이터 요소를 병렬 처리할 수 있어 성능을 극대화하는 코드라는 것 같음
RISC-V의 Vector Extension, RVV를 활용할 수 있음
gpt 통해서 어떻게 해보긴 했는데, 못보던 register도 나오고 처음보는 instruction으로 풀었네
어려워서 일단 넘어감