컴퓨터/컴퓨터구조특론

[ACA] Instruction-level Parallelism (1)

xeskin 2020. 12. 26. 09:00

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Instruction level parallelism (ILP)
- 명령어들의 실행을 오버랩하는 것.
Goal: minimize CPI (maxmize IPC)

Challenges

Challenges: 모든 명령어가 병렬로 실행될 수 없다.
파이프라이닝된 프로세서에서
- CPI = ideal CPI + (Structural stalls + Data hazard stalls + Control stalls)
종속성이 있는 명령어는 동시에 실행될 수 없다.
- 종속성의 세가지 종류
  1. Data dependences
  2. Name dependences
  3. Control dependences

Data Dependence

명령어 i가 명령어 j가 사용할 데이터를 만드는 경우, 명령어 j가 명령어 i에 data dependent하다고 한다.
Data dependent한 명령어들은 '순서대로' 실행되야한다.
Pipeline interlocks
- With interlocks, data dependence causes a hazard and stall
- Without interlocks, data dependence prohibits the comptiler from scheduling instructions with overlap
Data dependence conveys:
- Possibility of a hazard
- The required order of instructions
- Upper bound on achievable parallelism
Data dependence는 ILP 수준을 제한시킬 수 있다.
Overcoming data dependence
- compiler나 hw scheduler를 사용하여 hazard를 예방해 dependence를 유지시킨다.
- 코드를 바꿔 종속성을 제거한다.
레지스터 이름이 명렁어에 고정되있기 때문에 종속성을 찾는 것은 간단하다.
하지만, 메모리 공간을 통해 전달되는 종속성은 두 주소가 보기에는 다르지만 같은 장소를 참조할 수 있기 때문에 찾기 어렵다. >> Name Dependence
- 100(R4), 20(R6)이 같은가?

Name Dependence

Name dependence는 두 명령어가 같은 레지스터가 동일한 레지스터(혹은 메모리 위치)를 사용하지만 둘 사이에는 data flow가 없을 때 발생한다.

Name dependence에는 두 가지 종류가 있다.

Antidependence: 명령어 j가 명령어 i가 읽는 레지스터에 쓰는 경우

Loop:    L.D       F0,0(R1)          
      ADD.D     F4,F0,F2          
      S.D       F4,0(R1)          ;instruction i
      DADDUI    R1,R1,#-8      ;instruction j
      BNE       R1,R2,LOOP

Output dependence: 명령어 i,j가 같은 레지스터에 쓰는 경우

Loop:    L.D       F0,0(R1)         
      ADD.D     F4,F0,F2           ;instruction i
      ADD.D     F4,F1,F2      ;instruction j

Data Hazards

Hazard는 명령어 사이에 name/data dependence가 있을 때 존재한다.
- 종속적인 명령어들을 overlapping하는 것은 프로그램의 결과를 바꿀 수 있다.
sw/hw 기술의 목표는 프로그램 결과에 영향을 미치는 경우에만 프로그램(명령어) 순서를 유지하여 병렬 처리를 활용하는 것이다.
Hazard를 탐지하고 피하는 것은 프로그램 순서를 유지하는 걸 보장해준다.
Possible data hazards
- RAW (Read after write) <- true data dependence
  - instruction i: write to x
  - instruction j: read from x
- WAW (Write after write) <- output dependence
  - instruction i: write to x
  - instruction j: write to x
- WAR (Write after read) <- antidependence
  - instruction i: read from x
  - instruction j: write to x
RAR (Read after read)는 hazard가 아니다.

Control Dependence

Control dependence는 분기 명령어에 대한 명령어의 순서를 결정한다.
- 순서는 반드시 유지돼야 한다.
- 실행은 조건부 여야한다.

Example

if p1{ 
   S1;       // S1 is control dependent on p1 
}
if p2{ 
   S2;       // S2 is control dependent on p2 
}

분기는 코드 상에서 potential ILP에 대한 barrier를 만든다.
Speculative execution은 control dependence를 망가뜨릴 수 있지만 추가적인 hw를 사용하면 올바른 실행을 하게끔 유지시켜줄 수 있다.
```
  DADDU   R2,R3,R4 
  BEQ     R2, R5,  L1
  LW      R1,0(R2) 
  ADD.    R4, R5, R6
L1:   …….
```
- BEQ(branch)와 LW(load) 사이에는 data dependence가 없다.
- 하지만 두 명령어 사이에는 control dependence가 있다.
Control dependence를 무시하는 것은 코드 분석 후에 가능할 수 있다.

Compiler Techniques for Exposing ILP

컴파일러는 ILP를 exploit하는 프로세서의 성능을 높일 수 있다.

Loop Unrolling

Loop unrolling은 루프의 바디를 여러 번 복사하고, 각 복사본을 새 루프에서 작동시키는 방법이다.
- Benefits
  1. Less branch instructions
    - Less pressure on branch predictor
  2. Increased basic block size
    - Potential for more parallelism
  3. Less instructions executed
    - For example: less increments of the loop counter
- Downsides
  1. Greater register pressure
  2. Increased use of intruction cache

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Instruction level parallelism (ILP)
- 명령어들의 실행을 오버랩하는 것.
Goal: minimize CPI (maxmize IPC)

Challenges

Challenges: 모든 명령어가 병렬로 실행될 수 없다.
파이프라이닝된 프로세서에서
- CPI = ideal CPI + (Structural stalls + Data hazard stalls + Control stalls)
종속성이 있는 명령어는 동시에 실행될 수 없다.
- 종속성의 세가지 종류
  1. Data dependences
  2. Name dependences
  3. Control dependences

Data Dependence

명령어 i가 명령어 j가 사용할 데이터를 만드는 경우, 명령어 j가 명령어 i에 data dependent하다고 한다.
Data dependent한 명령어들은 '순서대로' 실행되야한다.
Pipeline interlocks
- With interlocks, data dependence causes a hazard and stall
- Without interlocks, data dependence prohibits the comptiler from scheduling instructions with overlap
Data dependence conveys:
- Possibility of a hazard
- The required order of instructions
- Upper bound on achievable parallelism
Data dependence는 ILP 수준을 제한시킬 수 있다.
Overcoming data dependence
- compiler나 hw scheduler를 사용하여 hazard를 예방해 dependence를 유지시킨다.
- 코드를 바꿔 종속성을 제거한다.
레지스터 이름이 명렁어에 고정되있기 때문에 종속성을 찾는 것은 간단하다.
하지만, 메모리 공간을 통해 전달되는 종속성은 두 주소가 보기에는 다르지만 같은 장소를 참조할 수 있기 때문에 찾기 어렵다. >> Name Dependence
- 100(R4), 20(R6)이 같은가?

Name Dependence

Name dependence는 두 명령어가 같은 레지스터가 동일한 레지스터(혹은 메모리 위치)를 사용하지만 둘 사이에는 data flow가 없을 때 발생한다.

Name dependence에는 두 가지 종류가 있다.

Antidependence: 명령어 j가 명령어 i가 읽는 레지스터에 쓰는 경우

Loop:    L.D       F0,0(R1)          
      ADD.D     F4,F0,F2          
      S.D       F4,0(R1)          ;instruction i
      DADDUI    R1,R1,#-8      ;instruction j
      BNE       R1,R2,LOOP

Output dependence: 명령어 i,j가 같은 레지스터에 쓰는 경우

Loop:    L.D       F0,0(R1)         
      ADD.D     F4,F0,F2           ;instruction i
      ADD.D     F4,F1,F2      ;instruction j

Data Hazards

Hazard는 명령어 사이에 name/data dependence가 있을 때 존재한다.
- 종속적인 명령어들을 overlapping하는 것은 프로그램의 결과를 바꿀 수 있다.
sw/hw 기술의 목표는 프로그램 결과에 영향을 미치는 경우에만 프로그램(명령어) 순서를 유지하여 병렬 처리를 활용하는 것이다.
Hazard를 탐지하고 피하는 것은 프로그램 순서를 유지하는 걸 보장해준다.
Possible data hazards
- RAW (Read after write) <- true data dependence
  - instruction i: write to x
  - instruction j: read from x
- WAW (Write after write) <- output dependence
  - instruction i: write to x
  - instruction j: write to x
- WAR (Write after read) <- antidependence
  - instruction i: read from x
  - instruction j: write to x
RAR (Read after read)는 hazard가 아니다.

Control Dependence

Control dependence는 분기 명령어에 대한 명령어의 순서를 결정한다.
- 순서는 반드시 유지돼야 한다.
- 실행은 조건부 여야한다.

Example

if p1{ 
   S1;       // S1 is control dependent on p1 
}
if p2{ 
   S2;       // S2 is control dependent on p2 
}

분기는 코드 상에서 potential ILP에 대한 barrier를 만든다.
Speculative execution은 control dependence를 망가뜨릴 수 있지만 추가적인 hw를 사용하면 올바른 실행을 하게끔 유지시켜줄 수 있다.
```
  DADDU   R2,R3,R4 
  BEQ     R2, R5,  L1
  LW      R1,0(R2) 
  ADD.    R4, R5, R6
L1:   …….
```
- BEQ(branch)와 LW(load) 사이에는 data dependence가 없다.
- 하지만 두 명령어 사이에는 control dependence가 있다.
Control dependence를 무시하는 것은 코드 분석 후에 가능할 수 있다.

Compiler Techniques for Exposing ILP

컴파일러는 ILP를 exploit하는 프로세서의 성능을 높일 수 있다.

Loop Unrolling

Loop unrolling은 루프의 바디를 여러 번 복사하고, 각 복사본을 새 루프에서 작동시키는 방법이다.
- Benefits
  1. Less branch instructions
    - Less pressure on branch predictor
  2. Increased basic block size
    - Potential for more parallelism
  3. Less instructions executed
    - For example: less increments of the loop counter
- Downsides
  1. Greater register pressure
  2. Increased use of intruction cache

저작자표시 비영리 변경금지 (새창열림)

'컴퓨터 > 컴퓨터구조특론' 카테고리의 다른 글

[ACA] Instruction-level Parallelism (2) (0)	2020.12.28
[ACA] Advanced Cache Optimization (2)	2020.04.23
[ACA] 캐시 메모리의 인덱스로 중간 비트를 사용하는 이유 (0)	2020.04.14
[ACA] Improving Cache Performance (0)	2020.04.13
[ACA] Memory Hierarchy and Caches (4) (0)	2020.04.12

현재글[ACA] Instruction-level Parallelism (1)

Blog

수(상), 고1, 수학, 곱셈공식의 변형의 활용, 다항식의나눗셈, neuralnetworkintelligence, neuralnetworkdistiller, 달서구, modelcompression, 수학동아리, 중간고사, 로피탈, 대곡고등학교, 대구, pruning #deeplearning #machinelearning #aiacceleration #경량화 #가속화, 삼각함수의극한, 고등수학, 몫과 나머지의 변형, pocketflow, mlkit,

Today :
Yesterday :

일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Blog

[ACA] Instruction-level Parallelism (1)

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Challenges

Data Dependence

Name Dependence

Data Hazards

Control Dependence

Compiler Techniques for Exposing ILP

Loop Unrolling

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Challenges

Data Dependence

Name Dependence

Data Hazards

Control Dependence

Compiler Techniques for Exposing ILP

Loop Unrolling

'컴퓨터 > 컴퓨터구조특론' 카테고리의 다른 글

'컴퓨터/컴퓨터구조특론'의 다른글

티스토리툴바

[ACA] Instruction-level Parallelism (1)

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Challenges

Data Dependence

Name Dependence

Data Hazards

Control Dependence

Compiler Techniques for Exposing ILP

Loop Unrolling

Instruction-level Parallelism (1)

Concepts and Challenges

ILP Basics

Challenges

Data Dependence

Name Dependence

Data Hazards

Control Dependence

Compiler Techniques for Exposing ILP

Loop Unrolling

'컴퓨터 > 컴퓨터구조특론' 카테고리의 다른 글

'컴퓨터/컴퓨터구조특론'의 다른글

관련글

티스토리툴바