Operating Systems: Three Easy Pieces Ch. 19

April 20, 2023 3 분 소요

Paging: Faster Translations

TLB: Translation-lookaside buffer
- Part of MMU: Memory-management unit
- hardware cache: unlike memory cache, it is visible
Hardware-managed TLB
1. extract virtual page number from virtual address and check TLB holds a translation
  1. TLB Hit : TLB hold the translation
  2. TLB Miss : TLB doesn’t have the translatinon
    1. In the case, update the TLB with the translation
    2. It is costly since it has more memory access
2. avoid TLB miss is important
Example
- In the case of array
  - Since the array has continous memory, if we load memory unit by memory size, TLB hit occurs often
- Spatial locality
  - The element of the array are packed tightly into pages
- Temporal locality
  - Quick re-referencing of memory items in time
Handling TLB miss
- CISC (complex-instruction set computers), the old day system, engineer doesn’t believe OS
  - Therefore, hardware know exactly where the page tables located via a page-table base register
  - Hardware-managed TLB, uses fixed multi-level page table
- RISC (reduced-instrution set computers)
  - TLB miss occurs, hardware simply raise the exception
  - Software-managed TLB
  - On the TLB miss → exception → raises privileged level to kernel mode, and jumps to a trap handler
  - Important details
    - return-from-trap instructions needs to be a little different
      - It resume the execution from the instruction “cause” the TLB miss because it now will occur TLB hit
    - TLB miss-handling, need to extra careful not to cause an infinite chain of TLB miss
      - Many solutions → keep TLB miss handler in physical memory
      - reserve some entries in the TLB for permanenetly-valid translations and use some of those permananent translation slots for the handler code it self → wired translation
  - Flexibility → OS can use any data structure it wants to implement page table
  - Simplicity → the hardware doesn’t have to do much on a miss
TLB Content
- TLB might have 32, 64, 128 entries with fully associative
- VPN PFN other bits
- Other bits
  - valid bit - whether the entry has a valid translation or not
  - protection bit - how can a page can be accessed
  - Address-space identifier
  - dirty bit
- Context switches
  - the TLB contains virtual-to-physical translations that are only valid for the currently running process; these translations are not meaningful for other processes.
  - Need to be careful not to be use other TLB
  - If there is same vpn, the hardware can’t distinguish which entry is meant for which process
  - Solution
    - TLB flush
      - flush the TLB on context switch, emptying TLB before running the new process
        
        On a software-based system, this can be accomplished with an explicit (and privileged) hardware instruction; with a hardware-managed TLB, the flush could be enacted when the page-table base register is changed (note the OS must change the PTBR on a context switch anyhow)
        
        There is a cost, it must incur TLB misses at it touches its data and code pages
    - ASID
      - address space identifier
      - TLB can hold same VPN with ASID without confusion
      - Two processes can share a page with this method
- Replacement policy
  - main issue: cache replacement; how can we replace?
  - LRU(Least-recently-used)
    - take advantage of locality in the memory-reference stream
  - Random policy
- The real TLB
  - Page 193
- The use of TLBs in virtual memory systems can significantly improve address translation performance by caching frequently accessed page table entries on-chip, reducing the need to access main memory. This results in performance similar to that of non-virtualized memory. However, TLBs may be less effective for programs that access a large number of pages in a short time, leading to TLB misses and decreased performance. One solution to this issue is to use larger page sizes to increase TLB coverage, but TLB access can also become a bottleneck in the CPU pipeline, particularly with physically-indexed caches, and virtually-indexed caches introduce new hardware design issues.

Operating Systems: Three Easy Pieces Ch. 19

참고

2023-HGU-ML Lecture 10. Classification

6학기를 돌아보며

2023-HGU-ML Lecture 9. Nonlinear Dimension Reduction

2023-HGU-ML Lecture 8. Dimension Reduction