IceWalkers.com - Linux Software downloads and news
Name : Password :
Linux SoftwareLinux RPMLinux HowtosLink UsAboutAdvertise

KernelAnalysis-HOWTO

Search Howtos :Match :
Next Previous Contents

7. Linux Memory Management

7.1 Overview

Linux uses segmentation + pagination, which simplifies notation.

Segments

Linux uses only 4 segments:

  • 2 segments (code and data/stack) for KERNEL SPACE from [0xC000 0000] (3 GB) to [0xFFFF FFFF] (4 GB)
  • 2 segments (code and data/stack) for USER SPACE from [0] (0 GB) to [0xBFFF FFFF] (3 GB)

                               __
   4 GB--->|                |    |
           |     Kernel     |    |  Kernel Space (Code + Data/Stack)
           |                |  __|
   3 GB--->|----------------|  __
           |                |    |
           |                |    |
   2 GB--->|                |    |
           |     Tasks      |    |  User Space (Code + Data/Stack)
           |                |    |
   1 GB--->|                |    |
           |                |    |
           |________________|  __| 
 0x00000000
          Kernel/User Linear addresses
 

7.2 Specific i386 implementation

Again, Linux implements Pagination using 3 Levels of Paging, but in i386 architecture only 2 of them are really used:

 
   ------------------------------------------------------------------
   L    I    N    E    A    R         A    D    D    R    E    S    S
   ------------------------------------------------------------------
        \___/                 \___/                     \_____/ 
 
     PD offset              PF offset                 Frame offset 
     [10 bits]              [10 bits]                 [12 bits]       
          |                     |                          |
          |                     |     -----------          |        
          |                     |     |  Value  |----------|---------
          |     |         |     |     |---------|   /|\    |        |
          |     |         |     |     |         |    |     |        |
          |     |         |     |     |         |    | Frame offset |
          |     |         |     |     |         |   \|/             |
          |     |         |     |     |---------|<------            |
          |     |         |     |     |         |      |            |
          |     |         |     |     |         |      | x 4096     |
          |     |         |  PF offset|_________|-------            |
          |     |         |       /|\ |         |                   |
      PD offset |_________|-----   |  |         |          _________|
            /|\ |         |    |   |  |         |          | 
             |  |         |    |  \|/ |         |         \|/
 _____       |  |         |    ------>|_________|   PHYSICAL ADDRESS 
|     |     \|/ |         |    x 4096 |         |
| CR3 |-------->|         |           |         |
|_____|         | ....... |           | ....... |
                |         |           |         |    
 
               Page Directory          Page File

                       Linux i386 Paging
 


7.3 Memory Mapping

Linux manages Access Control with Pagination only, so different Tasks will have the same segment addresses, but different CR3 (register used to store Directory Page Address), pointing to different Page Entries.

In User mode a task cannot overcome 3 GB limit (0 x C0 00 00 00), so only the first 768 page directory entries are meaningful (768*4MB = 3GB).

When a Task goes in Kernel Mode (by System call or by IRQ) the other 256 pages directory entries become important, and they point to the same page files as all other Tasks (which are the same as the Kernel).

Note that Kernel (and only kernel) Linear Space is equal to Kernel Physical Space, so:

 
            ________________ _____                    
           |Other KernelData|___  |  |                |
           |----------------|   | |__|                |
           |     Kernel     |\  |____|   Real Other   |
  3 GB --->|----------------| \      |   Kernel Data  |
           |                |\ \     |                |
           |              __|_\_\____|__   Real       |
           |      Tasks     |  \ \   |     Tasks      |
           |              __|___\_\__|__   Space      |
           |                |    \ \ |                |
           |                |     \ \|----------------|
           |                |      \ |Real KernelSpace|
           |________________|       \|________________|
      
           Logical Addresses          Physical Addresses
 

Linear Kernel Space corresponds to Physical Kernel Space translated 3 GB down (in fact page tables are something like { "00000000", "00000001" }, so they operate no virtualization, they only report physical addresses they take from linear ones).

Notice that you'll not have an "addresses conflict" between Kernel and User spaces because we can manage physical addresses with Page Tables.

7.4 Low level memory allocation

Boot Initialization

We start from kmem_cache_init (launched by start_kernel [init/main.c] at boot up).

|kmem_cache_init
   |kmem_cache_estimate

kmem_cache_init [mm/slab.c]

kmem_cache_estimate

Now we continue with mem_init (also launched by start_kernel[init/main.c])

|mem_init
   |free_all_bootmem
      |free_all_bootmem_core

mem_init [arch/i386/mm/init.c]

free_all_bootmem [mm/bootmem.c]

free_all_bootmem_core

Run-time allocation

Under Linux, when we want to allocate memory, for example during "copy_on_write" mechanism (see Cap.10), we call:

|copy_mm 
   |allocate_mm = kmem_cache_alloc
      |__kmem_cache_alloc
         |kmem_cache_alloc_one
            |alloc_new_slab
               |kmem_cache_grow
                  |kmem_getpages
                     |__get_free_pages
                        |alloc_pages
                           |alloc_pages_pgdat
                              |__alloc_pages
                                 |rmqueue   
                                 |reclaim_pages

Functions can be found under:

  • copy_mm [kernel/fork.c]
  • allocate_mm [kernel/fork.c]
  • kmem_cache_alloc [mm/slab.c]
  • __kmem_cache_alloc
  • kmem_cache_alloc_one
  • alloc_new_slab
  • kmem_cache_grow
  • kmem_getpages
  • __get_free_pages [mm/page_alloc.c]
  • alloc_pages [mm/numa.c]
  • alloc_pages_pgdat
  • __alloc_pages [mm/page_alloc.c]
  • rm_queue
  • reclaim_pages [mm/vmscan.c]

TODO: Understand Zones

7.5 Swap

Overview

Swap is managed by the kswapd daemon (kernel thread).

kswapd

As other kernel threads, kswapd has a main loop that wait to wake up.

|kswapd
   |// initialization routines
   |for (;;) { // Main loop
      |do_try_to_free_pages
      |recalculate_vm_stats
      |refill_inactive_scan
      |run_task_queue
      |interruptible_sleep_on_timeout // we sleep for a new swap request
   |}

  • kswapd [mm/vmscan.c]
  • do_try_to_free_pages
  • recalculate_vm_stats [mm/swap.c]
  • refill_inactive_scan [mm/vmswap.c]
  • run_task_queue [kernel/softirq.c]
  • interruptible_sleep_on_timeout [kernel/sched.c]

When do we need swapping?

Swapping is needed when we have to access a page that is not in physical memory.

Linux uses ''kswapd'' kernel thread to carry out this purpose. When the Task receives a page fault exception we do the following:

 
 | Page Fault Exception
 | cause by all these conditions: 
 |   a-) User page 
 |   b-) Read or write access 
 |   c-) Page not present
 |
 |
 -----------> |do_page_fault
                 |handle_mm_fault
                    |pte_alloc 
                       |pte_alloc_one
                          |__get_free_page = __get_free_pages
                             |alloc_pages
                                |alloc_pages_pgdat
                                   |__alloc_pages
                                      |wakeup_kswapd // We wake up kernel thread kswapd
   
                   Page Fault ICA
 

  • do_page_fault [arch/i386/mm/fault.c]
  • handle_mm_fault [mm/memory.c]
  • pte_alloc
  • pte_alloc_one [include/asm/pgalloc.h]
  • __get_free_page [include/linux/mm.h]
  • __get_free_pages [mm/page_alloc.c]
  • alloc_pages [mm/numa.c]
  • alloc_pages_pgdat
  • __alloc_pages
  • wakeup_kswapd [mm/vmscan.c]

Next Previous Contents
Search Howtos :Match :
Transmission 1.41 beta 2
Lightweight, yet powerful BitTorrent client
X-Moto 0.5.0
2D motocross platform game
Gdm 2.25.1
Reimplementation of the well known xdm program.
Linux Kernel 2.6 2.6.28-rc7
Linux Kernel
Linux Kernel 2.4 2.4.37
Linux Kernel
RIP 7.3
Small linux system for the purpose of system booting or repairing
GEdit 2.25.1
Small but powerful text editor
VLC media player 0.9.7
Cross-platform media player and streaming server
GNOME 2.25.2
GNOME desktop environment
WebGUI 7.6.5
A fully featured content management system.
Free IT Magazines, White Papers, eBooks, and more !
Dr. Dobb's Journal

Dr. Dobb's Journal enables programmers to write the most efficient and sophisticated programs and help in daily programming quandaries.

The 7 Things that IT Security Professionals MUST KNOW!

Gain key insight into security problem and find the safest means to protect your technological assets.

Database Trends and Applications

Provides timely coverage of the technology, intelligence and insight needed to plan, implement and manage information-rich projects.

Linux Software Map
Find Linux RPM
Best Rated Linux Software
Most Rated Linux Software
Linux Distributions
Linux Howtos
Quick Survey

Please take our survey and help us improve our website to serve you better.

Thank you.
Linux Software
Linux / IT Resources
Site Resources
Google
Privacy Policy
Contact Us
Submit Software
Advertising info