2010年7月20日 星期二

Dalvik Memory Management (1)

Dalvik vm heap construction

Introduction:
The Dalvik vm has its own heap to allocate/de-allocate memory spaces for the newly created java classes. However, the posix doesn't provide interfaces to malloc/free memory spaces within an already set memory region. Therefore, the Dalvik vm uses the doug lea allocator (dlmalloc) to help it. The dlmalloc provide APIs to create a memory region and manage it. By using the dlmalloc, Dalvik vm has a heap to serve the dynamic memory allocation in Java.

1. At the construction step of Dalvik vm, callee: dvmHeapSourceStartup will be called
- file
-- dalvik/vm/alloc/HeapSource.c
- parameter
-- size_t startSize
= gDvm.heapSizeStart = 2 * 1024 * 1024 = 2MB
-- size_t absoluteMaxSize
= gDvm.heapSizeMax = 16 * 1024 * 1024 = 16MB
- return value
-- GcHeap *

2. Then callee: createMspace will be called by caller: dvmHeapSourceStartup
- file
-- dalvik/vm/alloc/HeapSource.c

- parameter
-- size_t startSize
= gDvm.heapSizeStart = 2 * 1024 * 1024 = 2MB
-- size_t absoluteMaxSize
= gDvm.heapSizeMax = 16 * 1024 * 1024 = 16MB

- return value
-- static mspace *

- This function will do
Make the name of the need-to-create heap, and then call the API to create a contiguous memory space as the heap.

3. Then callee: create_contiguous_mspace_with_name will be called by caller: createMspace
- file
-- system/core/libcutils/mspace.c
- parameter
-- size_t starting_capacity
= startSize / 2 = gDvm.heapSizeStart / 2 = 1MB
-- size_t max_capacity
= absoluteMaxSize = gDvm.heapSizeMax = 16MB
-- int locked
= false
-- char const * name
= name generated in createMspace
- return value
-- mspace

- This function will do:
(1). max_capacity = (size_t)ALIGN_UP(max_capacity, pagesize);
set the heap's max size align up to page boundary

#define PAGESIZE 4096 = 4KB
#define ALIGN_UP(p, alignment) (((uintptr_t)(p) + (alignment)-1) & ~((alignment)-1))


(2). fd = ashmem_create_region(buf, max_capacity);
Map the /dev/zero to fd
fd = open("/dev/zero", O_RDWR);

(3). base = mmap(NULL, max_capacity, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
Do mmap() to map the opened file /dev/zero to the memory with length = max_capacity. On Linux, the mapping will be created at a nearby page boundary.This step will create the needed heap for dynamic memory allocation usage.

(4). The following step will make some installation to the mapped memory in order to manage it as mspace.

local variable: struct mspace_contig_state *cs;

struct mspace_contig_state {
    unsigned int magic;
    char *brk;
    char *top;
    mspace m;
};


cs = base;
-This will make the cs point to the starting address of the created memory space, then the header of the memory space is cs.

m = create_mspace_with_base(base + sizeof(*cs), starting_capacity, locked);
-This will create the mspace from the mapped memory space.


cs->brk = m->seg.base + m->seg.size;
-This set the current usable size of the mspace, which is initialized to
m->footprint = m->max_footprint = starting_capacity = 1MB

cs->top = (char *)base + max_capacity;
This set the max usable size of the mspace.



if (cs->brk != cs->top) {
    /* mprotect() requires page-aligned arguments, but it's possible
     * for cs->brk not to be page-aligned at this point.
     */
    char *prot_brk = (char *)ALIGN_UP(cs->brk, pagesize);
    if (mprotect(prot_brk, cs->top - prot_brk, PROT_NONE) < 0)
      goto error;
}
-This step prevents access to the memory Dalvim VM haven't handed out yet.


(5). return (mspace)m;
This step return the mspace pointer to the caller: createMspace

4. Then callee: mspace_set_max_allowed_footprint will be called by caller: createMspace
- parameter
-- mspace msp
= newly created mspace
-- size_t bytes
= startSize = 1MB

- local variable ms = (mstate)msp

-This function will set the current max_allowed_footprint. ms->max_allowed_footprint will be set to msp->footprint. This function will be called later to increase or decrease the
ms->max_allowed_footprint according to the usage of the Dalvik vm heap. If the Dalvik VM needs to grow its heap size, the ms->max_allowed_footprint will be increased.

2010年7月19日 星期一

inside the memory allocation (2)

Per process memory space



1. Segment Description

Segment
Description
Code - text segment
Often referred to as the text segment, this is the area in which the executable or binary image instructions reside.  For example, Linux/Unix arranges things so that multiple running instances of the same program share their code if possible.  Only one copy of the instructions for the same program resides in memory at any time.  The portion of the executable file containing the text segment is the text section.
Initialized data – data segment
Statically allocated and global data that are initialized with nonzero values live in the data segment.  Each process running the same program has its own data segment.  The portion of the executable file containing the data segment is the data section.
Uninitialized data – bss segment
BSS stands for ‘Block Started by Symbol’.  Global and statically allocated data that initialized to zero by default are kept in what is called the BSS area of the process.  Each process running the same program has its own BSS area.  When running, the BSS, data are placed in the data segment.  In the executable file, they are stored in the BSS section.  For Linux/Unix the format of an executable, only variables that are initialized to a nonzero value occupy space in the executable’s disk file.
Heap
The heap is where dynamic memory (obtained by malloc()calloc()realloc() and new – C++) comes from.  Everything on a heap is anonymous, thus you can only access parts of it through a pointer. As memory is allocated on the heap, the process’s address space grows.  Although it is possible to give memory back to the system and shrink a process’s address space, this is almost never done because it will be allocated to other process again.   Freed memory (free()and delete – C++) goes back to the heap, creating what is called holes.   It is typical for the heap to grow upward.  This means that successive items that are added to the heap are added at addresses that are numerically greater than previous items.  It is also typical for the heap to start immediately after the BSS area of the data segment.  The end of the heap is marked by a pointer known as the break. You cannot reference past the break. You can, however, move the break pointer (via brk() and sbrk() system calls) to a new position to increase the amount of heap memory available.
Stack
The stack segment is where local (automatic) variables are allocated.  In C program, local variables are all variables declared inside the opening left curly brace of a function's body including the main() or other left curly brace that aren’t defined as static.  The data is popped up or pushed into the stack following the Last In First Out (LIFO) rule.  The stack holds local variables, temporary information, function parameters, return address and the like.  When a function is called, a stack frame (or a procedure activation record) is created and PUSHed onto the top of the stack. This stack frame contains information such as the address from which the function was called and where to jump back to when the function is finished (return address), parameters, local variables, and any other information needed by the invoked function. The order of the information may vary by system and compiler.  When a function returns, the stack frame is POPped from the stack.  Typically the stack grows downward, meaning that items deeper in the call chain are at numerically lower addresses and toward the heap.


2. The break point between stack and heap

The break point is at the end of the heap. Normally, a heap grows upward starting from the end of bss segment. Therefore, the sbrk system call is used to increase the size of a heap. This system call is widely used by many memory allocators

3. Reference:
-http://www.tenouk.com/ModuleZ.html
-Linux System Programming

2010年7月18日 星期日

inside the memory allocation (1)

1. malloc()algorithm


1. If our allocator has not been initialized, initialize it.

2. Add sizeof(struct mem_control_block) to the size requested.

3. Start at managed_memory_start.

4. Are we at last_valid address?

5. If we are:

   A. We didn't find any existing space that was large enough
      -- ask the operating system for more and return that.

6. Otherwise:

   A. Is the current space available (check is_available from
      the mem_control_block)?

   B. If it is:

      i)   Is it large enough (check "size" from the
           mem_control_block)?

      ii)  If so:

           a. Mark it as unavailable

           b. Move past mem_control_block and return the
              pointer

      iii) Otherwise:

           a. Move forward "size" bytes

           b. Go back go step 4

   C. Otherwise:

      i)   Move forward "size" bytes

      ii)  Go back to step 4



2. Defined global variables

malloc_init is going to be our function to initialize our memory allocator. It does three things: marks our allocator as being initialized, finds the last valid memory address on the system, and sets up the pointer to the beginning of our managed memory. These three variables are global variables:

int has_initialized = 0;
void *managed_memory_start;
void *last_valid_address;

3. The malloc_init() function

As mentioned above, the edge of mapped memory -- last valid address -- is often known as the system break or the current break. On many UNIX® systems, to find the current system break, you use the function sbrk(0)sbrk moves the current system break by the number of bytes in its argument, and then returns the new system break. Calling it with an argument of 0 simply returns the current break. Here is our malloc initialization code, which finds the current break and initializes our variables:

void malloc_init()
{
    /* grab the last valid address from the OS */
    last_valid_address = sbrk(0);
    /* 
    * we don't have any memory to manage yet, so
    * just set the beginning to be last_valid_address
    */
    managed_memory_start = last_valid_address;
    /* Okay, we're initialized and ready to go */
    has_initialized = 1;
}

4. The Memory Control Block

Now, in order to properly manage memory, we need to be able to track what we are allocating and deallocating. We need to do things like mark blocks as unused after free has been called on them, and be able to locate unused blocks when malloc is called. Therefore, the start of every piece of memory returned by malloc will have this structure at the beginning:

struct mem_control_block {
    int is_available;
    int size;
};



Now, you might think that this would cause problems for programs calling malloc -- how do they know about this struct? The answer is that they don't have to know about it; we will hide it by moving the pointer past this struct before we return it. This will make the returned pointer point to memory that is not used for any other purpose. That way, from the calling programs' perspective, all they get is free, open memory. Then, when they pass the pointer back via free(), we simply back up a few memory bytes to find this structure again.
We're going to talk about freeing before we talk about allocating memory, because it's simpler. The only thing we have to do to free memory is to take the pointer we're given, back up sizeof(struct mem_control_block) bytes, and mark it as available. Here is the code for that:


5. The void free(void *ptr) function

For freeing memoty, get the ptr of the allocated memory, and move back
"sizeof(struct mem_control_block)" chars, then mark it as available.

void free(void *firstbyte) 
{
    struct mem_control_block *mcb;
    /* 
     * Backup from the given pointer 
     * to find the mem_control_block
     */
    mcb = firstbyte - sizeof(struct mem_control_block);
    /* Mark the block as being available */
    mcb->is_available = 1;
    /* That's It!  We're done. */
    return;
}

6. The void *malloc(size_t size) function

void *malloc(long numbytes) {
    /* Holds where we are looking in memory */
    void *current_location;
    /* This is the same as current_location, but cast to a
     * memory_control_block
     */
    struct mem_control_block *current_location_mcb;
    /* This is the memory location we will return.  It will
     * be set to 0 until we find something suitable
     */
    void *memory_location;
    /* Initialize if we haven't already done so */
    if(! has_initialized) {
        malloc_init();
    }
    /* The memory we search for has to include the memory
     * control block, but the users of malloc don't need 
     * to know this, so we'll just add it in for them.
     */
    numbytes = numbytes + sizeof(struct mem_control_block);
    /* Set memory_location to 0 until we find a suitable
     * location
     */
    memory_location = 0;
    /* Begin searching at the start of managed memory */
    current_location = managed_memory_start;
    /* Keep going until we have searched all allocated space */
    while(current_location != last_valid_address)
    {
        /* current_location and current_location_mcb point
         * to the same address.  However, current_location_mcb
         * is of the correct type, so we can use it as a struct.
         * current_location is a void pointer so we can use it
         * to calculate addresses.
         */
        current_location_mcb =
        (struct mem_control_block *)current_location;
        if(current_location_mcb->is_available)
        {
            if(current_location_mcb->size >= numbytes)
            {
            /* Woohoo!  We've found an open,
             * appropriately-size location.
             */
                /* It is no longer available */
                current_location_mcb->is_available = 0;
                /* We own it */
                memory_location = current_location;
                /* Leave the loop */
                break;
            }
        }
        /* If we made it here, it's because the Current memory
         * block not suitable; move to the next one
         */
        current_location = current_location +
        current_location_mcb->size;
    }
    /* If we still don't have a valid location, we'll
     * have to ask the operating system for more memory
     */
    if(! memory_location)
    {
        /* Move the program break numbytes further */
        sbrk(numbytes);
        /* The new memory will be where the last valid
         * address left off
         */
        memory_location = last_valid_address;
        /* We'll move the last valid address forward
         * numbytes
         */
        last_valid_address = last_valid_address + numbytes;
        /* We need to initialize the mem_control_block */
        current_location_mcb = memory_location;
        current_location_mcb->is_available = 0;
        current_location_mcb->size = numbytes;
    }
    /* Now, no matter what (well, except for error conditions),
     * memory_location has the address of the memory, including
     * the mem_control_block
     */
    /* Move the pointer past the mem_control_block */
    memory_location = memory_location + sizeof(struct mem_control_block);
    /* Return the pointer */
    return memory_location;
}

7. reference:
-http://www.ibm.com/developerworks/linux/library/l-memory/

2010年7月1日 星期四

ACM 103

alright

It's a greedy algorithm problem

Use an adjacency list to save the graph,
then find every vertex's successor which has most successors.