sfrench/cifs-2.6.git
13 years agox86: cpa: set_memory_notpresent()
Ingo Molnar [Wed, 30 Jan 2008 12:34:07 +0000 (13:34 +0100)]
x86: cpa: set_memory_notpresent()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa: convert ioremap to new API
Thomas Gleixner [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: cpa: convert ioremap to new API

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix ioremap API
Thomas Gleixner [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: fix ioremap API

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix ioremap RAM check
Thomas Gleixner [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: fix ioremap RAM check

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix the missing BIOS area check in page_is_ram
Thomas Gleixner [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: fix the missing BIOS area check in page_is_ram

page_is_ram has a FIXME since ages, which reminds to sanity check the
BIOS area between 640k and 1M, which is sometimes falsely reported as
RAM in the e820 tables.

Implement the sanity check. Move the BIOS range defines from
pageattr.c into e820.h to avoid duplicate defines.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: move page_is_ram() function
Thomas Gleixner [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: move page_is_ram() function

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: deprecate change_page_attr() for drivers
Arjan van de Ven [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: deprecate change_page_attr() for drivers

With the introduction of the new API, no driver or non-archcore code needs
to use c-p-a anymore, so this patch also deprecates the EXPORT_SYMBOL of CPA
(it's a horrible API after all).

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: convert CPA users to the new set_page_ API
Arjan van de Ven [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: convert CPA users to the new set_page_ API

This patch converts various users of change_page_attr() to the new,
more intent driven set_page_*/set_memory_* API set.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: a new API for drivers/etc to control cache and other page attributes
Arjan van de Ven [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: a new API for drivers/etc to control cache and other page attributes

Right now, if drivers or other code want to change, say, a cache attribute of a
page, the only API they have is change_page_attr(). c-p-a is a really bad API
for this, because it forces the caller to know *ALL* the attributes he wants
for the page, not just the 1 thing he wants to change. So code that wants to
set a page uncachable, needs to be aware of the NX status as well etc etc etc.

This patch introduces a set of new APIs for this, set_pages_<attr> and
set_memory_<attr>, that offer a logical change to the user, and leave all
attributes not implied by the requested logical change alone.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: cpa: move clflush_cache_range()
Ingo Molnar [Wed, 30 Jan 2008 12:34:06 +0000 (13:34 +0100)]
x86: cpa: move clflush_cache_range()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: unify ioremap_32 and _64
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: unify ioremap_32 and _64

Unify the now identical ioremap_32.c and ioremap_64.c into the
same ioremap.c file. No code changed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: unify ioremap
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: unify ioremap

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: use remove_vm_are in ioremap_32 error path
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: use remove_vm_are in ioremap_32 error path

When ioremap_page_range fails, then we can use remove_vm_area instead
of vunmap safely.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: __iomem annotations
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: __iomem annotations

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: switch to change_page_attr_addr in ioremap_32.c
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: switch to change_page_attr_addr in ioremap_32.c

Use change_page_attr_addr() instead of change_page_attr(), which
simplifies the code significantly and matches the 64bit
implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: make c_p_a unconditional in ioremap
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: make c_p_a unconditional in ioremap

Make c_p_a unconditional for ioremap and iounmap. This ensures
complete consistency of the flags which are handed to
ioremap_page_range and the real flags in the mappings.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: introduce max_pfn_mapped
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: introduce max_pfn_mapped

64bit uses end_pfn_map and 32bit uses max_low_pfn. There are several
files which have #ifdef'ed defines which map either to end_pfn_map or
max_low_pfn. Replace this by a universal define and clean up all the
other instances.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: cleanup ioremap includes
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: cleanup ioremap includes

Get rid of the douplicate define of ISA_START/END_ADDRESS and use the
same headers in 32 and 64 bit code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: style cleanup of ioremap code
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: style cleanup of ioremap code

Fix the coding style before going further.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: fix ioremap pgprot inconsistency
Thomas Gleixner [Wed, 30 Jan 2008 12:34:05 +0000 (13:34 +0100)]
x86: fix ioremap pgprot inconsistency

The pgprot flags which are handed into ioremap_page_range() are
different to those which are set in change_page_attr(). The
ioremap_page_range flags are executable, while the c_p_a flags are
not.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: fix ioremap pgprot inconsistency
Thomas Gleixner [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: fix ioremap pgprot inconsistency

The pgprot flags which are handed into ioremap_page_range() are
different to those which are set in change_page_attr(). The
ioremap_page_range flags are executable, while the c_p_a flags are
not. Also make the mappings global (which is a NOP currently on 32bit,
although CPUs from PPRO+ onwards support it, but that's a separate
fix.)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
13 years agox86: turn the check_exec function into function that
Arjan van de Ven [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: turn the check_exec function into function that

What the check_exec() function really is trying to do is enforce certain
bits in the pgprot that are required by the x86 architecture, but that
callers might not be aware of (such as NX bit exclusion of the BIOS
area for BIOS based PCI access; it's not uncommon to ioremap the BIOS
region for various purposes and normally ioremap() memory has the NX bit
set).

This patch turns the check_exec() function into static_protections()
which also is now used to make sure the kernel text area remains non-NX
and that the .rodata section remains read-only. If the architecture
ends up requiring more such mandatory prot settings for specific areas,
this is now a reasonable place to add these.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa: make self-test depend on DEBUG_KERNEL
Ingo Molnar [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: cpa: make self-test depend on DEBUG_KERNEL

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: ioremap_nocache fix
Huang, Ying [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: ioremap_nocache fix

This patch fixes a bug of ioremap_nocache. ioremap_nocache() will call
__ioremap() with flags != 0 to do the real work, which will call
change_page_attr_addr() if phys_addr + size - 1 < (end_pfn_map << PAGE_SHIFT).
But some pages between 0 ~ end_pfn_map << PAGE_SHIFT are not mapped by
identity map, this will make change_page_attr_addr failed.

This patch is based on latest x86 git and has been tested on x86_64 platform.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add PAGE_KERNEL_EXEC_NOCACHE
Ingo Molnar [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: add PAGE_KERNEL_EXEC_NOCACHE

add PAGE_KERNEL_EXEC_NOCACHE.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix NX bit handling in change_page_attr()
Huang, Ying [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: fix NX bit handling in change_page_attr()

This patch fixes a bug of change_page_attr/change_page_attr_addr on
Intel i386/x86_64 CPUs.  After changing page attribute to be
executable with these functions, the page remains un-executable on
Intel i386/x86_64 CPU.  Because on Intel i386/x86_64 CPU, only if the
"NX" bits of all three level page tables are cleared (PAE is enabled),
the corresponding page is executable (refer to section 4.13.2 of Intel
64 and IA-32 Architectures Software Developer's Manual).  So, the bug
is fixed through clearing the "NX" bit of PMD when splitting the huge
PMD.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: change cpa to pfn based
Ingo Molnar [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: change cpa to pfn based

change CPA to pfn based.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: keep the BIOS area executable
Ingo Molnar [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: keep the BIOS area executable

keep the BIOS area executable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add PG_LEVEL enum
Thomas Gleixner [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: add PG_LEVEL enum

this way PG_LEVEL_1GB will be an easy change.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up lookup_address() declarations
Thomas Gleixner [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: clean up lookup_address() declarations

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up arch/x86/mm/pageattr.c
Ingo Molnar [Wed, 30 Jan 2008 12:34:04 +0000 (13:34 +0100)]
x86: clean up arch/x86/mm/pageattr.c

do some leftover cleanups in the now unified arch/x86/mm/pageattr.c
file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: re-add clflush_cache_range()
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: re-add clflush_cache_range()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: unify pageattr_32.c and pageattr_64.c
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: unify pageattr_32.c and pageattr_64.c

unify the now perfectly identical pageattr_32/64.c files - no code changed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: prepare for pageattr.c unification
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: prepare for pageattr.c unification

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: backmerge 64-bit details into 32-bit pageattr.c
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: backmerge 64-bit details into 32-bit pageattr.c

backmerge 64-bit details into 32-bit pageattr.c.

the pageattr_32.c and pageattr_64.c files are now identical.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: enable DEBUG_PAGEALLOC on 64-bit
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: enable DEBUG_PAGEALLOC on 64-bit

enable CONFIG_DEBUG_PAGEALLOC=y on 64-bit kernels too.

preliminary testing shows that it's working fine.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add kernel_map_pages() to 64-bit
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: add kernel_map_pages() to 64-bit

needed for DEBUG_PAGEALLOC support and for unification.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: return -EINVAL in __change_page_attr(), instead of 0
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: return -EINVAL in __change_page_attr(), instead of 0

careful: might change driver behavior - but this is the right
return value.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up differences between 64-bit and 32-bit
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: clean up differences between 64-bit and 32-bit

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: 64-bit, add the new split_large_page() function
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: 64-bit, add the new split_large_page() function

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: 64-bit pageattr.c, prepare for unification
Ingo Molnar [Wed, 30 Jan 2008 12:34:03 +0000 (13:34 +0100)]
x86: 64-bit pageattr.c, prepare for unification

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: change 64-bit pageattr to use set_pte_atomic()
Ingo Molnar [Wed, 30 Jan 2008 12:34:02 +0000 (13:34 +0100)]
x86: change 64-bit pageattr to use set_pte_atomic()

NOP change - same as set_pte().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: change 64-bit __change_page_attr() to struct page
Ingo Molnar [Wed, 30 Jan 2008 12:34:02 +0000 (13:34 +0100)]
x86: change 64-bit __change_page_attr() to struct page

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: simplify __change_page_attr()
Ingo Molnar [Wed, 30 Jan 2008 12:34:01 +0000 (13:34 +0100)]
x86: simplify __change_page_attr()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: introduce native_set_pte_atomic() on 64-bit too
Ingo Molnar [Wed, 30 Jan 2008 12:34:01 +0000 (13:34 +0100)]
x86: introduce native_set_pte_atomic() on 64-bit too

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up and simplify 64-bit split_large_page()
Ingo Molnar [Wed, 30 Jan 2008 12:34:00 +0000 (13:34 +0100)]
x86: clean up and simplify 64-bit split_large_page()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: unify header part of pageattr_64.c
Ingo Molnar [Wed, 30 Jan 2008 12:34:00 +0000 (13:34 +0100)]
x86: unify header part of pageattr_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: simplify pageattr_64.c
Ingo Molnar [Wed, 30 Jan 2008 12:33:59 +0000 (13:33 +0100)]
x86: simplify pageattr_64.c

simplify pageattr_64.c.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: prepare for the unification of the cpa code
Ingo Molnar [Wed, 30 Jan 2008 12:33:59 +0000 (13:33 +0100)]
x86: prepare for the unification of the cpa code

prepare for the unification of the cpa code, by unifying the
lookup_address() logic between 32-bit and 64-bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: prepare for the unification of the cpa code
Ingo Molnar [Wed, 30 Jan 2008 12:33:59 +0000 (13:33 +0100)]
x86: prepare for the unification of the cpa code

prepare for the unification of the cpa code, by unifying the
lookup_address() logic between 32-bit and 64-bit.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa self-test, WARN_ON()
Ingo Molnar [Wed, 30 Jan 2008 12:33:58 +0000 (13:33 +0100)]
x86: cpa self-test, WARN_ON()

add a WARN_ON() to the cpa-self-test failure branch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y
Ingo Molnar [Wed, 30 Jan 2008 12:33:58 +0000 (13:33 +0100)]
x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y

get more testing of the c_p_a() code done by not turning off
PSE on DEBUG_PAGEALLOC.

this simplifies the early pagetable setup code, and tests
the largepage-splitup code quite heavily.

In the end, all the largepages will be split up pretty quickly,
so there's no difference to how DEBUG_PAGEALLOC worked before.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa: simplify locking
Ingo Molnar [Wed, 30 Jan 2008 12:33:57 +0000 (13:33 +0100)]
x86: cpa: simplify locking

further simplify cpa locking: since the largepage-split is a
slowpath, use the pgd_lock for the whole operation, intead
of the mmap_sem.

This also makes it suitable for DEBUG_PAGEALLOC purposes again.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: simplify cpa largepage split, #3
Ingo Molnar [Wed, 30 Jan 2008 12:33:57 +0000 (13:33 +0100)]
x86: simplify cpa largepage split, #3

simplify cpa largepage split: push the reference protection bits
into the largepage-splitting function.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa self-test fixes
Ingo Molnar [Wed, 30 Jan 2008 12:33:56 +0000 (13:33 +0100)]
x86: cpa self-test fixes

cpa self-test fixes. change_page_attr_addr() was buggy, it
passed in a virtual address as a physical one.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: further cpa largepage-split cleanups
Ingo Molnar [Wed, 30 Jan 2008 12:33:56 +0000 (13:33 +0100)]
x86: further cpa largepage-split cleanups

further cpa largepage-split cleanups: make the splitup isolated
functionality, without leaking details back into __change_page_attr().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: simplify 32-bit cpa largepage splitting
Ingo Molnar [Wed, 30 Jan 2008 12:33:55 +0000 (13:33 +0100)]
x86: simplify 32-bit cpa largepage splitting

simplify 32-bit cpa largepage splitting: do a pure split and repeat
the pte lookup to get the new pte modified.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: simplify the 32-bit cpa code
Ingo Molnar [Wed, 30 Jan 2008 12:33:55 +0000 (13:33 +0100)]
x86: simplify the 32-bit cpa code

simplify the 32-bit cpa code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix some bugs about EFI runtime code mapping
Huang, Ying [Wed, 30 Jan 2008 12:33:55 +0000 (13:33 +0100)]
x86: fix some bugs about EFI runtime code mapping

This patch fixes some bugs of making EFI runtime code executable.

- Use change_page_attr in i386 too. Because the runtime code may be
  mapped not through ioremap.

- If there is no _PAGE_NX in __supported_pte_mask, the change_page_attr
  is not called.

- Make efi_ioremap map pages as PAGE_KERNEL_EXEC_NOCACHE, because EFI runtime
  code may be mapped through efi_ioremap.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix more non-global TLB flushes
Ingo Molnar [Wed, 30 Jan 2008 12:33:54 +0000 (13:33 +0100)]
x86: fix more non-global TLB flushes

fix more __flush_tlb() instances, out of caution.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix early_ioremap() on 64-bit
Andi Kleen [Wed, 30 Jan 2008 12:33:54 +0000 (13:33 +0100)]
x86: fix early_ioremap() on 64-bit

Fix early_ioremap() on x86-64

I had ACPI failures on several machines since a few days. Symptom
was NUMA nodes not getting detected or worse cores not getting detected.
They all came from ACPI not being able to read various of its tables. I finally
bisected it down to Jeremy's "put _PAGE_GLOBAL into PAGE_KERNEL" change.
With that the fix was fairly obvious. The problem was that early_ioremap()
didn't use a "_all" flush that would affect the global PTEs too. So
with global bits getting used everywhere now an early_ioremap would
not actually flush a mapping if something else was mapped previously
on that slot (which can happen with early_iounmap inbetween)

This patch changes all flushes in init_64.c to be __flush_tlb_all()
and fixes the problem here.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: remove set_kernel_exec()
Andi Kleen [Wed, 30 Jan 2008 12:33:53 +0000 (13:33 +0100)]
x86: remove set_kernel_exec()

The SMP trampoline always runs in real mode, so making it executable
in the page tables doesn't make much sense because it executes
before page tables are set up. That was the only user of
set_kernel_exec(). Remove set_kernel_exec().

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: introduce canon_pgprot()
Andi Kleen [Wed, 30 Jan 2008 12:33:53 +0000 (13:33 +0100)]
x86: introduce canon_pgprot()

Introduce canon_pgprot()

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: c_p_a() make it more robust against use of PAT bits
Andi Kleen [Wed, 30 Jan 2008 12:33:52 +0000 (13:33 +0100)]
x86: c_p_a() make it more robust against use of PAT bits

Use the page table level instead of the PSE bit to check if the PTE
is for a 4K page or not. This makes the code more robust when the PAT
bit is changed because the PAT bit on 4K pages is in the same position
as the PSE bit.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: c_p_a() fix: reorder TLB / cache flushes to follow Intel recommendation
Andi Kleen [Wed, 30 Jan 2008 12:33:52 +0000 (13:33 +0100)]
x86: c_p_a() fix: reorder TLB / cache flushes to follow Intel recommendation

Intel recommends to first flush the TLBs and then the caches
on caching attribute changes. c_p_a() previously did it the
other way round. Reorder that.

The procedure is still not fully compliant to the Intel documentation
because Intel recommends a all CPU synchronization step between
the TLB flushes and the cache flushes.

However on all new Intel CPUs this is now meaningless anyways
because they support Self-Snoop and can skip the cache flush
step anyway.

[ mingo@elte.hu: decoupled from clflush and ported it to x86.git ]

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix c_p_a() boot crash
Andi Kleen [Wed, 30 Jan 2008 12:33:52 +0000 (13:33 +0100)]
x86: fix c_p_a() boot crash

fix:

> hm, i just found a failing 64-bit .config while testing your CPA
> patchset:
>
>  [    1.916541] CPA mapping 4k 0 large 2048 gb 0 x 0[0-0] miss 0
>  [    1.919874] Unable to handle kernel paging request at 000000000335aea8 RIP:
>  [    1.919874]  [<ffffffff8021d2d3>] change_page_attr+0x3/0x61
>  [    1.919874] PGD 0
>  [    1.919874] Oops: 0000 [1]
>  [    1.919874] CPU 0

This handles addresses which don't have a mem_map entry.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: don't drop NX bit in pte modifier functions on 32-bit
Andi Kleen [Wed, 30 Jan 2008 12:33:51 +0000 (13:33 +0100)]
x86: don't drop NX bit in pte modifier functions on 32-bit

The pte_* modifier functions that cleared bits dropped the NX bit on 32bit
PAE because they only worked in int, but NX is in bit 63. Fix that
by adding appropiate casts so that the arithmetic happens as long long
on PAE kernels.

I decided to just use 64bit arithmetic instead of open coding like
pte_modify() because gcc should generate good enough code for that now.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add pte_pgprot to 32-bit
Andi Kleen [Wed, 30 Jan 2008 12:33:51 +0000 (13:33 +0100)]
x86: add pte_pgprot to 32-bit

64bit already had it.

Needed for later patches.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernels
Andi Kleen [Wed, 30 Jan 2008 12:33:50 +0000 (13:33 +0100)]
x86: shrink __PAGE_KERNEL/__PAGE_KERNEL_EXEC on non PAE kernels

No need to make it 64bit there.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa: remove unnecessary masking of address
Andi Kleen [Wed, 30 Jan 2008 12:33:50 +0000 (13:33 +0100)]
x86: cpa: remove unnecessary masking of address

virt_to_page does not care about the bits below the page granuality.
So don't mask them.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cpa: use wbinvd() macro instead of inline assembly in 64bit c_p_a()
Andi Kleen [Wed, 30 Jan 2008 12:33:50 +0000 (13:33 +0100)]
x86: cpa: use wbinvd() macro instead of inline assembly in 64bit c_p_a()

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: early_ioremap_init(), enhance warnings
Ingo Molnar [Wed, 30 Jan 2008 12:33:49 +0000 (13:33 +0100)]
x86: early_ioremap_init(), enhance warnings

enhance the debug warning in early_ioremap_init().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix EISA ioremap
Ingo Molnar [Wed, 30 Jan 2008 12:33:49 +0000 (13:33 +0100)]
x86: fix EISA ioremap

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix early_ioremap()/btmap
Ingo Molnar [Wed, 30 Jan 2008 12:33:48 +0000 (13:33 +0100)]
x86: fix early_ioremap()/btmap

fix a long-standing weakness of the early-ioremap allocator: it
uses a single pgd entry for the boot mappings, and was not properly
protecting itself against crossing a 2MB (4MB) boundary.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add early_ioremap() leak detection
Ingo Molnar [Wed, 30 Jan 2008 12:33:47 +0000 (13:33 +0100)]
x86: add early_ioremap() leak detection

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: make early_ioremap_debug early_param
Huang, Ying [Wed, 30 Jan 2008 12:33:45 +0000 (13:33 +0100)]
x86: make early_ioremap_debug early_param

This patch makes "early_ioremap_debug" a early parameter, because
"early_ioreamp/early_iounmap" is only used during early boot stage.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: early_ioremap(), debugging
Ingo Molnar [Wed, 30 Jan 2008 12:33:45 +0000 (13:33 +0100)]
x86: early_ioremap(), debugging

add early_ioremap() debug printouts via the early_ioremap_debug
boot option.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add debug warnings to early_ioremap()
Ingo Molnar [Wed, 30 Jan 2008 12:33:45 +0000 (13:33 +0100)]
x86: add debug warnings to early_ioremap()

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: increase the number of boot-mappings
Ingo Molnar [Wed, 30 Jan 2008 12:33:45 +0000 (13:33 +0100)]
x86: increase the number of boot-mappings

increase max early_ioremap() remapping size from 64K to 256K.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: enhance early_ioremap()
Ingo Molnar [Wed, 30 Jan 2008 12:33:45 +0000 (13:33 +0100)]
x86: enhance early_ioremap()

 - allow nesting of up to 4 levels

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: early_ioremap_reset fix
Huang, Ying [Wed, 30 Jan 2008 12:33:44 +0000 (13:33 +0100)]
x86: early_ioremap_reset fix

This patch fixes a bug of early_ioremap_reset.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86 32-bit boot: rename bt_ioremap() to early_ioremap()
Huang, Ying [Wed, 30 Jan 2008 12:33:44 +0000 (13:33 +0100)]
x86 32-bit boot: rename bt_ioremap() to early_ioremap()

This patch renames bt_ioremap to early_ioremap, which is used in
x86_64. This makes it easier to merge i386 and x86_64 usage.

[ mingo@elte.hu: fix ]

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: replace boot_ioremap() with enhanced bt_ioremap() - remove boot_ioremap()
Huang, Ying [Wed, 30 Jan 2008 12:33:44 +0000 (13:33 +0100)]
x86: replace boot_ioremap() with enhanced bt_ioremap() - remove boot_ioremap()

This patch replaces boot_ioremap invokation with bt_ioremap and
removes the boot_ioremap implementation.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agoi386 boot: replace boot_ioremap with enhanced bt_ioremap - enhance bt_ioremap
Huang, Ying [Wed, 30 Jan 2008 12:33:44 +0000 (13:33 +0100)]
i386 boot: replace boot_ioremap with enhanced bt_ioremap - enhance bt_ioremap

This patch makes it possible for bt_ioremap() to be used before
paging_init(), via providing an early implementation of set_fixmap()
that can be used before paging_init().

This way boot_ioremap() can be replaced by bt_ioremap().

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: set strong uncacheable where UC is really desired
Siddha, Suresh B [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: set strong uncacheable where UC is really desired

Also use _PAGE_PWT for all the mappings which need uncache mapping.
Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
we now use PAT3 which is strong uncacheable.

This makes it consistent with pgprot_noncached()

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: use __PAGE_KERNEL_EXEC in ioremap_64.c
Joerg Roedel [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: use __PAGE_KERNEL_EXEC in ioremap_64.c

This patch replaces the manual permission setup for pages in ioremap_64.c with
the pre-defined __PAGE_KERNEL_EXEC value.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: cleanup boot_ioremap_32.c
Thomas Gleixner [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: cleanup boot_ioremap_32.c

Coding style cleanup before modifying the file.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up arch/x86/mm/pageattr-test.c
Ingo Molnar [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: clean up arch/x86/mm/pageattr-test.c

fix 15 checkpatch warnings.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: c_p_a(), add simple self test at boot
Andi Kleen [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: c_p_a(), add simple self test at boot

Since change_page_attr() is tricky code it is good to have some regression
test code. This patch maps and unmaps some random pages in the direct mapping
at boot and then dumps the state and does some simple sanity checks.

Add it with a CONFIG option.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: return the page table level in lookup_address()
Ingo Molnar [Wed, 30 Jan 2008 12:33:43 +0000 (13:33 +0100)]
x86: return the page table level in lookup_address()

based on this patch from Andi Kleen:

|  Subject: CPA: Return the page table level in lookup_address()
|  From: Andi Kleen <ak@suse.de>
|
|  Needed for the next change.
|
|  And change all the callers.

and ported it to x86.git.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add pte accessors for the global bit
Andi Kleen [Wed, 30 Jan 2008 12:33:42 +0000 (13:33 +0100)]
x86: add pte accessors for the global bit

Needed for some test code.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up pte_exec
Andi Kleen [Wed, 30 Jan 2008 12:33:42 +0000 (13:33 +0100)]
x86: clean up pte_exec

- Rename it to pte_exec() from pte_exec_kernel(). There is nothing
kernel specific in there.
- Move it into the common file because _PAGE_NX is 0 on !PAE and then
pte_exec() will be always evaluate to true.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: add dump_pagetable helper to X86_32
Harvey Harrison [Wed, 30 Jan 2008 12:33:42 +0000 (13:33 +0100)]
x86: add dump_pagetable helper to X86_32

Similar to x86 64-bit.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agoc_p_a(): do a simple self test at boot
Andi Kleen [Wed, 30 Jan 2008 12:33:42 +0000 (13:33 +0100)]
c_p_a(): do a simple self test at boot

When CONFIG_DEBUG_RODATA is enabled undo the ro mapping and redo it again.
This gives some simple testing for change_page_attr().

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up arch/x86/mm/pageattr_64.c
Ingo Molnar [Wed, 30 Jan 2008 12:33:41 +0000 (13:33 +0100)]
x86: clean up arch/x86/mm/pageattr_64.c

clean up arch/x86/mm/pageattr_64.c.

no code changed:

   text    data     bss     dec     hex filename
   1751      16       0    1767     6e7 pageattr_64.o.before
   1751      16       0    1767     6e7 pageattr_64.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: clean up arch/x86/mm/pageattr_32.c
Ingo Molnar [Wed, 30 Jan 2008 12:33:41 +0000 (13:33 +0100)]
x86: clean up arch/x86/mm/pageattr_32.c

clean up arch/x86/mm/pageattr_32.c.

no code changed:

   text    data     bss     dec     hex filename
   1255      40       0    1295     50f pageattr_32.o.before
   1255      40       0    1295     50f pageattr_32.o.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: change ioremap() to default to uncached
Ingo Molnar [Wed, 30 Jan 2008 12:33:40 +0000 (13:33 +0100)]
x86: change ioremap() to default to uncached

Prepare ioremap() to default to uncached. This will be the
safest - but first we have to fix CPA.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: fix recursion in arch/x86/kernel/cpu/mcheck/mce_amd_64.c
Yinghai Lu [Wed, 30 Jan 2008 12:33:40 +0000 (13:33 +0100)]
x86: fix recursion in arch/x86/kernel/cpu/mcheck/mce_amd_64.c

remove the recursion from this function.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: allocate and initialize unshared pmds
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:40 +0000 (13:33 +0100)]
x86: allocate and initialize unshared pmds

If SHARED_KERNEL_PMD is false, then we need to allocate and initialize
the kernel pmd.  We can easily piggy-back this onto the existing pmd
prepopulation code.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
13 years agox86: preallocate pmds at pgd creation time
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:40 +0000 (13:33 +0100)]
x86: preallocate pmds at pgd creation time

In PAE mode, an update to the pgd requires a cr3 reload to make sure
the processor notices the changes.  Since this also has the
side-effect of flushing the tlb, its an expensive operation which we
want to avoid where possible.

This patch mitigates the cost of installing the initial set of pmds on
process creation by preallocating them when the pgd is allocated.
This avoids up to three tlb flushes during exec, as it creates the new
process address space while the pagetable is in active use.

The pmds will be freed as part of the normal pagetable teardown in
free_pgtables, which is called in munmap and process exit.  However,
free_pgtables will only free parts of the pagetable which actually
contain mappings, so stray pmds may still be attached to the pgd at
pgd_free time.  We must mop them up to prevent a memory leak.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: William Irwin <wli@holomorphy.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>