</contrib>
<affiliation>
<address>
- <email>awalls@radix.net</email>
+ <email>awalls@md.metrocast.net</email>
</address>
</affiliation>
</author>
automatically, similar to sensing the video standard. To do so, applications
call <constant> VIDIOC_QUERY_DV_PRESET</constant> with a pointer to a
&v4l2-dv-preset; type. Once the hardware detects a preset, that preset is
-returned in the preset field of &v4l2-dv-preset;. When detection is not
-possible or fails, the value V4L2_DV_INVALID is returned.</para>
+returned in the preset field of &v4l2-dv-preset;. If the preset could not be
+detected because there was no signal, or the signal was unreliable, or the
+signal did not map to a supported preset, then the value V4L2_DV_INVALID is
+returned.</para>
</refsect1>
<refsect1>
7 Dec 2005
17 Jul 2007 Updated
+(c) Mauro Carvalho Chehab <mchehab@redhat.com>
+05 Aug 2009 Nehalem interface
EDAC is maintained and written by:
The 'test_device_edac' sample driver is located at the
bluesmoke.sourceforge.net project site for EDAC.
+=======================================================================
+NEHALEM USAGE OF EDAC APIs
+
+This chapter documents some EXPERIMENTAL mappings for EDAC API to handle
+Nehalem EDAC driver. They will likely be changed on future versions
+of the driver.
+
+Due to the way Nehalem exports Memory Controller data, some adjustments
+were done at i7core_edac driver. This chapter will cover those differences
+
+1) On Nehalem, there are one Memory Controller per Quick Patch Interconnect
+ (QPI). At the driver, the term "socket" means one QPI. This is
+ associated with a physical CPU socket.
+
+ Each MC have 3 physical read channels, 3 physical write channels and
+ 3 logic channels. The driver currenty sees it as just 3 channels.
+ Each channel can have up to 3 DIMMs.
+
+ The minimum known unity is DIMMs. There are no information about csrows.
+ As EDAC API maps the minimum unity is csrows, the driver sequencially
+ maps channel/dimm into different csrows.
+
+ For example, suposing the following layout:
+ Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
+ dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
+ dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
+ Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
+ dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
+ Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
+ dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
+ The driver will map it as:
+ csrow0: channel 0, dimm0
+ csrow1: channel 0, dimm1
+ csrow2: channel 1, dimm0
+ csrow3: channel 2, dimm0
+
+exports one
+ DIMM per csrow.
+
+ Each QPI is exported as a different memory controller.
+
+2) Nehalem MC has the hability to generate errors. The driver implements this
+ functionality via some error injection nodes:
+
+ For injecting a memory error, there are some sysfs nodes, under
+ /sys/devices/system/edac/mc/mc?/:
+
+ inject_addrmatch/*:
+ Controls the error injection mask register. It is possible to specify
+ several characteristics of the address to match an error code:
+ dimm = the affected dimm. Numbers are relative to a channel;
+ rank = the memory rank;
+ channel = the channel that will generate an error;
+ bank = the affected bank;
+ page = the page address;
+ column (or col) = the address column.
+ each of the above values can be set to "any" to match any valid value.
+
+ At driver init, all values are set to any.
+
+ For example, to generate an error at rank 1 of dimm 2, for any channel,
+ any bank, any page, any column:
+ echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
+ echo 1 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank
+
+ To return to the default behaviour of matching any, you can do:
+ echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
+ echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank
+
+ inject_eccmask:
+ specifies what bits will have troubles,
+
+ inject_section:
+ specifies what ECC cache section will get the error:
+ 3 for both
+ 2 for the highest
+ 1 for the lowest
+
+ inject_type:
+ specifies the type of error, being a combination of the following bits:
+ bit 0 - repeat
+ bit 1 - ecc
+ bit 2 - parity
+
+ inject_enable starts the error generation when something different
+ than 0 is written.
+
+ All inject vars can be read. root permission is needed for write.
+
+ Datasheet states that the error will only be generated after a write on an
+ address that matches inject_addrmatch. It seems, however, that reading will
+ also produce an error.
+
+ For example, the following code will generate an error for any write access
+ at socket 0, on any DIMM/address on channel 2:
+
+ echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/channel
+ echo 2 >/sys/devices/system/edac/mc/mc0/inject_type
+ echo 64 >/sys/devices/system/edac/mc/mc0/inject_eccmask
+ echo 3 >/sys/devices/system/edac/mc/mc0/inject_section
+ echo 1 >/sys/devices/system/edac/mc/mc0/inject_enable
+ dd if=/dev/mem of=/dev/null seek=16k bs=4k count=1 >& /dev/null
+
+ For socket 1, it is needed to replace "mc0" by "mc1" at the above
+ commands.
+
+ The generated error message will look like:
+
+ EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))
+
+3) Nehalem specific Corrected Error memory counters
+
+ Nehalem have some registers to count memory errors. The driver uses those
+ registers to report Corrected Errors on devices with Registered Dimms.
+
+ However, those counters don't work with Unregistered Dimms. As the chipset
+ offers some counters that also work with UDIMMS (but with a worse level of
+ granularity than the default ones), the driver exposes those registers for
+ UDIMM memories.
+
+ They can be read by looking at the contents of all_channel_counts/
+
+ $ for i in /sys/devices/system/edac/mc/mc0/all_channel_counts/*; do echo $i; cat $i; done
+ /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0
+ 0
+ /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1
+ 0
+ /sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2
+ 0
+
+ What happens here is that errors on different csrows, but at the same
+ dimm number will increment the same counter.
+ So, in this memory mapping:
+ csrow0: channel 0, dimm0
+ csrow1: channel 0, dimm1
+ csrow2: channel 1, dimm0
+ csrow3: channel 2, dimm0
+ The hardware will increment udimm0 for an error at the first dimm at either
+ csrow0, csrow2 or csrow3;
+ The hardware will increment udimm1 for an error at the second dimm at either
+ csrow0, csrow2 or csrow3;
+ The hardware will increment udimm2 for an error at the third dimm at either
+ csrow0, csrow2 or csrow3;
+
+4) Standard error counters
+
+ The standard error counters are generated when an mcelog error is received
+ by the driver. Since, with udimm, this is counted by software, it is
+ possible that some errors could be lost. With rdimm's, they displays the
+ contents of the registers
----------------------------
-What: "acpi=ht" boot option
-When: 2.6.35
-Why: Useful in 2003, implementation is a hack.
- Generally invoked by accident today.
- Seen as doing more harm than good.
-Who: Len Brown <len.brown@intel.com>
-
-----------------------------
-
What: iwlwifi 50XX module parameters
When: 2.6.40
Why: The "..50" modules parameters were used to configure 5000 series and
175 -> Leadtek Winfast DTV1000S [107d:6655]
176 -> Beholder BeholdTV 505 RDS [0000:5051]
177 -> Hawell HW-404M7
-179 -> Beholder BeholdTV H7 [5ace:7190]
-180 -> Beholder BeholdTV A7 [5ace:7090]
+178 -> Beholder BeholdTV H7 [5ace:7190]
+179 -> Beholder BeholdTV A7 [5ace:7090]
+180 -> Avermedia M733A [1461:4155,1461:4255]
sonixj 0c45:6040 Speed NVC 350K
sonixj 0c45:607c Sonix sn9c102p Hv7131R
sonixj 0c45:60c0 Sangha Sn535
+sonixj 0c45:60ce USB-PC-Camera-168 (TALK-5067)
sonixj 0c45:60ec SN9C105+MO4000
sonixj 0c45:60fb Surfer NoName
sonixj 0c45:60fc LG-LIC300
F: sound/pci/cs5535audio/
CX18 VIDEO4LINUX DRIVER
-M: Andy Walls <awalls@radix.net>
+M: Andy Walls <awalls@md.metrocast.net>
L: ivtv-devel@ivtvdriver.org (moderated for non-subscribers)
L: linux-media@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6.git
F: drivers/hwmon/it87.c
IVTV VIDEO4LINUX DRIVER
-M: Andy Walls <awalls@radix.net>
+M: Andy Walls <awalls@md.metrocast.net>
L: ivtv-devel@ivtvdriver.org (moderated for non-subscribers)
L: linux-media@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6.git
*/
out_of_memory:
up_read(&mm->mmap_sem);
- printk("VM: killing process %s\n", current->comm);
- if (user_mode(__frame))
- do_group_exit(SIGKILL);
- goto no_context;
+ if (!user_mode(__frame))
+ goto no_context;
+ pagefault_out_of_memory();
+ return;
do_sigbus:
up_read(&mm->mmap_sem);
if ((error_code & ACE_INSTRUCTION) && !(vma->vm_flags & VM_EXEC))
goto bad_area;
-survive:
/*
* If for any reason at all we couldn't handle the fault,
* make sure we exit gracefully rather than endlessly redo
*/
out_of_memory:
up_read(&mm->mmap_sem);
- if (is_global_init(tsk)) {
- yield();
- down_read(&mm->mmap_sem);
- goto survive;
- }
- printk("VM: killing process %s\n", tsk->comm);
- if (error_code & ACE_USERMODE)
- do_group_exit(SIGKILL);
- goto no_context;
+ if (!(error_code & ACE_USERMODE))
+ goto no_context;
+ pagefault_out_of_memory();
+ return;
do_sigbus:
up_read(&mm->mmap_sem);
*/
out_of_memory:
up_read(&mm->mmap_sem);
- monitor_signal(regs);
- printk(KERN_ALERT "VM: killing process %s\n", tsk->comm);
- if ((fault_code & MMUFCR_xFC_ACCESS) == MMUFCR_xFC_ACCESS_USR)
- do_exit(SIGKILL);
- goto no_context;
+ if ((fault_code & MMUFCR_xFC_ACCESS) != MMUFCR_xFC_ACCESS_USR)
+ goto no_context;
+ pagefault_out_of_memory();
+ return;
do_sigbus:
up_read(&mm->mmap_sem);
def_bool y
select EMBEDDED
select HAVE_CLK
- select HAVE_IDE
+ select HAVE_IDE if HAS_IOPORT
select HAVE_LMB
select HAVE_OPROFILE
select HAVE_GENERIC_DMA_COHERENT
config ARCH_HAS_CPU_IDLE_WAIT
def_bool y
+config NO_IOPORT
+ bool
+
config IO_TRAPPED
bool
default "0x00010000" if PAGE_SIZE_64KB
default "0x00000000"
+config ROMIMAGE_MMCIF
+ bool "Include MMCIF loader in romImage (EXPERIMENTAL)"
+ depends on CPU_SUBTYPE_SH7724 && EXPERIMENTAL
+ help
+ Say Y here to include experimental MMCIF loading code in
+ romImage. With this enabled it is possible to write the romImage
+ kernel image to an MMC card and boot the kernel straight from
+ the reset vector. At reset the processor Mask ROM will load the
+ first part of the romImage which in turn loads the rest the kernel
+ image to RAM using the MMCIF hardware block.
+
choice
prompt "Kernel command line"
optional
bool "SDK7786"
depends on CPU_SUBTYPE_SH7786
select SYS_SUPPORTS_PCI
+ select NO_IOPORT if !PCI
help
Select SDK7786 if configuring for a Renesas Technology Europe
SH7786-65nm board.
depends on CPU_SUBTYPE_SH7786
select ARCH_REQUIRE_GPIOLIB
select SYS_SUPPORTS_PCI
+ select NO_IOPORT if !PCI
config SH_MIGOR
bool "Migo-R"
config SH_X3PROTO
bool "SH-X3 Prototype board"
depends on CPU_SUBTYPE_SHX3
+ select NO_IOPORT if !PCI
config SH_MAGIC_PANEL_R2
bool "Magic Panel R2"
.set_capture = camera_set_capture,
};
-struct soc_camera_link camera_link = {
+static struct soc_camera_link camera_link = {
.bus_id = 0,
.add_device = ap325rxa_camera_add,
.del_device = ap325rxa_camera_del,
#include <linux/device.h>
#include <linux/platform_device.h>
#include <linux/mfd/sh_mobile_sdhi.h>
+#include <linux/mmc/host.h>
+#include <linux/mmc/sh_mmcif.h>
#include <linux/mtd/physmap.h>
#include <linux/gpio.h>
#include <linux/interrupt.h>
#include <linux/mmc/host.h>
#include <linux/input.h>
#include <linux/input/sh_keysc.h>
-#include <linux/mfd/sh_mobile_sdhi.h>
#include <video/sh_mobile_lcdc.h>
#include <sound/sh_fsi.h>
#include <media/sh_mobile_ceu.h>
},
};
-struct sh_eth_plat_data sh_eth_plat = {
+static struct sh_eth_plat_data sh_eth_plat = {
.phy = 0x1f, /* SMSC LAN8700 */
.edmac_endian = EDMAC_LITTLE_ENDIAN,
.ether_link_active_low = 1
};
/* USB0 host */
-void usb0_port_power(int port, int power)
+static void usb0_port_power(int port, int power)
{
gpio_set_value(GPIO_PTB4, power);
}
};
/* USB1 host/function */
-void usb1_port_power(int port, int power)
+static void usb1_port_power(int port, int power)
{
gpio_set_value(GPIO_PTB5, power);
}
return 0;
}
-struct tsc2007_platform_data tsc2007_info = {
+static struct tsc2007_platform_data tsc2007_info = {
.model = 2007,
.x_plate_ohms = 180,
.get_pendown_state = ts_get_pendown_state,
};
#ifdef CONFIG_MFD_SH_MOBILE_SDHI
-/* SHDI0 */
+/* SDHI0 */
static void sdhi0_set_pwr(struct platform_device *pdev, int state)
{
gpio_set_value(GPIO_PTB6, state);
},
};
-/* SHDI1 */
+#if !defined(CONFIG_MMC_SH_MMCIF)
+/* SDHI1 */
static void sdhi1_set_pwr(struct platform_device *pdev, int state)
{
gpio_set_value(GPIO_PTB7, state);
.hwblk_id = HWBLK_SDHI1,
},
};
+#endif /* CONFIG_MMC_SH_MMCIF */
#else
.rate = 0, /* unknown */
};
-struct sh_fsi_platform_info fsi_info = {
+static struct sh_fsi_platform_info fsi_info = {
.portb_flags = SH_FSI_BRS_INV |
SH_FSI_OUT_SLAVE_MODE |
SH_FSI_IN_SLAVE_MODE |
#include <media/ak881x.h>
#include <media/sh_vou.h>
-struct ak881x_pdata ak881x_pdata = {
+static struct ak881x_pdata ak881x_pdata = {
.flags = AK881X_IF_MODE_SLAVE,
};
.platform_data = &ak881x_pdata,
};
-struct sh_vou_pdata sh_vou_pdata = {
+static struct sh_vou_pdata sh_vou_pdata = {
.bus_fmt = SH_VOU_BUS_8BIT,
.flags = SH_VOU_HSYNC_LOW | SH_VOU_VSYNC_LOW,
.board_info = &ak8813,
},
};
+#if defined(CONFIG_MMC_SH_MMCIF)
+/* SH_MMCIF */
+static void mmcif_set_pwr(struct platform_device *pdev, int state)
+{
+ gpio_set_value(GPIO_PTB7, state);
+}
+
+static void mmcif_down_pwr(struct platform_device *pdev)
+{
+ gpio_set_value(GPIO_PTB7, 0);
+}
+
+static struct resource sh_mmcif_resources[] = {
+ [0] = {
+ .name = "SH_MMCIF",
+ .start = 0xA4CA0000,
+ .end = 0xA4CA00FF,
+ .flags = IORESOURCE_MEM,
+ },
+ [1] = {
+ /* MMC2I */
+ .start = 29,
+ .flags = IORESOURCE_IRQ,
+ },
+ [2] = {
+ /* MMC3I */
+ .start = 30,
+ .flags = IORESOURCE_IRQ,
+ },
+};
+
+static struct sh_mmcif_plat_data sh_mmcif_plat = {
+ .set_pwr = mmcif_set_pwr,
+ .down_pwr = mmcif_down_pwr,
+ .sup_pclk = 0, /* SH7724: Max Pclk/2 */
+ .caps = MMC_CAP_4_BIT_DATA |
+ MMC_CAP_8_BIT_DATA |
+ MMC_CAP_NEEDS_POLL,
+ .ocr = MMC_VDD_32_33 | MMC_VDD_33_34,
+};
+
+static struct platform_device sh_mmcif_device = {
+ .name = "sh_mmcif",
+ .id = 0,
+ .dev = {
+ .platform_data = &sh_mmcif_plat,
+ },
+ .num_resources = ARRAY_SIZE(sh_mmcif_resources),
+ .resource = sh_mmcif_resources,
+};
+#endif
+
static struct platform_device *ecovec_devices[] __initdata = {
&heartbeat_device,
&nor_flash_device,
&keysc_device,
#ifdef CONFIG_MFD_SH_MOBILE_SDHI
&sdhi0_device,
+#if !defined(CONFIG_MMC_SH_MMCIF)
&sdhi1_device,
+#endif
#else
&msiof0_device,
#endif
&fsi_device,
&irda_device,
&vou_device,
+#if defined(CONFIG_MMC_SH_MMCIF)
+ &sh_mmcif_device,
+#endif
};
#ifdef CONFIG_I2C
gpio_request(GPIO_PTB6, NULL);
gpio_direction_output(GPIO_PTB6, 0);
+#if !defined(CONFIG_MMC_SH_MMCIF)
/* enable SDHI1 on CN12 (needs DS2.6,7 set to ON,OFF) */
gpio_request(GPIO_FN_SDHI1CD, NULL);
gpio_request(GPIO_FN_SDHI1WP, NULL);
/* I/O buffer drive ability is high for SDHI1 */
__raw_writew((__raw_readw(IODRIVEA) & ~0x3000) | 0x2000 , IODRIVEA);
+#endif /* CONFIG_MMC_SH_MMCIF */
#else
/* enable MSIOF0 on CN11 (needs DS2.4 set to OFF) */
gpio_request(GPIO_FN_MSIOF0_TXD, NULL);
gpio_request(GPIO_PTU5, NULL);
gpio_direction_output(GPIO_PTU5, 0);
+#if defined(CONFIG_MMC_SH_MMCIF)
+ /* enable MMCIF (needs DS2.6,7 set to OFF,ON) */
+ gpio_request(GPIO_FN_MMC_D7, NULL);
+ gpio_request(GPIO_FN_MMC_D6, NULL);
+ gpio_request(GPIO_FN_MMC_D5, NULL);
+ gpio_request(GPIO_FN_MMC_D4, NULL);
+ gpio_request(GPIO_FN_MMC_D3, NULL);
+ gpio_request(GPIO_FN_MMC_D2, NULL);
+ gpio_request(GPIO_FN_MMC_D1, NULL);
+ gpio_request(GPIO_FN_MMC_D0, NULL);
+ gpio_request(GPIO_FN_MMC_CLK, NULL);
+ gpio_request(GPIO_FN_MMC_CMD, NULL);
+ gpio_request(GPIO_PTB7, NULL);
+ gpio_direction_output(GPIO_PTB7, 0);
+
+ /* I/O buffer drive ability is high for MMCIF */
+ __raw_writew((__raw_readw(IODRIVEA) & ~0x3000) | 0x2000 , IODRIVEA);
+#endif
+
/* enable I2C device */
i2c_register_board_info(0, i2c0_devices,
ARRAY_SIZE(i2c0_devices));
return gpio_get_value(GPIO_PTA1); /* NAND_RBn */
}
-struct platform_nand_data migor_nand_flash_data = {
+static struct platform_nand_data migor_nand_flash_data = {
.chip = {
.nr_chips = 1,
.partitions = migor_nand_flash_partitions,
};
/* change J20, J21, J22 pin to 1-2 connection to use slave mode */
-struct sh_fsi_platform_info fsi_info = {
+static struct sh_fsi_platform_info fsi_info = {
.porta_flags = SH_FSI_BRS_INV |
SH_FSI_OUT_SLAVE_MODE |
SH_FSI_IN_SLAVE_MODE |
},
};
-struct sh_eth_plat_data sh_eth_plat = {
+static struct sh_eth_plat_data sh_eth_plat = {
.phy = 0x1f, /* SMSC LAN8187 */
.edmac_endian = EDMAC_LITTLE_ENDIAN,
};
#include <media/ak881x.h>
#include <media/sh_vou.h>
-struct ak881x_pdata ak881x_pdata = {
+static struct ak881x_pdata ak881x_pdata = {
.flags = AK881X_IF_MODE_SLAVE,
};
.platform_data = &ak881x_pdata,
};
-struct sh_vou_pdata sh_vou_pdata = {
+static struct sh_vou_pdata sh_vou_pdata = {
.bus_fmt = SH_VOU_BUS_8BIT,
.flags = SH_VOU_HSYNC_LOW | SH_VOU_VSYNC_LOW,
.board_info = &ak8813,
#
# linux/arch/sh/boot/romimage/Makefile
#
-# create an image suitable for burning to flash from zImage
+# create an romImage file suitable for burning to flash/mmc from zImage
#
targets := vmlinux head.o zeropage.bin piggy.o
+load-y := 0
-OBJECTS = $(obj)/head.o
-LDFLAGS_vmlinux := --oformat $(ld-bfd) -Ttext 0 -e romstart \
+mmcif-load-$(CONFIG_CPU_SUBTYPE_SH7724) := 0xe5200000 # ILRAM
+mmcif-obj-$(CONFIG_CPU_SUBTYPE_SH7724) := $(obj)/mmcif-sh7724.o
+load-$(CONFIG_ROMIMAGE_MMCIF) := $(mmcif-load-y)
+obj-$(CONFIG_ROMIMAGE_MMCIF) := $(mmcif-obj-y)
+
+LDFLAGS_vmlinux := --oformat $(ld-bfd) -Ttext $(load-y) -e romstart \
-T $(obj)/../../kernel/vmlinux.lds
-$(obj)/vmlinux: $(OBJECTS) $(obj)/piggy.o FORCE
+$(obj)/vmlinux: $(obj)/head.o $(obj-y) $(obj)/piggy.o FORCE
$(call if_changed,ld)
@:
/* include board specific setup code */
#include <mach/romimage.h>
+#ifdef CONFIG_ROMIMAGE_MMCIF
+ /* load the romImage to above the empty zero page */
+ mov.l empty_zero_page_dst, r4
+ mov.l empty_zero_page_dst_adj, r5
+ add r5, r4
+ mov.l bytes_to_load, r5
+ mov.l loader_function, r7
+ jsr @r7
+ mov r4, r15
+
+ mov.l empty_zero_page_dst, r4
+ mov.l empty_zero_page_dst_adj, r5
+ add r5, r4
+ mov.l loaded_code_offs, r5
+ add r5, r4
+ jmp @r4
+ nop
+
+ .balign 4
+empty_zero_page_dst_adj:
+ .long PAGE_SIZE
+bytes_to_load:
+ .long end_data - romstart
+loader_function:
+ .long mmcif_loader
+loaded_code_offs:
+ .long loaded_code - romstart
+loaded_code:
+#endif /* CONFIG_ROMIMAGE_MMCIF */
+
/* copy the empty_zero_page contents to where vmlinux expects it */
- mova empty_zero_page_src, r0
+ mova extra_data_pos, r0
+ mov.l extra_data_size, r1
+ add r1, r0
mov.l empty_zero_page_dst, r1
mov #(PAGE_SHIFT - 4), r4
mov #1, r3
mov #PAGE_SHIFT, r4
mov #1, r1
shld r4, r1
- mova empty_zero_page_src, r0
+ mova extra_data_pos, r0
+ add r1, r0
+ mov.l extra_data_size, r1
add r1, r0
jmp @r0
nop
.align 2
empty_zero_page_dst:
.long _text
-empty_zero_page_src:
+extra_data_pos:
+extra_data_size:
+ .long zero_page_pos - extra_data_pos
--- /dev/null
+/*
+ * sh7724 MMCIF loader
+ *
+ * Copyright (C) 2010 Magnus Damm
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License. See the file "COPYING" in the main directory of this archive
+ * for more details.
+ */
+
+#include <linux/mmc/sh_mmcif.h>
+#include <mach/romimage.h>
+
+#define MMCIF_BASE (void __iomem *)0xa4ca0000
+
+#define MSTPCR2 0xa4150038
+#define PTWCR 0xa4050146
+#define PTXCR 0xa4050148
+#define PSELA 0xa405014e
+#define PSELE 0xa4050156
+#define HIZCRC 0xa405015c
+#define DRVCRA 0xa405018a
+
+enum { MMCIF_PROGRESS_ENTER, MMCIF_PROGRESS_INIT,
+ MMCIF_PROGRESS_LOAD, MMCIF_PROGRESS_DONE };
+
+/* SH7724 specific MMCIF loader
+ *
+ * loads the romImage from an MMC card starting from block 512
+ * use the following line to write the romImage to an MMC card
+ * # dd if=arch/sh/boot/romImage of=/dev/sdx bs=512 seek=512
+ */
+asmlinkage void mmcif_loader(unsigned char *buf, unsigned long no_bytes)
+{
+ mmcif_update_progress(MMCIF_PROGRESS_ENTER);
+
+ /* enable clock to the MMCIF hardware block */
+ __raw_writel(__raw_readl(MSTPCR2) & ~0x20000000, MSTPCR2);
+
+ /* setup pins D7-D0 */
+ __raw_writew(0x0000, PTWCR);
+
+ /* setup pins MMC_CLK, MMC_CMD */
+ __raw_writew(__raw_readw(PTXCR) & ~0x000f, PTXCR);
+
+ /* select D3-D0 pin function */
+ __raw_writew(__raw_readw(PSELA) & ~0x2000, PSELA);
+
+ /* select D7-D4 pin function */
+ __raw_writew(__raw_readw(PSELE) & ~0x3000, PSELE);
+
+ /* disable Hi-Z for the MMC pins */
+ __raw_writew(__raw_readw(HIZCRC) & ~0x0620, HIZCRC);
+
+ /* high drive capability for MMC pins */
+ __raw_writew(__raw_readw(DRVCRA) | 0x3000, DRVCRA);
+
+ mmcif_update_progress(MMCIF_PROGRESS_INIT);
+
+ /* setup MMCIF hardware */
+ sh_mmcif_boot_init(MMCIF_BASE);
+
+ mmcif_update_progress(MMCIF_PROGRESS_LOAD);
+
+ /* load kernel via MMCIF interface */
+ sh_mmcif_boot_slurp(MMCIF_BASE, buf, no_bytes);
+
+ /* disable clock to the MMCIF hardware block */
+ __raw_writel(__raw_readl(MSTPCR2) | 0x20000000, MSTPCR2);
+
+ mmcif_update_progress(MMCIF_PROGRESS_DONE);
+}
SECTIONS
{
.text : {
+ zero_page_pos = .;
*(.data)
+ end_data = .;
}
}
#include <asm/io_generic.h>
#include <asm/io_trapped.h>
+#ifdef CONFIG_HAS_IOPORT
+
#define inb(p) sh_mv.mv_inb((p))
#define inw(p) sh_mv.mv_inw((p))
#define inl(p) sh_mv.mv_inl((p))
#define outsw(p,b,c) sh_mv.mv_outsw((p), (b), (c))
#define outsl(p,b,c) sh_mv.mv_outsl((p), (b), (c))
+#endif
+
#define __raw_writeb(v,a) (__chk_io_ptr(a), *(volatile u8 __force *)(a) = (v))
#define __raw_writew(v,a) (__chk_io_ptr(a), *(volatile u16 __force *)(a) = (v))
#define __raw_writel(v,a) (__chk_io_ptr(a), *(volatile u32 __force *)(a) = (v))
#define IO_SPACE_LIMIT 0xffffffff
+#ifdef CONFIG_HAS_IOPORT
+
/*
* This function provides a method for the generic case where a
* board-specific ioport_map simply needs to return the port + some
#define __ioport_map(p, n) sh_mv.mv_ioport_map((p), (n))
+#endif
+
/* We really want to try and get these to memcpy etc */
void memcpy_fromio(void *, const volatile void __iomem *, unsigned long);
void memcpy_toio(volatile void __iomem *, const void *, unsigned long);
const char *mv_name;
int mv_nr_irqs;
+ int (*mv_irq_demux)(int irq);
+ void (*mv_init_irq)(void);
+
+#ifdef CONFIG_HAS_IOPORT
u8 (*mv_inb)(unsigned long);
u16 (*mv_inw)(unsigned long);
u32 (*mv_inl)(unsigned long);
void (*mv_outsw)(unsigned long, const void *src, unsigned long count);
void (*mv_outsl)(unsigned long, const void *src, unsigned long count);
- int (*mv_irq_demux)(int irq);
-
- void (*mv_init_irq)(void);
-
void __iomem *(*mv_ioport_map)(unsigned long port, unsigned int size);
void (*mv_ioport_unmap)(void __iomem *);
+#endif
int (*mv_clk_init)(void);
int (*mv_mode_pins)(void);
* MD3: BSC - Area0 Bus Width (16/32-bit) [CS0BCR.9,10]
* MD5: BSC - Endian Mode (L: Big, H: Little) [CMNCR.3]
* MD8: Test Mode
+ * BOOT: FBR - Boot Mode (L: MMCIF, H: Area0)
*/
/* Pin Function Controller:
+#ifdef __ASSEMBLY__
+
/* do nothing here by default */
+
+#else /* __ASSEMBLY__ */
+
+extern inline void mmcif_update_progress(int nr)
+{
+}
+
+#endif /* __ASSEMBLY__ */
+#ifdef __ASSEMBLY__
+
/* EcoVec board specific boot code:
* converts the "partner-jet-script.txt" script into assembly
* the assembly code is the first code to be executed in the romImage
.align 2
1 : .long 0xa8000000
2 :
+
+#else /* __ASSEMBLY__ */
+
+/* Ecovec board specific information:
+ *
+ * Set the following to enable MMCIF boot from the MMC card in CN12:
+ *
+ * DS1.5 = OFF (SH BOOT pin set to L)
+ * DS2.6 = OFF (Select MMCIF on CN12 instead of SDHI1)
+ * DS2.7 = ON (Select MMCIF on CN12 instead of SDHI1)
+ *
+ */
+#define HIZCRA 0xa4050158
+#define PGDR 0xa405012c
+
+extern inline void mmcif_update_progress(int nr)
+{
+ /* disable Hi-Z for LED pins */
+ __raw_writew(__raw_readw(HIZCRA) & ~(1 << 1), HIZCRA);
+
+ /* update progress on LED4, LED5, LED6 and LED7 */
+ __raw_writeb(1 << (nr - 1), PGDR);
+}
+
+#endif /* __ASSEMBLY__ */
+#ifdef __ASSEMBLY__
+
/* kfr2r09 board specific boot code:
* converts the "partner-jet-script.txt" script into assembly
* the assembly code is the first code to be executed in the romImage
.align 2
1: .long 0xa8000000
2:
+
+#else /* __ASSEMBLY__ */
+
+extern inline void mmcif_update_progress(int nr)
+{
+}
+
+#endif /* __ASSEMBLY__ */
CFLAGS_REMOVE_return_address.o = -pg
obj-y := clkdev.o debugtraps.o dma-nommu.o dumpstack.o \
- idle.o io.o io_generic.o irq.o \
+ idle.o io.o irq.o \
irq_$(BITS).o machvec.o nmi_debug.o process.o \
process_$(BITS).o ptrace_$(BITS).o \
reboot.o return_address.o \
obj-$(CONFIG_HIBERNATION) += swsusp.o
obj-$(CONFIG_DWARF_UNWINDER) += dwarf.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_callchain.o
+obj-$(CONFIG_HAS_IOPORT) += io_generic.o
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST) += localtimer.o
static struct dwarf_cie *cached_cie;
+static unsigned int dwarf_unwinder_ready;
+
/**
* dwarf_frame_alloc_reg - allocate memory for a DWARF register
* @frame: the DWARF frame whose list of registers we insert on
struct dwarf_reg *reg;
unsigned long addr;
+ /*
+ * If we've been called in to before initialization has
+ * completed, bail out immediately.
+ */
+ if (!dwarf_unwinder_ready)
+ return NULL;
+
/*
* If we're starting at the top of the stack we need get the
* contents of a physical register to get the CFA in order to
*/
static int __init dwarf_unwinder_init(void)
{
- int err;
+ int err = -ENOMEM;
dwarf_frame_cachep = kmem_cache_create("dwarf_frames",
sizeof(struct dwarf_frame), 0,
mempool_alloc_slab,
mempool_free_slab,
dwarf_frame_cachep);
+ if (!dwarf_frame_pool)
+ goto out;
dwarf_reg_pool = mempool_create(DWARF_REG_MIN_REQ,
mempool_alloc_slab,
mempool_free_slab,
dwarf_reg_cachep);
+ if (!dwarf_reg_pool)
+ goto out;
err = dwarf_parse_section(__start_eh_frame, __stop_eh_frame, NULL);
if (err)
if (err)
goto out;
+ dwarf_unwinder_ready = 1;
+
return 0;
out:
printk(KERN_ERR "Failed to initialise DWARF unwinder: %d\n", err);
dwarf_unwinder_cleanup();
- return -EINVAL;
+ return err;
}
early_initcall(dwarf_unwinder_init);
}
}
EXPORT_SYMBOL(memset_io);
-
-#ifndef CONFIG_GENERIC_IOMAP
-
-void __iomem *ioport_map(unsigned long port, unsigned int nr)
-{
- void __iomem *ret;
-
- ret = __ioport_map_trapped(port, nr);
- if (ret)
- return ret;
-
- return __ioport_map(port, nr);
-}
-EXPORT_SYMBOL(ioport_map);
-
-void ioport_unmap(void __iomem *addr)
-{
- sh_mv.mv_ioport_unmap(addr);
-}
-EXPORT_SYMBOL(ioport_unmap);
-
-#endif /* CONFIG_GENERIC_IOMAP */
void generic_ioport_unmap(void __iomem *addr)
{
}
+
+#ifndef CONFIG_GENERIC_IOMAP
+void __iomem *ioport_map(unsigned long port, unsigned int nr)
+{
+ void __iomem *ret;
+
+ ret = __ioport_map_trapped(port, nr);
+ if (ret)
+ return ret;
+
+ return __ioport_map(port, nr);
+}
+EXPORT_SYMBOL(ioport_map);
+
+void ioport_unmap(void __iomem *addr)
+{
+ sh_mv.mv_ioport_unmap(addr);
+}
+EXPORT_SYMBOL(ioport_unmap);
+#endif /* CONFIG_GENERIC_IOMAP */
tiop->magic = IO_TRAPPED_MAGIC;
INIT_LIST_HEAD(&tiop->list);
spin_lock_irq(&trapped_lock);
+#ifdef CONFIG_HAS_IOPORT
if (flags & IORESOURCE_IO)
list_add(&tiop->list, &trapped_io);
+#endif
+#ifdef CONFIG_HAS_IOMEM
if (flags & IORESOURCE_MEM)
list_add(&tiop->list, &trapped_mem);
+#endif
spin_unlock_irq(&trapped_lock);
return 0;
sh_mv.mv_##elem = generic_##elem; \
} while (0)
+#ifdef CONFIG_HAS_IOPORT
+
+#ifdef P2SEG
+ __set_io_port_base(P2SEG);
+#else
+ __set_io_port_base(0);
+#endif
+
mv_set(inb); mv_set(inw); mv_set(inl);
mv_set(outb); mv_set(outw); mv_set(outl);
mv_set(ioport_map);
mv_set(ioport_unmap);
+
+#endif
+
mv_set(irq_demux);
mv_set(mode_pins);
mv_set(mem_init);
if (!sh_mv.mv_nr_irqs)
sh_mv.mv_nr_irqs = NR_IRQS;
-
-#ifdef P2SEG
- __set_io_port_base(P2SEG);
-#else
- __set_io_port_base(0);
-#endif
}
struct dwarf_frame *tmp;
tmp = dwarf_unwind_stack(ra, frame);
+ if (!tmp)
+ return NULL;
if (frame)
dwarf_free_frame(frame);
current->thread.fault_catcher = NULL;
- kunmap_atomic(page, KM_UML_USERCOPY);
+ kunmap_atomic((void *)addr, KM_UML_USERCOPY);
return n;
}
extern struct pci_bus *pci_root_bus;
extern struct pci_ops pci_root_ops;
+void pcibios_scan_specific_bus(int busn);
+
/* pci-irq.c */
struct irq_info {
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/debugfs.h>
+#include <linux/edac_mce.h>
#include <asm/processor.h>
#include <asm/hw_irq.h>
for (;;) {
entry = rcu_dereference_check_mce(mcelog.next);
for (;;) {
+ /*
+ * If edac_mce is enabled, it will check the error type
+ * and will process it, if it is a known error.
+ * Otherwise, the error will be sent through mcelog
+ * interface
+ */
+ if (edac_mce_parse(mce))
+ return;
+
/*
* When the buffer fills up discard new entries.
* Assume that the earlier errors are the more
*/
static void __devinit pcibios_fixup_peer_bridges(void)
{
- int n, devfn;
- long node;
+ int n;
if (pcibios_last_bus <= 0 || pcibios_last_bus > 0xff)
return;
DBG("PCI: Peer bridge fixup\n");
- for (n=0; n <= pcibios_last_bus; n++) {
- u32 l;
- if (pci_find_bus(0, n))
- continue;
- node = get_mp_bus_to_node(n);
- for (devfn = 0; devfn < 256; devfn += 8) {
- if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
- l != 0x0000 && l != 0xffff) {
- DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
- printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
- pci_scan_bus_on_node(n, &pci_root_ops, node);
- break;
- }
- }
- }
+ for (n=0; n <= pcibios_last_bus; n++)
+ pcibios_scan_specific_bus(n);
}
int __init pci_legacy_init(void)
return 0;
}
+void pcibios_scan_specific_bus(int busn)
+{
+ int devfn;
+ long node;
+ u32 l;
+
+ if (pci_find_bus(0, busn))
+ return;
+
+ node = get_mp_bus_to_node(busn);
+ for (devfn = 0; devfn < 256; devfn += 8) {
+ if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) &&
+ l != 0x0000 && l != 0xffff) {
+ DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l);
+ printk(KERN_INFO "PCI: Discovered peer bus %02x\n", busn);
+ pci_scan_bus_on_node(busn, &pci_root_ops, node);
+ return;
+ }
+ }
+}
+EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus);
+
int __init pci_subsys_init(void)
{
/*
* make sure we exit gracefully rather than endlessly redo
* the fault.
*/
-survive:
fault = handle_mm_fault(mm, vma, address, is_write ? FAULT_FLAG_WRITE : 0);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
*/
out_of_memory:
up_read(&mm->mmap_sem);
- if (is_global_init(current)) {
- yield();
- down_read(&mm->mmap_sem);
- goto survive;
- }
- printk("VM: killing process %s\n", current->comm);
- if (user_mode(regs))
- do_group_exit(SIGKILL);
- bad_page_fault(regs, address, SIGKILL);
+ if (!user_mode(regs))
+ bad_page_fault(regs, address, SIGKILL);
+ else
+ pagefault_out_of_memory();
return;
do_sigbus:
{
struct request_list *rl = &q->rq;
+ if (unlikely(rl->rq_pool))
+ return 0;
+
rl->count[BLK_RW_SYNC] = rl->count[BLK_RW_ASYNC] = 0;
rl->starved[BLK_RW_SYNC] = rl->starved[BLK_RW_ASYNC] = 0;
rl->elvpriv = 0;
struct request_queue *
blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
{
- struct request_queue *q = blk_alloc_queue_node(GFP_KERNEL, node_id);
+ struct request_queue *uninit_q, *q;
+
+ uninit_q = blk_alloc_queue_node(GFP_KERNEL, node_id);
+ if (!uninit_q)
+ return NULL;
+
+ q = blk_init_allocated_queue_node(uninit_q, rfn, lock, node_id);
+ if (!q)
+ blk_cleanup_queue(uninit_q);
- return blk_init_allocated_queue_node(q, rfn, lock, node_id);
+ return q;
}
EXPORT_SYMBOL(blk_init_queue_node);
return NULL;
q->node = node_id;
- if (blk_init_free_list(q)) {
- kmem_cache_free(blk_requestq_cachep, q);
+ if (blk_init_free_list(q))
return NULL;
- }
q->request_fn = rfn;
q->prep_rq_fn = NULL;
return q;
}
- blk_put_queue(q);
return NULL;
}
EXPORT_SYMBOL(blk_init_allocated_queue_node);
static struct completion *ioc_gone;
static DEFINE_SPINLOCK(ioc_gone_lock);
+static DEFINE_SPINLOCK(cic_index_lock);
+static DEFINE_IDA(cic_index_ida);
+
#define CFQ_PRIO_LISTS IOPRIO_BE_NR
#define cfq_class_idle(cfqq) ((cfqq)->ioprio_class == IOPRIO_CLASS_IDLE)
#define cfq_class_rt(cfqq) ((cfqq)->ioprio_class == IOPRIO_CLASS_RT)
unsigned int cfq_latency;
unsigned int cfq_group_isolation;
+ unsigned int cic_index;
struct list_head cic_list;
/*
cic->cfqq[is_sync] = cfqq;
}
+#define CIC_DEAD_KEY 1ul
+#define CIC_DEAD_INDEX_SHIFT 1
+
+static inline void *cfqd_dead_key(struct cfq_data *cfqd)
+{
+ return (void *)(cfqd->cic_index << CIC_DEAD_INDEX_SHIFT | CIC_DEAD_KEY);
+}
+
+static inline struct cfq_data *cic_to_cfqd(struct cfq_io_context *cic)
+{
+ struct cfq_data *cfqd = cic->key;
+
+ if (unlikely((unsigned long) cfqd & CIC_DEAD_KEY))
+ return NULL;
+
+ return cfqd;
+}
+
/*
* We regard a request as SYNC, if it's either a read or has the SYNC bit
* set (in which case it could also be direct WRITE).
static void cic_free_func(struct io_context *ioc, struct cfq_io_context *cic)
{
unsigned long flags;
+ unsigned long dead_key = (unsigned long) cic->key;
- BUG_ON(!cic->dead_key);
+ BUG_ON(!(dead_key & CIC_DEAD_KEY));
spin_lock_irqsave(&ioc->lock, flags);
- radix_tree_delete(&ioc->radix_root, cic->dead_key);
+ radix_tree_delete(&ioc->radix_root, dead_key >> CIC_DEAD_INDEX_SHIFT);
hlist_del_rcu(&cic->cic_list);
spin_unlock_irqrestore(&ioc->lock, flags);
__call_for_each_cic(ioc, cic_free_func);
}
-static void cfq_exit_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
+static void cfq_put_cooperator(struct cfq_queue *cfqq)
{
struct cfq_queue *__cfqq, *next;
- if (unlikely(cfqq == cfqd->active_queue)) {
- __cfq_slice_expired(cfqd, cfqq, 0);
- cfq_schedule_dispatch(cfqd);
- }
-
/*
* If this queue was scheduled to merge with another queue, be
* sure to drop the reference taken on that queue (and others in
cfq_put_queue(__cfqq);
__cfqq = next;
}
+}
+
+static void cfq_exit_cfqq(struct cfq_data *cfqd, struct cfq_queue *cfqq)
+{
+ if (unlikely(cfqq == cfqd->active_queue)) {
+ __cfq_slice_expired(cfqd, cfqq, 0);
+ cfq_schedule_dispatch(cfqd);
+ }
+
+ cfq_put_cooperator(cfqq);
cfq_put_queue(cfqq);
}
list_del_init(&cic->queue_list);
/*
- * Make sure key == NULL is seen for dead queues
+ * Make sure dead mark is seen for dead queues
*/
smp_wmb();
- cic->dead_key = (unsigned long) cic->key;
- cic->key = NULL;
+ cic->key = cfqd_dead_key(cfqd);
if (ioc->ioc_data == cic)
rcu_assign_pointer(ioc->ioc_data, NULL);
static void cfq_exit_single_io_context(struct io_context *ioc,
struct cfq_io_context *cic)
{
- struct cfq_data *cfqd = cic->key;
+ struct cfq_data *cfqd = cic_to_cfqd(cic);
if (cfqd) {
struct request_queue *q = cfqd->queue;
* race between exiting task and queue
*/
smp_read_barrier_depends();
- if (cic->key)
+ if (cic->key == cfqd)
__cfq_exit_single_io_context(cfqd, cic);
spin_unlock_irqrestore(q->queue_lock, flags);
static void changed_ioprio(struct io_context *ioc, struct cfq_io_context *cic)
{
- struct cfq_data *cfqd = cic->key;
+ struct cfq_data *cfqd = cic_to_cfqd(cic);
struct cfq_queue *cfqq;
unsigned long flags;
static void changed_cgroup(struct io_context *ioc, struct cfq_io_context *cic)
{
struct cfq_queue *sync_cfqq = cic_to_cfqq(cic, 1);
- struct cfq_data *cfqd = cic->key;
+ struct cfq_data *cfqd = cic_to_cfqd(cic);
unsigned long flags;
struct request_queue *q;
unsigned long flags;
WARN_ON(!list_empty(&cic->queue_list));
+ BUG_ON(cic->key != cfqd_dead_key(cfqd));
spin_lock_irqsave(&ioc->lock, flags);
BUG_ON(ioc->ioc_data == cic);
- radix_tree_delete(&ioc->radix_root, (unsigned long) cfqd);
+ radix_tree_delete(&ioc->radix_root, cfqd->cic_index);
hlist_del_rcu(&cic->cic_list);
spin_unlock_irqrestore(&ioc->lock, flags);
{
struct cfq_io_context *cic;
unsigned long flags;
- void *k;
if (unlikely(!ioc))
return NULL;
}
do {
- cic = radix_tree_lookup(&ioc->radix_root, (unsigned long) cfqd);
+ cic = radix_tree_lookup(&ioc->radix_root, cfqd->cic_index);
rcu_read_unlock();
if (!cic)
break;
- /* ->key must be copied to avoid race with cfq_exit_queue() */
- k = cic->key;
- if (unlikely(!k)) {
+ if (unlikely(cic->key != cfqd)) {
cfq_drop_dead_cic(cfqd, ioc, cic);
rcu_read_lock();
continue;
spin_lock_irqsave(&ioc->lock, flags);
ret = radix_tree_insert(&ioc->radix_root,
- (unsigned long) cfqd, cic);
+ cfqd->cic_index, cic);
if (!ret)
hlist_add_head_rcu(&cic->cic_list, &ioc->cic_list);
spin_unlock_irqrestore(&ioc->lock, flags);
}
cic_set_cfqq(cic, NULL, 1);
+
+ cfq_put_cooperator(cfqq);
+
cfq_put_queue(cfqq);
return NULL;
}
cfq_shutdown_timer_wq(cfqd);
+ spin_lock(&cic_index_lock);
+ ida_remove(&cic_index_ida, cfqd->cic_index);
+ spin_unlock(&cic_index_lock);
+
/* Wait for cfqg->blkg->key accessors to exit their grace periods. */
call_rcu(&cfqd->rcu, cfq_cfqd_free);
}
+static int cfq_alloc_cic_index(void)
+{
+ int index, error;
+
+ do {
+ if (!ida_pre_get(&cic_index_ida, GFP_KERNEL))
+ return -ENOMEM;
+
+ spin_lock(&cic_index_lock);
+ error = ida_get_new(&cic_index_ida, &index);
+ spin_unlock(&cic_index_lock);
+ if (error && error != -EAGAIN)
+ return error;
+ } while (error);
+
+ return index;
+}
+
static void *cfq_init_queue(struct request_queue *q)
{
struct cfq_data *cfqd;
struct cfq_group *cfqg;
struct cfq_rb_root *st;
+ i = cfq_alloc_cic_index();
+ if (i < 0)
+ return NULL;
+
cfqd = kmalloc_node(sizeof(*cfqd), GFP_KERNEL | __GFP_ZERO, q->node);
if (!cfqd)
return NULL;
+ cfqd->cic_index = i;
+
/* Init root service tree */
cfqd->grp_service_tree = CFQ_RB_ROOT;
*/
if (elv_ioc_count_read(cfq_ioc_count))
wait_for_completion(&all_gone);
+ ida_destroy(&cic_index_ida);
cfq_slab_kill();
}
{
struct elevator_type *e = NULL;
struct elevator_queue *eq;
- int ret = 0;
void *data;
+ if (unlikely(q->elevator))
+ return 0;
+
INIT_LIST_HEAD(&q->queue_head);
q->last_merge = NULL;
q->end_sector = 0;
}
elevator_attach(q, eq, data);
- return ret;
+ return 0;
}
EXPORT_SYMBOL(elevator_init);
struct elevator_type *__e;
int len = 0;
- if (!q->elevator)
+ if (!q->elevator || !blk_queue_stackable(q))
return sprintf(name, "none\n");
elv = e->elevator_type;
EC_FLAGS_GPE_STORM, /* GPE storm detected */
EC_FLAGS_HANDLERS_INSTALLED, /* Handlers for GPE and
* OpReg are installed */
- EC_FLAGS_FROZEN, /* Transactions are suspended */
+ EC_FLAGS_BLOCKED, /* Transactions are blocked */
};
/* If we find an EC via the ECDT, we need to keep a ptr to its context */
if (t->rdata)
memset(t->rdata, 0, t->rlen);
mutex_lock(&ec->lock);
- if (test_bit(EC_FLAGS_FROZEN, &ec->flags)) {
+ if (test_bit(EC_FLAGS_BLOCKED, &ec->flags)) {
status = -EINVAL;
goto unlock;
}
EXPORT_SYMBOL(ec_transaction);
-void acpi_ec_suspend_transactions(void)
+void acpi_ec_block_transactions(void)
{
struct acpi_ec *ec = first_ec;
mutex_lock(&ec->lock);
/* Prevent transactions from being carried out */
- set_bit(EC_FLAGS_FROZEN, &ec->flags);
+ set_bit(EC_FLAGS_BLOCKED, &ec->flags);
mutex_unlock(&ec->lock);
}
-void acpi_ec_resume_transactions(void)
+void acpi_ec_unblock_transactions(void)
{
struct acpi_ec *ec = first_ec;
mutex_lock(&ec->lock);
/* Allow transactions to be carried out again */
- clear_bit(EC_FLAGS_FROZEN, &ec->flags);
+ clear_bit(EC_FLAGS_BLOCKED, &ec->flags);
mutex_unlock(&ec->lock);
}
+void acpi_ec_unblock_transactions_early(void)
+{
+ /*
+ * Allow transactions to happen again (this function is called from
+ * atomic context during wakeup, so we don't need to acquire the mutex).
+ */
+ if (first_ec)
+ clear_bit(EC_FLAGS_BLOCKED, &first_ec->flags);
+}
+
static int acpi_ec_query_unlocked(struct acpi_ec *ec, u8 * data)
{
int result;
int acpi_ec_init(void);
int acpi_ec_ecdt_probe(void);
int acpi_boot_ec_enable(void);
-void acpi_ec_suspend_transactions(void);
-void acpi_ec_resume_transactions(void);
+void acpi_ec_block_transactions(void);
+void acpi_ec_unblock_transactions(void);
+void acpi_ec_unblock_transactions_early(void);
/*--------------------------------------------------------------------------
Suspend/Resume
static unsigned int latency_factor __read_mostly = 2;
module_param(latency_factor, uint, 0644);
-static s64 us_to_pm_timer_ticks(s64 t)
+static u64 us_to_pm_timer_ticks(s64 t)
{
return div64_u64(t * PM_TIMER_FREQUENCY, 1000000);
}
seq_puts(seq, "demotion[--] ");
- seq_printf(seq, "latency[%03d] usage[%08d] duration[%020llu]\n",
+ seq_printf(seq, "latency[%03d] usage[%08d] duration[%020Lu]\n",
pr->power.states[i].latency,
pr->power.states[i].usage,
- (unsigned long long)pr->power.states[i].time);
+ us_to_pm_timer_ticks(pr->power.states[i].time));
}
end:
ktime_t kt1, kt2;
s64 idle_time_ns;
s64 idle_time;
- s64 sleep_ticks = 0;
pr = __get_cpu_var(processors);
idle_time = idle_time_ns;
do_div(idle_time, NSEC_PER_USEC);
- sleep_ticks = us_to_pm_timer_ticks(idle_time);
-
/* Tell the scheduler how much we idled: */
sched_clock_idle_wakeup_event(idle_time_ns);
cx->usage++;
lapic_timer_state_broadcast(pr, cx, 0);
- cx->time += sleep_ticks;
+ cx->time += idle_time;
return idle_time;
}
ktime_t kt1, kt2;
s64 idle_time_ns;
s64 idle_time;
- s64 sleep_ticks = 0;
pr = __get_cpu_var(processors);
spin_unlock(&c3_lock);
}
kt2 = ktime_get_real();
- idle_time_ns = ktime_to_us(ktime_sub(kt2, kt1));
+ idle_time_ns = ktime_to_ns(ktime_sub(kt2, kt1));
idle_time = idle_time_ns;
do_div(idle_time, NSEC_PER_USEC);
- sleep_ticks = us_to_pm_timer_ticks(idle_time);
/* Tell the scheduler how much we idled: */
sched_clock_idle_wakeup_event(idle_time_ns);
cx->usage++;
lapic_timer_state_broadcast(pr, cx, 0);
- cx->time += sleep_ticks;
+ cx->time += idle_time;
return idle_time;
}
}
/**
- * acpi_pm_disable_gpes - Disable the GPEs.
+ * acpi_pm_freeze - Disable the GPEs and suspend EC transactions.
*/
-static int acpi_pm_disable_gpes(void)
+static int acpi_pm_freeze(void)
{
acpi_disable_all_gpes();
+ acpi_os_wait_events_complete(NULL);
+ acpi_ec_block_transactions();
return 0;
}
int error = __acpi_pm_prepare();
if (!error)
- acpi_disable_all_gpes();
+ acpi_pm_freeze();
+
return error;
}
* acpi_leave_sleep_state will reenable specific GPEs later
*/
acpi_disable_all_gpes();
+ /* Allow EC transactions to happen. */
+ acpi_ec_unblock_transactions_early();
local_irq_restore(flags);
printk(KERN_DEBUG "Back to C!\n");
return ACPI_SUCCESS(status) ? 0 : -EFAULT;
}
+static void acpi_suspend_finish(void)
+{
+ acpi_ec_unblock_transactions();
+ acpi_pm_finish();
+}
+
static int acpi_suspend_state_valid(suspend_state_t pm_state)
{
u32 acpi_state;
.begin = acpi_suspend_begin,
.prepare_late = acpi_pm_prepare,
.enter = acpi_suspend_enter,
- .wake = acpi_pm_finish,
+ .wake = acpi_suspend_finish,
.end = acpi_pm_end,
};
static struct platform_suspend_ops acpi_suspend_ops_old = {
.valid = acpi_suspend_state_valid,
.begin = acpi_suspend_begin_old,
- .prepare_late = acpi_pm_disable_gpes,
+ .prepare_late = acpi_pm_freeze,
.enter = acpi_suspend_enter,
- .wake = acpi_pm_finish,
+ .wake = acpi_suspend_finish,
.end = acpi_pm_end,
.recover = acpi_pm_finish,
};
static void acpi_hibernation_finish(void)
{
hibernate_nvs_free();
+ acpi_ec_unblock_transactions();
acpi_pm_finish();
}
}
/* Restore the NVS memory area */
hibernate_nvs_restore();
+ /* Allow EC transactions to happen. */
+ acpi_ec_unblock_transactions_early();
}
-static int acpi_pm_pre_restore(void)
-{
- acpi_disable_all_gpes();
- acpi_os_wait_events_complete(NULL);
- acpi_ec_suspend_transactions();
- return 0;
-}
-
-static void acpi_pm_restore_cleanup(void)
+static void acpi_pm_thaw(void)
{
- acpi_ec_resume_transactions();
+ acpi_ec_unblock_transactions();
acpi_enable_all_runtime_gpes();
}
.prepare = acpi_pm_prepare,
.enter = acpi_hibernation_enter,
.leave = acpi_hibernation_leave,
- .pre_restore = acpi_pm_pre_restore,
- .restore_cleanup = acpi_pm_restore_cleanup,
+ .pre_restore = acpi_pm_freeze,
+ .restore_cleanup = acpi_pm_thaw,
};
/**
static int acpi_hibernation_pre_snapshot_old(void)
{
- int error = acpi_pm_disable_gpes();
-
- if (!error)
- hibernate_nvs_save();
-
- return error;
+ acpi_pm_freeze();
+ hibernate_nvs_save();
+ return 0;
}
/*
.end = acpi_pm_end,
.pre_snapshot = acpi_hibernation_pre_snapshot_old,
.finish = acpi_hibernation_finish,
- .prepare = acpi_pm_disable_gpes,
+ .prepare = acpi_pm_freeze,
.enter = acpi_hibernation_enter,
.leave = acpi_hibernation_leave,
- .pre_restore = acpi_pm_pre_restore,
- .restore_cleanup = acpi_pm_restore_cleanup,
+ .pre_restore = acpi_pm_freeze,
+ .restore_cleanup = acpi_pm_thaw,
.recover = acpi_pm_finish,
};
#endif /* CONFIG_HIBERNATION */
return page;
}
+static void brd_free_page(struct brd_device *brd, sector_t sector)
+{
+ struct page *page;
+ pgoff_t idx;
+
+ spin_lock(&brd->brd_lock);
+ idx = sector >> PAGE_SECTORS_SHIFT;
+ page = radix_tree_delete(&brd->brd_pages, idx);
+ spin_unlock(&brd->brd_lock);
+ if (page)
+ __free_page(page);
+}
+
+static void brd_zero_page(struct brd_device *brd, sector_t sector)
+{
+ struct page *page;
+
+ page = brd_lookup_page(brd, sector);
+ if (page)
+ clear_highpage(page);
+}
+
/*
* Free all backing store pages and radix tree. This must only be called when
* there are no other users of the device.
return 0;
}
+static void discard_from_brd(struct brd_device *brd,
+ sector_t sector, size_t n)
+{
+ while (n >= PAGE_SIZE) {
+ /*
+ * Don't want to actually discard pages here because
+ * re-allocating the pages can result in writeback
+ * deadlocks under heavy load.
+ */
+ if (0)
+ brd_free_page(brd, sector);
+ else
+ brd_zero_page(brd, sector);
+ sector += PAGE_SIZE >> SECTOR_SHIFT;
+ n -= PAGE_SIZE;
+ }
+}
+
/*
* Copy n bytes from src to the brd starting at sector. Does not sleep.
*/
get_capacity(bdev->bd_disk))
goto out;
+ if (unlikely(bio_rw_flagged(bio, BIO_RW_DISCARD))) {
+ err = 0;
+ discard_from_brd(brd, sector, bio->bi_size);
+ goto out;
+ }
+
rw = bio_rw(bio);
if (rw == READA)
rw = READ;
}
#ifdef CONFIG_BLK_DEV_XIP
-static int brd_direct_access (struct block_device *bdev, sector_t sector,
+static int brd_direct_access(struct block_device *bdev, sector_t sector,
void **kaddr, unsigned long *pfn)
{
struct brd_device *brd = bdev->bd_disk->private_data;
blk_queue_max_hw_sectors(brd->brd_queue, 1024);
blk_queue_bounce_limit(brd->brd_queue, BLK_BOUNCE_ANY);
+ brd->brd_queue->limits.discard_granularity = PAGE_SIZE;
+ brd->brd_queue->limits.max_discard_sectors = UINT_MAX;
+ brd->brd_queue->limits.discard_zeroes_data = 1;
+ queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, brd->brd_queue);
+
disk = brd->brd_disk = alloc_disk(1 << part_shift);
if (!disk)
goto out_free_queue;
sa = h->scsi_ctlr;
stk = &sa->cmd_stack;
+ stk->top++;
if (stk->top >= CMD_STACK_SIZE) {
printk("cciss: scsi_cmd_free called too many times.\n");
BUG();
}
- stk->top++;
stk->elem[stk->top] = (struct cciss_scsi_cmd_stack_elem_t *) cmd;
}
struct drbd_work resync_work,
unplug_work,
md_sync_work,
- delay_probe_work,
- uuid_work;
+ delay_probe_work;
struct timer_list resync_timer;
struct timer_list md_sync_timer;
struct timer_list delay_probe_timer;
struct timeval dps_time; /* delay-probes-start-time */
unsigned int dp_volume_last; /* send_cnt of last delay probe */
int c_sync_rate; /* current resync rate after delay_probe magic */
- atomic_t new_c_uuid;
};
static inline struct drbd_conf *minor_to_mdev(unsigned int minor)
extern int w_ov_finished(struct drbd_conf *, struct drbd_work *, int);
extern int w_resync_inactive(struct drbd_conf *, struct drbd_work *, int);
extern int w_resume_next_sg(struct drbd_conf *, struct drbd_work *, int);
-extern int w_io_error(struct drbd_conf *, struct drbd_work *, int);
extern int w_send_write_hint(struct drbd_conf *, struct drbd_work *, int);
extern int w_make_resync_request(struct drbd_conf *, struct drbd_work *, int);
extern int w_send_dblock(struct drbd_conf *, struct drbd_work *, int);
static inline void drbd_tcp_quickack(struct socket *sock)
{
- int __user val = 1;
+ int __user val = 2;
(void) drbd_setsockopt(sock, SOL_TCP, TCP_QUICKACK,
(char __user *)&val, sizeof(val));
}
switch (mdev->ldev->dc.on_io_error) {
case EP_PASS_ON:
if (!forcedetach) {
- if (printk_ratelimit())
+ if (__ratelimit(&drbd_ratelimit_state))
dev_err(DEV, "Local IO failed in %s."
"Passing error on...\n", where);
break;
return 0;
if (test_bit(BITMAP_IO, &mdev->flags))
return 0;
- if (atomic_read(&mdev->new_c_uuid))
- return 0;
return 1;
}
* to avoid races with the reconnect code,
* we need to atomic_inc within the spinlock. */
- if (atomic_read(&mdev->new_c_uuid) && atomic_add_unless(&mdev->new_c_uuid, -1, 1))
- drbd_queue_work_front(&mdev->data.work, &mdev->uuid_work);
-
spin_lock_irq(&mdev->req_lock);
while (!__inc_ap_bio_cond(mdev)) {
prepare_to_wait(&mdev->misc_wait, &wait, TASK_UNINTERRUPTIBLE);
ns.pdsk == D_OUTDATED)) {
if (get_ldev(mdev)) {
if ((ns.role == R_PRIMARY || ns.peer == R_PRIMARY) &&
- mdev->ldev->md.uuid[UI_BITMAP] == 0 && ns.disk >= D_UP_TO_DATE &&
- !atomic_read(&mdev->new_c_uuid))
- atomic_set(&mdev->new_c_uuid, 2);
+ mdev->ldev->md.uuid[UI_BITMAP] == 0 && ns.disk >= D_UP_TO_DATE) {
+ drbd_uuid_new_current(mdev);
+ drbd_send_uuids(mdev);
+ }
put_ldev(mdev);
}
}
if (ns.pdsk < D_INCONSISTENT && get_ldev(mdev)) {
- /* Diskless peer becomes primary or got connected do diskless, primary peer. */
- if (ns.peer == R_PRIMARY && mdev->ldev->md.uuid[UI_BITMAP] == 0 &&
- !atomic_read(&mdev->new_c_uuid))
- atomic_set(&mdev->new_c_uuid, 2);
+ if (ns.peer == R_PRIMARY && mdev->ldev->md.uuid[UI_BITMAP] == 0)
+ drbd_uuid_new_current(mdev);
/* D_DISKLESS Peer becomes secondary */
if (os.peer == R_PRIMARY && ns.peer == R_SECONDARY)
drbd_md_sync(mdev);
}
-static int w_new_current_uuid(struct drbd_conf *mdev, struct drbd_work *w, int cancel)
-{
- if (get_ldev(mdev)) {
- if (mdev->ldev->md.uuid[UI_BITMAP] == 0) {
- drbd_uuid_new_current(mdev);
- if (get_net_conf(mdev)) {
- drbd_send_uuids(mdev);
- put_net_conf(mdev);
- }
- drbd_md_sync(mdev);
- }
- put_ldev(mdev);
- }
- atomic_dec(&mdev->new_c_uuid);
- wake_up(&mdev->misc_wait);
-
- return 1;
-}
static int drbd_thread_setup(void *arg)
{
* with page_count == 0 or PageSlab.
*/
static int _drbd_no_send_page(struct drbd_conf *mdev, struct page *page,
- int offset, size_t size)
+ int offset, size_t size, unsigned msg_flags)
{
- int sent = drbd_send(mdev, mdev->data.socket, kmap(page) + offset, size, 0);
+ int sent = drbd_send(mdev, mdev->data.socket, kmap(page) + offset, size, msg_flags);
kunmap(page);
if (sent == size)
mdev->send_cnt += size>>9;
}
static int _drbd_send_page(struct drbd_conf *mdev, struct page *page,
- int offset, size_t size)
+ int offset, size_t size, unsigned msg_flags)
{
mm_segment_t oldfs = get_fs();
int sent, ok;
* __page_cache_release a page that would actually still be referenced
* by someone, leading to some obscure delayed Oops somewhere else. */
if (disable_sendpage || (page_count(page) < 1) || PageSlab(page))
- return _drbd_no_send_page(mdev, page, offset, size);
+ return _drbd_no_send_page(mdev, page, offset, size, msg_flags);
+ msg_flags |= MSG_NOSIGNAL;
drbd_update_congested(mdev);
set_fs(KERNEL_DS);
do {
sent = mdev->data.socket->ops->sendpage(mdev->data.socket, page,
offset, len,
- MSG_NOSIGNAL);
+ msg_flags);
if (sent == -EAGAIN) {
if (we_should_drop_the_connection(mdev,
mdev->data.socket))
{
struct bio_vec *bvec;
int i;
+ /* hint all but last page with MSG_MORE */
__bio_for_each_segment(bvec, bio, i, 0) {
if (!_drbd_no_send_page(mdev, bvec->bv_page,
- bvec->bv_offset, bvec->bv_len))
+ bvec->bv_offset, bvec->bv_len,
+ i == bio->bi_vcnt -1 ? 0 : MSG_MORE))
return 0;
}
return 1;
{
struct bio_vec *bvec;
int i;
+ /* hint all but last page with MSG_MORE */
__bio_for_each_segment(bvec, bio, i, 0) {
if (!_drbd_send_page(mdev, bvec->bv_page,
- bvec->bv_offset, bvec->bv_len))
+ bvec->bv_offset, bvec->bv_len,
+ i == bio->bi_vcnt -1 ? 0 : MSG_MORE))
return 0;
}
-
return 1;
}
{
struct page *page = e->pages;
unsigned len = e->size;
+ /* hint all but last page with MSG_MORE */
page_chain_for_each(page) {
unsigned l = min_t(unsigned, len, PAGE_SIZE);
- if (!_drbd_send_page(mdev, page, 0, l))
+ if (!_drbd_send_page(mdev, page, 0, l,
+ page_chain_next(page) ? MSG_MORE : 0))
return 0;
len -= l;
}
p.dp_flags = cpu_to_be32(dp_flags);
set_bit(UNPLUG_REMOTE, &mdev->flags);
ok = (sizeof(p) ==
- drbd_send(mdev, mdev->data.socket, &p, sizeof(p), MSG_MORE));
+ drbd_send(mdev, mdev->data.socket, &p, sizeof(p), dgs ? MSG_MORE : 0));
if (ok && dgs) {
dgb = mdev->int_dig_out;
drbd_csum_bio(mdev, mdev->integrity_w_tfm, req->master_bio, dgb);
- ok = drbd_send(mdev, mdev->data.socket, dgb, dgs, MSG_MORE);
+ ok = drbd_send(mdev, mdev->data.socket, dgb, dgs, 0);
}
if (ok) {
if (mdev->net_conf->wire_protocol == DRBD_PROT_A)
return 0;
ok = sizeof(p) == drbd_send(mdev, mdev->data.socket, &p,
- sizeof(p), MSG_MORE);
+ sizeof(p), dgs ? MSG_MORE : 0);
if (ok && dgs) {
dgb = mdev->int_dig_out;
drbd_csum_ee(mdev, mdev->integrity_w_tfm, e, dgb);
- ok = drbd_send(mdev, mdev->data.socket, dgb, dgs, MSG_MORE);
+ ok = drbd_send(mdev, mdev->data.socket, dgb, dgs, 0);
}
if (ok)
ok = _drbd_send_zc_ee(mdev, e);
atomic_set(&mdev->net_cnt, 0);
atomic_set(&mdev->packet_seq, 0);
atomic_set(&mdev->pp_in_use, 0);
- atomic_set(&mdev->new_c_uuid, 0);
mutex_init(&mdev->md_io_mutex);
mutex_init(&mdev->data.mutex);
INIT_LIST_HEAD(&mdev->bm_io_work.w.list);
INIT_LIST_HEAD(&mdev->delay_probes);
INIT_LIST_HEAD(&mdev->delay_probe_work.list);
- INIT_LIST_HEAD(&mdev->uuid_work.list);
mdev->resync_work.cb = w_resync_inactive;
mdev->unplug_work.cb = w_send_write_hint;
mdev->md_sync_work.cb = w_md_sync;
mdev->bm_io_work.w.cb = w_bitmap_io;
mdev->delay_probe_work.cb = w_delay_probes;
- mdev->uuid_work.cb = w_new_current_uuid;
init_timer(&mdev->resync_timer);
init_timer(&mdev->md_sync_timer);
init_timer(&mdev->delay_probe_timer);
if (ret) {
fault_count++;
- if (printk_ratelimit())
+ if (__ratelimit(&drbd_ratelimit_state))
dev_warn(DEV, "***Simulating %s failure\n",
_drbd_fault_str(type));
}
#include <linux/unistd.h>
#include <linux/vmalloc.h>
#include <linux/random.h>
-#include <linux/mm.h>
#include <linux/string.h>
#include <linux/scatterlist.h>
#include "drbd_int.h"
return rv;
}
+/* quoting tcp(7):
+ * On individual connections, the socket buffer size must be set prior to the
+ * listen(2) or connect(2) calls in order to have it take effect.
+ * This is our wrapper to do so.
+ */
+static void drbd_setbufsize(struct socket *sock, unsigned int snd,
+ unsigned int rcv)
+{
+ /* open coded SO_SNDBUF, SO_RCVBUF */
+ if (snd) {
+ sock->sk->sk_sndbuf = snd;
+ sock->sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
+ }
+ if (rcv) {
+ sock->sk->sk_rcvbuf = rcv;
+ sock->sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
+ }
+}
+
static struct socket *drbd_try_connect(struct drbd_conf *mdev)
{
const char *what;
sock->sk->sk_rcvtimeo =
sock->sk->sk_sndtimeo = mdev->net_conf->try_connect_int*HZ;
+ drbd_setbufsize(sock, mdev->net_conf->sndbuf_size,
+ mdev->net_conf->rcvbuf_size);
/* explicitly bind to the configured IP as source IP
* for the outgoing connections.
s_listen->sk->sk_reuse = 1; /* SO_REUSEADDR */
s_listen->sk->sk_rcvtimeo = timeo;
s_listen->sk->sk_sndtimeo = timeo;
+ drbd_setbufsize(s_listen, mdev->net_conf->sndbuf_size,
+ mdev->net_conf->rcvbuf_size);
what = "bind before listen";
err = s_listen->ops->bind(s_listen,
sock->sk->sk_priority = TC_PRIO_INTERACTIVE_BULK;
msock->sk->sk_priority = TC_PRIO_INTERACTIVE;
- if (mdev->net_conf->sndbuf_size) {
- sock->sk->sk_sndbuf = mdev->net_conf->sndbuf_size;
- sock->sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
- }
-
- if (mdev->net_conf->rcvbuf_size) {
- sock->sk->sk_rcvbuf = mdev->net_conf->rcvbuf_size;
- sock->sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
- }
-
/* NOT YET ...
* sock->sk->sk_sndtimeo = mdev->net_conf->timeout*HZ/10;
* sock->sk->sk_rcvtimeo = MAX_SCHEDULE_TIMEOUT;
unsigned n_bios = 0;
unsigned nr_pages = (ds + PAGE_SIZE -1) >> PAGE_SHIFT;
- if (atomic_read(&mdev->new_c_uuid)) {
- if (atomic_add_unless(&mdev->new_c_uuid, -1, 1)) {
- drbd_uuid_new_current(mdev);
- drbd_md_sync(mdev);
-
- atomic_dec(&mdev->new_c_uuid);
- wake_up(&mdev->misc_wait);
- }
- wait_event(mdev->misc_wait, !atomic_read(&mdev->new_c_uuid));
- }
-
/* In most cases, we will only need one bio. But in case the lower
* level restrictions happen to be different at this offset on this
* side than those of the sending peer, we may need to submit the
}
}
- /* if it was a local io error, we want to notify our
- * peer about that, and see if we need to
- * detach the disk and stuff.
- * to avoid allocating some special work
- * struct, reuse the request. */
-
- /* THINK
- * why do we do this not when we detect the error,
- * but delay it until it is "done", i.e. possibly
- * until the next barrier ack? */
-
- if (rw == WRITE &&
- ((s & RQ_LOCAL_MASK) && !(s & RQ_LOCAL_OK))) {
- if (!(req->w.list.next == LIST_POISON1 ||
- list_empty(&req->w.list))) {
- /* DEBUG ASSERT only; if this triggers, we
- * probably corrupt the worker list here */
- dev_err(DEV, "req->w.list.next = %p\n", req->w.list.next);
- dev_err(DEV, "req->w.list.prev = %p\n", req->w.list.prev);
- }
- req->w.cb = w_io_error;
- drbd_queue_work(&mdev->data.work, &req->w);
- /* drbd_req_free() is done in w_io_error */
- } else {
- drbd_req_free(req);
- }
+ drbd_req_free(req);
}
static void queue_barrier(struct drbd_conf *mdev)
req->rq_state |= RQ_LOCAL_COMPLETED;
req->rq_state &= ~RQ_LOCAL_PENDING;
- dev_alert(DEV, "Local WRITE failed sec=%llus size=%u\n",
- (unsigned long long)req->sector, req->size);
- /* and now: check how to handle local io error. */
__drbd_chk_io_error(mdev, FALSE);
_req_may_be_done(req, m);
put_ldev(mdev);
req->rq_state |= RQ_LOCAL_COMPLETED;
req->rq_state &= ~RQ_LOCAL_PENDING;
- dev_alert(DEV, "Local READ failed sec=%llus size=%u\n",
- (unsigned long long)req->sector, req->size);
- /* _req_mod(req,to_be_send); oops, recursion... */
D_ASSERT(!(req->rq_state & RQ_NET_MASK));
- req->rq_state |= RQ_NET_PENDING;
- inc_ap_pending(mdev);
__drbd_chk_io_error(mdev, FALSE);
put_ldev(mdev);
- /* NOTE: if we have no connection,
- * or know the peer has no good data either,
- * then we don't actually need to "queue_for_net_read",
- * but we do so anyways, since the drbd_io_error()
- * and the potential state change to "Diskless"
- * needs to be done from process context */
+ /* no point in retrying if there is no good remote data,
+ * or we have no connection. */
+ if (mdev->state.pdsk != D_UP_TO_DATE) {
+ _req_may_be_done(req, m);
+ break;
+ }
+
+ /* _req_mod(req,to_be_send); oops, recursion... */
+ req->rq_state |= RQ_NET_PENDING;
+ inc_ap_pending(mdev);
/* fall through: _req_mod(req,queue_for_net_read); */
case queue_for_net_read:
_req_may_be_done(req, m);
break;
+ case read_retry_remote_canceled:
+ req->rq_state &= ~RQ_NET_QUEUED;
+ /* fall through, in case we raced with drbd_disconnect */
case connection_lost_while_pending:
/* transfer log cleanup after connection loss */
/* assert something? */
send_failed,
handed_over_to_network,
connection_lost_while_pending,
+ read_retry_remote_canceled,
recv_acked_by_peer,
write_acked_by_peer,
write_acked_by_peer_and_sis, /* and set_in_sync */
enum drbd_req_event what;
int uptodate = bio_flagged(bio, BIO_UPTODATE);
- if (error)
- dev_warn(DEV, "p %s: error=%d\n",
- bio_data_dir(bio) == WRITE ? "write" : "read", error);
if (!error && !uptodate) {
dev_warn(DEV, "p %s: setting error to -EIO\n",
bio_data_dir(bio) == WRITE ? "write" : "read");
complete_master_bio(mdev, &m);
}
-int w_io_error(struct drbd_conf *mdev, struct drbd_work *w, int cancel)
-{
- struct drbd_request *req = container_of(w, struct drbd_request, w);
-
- /* NOTE: mdev->ldev can be NULL by the time we get here! */
- /* D_ASSERT(mdev->ldev->dc.on_io_error != EP_PASS_ON); */
-
- /* the only way this callback is scheduled is from _req_may_be_done,
- * when it is done and had a local write error, see comments there */
- drbd_req_free(req);
-
- return TRUE;
-}
-
int w_read_retry_remote(struct drbd_conf *mdev, struct drbd_work *w, int cancel)
{
struct drbd_request *req = container_of(w, struct drbd_request, w);
* to give the disk the chance to relocate that block */
spin_lock_irq(&mdev->req_lock);
- if (cancel ||
- mdev->state.conn < C_CONNECTED ||
- mdev->state.pdsk <= D_INCONSISTENT) {
- _req_mod(req, send_canceled);
+ if (cancel || mdev->state.pdsk != D_UP_TO_DATE) {
+ _req_mod(req, read_retry_remote_canceled);
spin_unlock_irq(&mdev->req_lock);
- dev_alert(DEV, "WE ARE LOST. Local IO failure, no peer.\n");
return 1;
}
spin_unlock_irq(&mdev->req_lock);
config RAMOOPS
tristate "Log panic/oops to a RAM buffer"
+ depends on HAS_IOMEM
default n
help
This enables panic and oops messages to be logged to a circular
int len;
/* Priority ordering: We should do priority with RR of the groups */
int i = 1;
- unsigned long flags;
- spin_lock_irqsave(&gsm->tx_lock, flags);
while (i < NUM_DLCI) {
struct gsm_dlci *dlci;
if (len == 0)
i++;
}
- spin_unlock_irqrestore(&gsm->tx_lock, flags);
}
/**
static void gsmld_write_wakeup(struct tty_struct *tty)
{
struct gsm_mux *gsm = tty->disc_data;
+ unsigned long flags;
/* Queue poll */
clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
gsm_data_kick(gsm);
- if (gsm->tx_bytes < TX_THRESH_LO)
+ if (gsm->tx_bytes < TX_THRESH_LO) {
+ spin_lock_irqsave(&gsm->tx_lock, flags);
gsm_dlci_data_sweep(gsm);
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ }
}
/**
d = (unsigned short *)(vc->vc_origin + vc->vc_size_row * t);
s = (unsigned short *)(vc->vc_origin + vc->vc_size_row * (t + nr));
scr_memmovew(d, s, (b - t - nr) * vc->vc_size_row);
- scr_memsetw(d + (b - t - nr) * vc->vc_cols, vc->vc_video_erase_char,
+ scr_memsetw(d + (b - t - nr) * vc->vc_size_row, vc->vc_video_erase_char,
vc->vc_size_row * nr);
}
if (!perm)
goto eperm;
ret = copy_from_user(&ui, up, sizeof(struct unimapinit));
- if (!ret)
+ if (ret)
+ ret = -EFAULT;
+ else
con_clear_unimap(vc, &ui);
break;
}
static int sh_cmt_clocksource_enable(struct clocksource *cs)
{
struct sh_cmt_priv *p = cs_to_sh_cmt(cs);
- int ret;
p->total_cycles = 0;
- ret = sh_cmt_start(p, FLAG_CLOCKSOURCE);
- if (ret)
- return ret;
-
- /* TODO: calculate good shift from rate and counter bit width */
- cs->shift = 0;
- cs->mult = clocksource_hz2mult(p->rate, cs->shift);
- return 0;
+ return sh_cmt_start(p, FLAG_CLOCKSOURCE);
}
static void sh_cmt_clocksource_disable(struct clocksource *cs)
cs->resume = sh_cmt_clocksource_resume;
cs->mask = CLOCKSOURCE_MASK(sizeof(unsigned long) * 8);
cs->flags = CLOCK_SOURCE_IS_CONTINUOUS;
+
+ /* clk_get_rate() needs an enabled clock */
+ clk_enable(p->clk);
+ p->rate = clk_get_rate(p->clk) / (p->width == 16) ? 512 : 8;
+ clk_disable(p->clk);
+
+ /* TODO: calculate good shift from rate and counter bit width */
+ cs->shift = 10;
+ cs->mult = clocksource_hz2mult(p->rate, cs->shift);
+
dev_info(&p->pdev->dev, "used as clock source\n");
+
clocksource_register(cs);
+
return 0;
}
static int sh_tmu_clocksource_enable(struct clocksource *cs)
{
struct sh_tmu_priv *p = cs_to_sh_tmu(cs);
- int ret;
-
- ret = sh_tmu_enable(p);
- if (ret)
- return ret;
- /* TODO: calculate good shift from rate and counter bit width */
- cs->shift = 10;
- cs->mult = clocksource_hz2mult(p->rate, cs->shift);
- return 0;
+ return sh_tmu_enable(p);
}
static void sh_tmu_clocksource_disable(struct clocksource *cs)
cs->disable = sh_tmu_clocksource_disable;
cs->mask = CLOCKSOURCE_MASK(32);
cs->flags = CLOCK_SOURCE_IS_CONTINUOUS;
+
+ /* clk_get_rate() needs an enabled clock */
+ clk_enable(p->clk);
+ /* channel will be configured at parent clock / 4 */
+ p->rate = clk_get_rate(p->clk) / 4;
+ clk_disable(p->clk);
+ /* TODO: calculate good shift from rate and counter bit width */
+ cs->shift = 10;
+ cs->mult = clocksource_hz2mult(p->rate, cs->shift);
+
dev_info(&p->pdev->dev, "used as clock source\n");
clocksource_register(cs);
return 0;
occurred so that a particular failing memory module can be
replaced. If unsure, select 'Y'.
+config EDAC_MCE
+ bool
+
config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64) K8, F10h, F11h"
depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && EDAC_DECODE_MCE
Support for error detection and correction the Intel
i5400 MCH chipset (Seaburg).
+config EDAC_I7CORE
+ tristate "Intel i7 Core (Nehalem) processors"
+ depends on EDAC_MM_EDAC && PCI && X86
+ select EDAC_MCE
+ help
+ Support for error detection and correction the Intel
+ i7 Core (Nehalem) Integrated Memory Controller that exists on
+ newer processors like i7 Core, i7 Core Extreme, Xeon 35xx
+ and Xeon 55xx processors.
+
config EDAC_I82860
tristate "Intel 82860"
depends on EDAC_MM_EDAC && PCI && X86_32
obj-$(CONFIG_EDAC) := edac_stub.o
obj-$(CONFIG_EDAC_MM_EDAC) += edac_core.o
+obj-$(CONFIG_EDAC_MCE) += edac_mce.o
edac_core-objs := edac_mc.o edac_device.o edac_mc_sysfs.o edac_pci_sysfs.o
edac_core-objs += edac_module.o edac_device_sysfs.o
obj-$(CONFIG_EDAC_I5000) += i5000_edac.o
obj-$(CONFIG_EDAC_I5100) += i5100_edac.o
obj-$(CONFIG_EDAC_I5400) += i5400_edac.o
+obj-$(CONFIG_EDAC_I7CORE) += i7core_edac.o
obj-$(CONFIG_EDAC_E7XXX) += e7xxx_edac.o
obj-$(CONFIG_EDAC_E752X) += e752x_edac.o
obj-$(CONFIG_EDAC_I82443BXGX) += i82443bxgx_edac.o
struct channel_info *channels;
};
+struct mcidev_sysfs_group {
+ const char *name; /* group name */
+ struct mcidev_sysfs_attribute *mcidev_attr; /* group attributes */
+};
+
+struct mcidev_sysfs_group_kobj {
+ struct list_head list; /* list for all instances within a mc */
+
+ struct kobject kobj; /* kobj for the group */
+
+ struct mcidev_sysfs_group *grp; /* group description table */
+ struct mem_ctl_info *mci; /* the parent */
+};
+
/* mcidev_sysfs_attribute structure
* used for driver sysfs attributes and in mem_ctl_info
* sysfs top level entries
*/
struct mcidev_sysfs_attribute {
- struct attribute attr;
+ /* It should use either attr or grp */
+ struct attribute attr;
+ struct mcidev_sysfs_group *grp; /* Points to a group of attributes */
+
+ /* Ops for show/store values at the attribute - not used on group */
ssize_t (*show)(struct mem_ctl_info *,char *);
ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
};
/* edac sysfs device control */
struct kobject edac_mci_kobj;
+ /* list for all grp instances within a mc */
+ struct list_head grp_kobj_list;
+
/* Additional top controller level attributes, but specified
* by the low level driver.
*
struct mem_ctl_info *mem_ctl_info = to_mci(kobj);
struct mcidev_sysfs_attribute *mcidev_attr = to_mcidev_attr(attr);
+ debugf1("%s() mem_ctl_info %p\n", __func__, mem_ctl_info);
+
if (mcidev_attr->show)
return mcidev_attr->show(mem_ctl_info, buffer);
struct mem_ctl_info *mem_ctl_info = to_mci(kobj);
struct mcidev_sysfs_attribute *mcidev_attr = to_mcidev_attr(attr);
+ debugf1("%s() mem_ctl_info %p\n", __func__, mem_ctl_info);
+
if (mcidev_attr->store)
return mcidev_attr->store(mem_ctl_info, buffer, count);
#define EDAC_DEVICE_SYMLINK "device"
+#define grp_to_mci(k) (container_of(k, struct mcidev_sysfs_group_kobj, kobj)->mci)
+
+/* MCI show/store functions for top most object */
+static ssize_t inst_grp_show(struct kobject *kobj, struct attribute *attr,
+ char *buffer)
+{
+ struct mem_ctl_info *mem_ctl_info = grp_to_mci(kobj);
+ struct mcidev_sysfs_attribute *mcidev_attr = to_mcidev_attr(attr);
+
+ debugf1("%s() mem_ctl_info %p\n", __func__, mem_ctl_info);
+
+ if (mcidev_attr->show)
+ return mcidev_attr->show(mem_ctl_info, buffer);
+
+ return -EIO;
+}
+
+static ssize_t inst_grp_store(struct kobject *kobj, struct attribute *attr,
+ const char *buffer, size_t count)
+{
+ struct mem_ctl_info *mem_ctl_info = grp_to_mci(kobj);
+ struct mcidev_sysfs_attribute *mcidev_attr = to_mcidev_attr(attr);
+
+ debugf1("%s() mem_ctl_info %p\n", __func__, mem_ctl_info);
+
+ if (mcidev_attr->store)
+ return mcidev_attr->store(mem_ctl_info, buffer, count);
+
+ return -EIO;
+}
+
+/* No memory to release for this kobj */
+static void edac_inst_grp_release(struct kobject *kobj)
+{
+ struct mcidev_sysfs_group_kobj *grp;
+ struct mem_ctl_info *mci;
+
+ debugf1("%s()\n", __func__);
+
+ grp = container_of(kobj, struct mcidev_sysfs_group_kobj, kobj);
+ mci = grp->mci;
+
+ kobject_put(&mci->edac_mci_kobj);
+}
+
+/* Intermediate show/store table */
+static struct sysfs_ops inst_grp_ops = {
+ .show = inst_grp_show,
+ .store = inst_grp_store
+};
+
+/* the kobj_type instance for a instance group */
+static struct kobj_type ktype_inst_grp = {
+ .release = edac_inst_grp_release,
+ .sysfs_ops = &inst_grp_ops,
+};
+
+
/*
* edac_create_mci_instance_attributes
- * create MC driver specific attributes at the topmost level
- * directory of this mci instance.
+ * create MC driver specific attributes bellow an specified kobj
+ * This routine calls itself recursively, in order to create an entire
+ * object tree.
*/
-static int edac_create_mci_instance_attributes(struct mem_ctl_info *mci)
+static int edac_create_mci_instance_attributes(struct mem_ctl_info *mci,
+ struct mcidev_sysfs_attribute *sysfs_attrib,
+ struct kobject *kobj)
{
int err;
- struct mcidev_sysfs_attribute *sysfs_attrib;
- /* point to the start of the array and iterate over it
- * adding each attribute listed to this mci instance's kobject
- */
- sysfs_attrib = mci->mc_driver_sysfs_attributes;
+ debugf1("%s()\n", __func__);
+
+ while (sysfs_attrib) {
+ if (sysfs_attrib->grp) {
+ struct mcidev_sysfs_group_kobj *grp_kobj;
+
+ grp_kobj = kzalloc(sizeof(*grp_kobj), GFP_KERNEL);
+ if (!grp_kobj)
+ return -ENOMEM;
+
+ list_add_tail(&grp_kobj->list, &mci->grp_kobj_list);
+
+ grp_kobj->grp = sysfs_attrib->grp;
+ grp_kobj->mci = mci;
+
+ debugf0("%s() grp %s, mci %p\n", __func__,
+ sysfs_attrib->grp->name, mci);
+
+ err = kobject_init_and_add(&grp_kobj->kobj,
+ &ktype_inst_grp,
+ &mci->edac_mci_kobj,
+ sysfs_attrib->grp->name);
+ if (err)
+ return err;
+
+ err = edac_create_mci_instance_attributes(mci,
+ grp_kobj->grp->mcidev_attr,
+ &grp_kobj->kobj);
+
+ if (err)
+ return err;
+ } else if (sysfs_attrib->attr.name) {
+ debugf0("%s() file %s\n", __func__,
+ sysfs_attrib->attr.name);
+
+ err = sysfs_create_file(kobj, &sysfs_attrib->attr);
+ } else
+ break;
- while (sysfs_attrib && sysfs_attrib->attr.name) {
- err = sysfs_create_file(&mci->edac_mci_kobj,
- (struct attribute*) sysfs_attrib);
if (err) {
return err;
}
-
sysfs_attrib++;
}
* remove MC driver specific attributes at the topmost level
* directory of this mci instance.
*/
-static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci)
+static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
+ struct mcidev_sysfs_attribute *sysfs_attrib,
+ struct kobject *kobj, int count)
{
- struct mcidev_sysfs_attribute *sysfs_attrib;
+ struct mcidev_sysfs_group_kobj *grp_kobj, *tmp;
- /* point to the start of the array and iterate over it
- * adding each attribute listed to this mci instance's kobject
- */
- sysfs_attrib = mci->mc_driver_sysfs_attributes;
+ debugf1("%s()\n", __func__);
- /* loop if there are attributes and until we hit a NULL entry */
- while (sysfs_attrib && sysfs_attrib->attr.name) {
- sysfs_remove_file(&mci->edac_mci_kobj,
- (struct attribute *) sysfs_attrib);
+ /*
+ * loop if there are attributes and until we hit a NULL entry
+ * Remove first all the atributes
+ */
+ while (sysfs_attrib) {
+ if (sysfs_attrib->grp) {
+ list_for_each_entry(grp_kobj, &mci->grp_kobj_list,
+ list)
+ if (grp_kobj->grp == sysfs_attrib->grp)
+ edac_remove_mci_instance_attributes(mci,
+ grp_kobj->grp->mcidev_attr,
+ &grp_kobj->kobj, count + 1);
+ } else if (sysfs_attrib->attr.name) {
+ debugf0("%s() file %s\n", __func__,
+ sysfs_attrib->attr.name);
+ sysfs_remove_file(kobj, &sysfs_attrib->attr);
+ } else
+ break;
sysfs_attrib++;
}
+
+ /*
+ * Now that all attributes got removed, it is save to remove all groups
+ */
+ if (!count)
+ list_for_each_entry_safe(grp_kobj, tmp, &mci->grp_kobj_list,
+ list) {
+ debugf0("%s() grp %s\n", __func__, grp_kobj->grp->name);
+ kobject_put(&grp_kobj->kobj);
+ }
}
debugf0("%s() idx=%d\n", __func__, mci->mc_idx);
+ INIT_LIST_HEAD(&mci->grp_kobj_list);
+
/* create a symlink for the device */
err = sysfs_create_link(kobj_mci, &mci->dev->kobj,
EDAC_DEVICE_SYMLINK);
* then create them now for the driver.
*/
if (mci->mc_driver_sysfs_attributes) {
- err = edac_create_mci_instance_attributes(mci);
+ err = edac_create_mci_instance_attributes(mci,
+ mci->mc_driver_sysfs_attributes,
+ &mci->edac_mci_kobj);
if (err) {
debugf1("%s() failure to create mci attributes\n",
__func__);
}
/* remove the mci instance's attributes, if any */
- edac_remove_mci_instance_attributes(mci);
+ edac_remove_mci_instance_attributes(mci,
+ mci->mc_driver_sysfs_attributes, &mci->edac_mci_kobj, 0);
/* remove the symlink */
sysfs_remove_link(kobj_mci, EDAC_DEVICE_SYMLINK);
debugf0("%s() remove_mci_instance\n", __func__);
/* remove this mci instance's attribtes */
- edac_remove_mci_instance_attributes(mci);
-
+ edac_remove_mci_instance_attributes(mci,
+ mci->mc_driver_sysfs_attributes,
+ &mci->edac_mci_kobj, 0);
debugf0("%s() unregister this mci kobj\n", __func__);
/* unregister this instance's kobject */
--- /dev/null
+/* Provides edac interface to mcelog events
+ *
+ * This file may be distributed under the terms of the
+ * GNU General Public License version 2.
+ *
+ * Copyright (c) 2009 by:
+ * Mauro Carvalho Chehab <mchehab@redhat.com>
+ *
+ * Red Hat Inc. http://www.redhat.com
+ */
+
+#include <linux/module.h>
+#include <linux/edac_mce.h>
+#include <asm/mce.h>
+
+int edac_mce_enabled;
+EXPORT_SYMBOL_GPL(edac_mce_enabled);
+
+
+/*
+ * Extension interface
+ */
+
+static LIST_HEAD(edac_mce_list);
+static DEFINE_MUTEX(edac_mce_lock);
+
+int edac_mce_register(struct edac_mce *edac_mce)
+{
+ mutex_lock(&edac_mce_lock);
+ list_add_tail(&edac_mce->list, &edac_mce_list);
+ mutex_unlock(&edac_mce_lock);
+ return 0;
+}
+EXPORT_SYMBOL(edac_mce_register);
+
+void edac_mce_unregister(struct edac_mce *edac_mce)
+{
+ mutex_lock(&edac_mce_lock);
+ list_del(&edac_mce->list);
+ mutex_unlock(&edac_mce_lock);
+}
+EXPORT_SYMBOL(edac_mce_unregister);
+
+int edac_mce_parse(struct mce *mce)
+{
+ struct edac_mce *edac_mce;
+
+ list_for_each_entry(edac_mce, &edac_mce_list, list) {
+ if (edac_mce->check_error(edac_mce->priv, mce))
+ return 1;
+ }
+
+ /* Nobody queued the error */
+ return 0;
+}
+EXPORT_SYMBOL_GPL(edac_mce_parse);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Mauro Carvalho Chehab <mchehab@redhat.com>");
+MODULE_AUTHOR("Red Hat Inc. (http://www.redhat.com)");
+MODULE_DESCRIPTION("EDAC Driver for mcelog captured errors");
--- /dev/null
+/* Intel i7 core/Nehalem Memory Controller kernel module
+ *
+ * This driver supports yhe memory controllers found on the Intel
+ * processor families i7core, i7core 7xx/8xx, i5core, Xeon 35xx,
+ * Xeon 55xx and Xeon 56xx also known as Nehalem, Nehalem-EP, Lynnfield
+ * and Westmere-EP.
+ *
+ * This file may be distributed under the terms of the
+ * GNU General Public License version 2 only.
+ *
+ * Copyright (c) 2009-2010 by:
+ * Mauro Carvalho Chehab <mchehab@redhat.com>
+ *
+ * Red Hat Inc. http://www.redhat.com
+ *
+ * Forked and adapted from the i5400_edac driver
+ *
+ * Based on the following public Intel datasheets:
+ * Intel Core i7 Processor Extreme Edition and Intel Core i7 Processor
+ * Datasheet, Volume 2:
+ * http://download.intel.com/design/processor/datashts/320835.pdf
+ * Intel Xeon Processor 5500 Series Datasheet Volume 2
+ * http://www.intel.com/Assets/PDF/datasheet/321322.pdf
+ * also available at:
+ * http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include <linux/pci_ids.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/edac.h>
+#include <linux/mmzone.h>
+#include <linux/edac_mce.h>
+#include <linux/smp.h>
+#include <asm/processor.h>
+
+#include "edac_core.h"
+
+/*
+ * This is used for Nehalem-EP and Nehalem-EX devices, where the non-core
+ * registers start at bus 255, and are not reported by BIOS.
+ * We currently find devices with only 2 sockets. In order to support more QPI
+ * Quick Path Interconnect, just increment this number.
+ */
+#define MAX_SOCKET_BUSES 2
+
+
+/*
+ * Alter this version for the module when modifications are made
+ */
+#define I7CORE_REVISION " Ver: 1.0.0 " __DATE__
+#define EDAC_MOD_STR "i7core_edac"
+
+/*
+ * Debug macros
+ */
+#define i7core_printk(level, fmt, arg...) \
+ edac_printk(level, "i7core", fmt, ##arg)
+
+#define i7core_mc_printk(mci, level, fmt, arg...) \
+ edac_mc_chipset_printk(mci, level, "i7core", fmt, ##arg)
+
+/*
+ * i7core Memory Controller Registers
+ */
+
+ /* OFFSETS for Device 0 Function 0 */
+
+#define MC_CFG_CONTROL 0x90
+
+ /* OFFSETS for Device 3 Function 0 */
+
+#define MC_CONTROL 0x48
+#define MC_STATUS 0x4c
+#define MC_MAX_DOD 0x64
+
+/*
+ * OFFSETS for Device 3 Function 4, as inicated on Xeon 5500 datasheet:
+ * http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf
+ */
+
+#define MC_TEST_ERR_RCV1 0x60
+ #define DIMM2_COR_ERR(r) ((r) & 0x7fff)
+
+#define MC_TEST_ERR_RCV0 0x64
+ #define DIMM1_COR_ERR(r) (((r) >> 16) & 0x7fff)
+ #define DIMM0_COR_ERR(r) ((r) & 0x7fff)
+
+/* OFFSETS for Device 3 Function 2, as inicated on Xeon 5500 datasheet */
+#define MC_COR_ECC_CNT_0 0x80
+#define MC_COR_ECC_CNT_1 0x84
+#define MC_COR_ECC_CNT_2 0x88
+#define MC_COR_ECC_CNT_3 0x8c
+#define MC_COR_ECC_CNT_4 0x90
+#define MC_COR_ECC_CNT_5 0x94
+
+#define DIMM_TOP_COR_ERR(r) (((r) >> 16) & 0x7fff)
+#define DIMM_BOT_COR_ERR(r) ((r) & 0x7fff)
+
+
+ /* OFFSETS for Devices 4,5 and 6 Function 0 */
+
+#define MC_CHANNEL_DIMM_INIT_PARAMS 0x58
+ #define THREE_DIMMS_PRESENT (1 << 24)
+ #define SINGLE_QUAD_RANK_PRESENT (1 << 23)
+ #define QUAD_RANK_PRESENT (1 << 22)
+ #define REGISTERED_DIMM (1 << 15)
+
+#define MC_CHANNEL_MAPPER 0x60
+ #define RDLCH(r, ch) ((((r) >> (3 + (ch * 6))) & 0x07) - 1)
+ #define WRLCH(r, ch) ((((r) >> (ch * 6)) & 0x07) - 1)
+
+#define MC_CHANNEL_RANK_PRESENT 0x7c
+ #define RANK_PRESENT_MASK 0xffff
+
+#define MC_CHANNEL_ADDR_MATCH 0xf0
+#define MC_CHANNEL_ERROR_MASK 0xf8
+#define MC_CHANNEL_ERROR_INJECT 0xfc
+ #define INJECT_ADDR_PARITY 0x10
+ #define INJECT_ECC 0x08
+ #define MASK_CACHELINE 0x06
+ #define MASK_FULL_CACHELINE 0x06
+ #define MASK_MSB32_CACHELINE 0x04
+ #define MASK_LSB32_CACHELINE 0x02
+ #define NO_MASK_CACHELINE 0x00
+ #define REPEAT_EN 0x01
+
+ /* OFFSETS for Devices 4,5 and 6 Function 1 */
+
+#define MC_DOD_CH_DIMM0 0x48
+#define MC_DOD_CH_DIMM1 0x4c
+#define MC_DOD_CH_DIMM2 0x50
+ #define RANKOFFSET_MASK ((1 << 12) | (1 << 11) | (1 << 10))
+ #define RANKOFFSET(x) ((x & RANKOFFSET_MASK) >> 10)
+ #define DIMM_PRESENT_MASK (1 << 9)
+ #define DIMM_PRESENT(x) (((x) & DIMM_PRESENT_MASK) >> 9)
+ #define MC_DOD_NUMBANK_MASK ((1 << 8) | (1 << 7))
+ #define MC_DOD_NUMBANK(x) (((x) & MC_DOD_NUMBANK_MASK) >> 7)
+ #define MC_DOD_NUMRANK_MASK ((1 << 6) | (1 << 5))
+ #define MC_DOD_NUMRANK(x) (((x) & MC_DOD_NUMRANK_MASK) >> 5)
+ #define MC_DOD_NUMROW_MASK ((1 << 4) | (1 << 3) | (1 << 2))
+ #define MC_DOD_NUMROW(x) (((x) & MC_DOD_NUMROW_MASK) >> 2)
+ #define MC_DOD_NUMCOL_MASK 3
+ #define MC_DOD_NUMCOL(x) ((x) & MC_DOD_NUMCOL_MASK)
+
+#define MC_RANK_PRESENT 0x7c
+
+#define MC_SAG_CH_0 0x80
+#define MC_SAG_CH_1 0x84
+#define MC_SAG_CH_2 0x88
+#define MC_SAG_CH_3 0x8c
+#define MC_SAG_CH_4 0x90
+#define MC_SAG_CH_5 0x94
+#define MC_SAG_CH_6 0x98
+#define MC_SAG_CH_7 0x9c
+
+#define MC_RIR_LIMIT_CH_0 0x40
+#define MC_RIR_LIMIT_CH_1 0x44
+#define MC_RIR_LIMIT_CH_2 0x48
+#define MC_RIR_LIMIT_CH_3 0x4C
+#define MC_RIR_LIMIT_CH_4 0x50
+#define MC_RIR_LIMIT_CH_5 0x54
+#define MC_RIR_LIMIT_CH_6 0x58
+#define MC_RIR_LIMIT_CH_7 0x5C
+#define MC_RIR_LIMIT_MASK ((1 << 10) - 1)
+
+#define MC_RIR_WAY_CH 0x80
+ #define MC_RIR_WAY_OFFSET_MASK (((1 << 14) - 1) & ~0x7)
+ #define MC_RIR_WAY_RANK_MASK 0x7
+
+/*
+ * i7core structs
+ */
+
+#define NUM_CHANS 3
+#define MAX_DIMMS 3 /* Max DIMMS per channel */
+#define MAX_MCR_FUNC 4
+#define MAX_CHAN_FUNC 3
+
+struct i7core_info {
+ u32 mc_control;
+ u32 mc_status;
+ u32 max_dod;
+ u32 ch_map;
+};
+
+
+struct i7core_inject {
+ int enable;
+
+ u32 section;
+ u32 type;
+ u32 eccmask;
+
+ /* Error address mask */
+ int channel, dimm, rank, bank, page, col;
+};
+
+struct i7core_channel {
+ u32 ranks;
+ u32 dimms;
+};
+
+struct pci_id_descr {
+ int dev;
+ int func;
+ int dev_id;
+ int optional;
+};
+
+struct pci_id_table {
+ struct pci_id_descr *descr;
+ int n_devs;
+};
+
+struct i7core_dev {
+ struct list_head list;
+ u8 socket;
+ struct pci_dev **pdev;
+ int n_devs;
+ struct mem_ctl_info *mci;
+};
+
+struct i7core_pvt {
+ struct pci_dev *pci_noncore;
+ struct pci_dev *pci_mcr[MAX_MCR_FUNC + 1];
+ struct pci_dev *pci_ch[NUM_CHANS][MAX_CHAN_FUNC + 1];
+
+ struct i7core_dev *i7core_dev;
+
+ struct i7core_info info;
+ struct i7core_inject inject;
+ struct i7core_channel channel[NUM_CHANS];
+
+ int channels; /* Number of active channels */
+
+ int ce_count_available;
+ int csrow_map[NUM_CHANS][MAX_DIMMS];
+
+ /* ECC corrected errors counts per udimm */
+ unsigned long udimm_ce_count[MAX_DIMMS];
+ int udimm_last_ce_count[MAX_DIMMS];
+ /* ECC corrected errors counts per rdimm */
+ unsigned long rdimm_ce_count[NUM_CHANS][MAX_DIMMS];
+ int rdimm_last_ce_count[NUM_CHANS][MAX_DIMMS];
+
+ unsigned int is_registered;
+
+ /* mcelog glue */
+ struct edac_mce edac_mce;
+
+ /* Fifo double buffers */
+ struct mce mce_entry[MCE_LOG_LEN];
+ struct mce mce_outentry[MCE_LOG_LEN];
+
+ /* Fifo in/out counters */
+ unsigned mce_in, mce_out;
+
+ /* Count indicator to show errors not got */
+ unsigned mce_overrun;
+};
+
+/* Static vars */
+static LIST_HEAD(i7core_edac_list);
+static DEFINE_MUTEX(i7core_edac_lock);
+
+#define PCI_DESCR(device, function, device_id) \
+ .dev = (device), \
+ .func = (function), \
+ .dev_id = (device_id)
+
+struct pci_id_descr pci_dev_descr_i7core_nehalem[] = {
+ /* Memory controller */
+ { PCI_DESCR(3, 0, PCI_DEVICE_ID_INTEL_I7_MCR) },
+ { PCI_DESCR(3, 1, PCI_DEVICE_ID_INTEL_I7_MC_TAD) },
+ /* Exists only for RDIMM */
+ { PCI_DESCR(3, 2, PCI_DEVICE_ID_INTEL_I7_MC_RAS), .optional = 1 },
+ { PCI_DESCR(3, 4, PCI_DEVICE_ID_INTEL_I7_MC_TEST) },
+
+ /* Channel 0 */
+ { PCI_DESCR(4, 0, PCI_DEVICE_ID_INTEL_I7_MC_CH0_CTRL) },
+ { PCI_DESCR(4, 1, PCI_DEVICE_ID_INTEL_I7_MC_CH0_ADDR) },
+ { PCI_DESCR(4, 2, PCI_DEVICE_ID_INTEL_I7_MC_CH0_RANK) },
+ { PCI_DESCR(4, 3, PCI_DEVICE_ID_INTEL_I7_MC_CH0_TC) },
+
+ /* Channel 1 */
+ { PCI_DESCR(5, 0, PCI_DEVICE_ID_INTEL_I7_MC_CH1_CTRL) },
+ { PCI_DESCR(5, 1, PCI_DEVICE_ID_INTEL_I7_MC_CH1_ADDR) },
+ { PCI_DESCR(5, 2, PCI_DEVICE_ID_INTEL_I7_MC_CH1_RANK) },
+ { PCI_DESCR(5, 3, PCI_DEVICE_ID_INTEL_I7_MC_CH1_TC) },
+
+ /* Channel 2 */
+ { PCI_DESCR(6, 0, PCI_DEVICE_ID_INTEL_I7_MC_CH2_CTRL) },
+ { PCI_DESCR(6, 1, PCI_DEVICE_ID_INTEL_I7_MC_CH2_ADDR) },
+ { PCI_DESCR(6, 2, PCI_DEVICE_ID_INTEL_I7_MC_CH2_RANK) },
+ { PCI_DESCR(6, 3, PCI_DEVICE_ID_INTEL_I7_MC_CH2_TC) },
+
+ /* Generic Non-core registers */
+ /*
+ * This is the PCI device on i7core and on Xeon 35xx (8086:2c41)
+ * On Xeon 55xx, however, it has a different id (8086:2c40). So,
+ * the probing code needs to test for the other address in case of
+ * failure of this one
+ */
+ { PCI_DESCR(0, 0, PCI_DEVICE_ID_INTEL_I7_NONCORE) },
+
+};
+
+struct pci_id_descr pci_dev_descr_lynnfield[] = {
+ { PCI_DESCR( 3, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MCR) },
+ { PCI_DESCR( 3, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_TAD) },
+ { PCI_DESCR( 3, 4, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_TEST) },
+
+ { PCI_DESCR( 4, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_CTRL) },
+ { PCI_DESCR( 4, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_ADDR) },
+ { PCI_DESCR( 4, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_RANK) },
+ { PCI_DESCR( 4, 3, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_TC) },
+
+ { PCI_DESCR( 5, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_CTRL) },
+ { PCI_DESCR( 5, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_ADDR) },
+ { PCI_DESCR( 5, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_RANK) },
+ { PCI_DESCR( 5, 3, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_TC) },
+
+ /*
+ * This is the PCI device has an alternate address on some
+ * processors like Core i7 860
+ */
+ { PCI_DESCR( 0, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_NONCORE) },
+};
+
+struct pci_id_descr pci_dev_descr_i7core_westmere[] = {
+ /* Memory controller */
+ { PCI_DESCR(3, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MCR_REV2) },
+ { PCI_DESCR(3, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_TAD_REV2) },
+ /* Exists only for RDIMM */
+ { PCI_DESCR(3, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_RAS_REV2), .optional = 1 },
+ { PCI_DESCR(3, 4, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_TEST_REV2) },
+
+ /* Channel 0 */
+ { PCI_DESCR(4, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_CTRL_REV2) },
+ { PCI_DESCR(4, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_ADDR_REV2) },
+ { PCI_DESCR(4, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_RANK_REV2) },
+ { PCI_DESCR(4, 3, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH0_TC_REV2) },
+
+ /* Channel 1 */
+ { PCI_DESCR(5, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_CTRL_REV2) },
+ { PCI_DESCR(5, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_ADDR_REV2) },
+ { PCI_DESCR(5, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_RANK_REV2) },
+ { PCI_DESCR(5, 3, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH1_TC_REV2) },
+
+ /* Channel 2 */
+ { PCI_DESCR(6, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_CTRL_REV2) },
+ { PCI_DESCR(6, 1, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_ADDR_REV2) },
+ { PCI_DESCR(6, 2, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_RANK_REV2) },
+ { PCI_DESCR(6, 3, PCI_DEVICE_ID_INTEL_LYNNFIELD_MC_CH2_TC_REV2) },
+
+ /* Generic Non-core registers */
+ { PCI_DESCR(0, 0, PCI_DEVICE_ID_INTEL_LYNNFIELD_NONCORE_REV2) },
+
+};
+
+#define PCI_ID_TABLE_ENTRY(A) { A, ARRAY_SIZE(A) }
+struct pci_id_table pci_dev_table[] = {
+ PCI_ID_TABLE_ENTRY(pci_dev_descr_i7core_nehalem),
+ PCI_ID_TABLE_ENTRY(pci_dev_descr_lynnfield),
+ PCI_ID_TABLE_ENTRY(pci_dev_descr_i7core_westmere),
+};
+
+/*
+ * pci_device_id table for which devices we are looking for
+ */
+static const struct pci_device_id i7core_pci_tbl[] __devinitdata = {
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_X58_HUB_MGMT)},
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LYNNFIELD_QPI_LINK0)},
+ {0,} /* 0 terminated list. */
+};
+
+static struct edac_pci_ctl_info *i7core_pci;
+
+/****************************************************************************
+ Anciliary status routines
+ ****************************************************************************/
+
+ /* MC_CONTROL bits */
+#define CH_ACTIVE(pvt, ch) ((pvt)->info.mc_control & (1 << (8 + ch)))
+#define ECCx8(pvt) ((pvt)->info.mc_control & (1 << 1))
+
+ /* MC_STATUS bits */
+#define ECC_ENABLED(pvt) ((pvt)->info.mc_status & (1 << 4))
+#define CH_DISABLED(pvt, ch) ((pvt)->info.mc_status & (1 << ch))
+
+ /* MC_MAX_DOD read functions */
+static inline int numdimms(u32 dimms)
+{
+ return (dimms & 0x3) + 1;
+}
+
+static inline int numrank(u32 rank)
+{
+ static int ranks[4] = { 1, 2, 4, -EINVAL };
+
+ return ranks[rank & 0x3];
+}
+
+static inline int numbank(u32 bank)
+{
+ static int banks[4] = { 4, 8, 16, -EINVAL };
+
+ return banks[bank & 0x3];
+}
+
+static inline int numrow(u32 row)
+{
+ static int rows[8] = {
+ 1 << 12, 1 << 13, 1 << 14, 1 << 15,
+ 1 << 16, -EINVAL, -EINVAL, -EINVAL,
+ };
+
+ return rows[row & 0x7];
+}
+
+static inline int numcol(u32 col)
+{
+ static int cols[8] = {
+ 1 << 10, 1 << 11, 1 << 12, -EINVAL,
+ };
+ return cols[col & 0x3];
+}
+
+static struct i7core_dev *get_i7core_dev(u8 socket)
+{
+ struct i7core_dev *i7core_dev;
+
+ list_for_each_entry(i7core_dev, &i7core_edac_list, list) {
+ if (i7core_dev->socket == socket)
+ return i7core_dev;
+ }
+
+ return NULL;
+}
+
+/****************************************************************************
+ Memory check routines
+ ****************************************************************************/
+static struct pci_dev *get_pdev_slot_func(u8 socket, unsigned slot,
+ unsigned func)
+{
+ struct i7core_dev *i7core_dev = get_i7core_dev(socket);
+ int i;
+
+ if (!i7core_dev)
+ return NULL;
+
+ for (i = 0; i < i7core_dev->n_devs; i++) {
+ if (!i7core_dev->pdev[i])
+ continue;
+
+ if (PCI_SLOT(i7core_dev->pdev[i]->devfn) == slot &&
+ PCI_FUNC(i7core_dev->pdev[i]->devfn) == func) {
+ return i7core_dev->pdev[i];
+ }
+ }
+
+ return NULL;
+}
+
+/**
+ * i7core_get_active_channels() - gets the number of channels and csrows
+ * @socket: Quick Path Interconnect socket
+ * @channels: Number of channels that will be returned
+ * @csrows: Number of csrows found
+ *
+ * Since EDAC core needs to know in advance the number of available channels
+ * and csrows, in order to allocate memory for csrows/channels, it is needed
+ * to run two similar steps. At the first step, implemented on this function,
+ * it checks the number of csrows/channels present at one socket.
+ * this is used in order to properly allocate the size of mci components.
+ *
+ * It should be noticed that none of the current available datasheets explain
+ * or even mention how csrows are seen by the memory controller. So, we need
+ * to add a fake description for csrows.
+ * So, this driver is attributing one DIMM memory for one csrow.
+ */
+static int i7core_get_active_channels(u8 socket, unsigned *channels,
+ unsigned *csrows)
+{
+ struct pci_dev *pdev = NULL;
+ int i, j;
+ u32 status, control;
+
+ *channels = 0;
+ *csrows = 0;
+
+ pdev = get_pdev_slot_func(socket, 3, 0);
+ if (!pdev) {
+ i7core_printk(KERN_ERR, "Couldn't find socket %d fn 3.0!!!\n",
+ socket);
+ return -ENODEV;
+ }
+
+ /* Device 3 function 0 reads */
+ pci_read_config_dword(pdev, MC_STATUS, &status);
+ pci_read_config_dword(pdev, MC_CONTROL, &control);
+
+ for (i = 0; i < NUM_CHANS; i++) {
+ u32 dimm_dod[3];
+ /* Check if the channel is active */
+ if (!(control & (1 << (8 + i))))
+ continue;
+
+ /* Check if the channel is disabled */
+ if (status & (1 << i))
+ continue;
+
+ pdev = get_pdev_slot_func(socket, i + 4, 1);
+ if (!pdev) {
+ i7core_printk(KERN_ERR, "Couldn't find socket %d "
+ "fn %d.%d!!!\n",
+ socket, i + 4, 1);
+ return -ENODEV;
+ }
+ /* Devices 4-6 function 1 */
+ pci_read_config_dword(pdev,
+ MC_DOD_CH_DIMM0, &dimm_dod[0]);
+ pci_read_config_dword(pdev,
+ MC_DOD_CH_DIMM1, &dimm_dod[1]);
+ pci_read_config_dword(pdev,
+ MC_DOD_CH_DIMM2, &dimm_dod[2]);
+
+ (*channels)++;
+
+ for (j = 0; j < 3; j++) {
+ if (!DIMM_PRESENT(dimm_dod[j]))
+ continue;
+ (*csrows)++;
+ }
+ }
+
+ debugf0("Number of active channels on socket %d: %d\n",
+ socket, *channels);
+
+ return 0;
+}
+
+static int get_dimm_config(struct mem_ctl_info *mci, int *csrow)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ struct csrow_info *csr;
+ struct pci_dev *pdev;
+ int i, j;
+ unsigned long last_page = 0;
+ enum edac_type mode;
+ enum mem_type mtype;
+
+ /* Get data from the MC register, function 0 */
+ pdev = pvt->pci_mcr[0];
+ if (!pdev)
+ return -ENODEV;
+
+ /* Device 3 function 0 reads */
+ pci_read_config_dword(pdev, MC_CONTROL, &pvt->info.mc_control);
+ pci_read_config_dword(pdev, MC_STATUS, &pvt->info.mc_status);
+ pci_read_config_dword(pdev, MC_MAX_DOD, &pvt->info.max_dod);
+ pci_read_config_dword(pdev, MC_CHANNEL_MAPPER, &pvt->info.ch_map);
+
+ debugf0("QPI %d control=0x%08x status=0x%08x dod=0x%08x map=0x%08x\n",
+ pvt->i7core_dev->socket, pvt->info.mc_control, pvt->info.mc_status,
+ pvt->info.max_dod, pvt->info.ch_map);
+
+ if (ECC_ENABLED(pvt)) {
+ debugf0("ECC enabled with x%d SDCC\n", ECCx8(pvt) ? 8 : 4);
+ if (ECCx8(pvt))
+ mode = EDAC_S8ECD8ED;
+ else
+ mode = EDAC_S4ECD4ED;
+ } else {
+ debugf0("ECC disabled\n");
+ mode = EDAC_NONE;
+ }
+
+ /* FIXME: need to handle the error codes */
+ debugf0("DOD Max limits: DIMMS: %d, %d-ranked, %d-banked "
+ "x%x x 0x%x\n",
+ numdimms(pvt->info.max_dod),
+ numrank(pvt->info.max_dod >> 2),
+ numbank(pvt->info.max_dod >> 4),
+ numrow(pvt->info.max_dod >> 6),
+ numcol(pvt->info.max_dod >> 9));
+
+ for (i = 0; i < NUM_CHANS; i++) {
+ u32 data, dimm_dod[3], value[8];
+
+ if (!pvt->pci_ch[i][0])
+ continue;
+
+ if (!CH_ACTIVE(pvt, i)) {
+ debugf0("Channel %i is not active\n", i);
+ continue;
+ }
+ if (CH_DISABLED(pvt, i)) {
+ debugf0("Channel %i is disabled\n", i);
+ continue;
+ }
+
+ /* Devices 4-6 function 0 */
+ pci_read_config_dword(pvt->pci_ch[i][0],
+ MC_CHANNEL_DIMM_INIT_PARAMS, &data);
+
+ pvt->channel[i].ranks = (data & QUAD_RANK_PRESENT) ?
+ 4 : 2;
+
+ if (data & REGISTERED_DIMM)
+ mtype = MEM_RDDR3;
+ else
+ mtype = MEM_DDR3;
+#if 0
+ if (data & THREE_DIMMS_PRESENT)
+ pvt->channel[i].dimms = 3;
+ else if (data & SINGLE_QUAD_RANK_PRESENT)
+ pvt->channel[i].dimms = 1;
+ else
+ pvt->channel[i].dimms = 2;
+#endif
+
+ /* Devices 4-6 function 1 */
+ pci_read_config_dword(pvt->pci_ch[i][1],
+ MC_DOD_CH_DIMM0, &dimm_dod[0]);
+ pci_read_config_dword(pvt->pci_ch[i][1],
+ MC_DOD_CH_DIMM1, &dimm_dod[1]);
+ pci_read_config_dword(pvt->pci_ch[i][1],
+ MC_DOD_CH_DIMM2, &dimm_dod[2]);
+
+ debugf0("Ch%d phy rd%d, wr%d (0x%08x): "
+ "%d ranks, %cDIMMs\n",
+ i,
+ RDLCH(pvt->info.ch_map, i), WRLCH(pvt->info.ch_map, i),
+ data,
+ pvt->channel[i].ranks,
+ (data & REGISTERED_DIMM) ? 'R' : 'U');
+
+ for (j = 0; j < 3; j++) {
+ u32 banks, ranks, rows, cols;
+ u32 size, npages;
+
+ if (!DIMM_PRESENT(dimm_dod[j]))
+ continue;
+
+ banks = numbank(MC_DOD_NUMBANK(dimm_dod[j]));
+ ranks = numrank(MC_DOD_NUMRANK(dimm_dod[j]));
+ rows = numrow(MC_DOD_NUMROW(dimm_dod[j]));
+ cols = numcol(MC_DOD_NUMCOL(dimm_dod[j]));
+
+ /* DDR3 has 8 I/O banks */
+ size = (rows * cols * banks * ranks) >> (20 - 3);
+
+ pvt->channel[i].dimms++;
+
+ debugf0("\tdimm %d %d Mb offset: %x, "
+ "bank: %d, rank: %d, row: %#x, col: %#x\n",
+ j, size,
+ RANKOFFSET(dimm_dod[j]),
+ banks, ranks, rows, cols);
+
+#if PAGE_SHIFT > 20
+ npages = size >> (PAGE_SHIFT - 20);
+#else
+ npages = size << (20 - PAGE_SHIFT);
+#endif
+
+ csr = &mci->csrows[*csrow];
+ csr->first_page = last_page + 1;
+ last_page += npages;
+ csr->last_page = last_page;
+ csr->nr_pages = npages;
+
+ csr->page_mask = 0;
+ csr->grain = 8;
+ csr->csrow_idx = *csrow;
+ csr->nr_channels = 1;
+
+ csr->channels[0].chan_idx = i;
+ csr->channels[0].ce_count = 0;
+
+ pvt->csrow_map[i][j] = *csrow;
+
+ switch (banks) {
+ case 4:
+ csr->dtype = DEV_X4;
+ break;
+ case 8:
+ csr->dtype = DEV_X8;
+ break;
+ case 16:
+ csr->dtype = DEV_X16;
+ break;
+ default:
+ csr->dtype = DEV_UNKNOWN;
+ }
+
+ csr->edac_mode = mode;
+ csr->mtype = mtype;
+
+ (*csrow)++;
+ }
+
+ pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
+ pci_read_config_dword(pdev, MC_SAG_CH_1, &value[1]);
+ pci_read_config_dword(pdev, MC_SAG_CH_2, &value[2]);
+ pci_read_config_dword(pdev, MC_SAG_CH_3, &value[3]);
+ pci_read_config_dword(pdev, MC_SAG_CH_4, &value[4]);
+ pci_read_config_dword(pdev, MC_SAG_CH_5, &value[5]);
+ pci_read_config_dword(pdev, MC_SAG_CH_6, &value[6]);
+ pci_read_config_dword(pdev, MC_SAG_CH_7, &value[7]);
+ debugf1("\t[%i] DIVBY3\tREMOVED\tOFFSET\n", i);
+ for (j = 0; j < 8; j++)
+ debugf1("\t\t%#x\t%#x\t%#x\n",
+ (value[j] >> 27) & 0x1,
+ (value[j] >> 24) & 0x7,
+ (value[j] && ((1 << 24) - 1)));
+ }
+
+ return 0;
+}
+
+/****************************************************************************
+ Error insertion routines
+ ****************************************************************************/
+
+/* The i7core has independent error injection features per channel.
+ However, to have a simpler code, we don't allow enabling error injection
+ on more than one channel.
+ Also, since a change at an inject parameter will be applied only at enable,
+ we're disabling error injection on all write calls to the sysfs nodes that
+ controls the error code injection.
+ */
+static int disable_inject(struct mem_ctl_info *mci)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+
+ pvt->inject.enable = 0;
+
+ if (!pvt->pci_ch[pvt->inject.channel][0])
+ return -ENODEV;
+
+ pci_write_config_dword(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ERROR_INJECT, 0);
+
+ return 0;
+}
+
+/*
+ * i7core inject inject.section
+ *
+ * accept and store error injection inject.section value
+ * bit 0 - refers to the lower 32-byte half cacheline
+ * bit 1 - refers to the upper 32-byte half cacheline
+ */
+static ssize_t i7core_inject_section_store(struct mem_ctl_info *mci,
+ const char *data, size_t count)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ unsigned long value;
+ int rc;
+
+ if (pvt->inject.enable)
+ disable_inject(mci);
+
+ rc = strict_strtoul(data, 10, &value);
+ if ((rc < 0) || (value > 3))
+ return -EIO;
+
+ pvt->inject.section = (u32) value;
+ return count;
+}
+
+static ssize_t i7core_inject_section_show(struct mem_ctl_info *mci,
+ char *data)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ return sprintf(data, "0x%08x\n", pvt->inject.section);
+}
+
+/*
+ * i7core inject.type
+ *
+ * accept and store error injection inject.section value
+ * bit 0 - repeat enable - Enable error repetition
+ * bit 1 - inject ECC error
+ * bit 2 - inject parity error
+ */
+static ssize_t i7core_inject_type_store(struct mem_ctl_info *mci,
+ const char *data, size_t count)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ unsigned long value;
+ int rc;
+
+ if (pvt->inject.enable)
+ disable_inject(mci);
+
+ rc = strict_strtoul(data, 10, &value);
+ if ((rc < 0) || (value > 7))
+ return -EIO;
+
+ pvt->inject.type = (u32) value;
+ return count;
+}
+
+static ssize_t i7core_inject_type_show(struct mem_ctl_info *mci,
+ char *data)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ return sprintf(data, "0x%08x\n", pvt->inject.type);
+}
+
+/*
+ * i7core_inject_inject.eccmask_store
+ *
+ * The type of error (UE/CE) will depend on the inject.eccmask value:
+ * Any bits set to a 1 will flip the corresponding ECC bit
+ * Correctable errors can be injected by flipping 1 bit or the bits within
+ * a symbol pair (2 consecutive aligned 8-bit pairs - i.e. 7:0 and 15:8 or
+ * 23:16 and 31:24). Flipping bits in two symbol pairs will cause an
+ * uncorrectable error to be injected.
+ */
+static ssize_t i7core_inject_eccmask_store(struct mem_ctl_info *mci,
+ const char *data, size_t count)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ unsigned long value;
+ int rc;
+
+ if (pvt->inject.enable)
+ disable_inject(mci);
+
+ rc = strict_strtoul(data, 10, &value);
+ if (rc < 0)
+ return -EIO;
+
+ pvt->inject.eccmask = (u32) value;
+ return count;
+}
+
+static ssize_t i7core_inject_eccmask_show(struct mem_ctl_info *mci,
+ char *data)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ return sprintf(data, "0x%08x\n", pvt->inject.eccmask);
+}
+
+/*
+ * i7core_addrmatch
+ *
+ * The type of error (UE/CE) will depend on the inject.eccmask value:
+ * Any bits set to a 1 will flip the corresponding ECC bit
+ * Correctable errors can be injected by flipping 1 bit or the bits within
+ * a symbol pair (2 consecutive aligned 8-bit pairs - i.e. 7:0 and 15:8 or
+ * 23:16 and 31:24). Flipping bits in two symbol pairs will cause an
+ * uncorrectable error to be injected.
+ */
+
+#define DECLARE_ADDR_MATCH(param, limit) \
+static ssize_t i7core_inject_store_##param( \
+ struct mem_ctl_info *mci, \
+ const char *data, size_t count) \
+{ \
+ struct i7core_pvt *pvt; \
+ long value; \
+ int rc; \
+ \
+ debugf1("%s()\n", __func__); \
+ pvt = mci->pvt_info; \
+ \
+ if (pvt->inject.enable) \
+ disable_inject(mci); \
+ \
+ if (!strcasecmp(data, "any") || !strcasecmp(data, "any\n"))\
+ value = -1; \
+ else { \
+ rc = strict_strtoul(data, 10, &value); \
+ if ((rc < 0) || (value >= limit)) \
+ return -EIO; \
+ } \
+ \
+ pvt->inject.param = value; \
+ \
+ return count; \
+} \
+ \
+static ssize_t i7core_inject_show_##param( \
+ struct mem_ctl_info *mci, \
+ char *data) \
+{ \
+ struct i7core_pvt *pvt; \
+ \
+ pvt = mci->pvt_info; \
+ debugf1("%s() pvt=%p\n", __func__, pvt); \
+ if (pvt->inject.param < 0) \
+ return sprintf(data, "any\n"); \
+ else \
+ return sprintf(data, "%d\n", pvt->inject.param);\
+}
+
+#define ATTR_ADDR_MATCH(param) \
+ { \
+ .attr = { \
+ .name = #param, \
+ .mode = (S_IRUGO | S_IWUSR) \
+ }, \
+ .show = i7core_inject_show_##param, \
+ .store = i7core_inject_store_##param, \
+ }
+
+DECLARE_ADDR_MATCH(channel, 3);
+DECLARE_ADDR_MATCH(dimm, 3);
+DECLARE_ADDR_MATCH(rank, 4);
+DECLARE_ADDR_MATCH(bank, 32);
+DECLARE_ADDR_MATCH(page, 0x10000);
+DECLARE_ADDR_MATCH(col, 0x4000);
+
+static int write_and_test(struct pci_dev *dev, int where, u32 val)
+{
+ u32 read;
+ int count;
+
+ debugf0("setting pci %02x:%02x.%x reg=%02x value=%08x\n",
+ dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn),
+ where, val);
+
+ for (count = 0; count < 10; count++) {
+ if (count)
+ msleep(100);
+ pci_write_config_dword(dev, where, val);
+ pci_read_config_dword(dev, where, &read);
+
+ if (read == val)
+ return 0;
+ }
+
+ i7core_printk(KERN_ERR, "Error during set pci %02x:%02x.%x reg=%02x "
+ "write=%08x. Read=%08x\n",
+ dev->bus->number, PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn),
+ where, val, read);
+
+ return -EINVAL;
+}
+
+/*
+ * This routine prepares the Memory Controller for error injection.
+ * The error will be injected when some process tries to write to the
+ * memory that matches the given criteria.
+ * The criteria can be set in terms of a mask where dimm, rank, bank, page
+ * and col can be specified.
+ * A -1 value for any of the mask items will make the MCU to ignore
+ * that matching criteria for error injection.
+ *
+ * It should be noticed that the error will only happen after a write operation
+ * on a memory that matches the condition. if REPEAT_EN is not enabled at
+ * inject mask, then it will produce just one error. Otherwise, it will repeat
+ * until the injectmask would be cleaned.
+ *
+ * FIXME: This routine assumes that MAXNUMDIMMS value of MC_MAX_DOD
+ * is reliable enough to check if the MC is using the
+ * three channels. However, this is not clear at the datasheet.
+ */
+static ssize_t i7core_inject_enable_store(struct mem_ctl_info *mci,
+ const char *data, size_t count)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ u32 injectmask;
+ u64 mask = 0;
+ int rc;
+ long enable;
+
+ if (!pvt->pci_ch[pvt->inject.channel][0])
+ return 0;
+
+ rc = strict_strtoul(data, 10, &enable);
+ if ((rc < 0))
+ return 0;
+
+ if (enable) {
+ pvt->inject.enable = 1;
+ } else {
+ disable_inject(mci);
+ return count;
+ }
+
+ /* Sets pvt->inject.dimm mask */
+ if (pvt->inject.dimm < 0)
+ mask |= 1LL << 41;
+ else {
+ if (pvt->channel[pvt->inject.channel].dimms > 2)
+ mask |= (pvt->inject.dimm & 0x3LL) << 35;
+ else
+ mask |= (pvt->inject.dimm & 0x1LL) << 36;
+ }
+
+ /* Sets pvt->inject.rank mask */
+ if (pvt->inject.rank < 0)
+ mask |= 1LL << 40;
+ else {
+ if (pvt->channel[pvt->inject.channel].dimms > 2)
+ mask |= (pvt->inject.rank & 0x1LL) << 34;
+ else
+ mask |= (pvt->inject.rank & 0x3LL) << 34;
+ }
+
+ /* Sets pvt->inject.bank mask */
+ if (pvt->inject.bank < 0)
+ mask |= 1LL << 39;
+ else
+ mask |= (pvt->inject.bank & 0x15LL) << 30;
+
+ /* Sets pvt->inject.page mask */
+ if (pvt->inject.page < 0)
+ mask |= 1LL << 38;
+ else
+ mask |= (pvt->inject.page & 0xffff) << 14;
+
+ /* Sets pvt->inject.column mask */
+ if (pvt->inject.col < 0)
+ mask |= 1LL << 37;
+ else
+ mask |= (pvt->inject.col & 0x3fff);
+
+ /*
+ * bit 0: REPEAT_EN
+ * bits 1-2: MASK_HALF_CACHELINE
+ * bit 3: INJECT_ECC
+ * bit 4: INJECT_ADDR_PARITY
+ */
+
+ injectmask = (pvt->inject.type & 1) |
+ (pvt->inject.section & 0x3) << 1 |
+ (pvt->inject.type & 0x6) << (3 - 1);
+
+ /* Unlock writes to registers - this register is write only */
+ pci_write_config_dword(pvt->pci_noncore,
+ MC_CFG_CONTROL, 0x2);
+
+ write_and_test(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ADDR_MATCH, mask);
+ write_and_test(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ADDR_MATCH + 4, mask >> 32L);
+
+ write_and_test(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ERROR_MASK, pvt->inject.eccmask);
+
+ write_and_test(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ERROR_INJECT, injectmask);
+
+ /*
+ * This is something undocumented, based on my tests
+ * Without writing 8 to this register, errors aren't injected. Not sure
+ * why.
+ */
+ pci_write_config_dword(pvt->pci_noncore,
+ MC_CFG_CONTROL, 8);
+
+ debugf0("Error inject addr match 0x%016llx, ecc 0x%08x,"
+ " inject 0x%08x\n",
+ mask, pvt->inject.eccmask, injectmask);
+
+
+ return count;
+}
+
+static ssize_t i7core_inject_enable_show(struct mem_ctl_info *mci,
+ char *data)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ u32 injectmask;
+
+ if (!pvt->pci_ch[pvt->inject.channel][0])
+ return 0;
+
+ pci_read_config_dword(pvt->pci_ch[pvt->inject.channel][0],
+ MC_CHANNEL_ERROR_INJECT, &injectmask);
+
+ debugf0("Inject error read: 0x%018x\n", injectmask);
+
+ if (injectmask & 0x0c)
+ pvt->inject.enable = 1;
+
+ return sprintf(data, "%d\n", pvt->inject.enable);
+}
+
+#define DECLARE_COUNTER(param) \
+static ssize_t i7core_show_counter_##param( \
+ struct mem_ctl_info *mci, \
+ char *data) \
+{ \
+ struct i7core_pvt *pvt = mci->pvt_info; \
+ \
+ debugf1("%s() \n", __func__); \
+ if (!pvt->ce_count_available || (pvt->is_registered)) \
+ return sprintf(data, "data unavailable\n"); \
+ return sprintf(data, "%lu\n", \
+ pvt->udimm_ce_count[param]); \
+}
+
+#define ATTR_COUNTER(param) \
+ { \
+ .attr = { \
+ .name = __stringify(udimm##param), \
+ .mode = (S_IRUGO | S_IWUSR) \
+ }, \
+ .show = i7core_show_counter_##param \
+ }
+
+DECLARE_COUNTER(0);
+DECLARE_COUNTER(1);
+DECLARE_COUNTER(2);
+
+/*
+ * Sysfs struct
+ */
+
+
+static struct mcidev_sysfs_attribute i7core_addrmatch_attrs[] = {
+ ATTR_ADDR_MATCH(channel),
+ ATTR_ADDR_MATCH(dimm),
+ ATTR_ADDR_MATCH(rank),
+ ATTR_ADDR_MATCH(bank),
+ ATTR_ADDR_MATCH(page),
+ ATTR_ADDR_MATCH(col),
+ { .attr = { .name = NULL } }
+};
+
+static struct mcidev_sysfs_group i7core_inject_addrmatch = {
+ .name = "inject_addrmatch",
+ .mcidev_attr = i7core_addrmatch_attrs,
+};
+
+static struct mcidev_sysfs_attribute i7core_udimm_counters_attrs[] = {
+ ATTR_COUNTER(0),
+ ATTR_COUNTER(1),
+ ATTR_COUNTER(2),
+};
+
+static struct mcidev_sysfs_group i7core_udimm_counters = {
+ .name = "all_channel_counts",
+ .mcidev_attr = i7core_udimm_counters_attrs,
+};
+
+static struct mcidev_sysfs_attribute i7core_sysfs_attrs[] = {
+ {
+ .attr = {
+ .name = "inject_section",
+ .mode = (S_IRUGO | S_IWUSR)
+ },
+ .show = i7core_inject_section_show,
+ .store = i7core_inject_section_store,
+ }, {
+ .attr = {
+ .name = "inject_type",
+ .mode = (S_IRUGO | S_IWUSR)
+ },
+ .show = i7core_inject_type_show,
+ .store = i7core_inject_type_store,
+ }, {
+ .attr = {
+ .name = "inject_eccmask",
+ .mode = (S_IRUGO | S_IWUSR)
+ },
+ .show = i7core_inject_eccmask_show,
+ .store = i7core_inject_eccmask_store,
+ }, {
+ .grp = &i7core_inject_addrmatch,
+ }, {
+ .attr = {
+ .name = "inject_enable",
+ .mode = (S_IRUGO | S_IWUSR)
+ },
+ .show = i7core_inject_enable_show,
+ .store = i7core_inject_enable_store,
+ },
+ { .attr = { .name = NULL } }, /* Reserved for udimm counters */
+ { .attr = { .name = NULL } }
+};
+
+/****************************************************************************
+ Device initialization routines: put/get, init/exit
+ ****************************************************************************/
+
+/*
+ * i7core_put_devices 'put' all the devices that we have
+ * reserved via 'get'
+ */
+static void i7core_put_devices(struct i7core_dev *i7core_dev)
+{
+ int i;
+
+ debugf0(__FILE__ ": %s()\n", __func__);
+ for (i = 0; i < i7core_dev->n_devs; i++) {
+ struct pci_dev *pdev = i7core_dev->pdev[i];
+ if (!pdev)
+ continue;
+ debugf0("Removing dev %02x:%02x.%d\n",
+ pdev->bus->number,
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+ pci_dev_put(pdev);
+ }
+ kfree(i7core_dev->pdev);
+ list_del(&i7core_dev->list);
+ kfree(i7core_dev);
+}
+
+static void i7core_put_all_devices(void)
+{
+ struct i7core_dev *i7core_dev, *tmp;
+
+ list_for_each_entry_safe(i7core_dev, tmp, &i7core_edac_list, list)
+ i7core_put_devices(i7core_dev);
+}
+
+static void __init i7core_xeon_pci_fixup(struct pci_id_table *table)
+{
+ struct pci_dev *pdev = NULL;
+ int i;
+ /*
+ * On Xeon 55xx, the Intel Quckpath Arch Generic Non-core pci buses
+ * aren't announced by acpi. So, we need to use a legacy scan probing
+ * to detect them
+ */
+ while (table && table->descr) {
+ pdev = pci_get_device(PCI_VENDOR_ID_INTEL, table->descr[0].dev_id, NULL);
+ if (unlikely(!pdev)) {
+ for (i = 0; i < MAX_SOCKET_BUSES; i++)
+ pcibios_scan_specific_bus(255-i);
+ }
+ table++;
+ }
+}
+
+/*
+ * i7core_get_devices Find and perform 'get' operation on the MCH's
+ * device/functions we want to reference for this driver
+ *
+ * Need to 'get' device 16 func 1 and func 2
+ */
+int i7core_get_onedevice(struct pci_dev **prev, int devno,
+ struct pci_id_descr *dev_descr, unsigned n_devs)
+{
+ struct i7core_dev *i7core_dev;
+
+ struct pci_dev *pdev = NULL;
+ u8 bus = 0;
+ u8 socket = 0;
+
+ pdev = pci_get_device(PCI_VENDOR_ID_INTEL,
+ dev_descr->dev_id, *prev);
+
+ /*
+ * On Xeon 55xx, the Intel Quckpath Arch Generic Non-core regs
+ * is at addr 8086:2c40, instead of 8086:2c41. So, we need
+ * to probe for the alternate address in case of failure
+ */
+ if (dev_descr->dev_id == PCI_DEVICE_ID_INTEL_I7_NONCORE && !pdev)
+ pdev = pci_get_device(PCI_VENDOR_ID_INTEL,
+ PCI_DEVICE_ID_INTEL_I7_NONCORE_ALT, *prev);
+
+ if (dev_descr->dev_id == PCI_DEVICE_ID_INTEL_LYNNFIELD_NONCORE && !pdev)
+ pdev = pci_get_device(PCI_VENDOR_ID_INTEL,
+ PCI_DEVICE_ID_INTEL_LYNNFIELD_NONCORE_ALT,
+ *prev);
+
+ if (!pdev) {
+ if (*prev) {
+ *prev = pdev;
+ return 0;
+ }
+
+ if (dev_descr->optional)
+ return 0;
+
+ if (devno == 0)
+ return -ENODEV;
+
+ i7core_printk(KERN_ERR,
+ "Device not found: dev %02x.%d PCI ID %04x:%04x\n",
+ dev_descr->dev, dev_descr->func,
+ PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+
+ /* End of list, leave */
+ return -ENODEV;
+ }
+ bus = pdev->bus->number;
+
+ if (bus == 0x3f)
+ socket = 0;
+ else
+ socket = 255 - bus;
+
+ i7core_dev = get_i7core_dev(socket);
+ if (!i7core_dev) {
+ i7core_dev = kzalloc(sizeof(*i7core_dev), GFP_KERNEL);
+ if (!i7core_dev)
+ return -ENOMEM;
+ i7core_dev->pdev = kzalloc(sizeof(*i7core_dev->pdev) * n_devs,
+ GFP_KERNEL);
+ if (!i7core_dev->pdev) {
+ kfree(i7core_dev);
+ return -ENOMEM;
+ }
+ i7core_dev->socket = socket;
+ i7core_dev->n_devs = n_devs;
+ list_add_tail(&i7core_dev->list, &i7core_edac_list);
+ }
+
+ if (i7core_dev->pdev[devno]) {
+ i7core_printk(KERN_ERR,
+ "Duplicated device for "
+ "dev %02x:%02x.%d PCI ID %04x:%04x\n",
+ bus, dev_descr->dev, dev_descr->func,
+ PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+ pci_dev_put(pdev);
+ return -ENODEV;
+ }
+
+ i7core_dev->pdev[devno] = pdev;
+
+ /* Sanity check */
+ if (unlikely(PCI_SLOT(pdev->devfn) != dev_descr->dev ||
+ PCI_FUNC(pdev->devfn) != dev_descr->func)) {
+ i7core_printk(KERN_ERR,
+ "Device PCI ID %04x:%04x "
+ "has dev %02x:%02x.%d instead of dev %02x:%02x.%d\n",
+ PCI_VENDOR_ID_INTEL, dev_descr->dev_id,
+ bus, PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+ bus, dev_descr->dev, dev_descr->func);
+ return -ENODEV;
+ }
+
+ /* Be sure that the device is enabled */
+ if (unlikely(pci_enable_device(pdev) < 0)) {
+ i7core_printk(KERN_ERR,
+ "Couldn't enable "
+ "dev %02x:%02x.%d PCI ID %04x:%04x\n",
+ bus, dev_descr->dev, dev_descr->func,
+ PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+ return -ENODEV;
+ }
+
+ debugf0("Detected socket %d dev %02x:%02x.%d PCI ID %04x:%04x\n",
+ socket, bus, dev_descr->dev,
+ dev_descr->func,
+ PCI_VENDOR_ID_INTEL, dev_descr->dev_id);
+
+ *prev = pdev;
+
+ return 0;
+}
+
+static int i7core_get_devices(struct pci_id_table *table)
+{
+ int i, rc;
+ struct pci_dev *pdev = NULL;
+ struct pci_id_descr *dev_descr;
+
+ while (table && table->descr) {
+ dev_descr = table->descr;
+ for (i = 0; i < table->n_devs; i++) {
+ pdev = NULL;
+ do {
+ rc = i7core_get_onedevice(&pdev, i, &dev_descr[i],
+ table->n_devs);
+ if (rc < 0) {
+ if (i == 0) {
+ i = table->n_devs;
+ break;
+ }
+ i7core_put_all_devices();
+ return -ENODEV;
+ }
+ } while (pdev);
+ }
+ table++;
+ }
+
+ return 0;
+ return 0;
+}
+
+static int mci_bind_devs(struct mem_ctl_info *mci,
+ struct i7core_dev *i7core_dev)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ struct pci_dev *pdev;
+ int i, func, slot;
+
+ /* Associates i7core_dev and mci for future usage */
+ pvt->i7core_dev = i7core_dev;
+ i7core_dev->mci = mci;
+
+ pvt->is_registered = 0;
+ for (i = 0; i < i7core_dev->n_devs; i++) {
+ pdev = i7core_dev->pdev[i];
+ if (!pdev)
+ continue;
+
+ func = PCI_FUNC(pdev->devfn);
+ slot = PCI_SLOT(pdev->devfn);
+ if (slot == 3) {
+ if (unlikely(func > MAX_MCR_FUNC))
+ goto error;
+ pvt->pci_mcr[func] = pdev;
+ } else if (likely(slot >= 4 && slot < 4 + NUM_CHANS)) {
+ if (unlikely(func > MAX_CHAN_FUNC))
+ goto error;
+ pvt->pci_ch[slot - 4][func] = pdev;
+ } else if (!slot && !func)
+ pvt->pci_noncore = pdev;
+ else
+ goto error;
+
+ debugf0("Associated fn %d.%d, dev = %p, socket %d\n",
+ PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn),
+ pdev, i7core_dev->socket);
+
+ if (PCI_SLOT(pdev->devfn) == 3 &&
+ PCI_FUNC(pdev->devfn) == 2)
+ pvt->is_registered = 1;
+ }
+
+ /*
+ * Add extra nodes to count errors on udimm
+ * For registered memory, this is not needed, since the counters
+ * are already displayed at the standard locations
+ */
+ if (!pvt->is_registered)
+ i7core_sysfs_attrs[ARRAY_SIZE(i7core_sysfs_attrs)-2].grp =
+ &i7core_udimm_counters;
+
+ return 0;
+
+error:
+ i7core_printk(KERN_ERR, "Device %d, function %d "
+ "is out of the expected range\n",
+ slot, func);
+ return -EINVAL;
+}
+
+/****************************************************************************
+ Error check routines
+ ****************************************************************************/
+static void i7core_rdimm_update_csrow(struct mem_ctl_info *mci,
+ int chan, int dimm, int add)
+{
+ char *msg;
+ struct i7core_pvt *pvt = mci->pvt_info;
+ int row = pvt->csrow_map[chan][dimm], i;
+
+ for (i = 0; i < add; i++) {
+ msg = kasprintf(GFP_KERNEL, "Corrected error "
+ "(Socket=%d channel=%d dimm=%d)",
+ pvt->i7core_dev->socket, chan, dimm);
+
+ edac_mc_handle_fbd_ce(mci, row, 0, msg);
+ kfree (msg);
+ }
+}
+
+static void i7core_rdimm_update_ce_count(struct mem_ctl_info *mci,
+ int chan, int new0, int new1, int new2)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ int add0 = 0, add1 = 0, add2 = 0;
+ /* Updates CE counters if it is not the first time here */
+ if (pvt->ce_count_available) {
+ /* Updates CE counters */
+
+ add2 = new2 - pvt->rdimm_last_ce_count[chan][2];
+ add1 = new1 - pvt->rdimm_last_ce_count[chan][1];
+ add0 = new0 - pvt->rdimm_last_ce_count[chan][0];
+
+ if (add2 < 0)
+ add2 += 0x7fff;
+ pvt->rdimm_ce_count[chan][2] += add2;
+
+ if (add1 < 0)
+ add1 += 0x7fff;
+ pvt->rdimm_ce_count[chan][1] += add1;
+
+ if (add0 < 0)
+ add0 += 0x7fff;
+ pvt->rdimm_ce_count[chan][0] += add0;
+ } else
+ pvt->ce_count_available = 1;
+
+ /* Store the new values */
+ pvt->rdimm_last_ce_count[chan][2] = new2;
+ pvt->rdimm_last_ce_count[chan][1] = new1;
+ pvt->rdimm_last_ce_count[chan][0] = new0;
+
+ /*updated the edac core */
+ if (add0 != 0)
+ i7core_rdimm_update_csrow(mci, chan, 0, add0);
+ if (add1 != 0)
+ i7core_rdimm_update_csrow(mci, chan, 1, add1);
+ if (add2 != 0)
+ i7core_rdimm_update_csrow(mci, chan, 2, add2);
+
+}
+
+static void i7core_rdimm_check_mc_ecc_err(struct mem_ctl_info *mci)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ u32 rcv[3][2];
+ int i, new0, new1, new2;
+
+ /*Read DEV 3: FUN 2: MC_COR_ECC_CNT regs directly*/
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_0,
+ &rcv[0][0]);
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_1,
+ &rcv[0][1]);
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_2,
+ &rcv[1][0]);
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_3,
+ &rcv[1][1]);
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_4,
+ &rcv[2][0]);
+ pci_read_config_dword(pvt->pci_mcr[2], MC_COR_ECC_CNT_5,
+ &rcv[2][1]);
+ for (i = 0 ; i < 3; i++) {
+ debugf3("MC_COR_ECC_CNT%d = 0x%x; MC_COR_ECC_CNT%d = 0x%x\n",
+ (i * 2), rcv[i][0], (i * 2) + 1, rcv[i][1]);
+ /*if the channel has 3 dimms*/
+ if (pvt->channel[i].dimms > 2) {
+ new0 = DIMM_BOT_COR_ERR(rcv[i][0]);
+ new1 = DIMM_TOP_COR_ERR(rcv[i][0]);
+ new2 = DIMM_BOT_COR_ERR(rcv[i][1]);
+ } else {
+ new0 = DIMM_TOP_COR_ERR(rcv[i][0]) +
+ DIMM_BOT_COR_ERR(rcv[i][0]);
+ new1 = DIMM_TOP_COR_ERR(rcv[i][1]) +
+ DIMM_BOT_COR_ERR(rcv[i][1]);
+ new2 = 0;
+ }
+
+ i7core_rdimm_update_ce_count(mci, i, new0, new1, new2);
+ }
+}
+
+/* This function is based on the device 3 function 4 registers as described on:
+ * Intel Xeon Processor 5500 Series Datasheet Volume 2
+ * http://www.intel.com/Assets/PDF/datasheet/321322.pdf
+ * also available at:
+ * http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf
+ */
+static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ u32 rcv1, rcv0;
+ int new0, new1, new2;
+
+ if (!pvt->pci_mcr[4]) {
+ debugf0("%s MCR registers not found\n", __func__);
+ return;
+ }
+
+ /* Corrected test errors */
+ pci_read_config_dword(pvt->pci_mcr[4], MC_TEST_ERR_RCV1, &rcv1);
+ pci_read_config_dword(pvt->pci_mcr[4], MC_TEST_ERR_RCV0, &rcv0);
+
+ /* Store the new values */
+ new2 = DIMM2_COR_ERR(rcv1);
+ new1 = DIMM1_COR_ERR(rcv0);
+ new0 = DIMM0_COR_ERR(rcv0);
+
+ /* Updates CE counters if it is not the first time here */
+ if (pvt->ce_count_available) {
+ /* Updates CE counters */
+ int add0, add1, add2;
+
+ add2 = new2 - pvt->udimm_last_ce_count[2];
+ add1 = new1 - pvt->udimm_last_ce_count[1];
+ add0 = new0 - pvt->udimm_last_ce_count[0];
+
+ if (add2 < 0)
+ add2 += 0x7fff;
+ pvt->udimm_ce_count[2] += add2;
+
+ if (add1 < 0)
+ add1 += 0x7fff;
+ pvt->udimm_ce_count[1] += add1;
+
+ if (add0 < 0)
+ add0 += 0x7fff;
+ pvt->udimm_ce_count[0] += add0;
+
+ if (add0 | add1 | add2)
+ i7core_printk(KERN_ERR, "New Corrected error(s): "
+ "dimm0: +%d, dimm1: +%d, dimm2 +%d\n",
+ add0, add1, add2);
+ } else
+ pvt->ce_count_available = 1;
+
+ /* Store the new values */
+ pvt->udimm_last_ce_count[2] = new2;
+ pvt->udimm_last_ce_count[1] = new1;
+ pvt->udimm_last_ce_count[0] = new0;
+}
+
+/*
+ * According with tables E-11 and E-12 of chapter E.3.3 of Intel 64 and IA-32
+ * Architectures Software Developer’s Manual Volume 3B.
+ * Nehalem are defined as family 0x06, model 0x1a
+ *
+ * The MCA registers used here are the following ones:
+ * struct mce field MCA Register
+ * m->status MSR_IA32_MC8_STATUS
+ * m->addr MSR_IA32_MC8_ADDR
+ * m->misc MSR_IA32_MC8_MISC
+ * In the case of Nehalem, the error information is masked at .status and .misc
+ * fields
+ */
+static void i7core_mce_output_error(struct mem_ctl_info *mci,
+ struct mce *m)
+{
+ struct i7core_pvt *pvt = mci->pvt_info;
+ char *type, *optype, *err, *msg;
+ unsigned long error = m->status & 0x1ff0000l;
+ u32 optypenum = (m->status >> 4) & 0x07;
+ u32 core_err_cnt = (m->status >> 38) && 0x7fff;
+ u32 dimm = (m->misc >> 16) & 0x3;
+ u32 channel = (m->misc >> 18) & 0x3;
+ u32 syndrome = m->misc >> 32;
+ u32 errnum = find_first_bit(&error, 32);
+ int csrow;
+
+ if (m->mcgstatus & 1)
+ type = "FATAL";
+ else
+ type = "NON_FATAL";
+
+ switch (optypenum) {
+ case 0:
+ optype = "generic undef request";
+ break;
+