tegra124/nyan: rougly stable code base

nyan: Clock setup.
Reviewed-on: https://chromium-review.googlesource.com/172106
(cherry picked from commit 3697b6454c0aceebcf735436de90ba2441c9b7b1)

tegra124: Call into the mainboard bootblock init if one exists.
Reviewed-on: https://chromium-review.googlesource.com/172581
(cherry picked from commit 3a0cd48a0d1a9ce6b32ed614cd81fb81f5f82aec)

nyan: Add a mainboard specific bootblock.
Reviewed-on: https://chromium-review.googlesource.com/172582
(cherry picked from commit a83d065d660a26fe71ed79879c25f84a1b669f69)

nyan: tegra124: Redestribute the clock code between the mainboard and soc.
Reviewed-on: https://chromium-review.googlesource.com/172583
(cherry picked from commit ea703137fc37befa7d5a65afc982e298a0daca1b)

nyan: Initialize the i2c pins and controllers.
Reviewed-on: https://chromium-review.googlesource.com/172584
(cherry picked from commit 9c10a3074ef834688fea46c03551c2e3e54e44a8)

nyan: Initialize the PMIC.
Reviewed-on: https://chromium-review.googlesource.com/172585
(cherry picked from commit f6be8b0e607e05b73b5e4a84afcf04c879eee88a)

tegra124: add a chip.h and use it in NYAN
Reviewed-on: https://chromium-review.googlesource.com/172773
(cherry picked from commit 4dd5f1f091f2dcae5ce38203bb86c62994609f8f)

tegra: Reorder GPIO register accesses to avoid glitching
Reviewed-on: https://chromium-review.googlesource.com/172730
(cherry picked from commit 61bedbf0f839e19b284d21af2ad10f2ff15e17d5)

tegra: Turn GPIO wrappers into macros to make them easier to write
Reviewed-on: https://chromium-review.googlesource.com/172731
(cherry picked from commit 94550fdfa5a8005d2e6a313041de212ab7ac470c)

tegra: Change GPIO functions to allow variable arguments
Reviewed-on: https://chromium-review.googlesource.com/172916
(cherry picked from commit e95ccd984f718a04b6067ff6ad5049a2cd74466d)

tegra124: Implement starting up the main CPUs.
Reviewed-on: https://chromium-review.googlesource.com/172917
(cherry picked from commit 7c5169a197310e18a3df0f176c499669e3c2bda3)

tegra: Simplify the I2C constants.
Reviewed-on: https://chromium-review.googlesource.com/172953
(cherry picked from commit 130a07c86dfa5ba5ac4580f29db927c91f045c76)

tegra124: Fix SPI base addresses
Reviewed-on: https://chromium-review.googlesource.com/173322
(cherry picked from commit da808e46919ebd3b9f2377a5889f0d5f10b92357)

tegra124: Scrub the clock constants.
Reviewed-on: https://chromium-review.googlesource.com/172954
(cherry picked from commit 9305ff0696a6d556a97f928b8683770833a309a4)

tegra124: add DMA support
Reviewed-on: https://chromium-review.googlesource.com/172951
(cherry picked from commit 4d2a5a56b922ac37d2326d7b139697567aac37b8)

tegra124: add basic SPI driver
Reviewed-on: https://chromium-review.googlesource.com/172952
(cherry picked from commit 5f861f13c7fd2dd881f3cbd0f1b4d4a9994ce429)

tegra124: Add an assembly stub which is run first on the main CPUs.
Reviewed-on: https://chromium-review.googlesource.com/173541
(cherry picked from commit e142b9572a89f43fe984c4fc87e3203f380ff4de)

nyan: tegra124: Set up dynamic cbmem.
Reviewed-on: https://chromium-review.googlesource.com/173542
(cherry picked from commit b6e1a70103446abb5c3440f145617e6566879c6f)

tegra124: Add an soc.c which sets up the chip operations and memory resource.
Reviewed-on: https://chromium-review.googlesource.com/173543
(cherry picked from commit af49a5bd1f589cf053c4808510138aae26e20db4)

tegra124: extend chip.h to include video settings
Reviewed-on: https://chromium-review.googlesource.com/173600
(cherry picked from commit 87687633a2116f58fad7333b3b639cee9089ad29)

tegra124 and nyan: fill in the devicetree a bit more, add defines
Reviewed-on: https://chromium-review.googlesource.com/173684
(cherry picked from commit c107eaca3dea42be89f61690d0d6cb2181acb147)

tegra124: clean-ups for SPI driver
Reviewed-on: https://chromium-review.googlesource.com/173599
(cherry picked from commit 1e2f9fd442ea336bf0663c3c8ea51f771e21beb7)

tegra124: add a #define for DMA alignment size
Reviewed-on: https://chromium-review.googlesource.com/173638
(cherry picked from commit f9dc2a8d8016fa7db974fb6cb01c3275e26832af)

tegra124: Add FIFO transmit functions to SPI driver
Reviewed-on: https://chromium-review.googlesource.com/173639
(cherry picked from commit 97e61f36ad96ce2f9b12a7ef765ee73d3f4285f7)

tegra124: clean-ups for DMA driver
Reviewed-on: https://chromium-review.googlesource.com/173598
(cherry picked from commit 750c0a5d6942748dd21f3a3f884ad94a561e86e0)

tegra124: early display and display code.
Reviewed-on: https://chromium-review.googlesource.com/173622
(cherry picked from commit 651c7ab96b1f136865e4673a120de7afc1218558)

tegra124: Move transfer size handling to spi_xfer()
Reviewed-on: https://chromium-review.googlesource.com/173680
(cherry picked from commit 4a9b7b47b3c09d70063ea843054ffef98f554621)

tegra124: strict error detection and reporting for SPI
Reviewed-on: https://chromium-review.googlesource.com/173681
(cherry picked from commit c056fa954e1dab40a56faec6c50385763a2eb010)

tegra124: add thread-friendly delays to SPI driver
Reviewed-on: https://chromium-review.googlesource.com/173648
(cherry picked from commit c1a321c8f61942801627f895c5db74c518e2aa8e)

Tegra124: Take the SPI1 controller out of reset and enable its clock.
Reviewed-on: https://chromium-review.googlesource.com/173787
(cherry picked from commit c026a3fb861e157f1e17a121fc2ef70b903f36f2)

tegra124: add two more clock setting values
Reviewed-on: https://chromium-review.googlesource.com/173772
(cherry picked from commit 7d79d7dd9f0c1fd7127a7ba41652d809ccff7a57)

nyan: Set up the ChromeOS related GPIOs and SPI bus 1 which goes to the EC.
Reviewed-on: https://chromium-review.googlesource.com/173788
(cherry picked from commit ff172bfe30f75983a1e8efa2ead0a4519583d0a8)

tegra124: Add some stub functions to the Tegra SPI driver.
Reviewed-on: https://chromium-review.googlesource.com/173789
(cherry picked from commit 8bc527aa4afd301c046b0e844c7fa400630af0d2)

tegra124: Build source files into the various stges needed by CONFIG_CHROMEOS.
Reviewed-on: https://chromium-review.googlesource.com/173790
(cherry picked from commit 86a6423b668ca912295c47d8c6e3ef6c6f8c6084)

nyan: Implement the code which reads GPIOs for ChromeOS.
Reviewed-on: https://chromium-review.googlesource.com/173791
(cherry picked from commit 4c394dfbce762574fc79edcb6e4ac6bf346e48a3)

nyan: Enable the CHROMEOS and ChromeOS EC related kconfig options.
Reviewed-on: https://chromium-review.googlesource.com/173792
(cherry picked from commit 2845a4487159aa4b1dba58d977f52c449574fc8e)

Tegra124: SDMMC: Take the SDMMC 3 and 4 out of reset and ungate their clocks.
Reviewed-on: https://chromium-review.googlesource.com/173793
(cherry picked from commit c238b87bcd9d35afd828476d6ee88322ac5d0f88)

tegra124: fix clear_fifo_status() in SPI driver
Reviewed-on: https://chromium-review.googlesource.com/173738
(cherry picked from commit f415d2c0aaffc0f1a3592551a2db782d538f8f4f)

ARM: Include stdint.h in cpu.h.
Reviewed-on: https://chromium-review.googlesource.com/173774
(cherry picked from commit f1930faea3f14b2a2560a6c4058ef38532b6f1a6)

tegra124: When setting up the main CPU, set its CPSR appropriately.
Reviewed-on: https://chromium-review.googlesource.com/173775
(cherry picked from commit bc2ba9c15cfd22aeaca4f80b1d13a8b5e0178ead)

tegra124: fix wrong names in clk_rst.h
Reviewed-on: https://chromium-review.googlesource.com/173955
(cherry picked from commit 19dd9c85e4a3d1f77b23828bcbdd4bd8c2688b8d)

tegra124: Fix up the PLLX divider table.
Reviewed-on: https://chromium-review.googlesource.com/173778
(cherry picked from commit 3362cf3a7d6f5eaec879dda42323345922f6df17)

tegra124: clock: Get rid of cpcon and dccon.
Reviewed-on: https://chromium-review.googlesource.com/173779
(cherry picked from commit 08626ffac4a7e9ea3d4738af87e9e4cced7be2c7)

Tegra124: SPI: Set and unset CS in spi_claim_bus and spi_release_bus.
Reviewed-on: https://chromium-review.googlesource.com/173953
(cherry picked from commit a2df8f3a9c9c54c62d6ff37d3baff1d30ee6d355)

armv7: expose dcache_line_bytes() in cache API
Reviewed-on: https://chromium-review.googlesource.com/173975
(cherry picked from commit 6727f65702c7668fcb33848b4113bc3d3cc04e12)

libpayload: expose dcache_line_bytes() in ARM cache API
Reviewed-on: https://chromium-review.googlesource.com/174099
(cherry picked from commit 9387b02dff85b42944d95c3bccf59059c93fb4a9)

armv4: add a stub for dcache_line_bytes()
Reviewed-on: https://chromium-review.googlesource.com/173976
(cherry picked from commit 924f61ea895b9268c716791466637009bbac6469)

tegra124: Base early UART on CLK_M to enable debugging of PLL init code
Reviewed-on: https://chromium-review.googlesource.com/174339
(cherry picked from commit 8d9387432f0a0d9b257b040304238e543cced1aa)

tegra124: Add additional PLLs and redesign the divisor table
Reviewed-on: https://chromium-review.googlesource.com/174380
(cherry picked from commit f6a5f5c4562f1ca733505717c175be00413f2384)

Squashed 49 commits for tegra124/nyan that included a lot of churn on
different pieces.

Change-Id: I00e8f5b74e835e01b28ca2e9c4af3709c9363d56
Signed-off-by: Isaac Christensen <isaac.christensen@se-eng.com>
Reviewed-on: http://review.coreboot.org/6869
Tested-by: build bot (Jenkins)
Reviewed-by: David Hendricks <dhendrix@chromium.org>
diff --git a/payloads/libpayload/arch/arm/cache.c b/payloads/libpayload/arch/arm/cache.c
index 3944818..4c222ea 100644
--- a/payloads/libpayload/arch/arm/cache.c
+++ b/payloads/libpayload/arch/arm/cache.c
@@ -195,17 +195,20 @@
 	dcache_foreach(OP_DCISW);
 }
 
-static unsigned int line_bytes(void)
+unsigned int dcache_line_bytes(void)
 {
 	uint32_t ccsidr;
-	unsigned int size;
+	static unsigned int line_bytes = 0;
+
+	if (line_bytes)
+		return line_bytes;
 
 	ccsidr = read_ccsidr();
 	/* [2:0] - Indicates (Log2(number of words in cache line)) - 2 */
-	size = 1 << ((ccsidr & 0x7) + 2);	/* words per line */
-	size *= sizeof(unsigned int);		/* bytes per line */
+	line_bytes = 1 << ((ccsidr & 0x7) + 2);	/* words per line */
+	line_bytes *= sizeof(unsigned int);	/* bytes per line */
 
-	return size;
+	return line_bytes;
 }
 
 /*
@@ -219,7 +222,7 @@
 	unsigned long line, linesize;
 	unsigned long paddr = virt_to_phys(vaddr);
 
-	linesize = line_bytes();
+	linesize = dcache_line_bytes();
 	line = paddr & ~(linesize - 1);
 
 	dsb();
diff --git a/payloads/libpayload/include/arm/arch/cache.h b/payloads/libpayload/include/arm/arch/cache.h
index ffdb55a..5210dfe 100644
--- a/payloads/libpayload/include/arm/arch/cache.h
+++ b/payloads/libpayload/include/arm/arch/cache.h
@@ -304,6 +304,9 @@
 /* dcache invalidate all (on current level given by CCSELR) */
 void dcache_invalidate_all(void);
 
+/* returns number of bytes per cache line */
+unsigned int dcache_line_bytes(void);
+
 /* dcache and MMU disable */
 void dcache_mmu_disable(void);
 
diff --git a/src/arch/arm/armv4/cache.c b/src/arch/arm/armv4/cache.c
index 729b82c..e5cf293 100644
--- a/src/arch/arm/armv4/cache.c
+++ b/src/arch/arm/armv4/cache.c
@@ -55,6 +55,17 @@
 {
 }
 
+unsigned int dcache_line_bytes(void)
+{
+	/*
+	 * TODO: Implement this correctly. For now we just return a
+	 * reasonable value. It was added during Nyan development and
+	 * may be used in bootblock code. It matters only if dcache is
+	 * turned on.
+	 */
+	return 64;
+}
+
 void dcache_clean_by_mva(void const *addr, size_t len)
 {
 }
diff --git a/src/arch/arm/armv7/cache.c b/src/arch/arm/armv7/cache.c
index acd1f9a..4ee2687 100644
--- a/src/arch/arm/armv7/cache.c
+++ b/src/arch/arm/armv7/cache.c
@@ -194,17 +194,20 @@
 	dcache_foreach(OP_DCISW);
 }
 
-static unsigned int line_bytes(void)
+unsigned int dcache_line_bytes(void)
 {
 	uint32_t ccsidr;
-	unsigned int size;
+	static unsigned int line_bytes = 0;
+
+	if (line_bytes)
+		return line_bytes;
 
 	ccsidr = read_ccsidr();
 	/* [2:0] - Indicates (Log2(number of words in cache line)) - 2 */
-	size = 1 << ((ccsidr & 0x7) + 2);	/* words per line */
-	size *= sizeof(unsigned int);		/* bytes per line */
+	line_bytes = 1 << ((ccsidr & 0x7) + 2);	/* words per line */
+	line_bytes *= sizeof(unsigned int);	/* bytes per line */
 
-	return size;
+	return line_bytes;
 }
 
 /*
@@ -217,7 +220,7 @@
 {
 	unsigned long line, linesize;
 
-	linesize = line_bytes();
+	linesize = dcache_line_bytes();
 	line = (uint32_t)addr & ~(linesize - 1);
 
 	dsb();
diff --git a/src/arch/arm/include/armv4/arch/cache.h b/src/arch/arm/include/armv4/arch/cache.h
index db4379a..6a3f593 100644
--- a/src/arch/arm/include/armv4/arch/cache.h
+++ b/src/arch/arm/include/armv4/arch/cache.h
@@ -57,6 +57,9 @@
 /* dcache invalidate all (on current level given by CCSELR) */
 void dcache_invalidate_all(void);
 
+/* returns number of bytes per cache line */
+unsigned int dcache_line_bytes(void);
+
 /* dcache and MMU disable */
 void dcache_mmu_disable(void);
 
diff --git a/src/arch/arm/include/armv7/arch/cache.h b/src/arch/arm/include/armv7/arch/cache.h
index ffdb55a..5210dfe 100644
--- a/src/arch/arm/include/armv7/arch/cache.h
+++ b/src/arch/arm/include/armv7/arch/cache.h
@@ -304,6 +304,9 @@
 /* dcache invalidate all (on current level given by CCSELR) */
 void dcache_invalidate_all(void);
 
+/* returns number of bytes per cache line */
+unsigned int dcache_line_bytes(void);
+
 /* dcache and MMU disable */
 void dcache_mmu_disable(void);
 
diff --git a/src/arch/arm/include/armv7/arch/cpu.h b/src/arch/arm/include/armv7/arch/cpu.h
index 52cc8a3..275bb8c 100644
--- a/src/arch/arm/include/armv7/arch/cpu.h
+++ b/src/arch/arm/include/armv7/arch/cpu.h
@@ -20,6 +20,8 @@
 #ifndef __ARCH_CPU_H__
 #define __ARCH_CPU_H__
 
+#include <stdint.h>
+
 #define asmlinkage
 
 #if !defined(__PRE_RAM__)
diff --git a/src/mainboard/google/nyan/Kconfig b/src/mainboard/google/nyan/Kconfig
index 5ac58d3..652cef1 100644
--- a/src/mainboard/google/nyan/Kconfig
+++ b/src/mainboard/google/nyan/Kconfig
@@ -22,6 +22,10 @@
 config BOARD_SPECIFIC_OPTIONS # dummy
 	def_bool y
 	select SOC_NVIDIA_TEGRA124
+	select MAINBOARD_HAS_CHROMEOS
+	select EC_GOOGLE_CHROMEEC
+	select EC_GOOGLE_CHROMEEC_SPI
+	select MAINBOARD_HAS_BOOTBLOCK_INIT
 	select BOARD_ROMSIZE_KB_1024
 
 config MAINBOARD_DIR
@@ -54,4 +58,24 @@
 
 endchoice
 
+config BOOT_MEDIA_SPI_BUS
+	int "SPI bus with boot media ROM"
+	range 1 6
+	depends on BCT_CFG_SPI
+	default 4
+	help
+	  Which SPI bus the boot media is connected to.
+
+config BOOT_MEDIA_SPI_CHIP_SELECT
+	int "Chip select for SPI boot media"
+	range 0 3
+	depends on BCT_CFG_SPI
+	default 0
+	help
+	  Which chip select to use for boot media.
+
+config EC_GOOGLE_CHROMEEC_SPI_BUS
+	hex
+	default 1
+
 endif # BOARD_GOOGLE_NYAN
diff --git a/src/mainboard/google/nyan/Makefile.inc b/src/mainboard/google/nyan/Makefile.inc
index 3cf7dd2..49ccf39 100644
--- a/src/mainboard/google/nyan/Makefile.inc
+++ b/src/mainboard/google/nyan/Makefile.inc
@@ -27,6 +27,11 @@
 
 subdirs-y += bct
 
+bootblock-y += bootblock.c
+bootblock-y += pmic.c
+
 romstage-y += romstage.c
+romstage-$(CONFIG_CHROMEOS) += chromeos.c
 
 ramstage-y += mainboard.c
+ramstage-$(CONFIG_CHROMEOS) += chromeos.c
diff --git a/src/mainboard/google/nyan/bootblock.c b/src/mainboard/google/nyan/bootblock.c
new file mode 100644
index 0000000..49133ca
--- /dev/null
+++ b/src/mainboard/google/nyan/bootblock.c
@@ -0,0 +1,80 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <bootblock_common.h>
+#include <console/console.h>
+#include <device/i2c.h>
+#include <soc/clock.h>
+#include <soc/nvidia/tegra/i2c.h>
+#include <soc/nvidia/tegra124/pinmux.h>
+#include <soc/nvidia/tegra124/spi.h>	/* FIXME: move back to soc code? */
+
+#include "pmic.h"
+
+void bootblock_mainboard_init(void)
+{
+	clock_config();
+
+	// I2C1 clock.
+	pinmux_set_config(PINMUX_GEN1_I2C_SCL_INDEX,
+			  PINMUX_GEN1_I2C_SCL_FUNC_I2C1 | PINMUX_INPUT_ENABLE);
+	// I2C1 data.
+	pinmux_set_config(PINMUX_GEN1_I2C_SDA_INDEX,
+			  PINMUX_GEN1_I2C_SDA_FUNC_I2C1 | PINMUX_INPUT_ENABLE);
+	// I2C2 clock.
+	pinmux_set_config(PINMUX_GEN2_I2C_SCL_INDEX,
+			  PINMUX_GEN2_I2C_SCL_FUNC_I2C2 | PINMUX_INPUT_ENABLE);
+	// I2C2 data.
+	pinmux_set_config(PINMUX_GEN2_I2C_SDA_INDEX,
+			  PINMUX_GEN2_I2C_SDA_FUNC_I2C2 | PINMUX_INPUT_ENABLE);
+	// I2C3 (cam) clock.
+	pinmux_set_config(PINMUX_CAM_I2C_SCL_INDEX,
+			  PINMUX_CAM_I2C_SCL_FUNC_I2C3 | PINMUX_INPUT_ENABLE);
+	// I2C3 (cam) data.
+	pinmux_set_config(PINMUX_CAM_I2C_SDA_INDEX,
+			  PINMUX_CAM_I2C_SDA_FUNC_I2C3 | PINMUX_INPUT_ENABLE);
+	// I2C5 (PMU) clock.
+	pinmux_set_config(PINMUX_PWR_I2C_SCL_INDEX,
+			  PINMUX_PWR_I2C_SCL_FUNC_I2CPMU | PINMUX_INPUT_ENABLE);
+	// I2C5 (PMU) data.
+	pinmux_set_config(PINMUX_PWR_I2C_SDA_INDEX,
+			  PINMUX_PWR_I2C_SDA_FUNC_I2CPMU | PINMUX_INPUT_ENABLE);
+
+	i2c_init(0);
+	i2c_init(1);
+	i2c_init(2);
+	i2c_init(4);
+
+	pmic_init(4);
+
+	/* SPI4 data out (MOSI) */
+	pinmux_set_config(PINMUX_SDMMC1_CMD_INDEX,
+			  PINMUX_SDMMC1_CMD_FUNC_SPI4 | PINMUX_INPUT_ENABLE);
+	/* SPI4 data in (MISO) */
+	pinmux_set_config(PINMUX_SDMMC1_DAT1_INDEX,
+			  PINMUX_SDMMC1_DAT1_FUNC_SPI4 | PINMUX_INPUT_ENABLE);
+	/* SPI4 clock */
+	pinmux_set_config(PINMUX_SDMMC1_DAT2_INDEX,
+			  PINMUX_SDMMC1_DAT2_FUNC_SPI4 | PINMUX_INPUT_ENABLE);
+	/* SPI4 chip select 0 */
+	pinmux_set_config(PINMUX_SDMMC1_DAT3_INDEX,
+			  PINMUX_SDMMC1_DAT3_FUNC_SPI4 | PINMUX_INPUT_ENABLE);
+//	spi_init();
+	tegra_spi_init(4);
+}
diff --git a/src/mainboard/google/nyan/chromeos.c b/src/mainboard/google/nyan/chromeos.c
new file mode 100644
index 0000000..5b8b9c0
--- /dev/null
+++ b/src/mainboard/google/nyan/chromeos.c
@@ -0,0 +1,99 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <boot/coreboot_tables.h>
+#include <console/console.h>
+#include <ec/google/chromeec/ec.h>
+#include <ec/google/chromeec/ec_commands.h>
+#include <string.h>
+#include <vendorcode/google/chromeos/chromeos.h>
+#include <bootmode.h>
+#include <soc/nvidia/tegra124/gpio.h>
+
+void fill_lb_gpios(struct lb_gpios *gpios)
+{
+	int count = 0;
+
+	/* Write Protect: active low */
+	gpios->gpios[count].port = GPIO_R1_INDEX;
+	gpios->gpios[count].polarity = ACTIVE_LOW;
+	gpios->gpios[count].value = gpio_get_in_value(GPIO_R1_INDEX);
+	strncpy((char *)gpios->gpios[count].name, "write protect",
+		GPIO_MAX_NAME_LENGTH);
+	count++;
+
+	/* Recovery: active high */
+	gpios->gpios[count].port = -1;
+	gpios->gpios[count].polarity = ACTIVE_HIGH;
+	gpios->gpios[count].value = get_recovery_mode_switch();
+	strncpy((char *)gpios->gpios[count].name, "recovery",
+		GPIO_MAX_NAME_LENGTH);
+	count++;
+
+	/* Lid: active high */
+	gpios->gpios[count].port = GPIO_R4_INDEX;
+	gpios->gpios[count].polarity = ACTIVE_HIGH;
+	gpios->gpios[count].value = gpio_get_in_value(GPIO_R4_INDEX);
+	strncpy((char *)gpios->gpios[count].name, "lid", GPIO_MAX_NAME_LENGTH);
+	count++;
+
+	/* Power: active low */
+	gpios->gpios[count].port = GPIO_Q0_INDEX;
+	gpios->gpios[count].polarity = ACTIVE_LOW;
+	gpios->gpios[count].value = gpio_get_in_value(GPIO_Q0_INDEX);
+	strncpy((char *)gpios->gpios[count].name, "power",
+		GPIO_MAX_NAME_LENGTH);
+	count++;
+
+	/* Developer: virtual GPIO active high */
+	gpios->gpios[count].port = -1;
+	gpios->gpios[count].polarity = ACTIVE_HIGH;
+	gpios->gpios[count].value = get_developer_mode_switch();
+	strncpy((char *)gpios->gpios[count].name, "developer",
+		GPIO_MAX_NAME_LENGTH);
+	count++;
+
+	gpios->size = sizeof(*gpios) + (count * sizeof(struct lb_gpio));
+	gpios->count = count;
+
+	printk(BIOS_ERR, "Added %d GPIOS size %d\n", count, gpios->size);
+}
+
+int get_developer_mode_switch(void)
+{
+	return gpio_get_in_value(GPIO_Q6_INDEX);
+}
+
+int get_recovery_mode_switch(void)
+{
+	uint32_t ec_events;
+
+	/* The GPIO is active low. */
+	if (!gpio_get_in_value(GPIO_Q7_INDEX)) // RECMODE_GPIO
+		return 1;
+
+	ec_events = google_chromeec_get_events_b();
+	return !!(ec_events &
+		  EC_HOST_EVENT_MASK(EC_HOST_EVENT_KEYBOARD_RECOVERY));
+}
+
+int get_write_protect_state(void)
+{
+	return !gpio_get_in_value(GPIO_R1_INDEX);
+}
diff --git a/src/mainboard/google/nyan/devicetree.cb b/src/mainboard/google/nyan/devicetree.cb
index 392a5ae..623c5a1 100644
--- a/src/mainboard/google/nyan/devicetree.cb
+++ b/src/mainboard/google/nyan/devicetree.cb
@@ -19,4 +19,52 @@
 
 chip soc/nvidia/tegra124
 	device cpu_cluster 0 on end
+# N.B. We ae not using the device tree in an effective way.
+# We need to change this in future such that the on-soc
+# devices are 'chips', which will allow us to go at them
+# in parallel. This is even easier on the ARM SOCs since there
+# are no single-access resources such as the infamous
+# cf8/cfc registers found on PCs.
+	register "display_controller" = "TEGRA_ARM_DISPLAYA"
+	register "xres" = "2560"
+	register "yres" = "1700"
+	register "framebuffer_bits_per_pixel" = "24"
+	register "cache_policy" = "DCACHE_WRITETHROUGH"
+
+	# With some help from the mainbaord designer
+	register "backlight_en_gpio" = "GPIO(H2)"
+	register "lvds_shutdown_gpio" = "0"
+	register "backlight_vdd_gpio" = "GPIO(P2)"
+	register "panel_vdd_gpio" = "0"
+	register "pwm" = "GPIO(H1)"
+
+# taken from u-boot; these look wrong however.
+	register "vdd_delay" = "400"
+	register "vdd_data_delay" = "4"
+	register "data_backlight_delay" = "203"
+	register "backlight_pwm_delay" = "17"
+	register "pwm_backlight_en_delay" = "15"
+
+# How to compute these: xrandr --verbose will give you this:
+#Detailed mode: Clock 285.250 MHz, 272 mm x 181 mm
+#               2560 2608 2640 2720 hborder 0
+#               1700 1703 1713 1749 vborder 0
+#Then you can compute your values:
+#H front porch = 2608 - 2560 = 48
+#H sync = 2640 - 2608 = 32
+#H back porch = 2720 - 2640 = 80
+#V front porch = 1703 - 1700 = 3
+#V sync = 1713 - 1703 = 10
+#V back porch = 1749 - 1713 = 36
+#href_to_sync and vref_to_sync are from the vendor
+
+	register "href_to_sync" = "11"
+	register "hfront_porch" = "48"
+	register "hsync_width" = "32"
+	register "hback_porch" = "80"
+
+	register "vref_to_sync" = "1"
+	register "vfront_porch" = "3"
+	register "vsync_width" = "10"
+	register "vback_porch" = "36"
 end
diff --git a/src/mainboard/google/nyan/mainboard.c b/src/mainboard/google/nyan/mainboard.c
index 9e08021..c7258ff 100644
--- a/src/mainboard/google/nyan/mainboard.c
+++ b/src/mainboard/google/nyan/mainboard.c
@@ -19,10 +19,44 @@
 
 #include <device/device.h>
 #include <boot/coreboot_tables.h>
+#include <soc/nvidia/tegra124/gpio.h>
 
-/* this happens after cpu_init where exynos resources are set */
+static void setup_pinmux(void)
+{
+	// Write protect.
+	gpio_input_pullup(GPIO(R1));
+	// Recovery mode.
+	gpio_input_pullup(GPIO(Q7));
+	// Lid switch.
+	gpio_input_pullup(GPIO(R4));
+	// Power switch.
+	gpio_input_pullup(GPIO(Q0));
+	// Developer mode.
+	gpio_input_pullup(GPIO(Q6));
+	// EC in RW.
+	gpio_input_pullup(GPIO(U4));
+
+	// SPI1 MOSI
+	pinmux_set_config(PINMUX_ULPI_CLK_INDEX, PINMUX_ULPI_CLK_FUNC_SPI1 |
+						 PINMUX_PULL_UP |
+						 PINMUX_INPUT_ENABLE);
+	// SPI1 MISO
+	pinmux_set_config(PINMUX_ULPI_DIR_INDEX, PINMUX_ULPI_DIR_FUNC_SPI1 |
+						 PINMUX_PULL_UP |
+						 PINMUX_INPUT_ENABLE);
+	// SPI1 SCLK
+	pinmux_set_config(PINMUX_ULPI_NXT_INDEX, PINMUX_ULPI_NXT_FUNC_SPI1 |
+						 PINMUX_PULL_NONE |
+						 PINMUX_INPUT_ENABLE);
+	// SPI1 CS0
+	pinmux_set_config(PINMUX_ULPI_STP_INDEX, PINMUX_ULPI_STP_FUNC_SPI1 |
+						 PINMUX_PULL_NONE |
+						 PINMUX_INPUT_ENABLE);
+}
+
 static void mainboard_init(device_t dev)
 {
+	setup_pinmux();
 }
 
 static void mainboard_enable(device_t dev)
diff --git a/src/mainboard/google/nyan/pmic.c b/src/mainboard/google/nyan/pmic.c
new file mode 100644
index 0000000..ab951ea
--- /dev/null
+++ b/src/mainboard/google/nyan/pmic.c
@@ -0,0 +1,78 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <delay.h>
+#include <device/i2c.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include "pmic.h"
+
+struct pmic_write
+{
+	uint8_t reg; // Register to write.
+	uint8_t val; // Value to write.
+};
+
+enum {
+	AS3722_I2C_ADDR = 0x40
+};
+
+static struct pmic_write pmic_writes[] =
+{
+	/* Don't need to set up VDD_CORE - already done - by OTP */
+
+	/* First set VDD_CPU to 1.0V, then enable the VDD_CPU regulator. */
+	{ 0x00, 0x28 },
+
+	/* Don't write SDCONTROL - it's already 0x7F, i.e. all SDs enabled. */
+
+	/* First set VDD_GPU to 1.0V, then enable the VDD_GPU regulator. */
+	{ 0x06, 0x28 },
+
+	/* Don't write SDCONTROL - it's already 0x7F, i.e. all SDs enabled. */
+
+	/* First set VPP_FUSE to 1.2V, then enable the VPP_FUSE regulator. */
+	{ 0x12, 0x10 },
+
+	/* Don't write LDCONTROL - it's already 0xFF, i.e. all LDOs enabled. */
+
+	/*
+	 * Bring up VDD_SDMMC via the AS3722 PMIC on the PWR I2C bus.
+	 * First set it to bypass 3.3V straight thru, then enable the regulator
+	 *
+	 * NOTE: We do this early because doing it later seems to hose the CPU
+	 * power rail/partition startup. Need to debug.
+	 */
+	{ 0x16, 0x3f }
+
+	/* Don't write LDCONTROL - it's already 0xFF, i.e. all LDOs enabled. */
+};
+
+void pmic_init(unsigned bus)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(pmic_writes); i++) {
+		i2c_write(bus, AS3722_I2C_ADDR, pmic_writes[i].reg, 1,
+			  &pmic_writes[i].val, 1);
+		udelay(10 * 1000);
+	}
+}
diff --git a/src/mainboard/google/nyan/pmic.h b/src/mainboard/google/nyan/pmic.h
new file mode 100644
index 0000000..78c9f0d
--- /dev/null
+++ b/src/mainboard/google/nyan/pmic.h
@@ -0,0 +1,25 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef __MAINBOARD_GOOGLE_NYAN_PMIC_H__
+#define __MAINBOARD_GOOGLE_NYAN_PMIC_H__
+
+void pmic_init(unsigned bus);
+
+#endif /* __MAINBOARD_GOOGLE_NYAN_PMIC_H__ */
diff --git a/src/mainboard/google/nyan/romstage.c b/src/mainboard/google/nyan/romstage.c
index c52fbd2..5a66dde 100644
--- a/src/mainboard/google/nyan/romstage.c
+++ b/src/mainboard/google/nyan/romstage.c
@@ -18,12 +18,35 @@
  */
 
 #include <arch/stages.h>
+#include <device/device.h>
 #include <cbfs.h>
+#include <cbmem.h>
 #include <console/console.h>
+#include "soc/nvidia/tegra124/chip.h"
+#include <soc/display.h>
 
 void main(void)
 {
 	void *entry;
+	const struct device *soc;
+	const struct soc_nvidia_tegra124_config *config;
+
+	/* for quality of the user interface, it's important to get
+	 * the video going ASAP. Because there are long delays in some
+	 * of the powerup steps, we do some very early setup here in
+	 * romstage. We don't do this in the bootblock because video
+	 * setup is finicky and subject to change; hence, we do it as
+	 * early as we can in the RW stage, but never in the RO stage.
+	 */
+
+	soc = dev_find_slot(DEVICE_PATH_CPU_CLUSTER, 0);
+	printk(BIOS_SPEW, "s%s: soc is %p\n", __func__, soc);
+	if (soc && soc->chip_info){
+		config = soc->chip_info;
+		setup_display((struct soc_nvidia_tegra124_config *)config);
+	}
+
+	cbmem_initialize_empty();
 
 	entry = cbfs_load_stage(CBFS_DEFAULT_MEDIA, "fallback/coreboot_ram");
 	stage_exit(entry);
diff --git a/src/soc/nvidia/tegra/dc.h b/src/soc/nvidia/tegra/dc.h
new file mode 100644
index 0000000..33dbe53
--- /dev/null
+++ b/src/soc/nvidia/tegra/dc.h
@@ -0,0 +1,564 @@
+/*
+ *  Copyright 2013 Google Inc.
+ *  (C) Copyright 2010
+ *  NVIDIA Corporation <www.nvidia.com>
+ *
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+
+#ifndef __SOC_NVIDIA_TEGRA_DC_H
+#define __SOC_NVIDIA_TEGRA_DC_H
+
+/* Register definitions for the Tegra display controller */
+
+/* CMD register 0x000 ~ 0x43 */
+struct dc_cmd_reg {
+	/* Address 0x000 ~ 0x002 */
+	u32 gen_incr_syncpt;		/* _CMD_GENERAL_INCR_SYNCPT_0 */
+	u32 gen_incr_syncpt_ctrl;	/* _CMD_GENERAL_INCR_SYNCPT_CNTRL_0 */
+	u32 gen_incr_syncpt_err;	/* _CMD_GENERAL_INCR_SYNCPT_ERROR_0 */
+
+	u32 reserved0[5];		/* reserved_0[5] */
+
+	/* Address 0x008 ~ 0x00a */
+	u32 win_a_incr_syncpt;		/* _CMD_WIN_A_INCR_SYNCPT_0 */
+	u32 win_a_incr_syncpt_ctrl;	/* _CMD_WIN_A_INCR_SYNCPT_CNTRL_0 */
+	u32 win_a_incr_syncpt_err;	/* _CMD_WIN_A_INCR_SYNCPT_ERROR_0 */
+
+	u32 reserved1[5];		/* reserved_1[5] */
+
+	/* Address 0x010 ~ 0x012 */
+	u32 win_b_incr_syncpt;		/* _CMD_WIN_B_INCR_SYNCPT_0 */
+	u32 win_b_incr_syncpt_ctrl;	/* _CMD_WIN_B_INCR_SYNCPT_CNTRL_0 */
+	u32 win_b_incr_syncpt_err;	/* _CMD_WIN_B_INCR_SYNCPT_ERROR_0 */
+
+	u32 reserved2[5];		/* reserved_2[5] */
+
+	/* Address 0x018 ~ 0x01a */
+	u32 win_c_incr_syncpt;		/* _CMD_WIN_C_INCR_SYNCPT_0 */
+	u32 win_c_incr_syncpt_ctrl;	/* _CMD_WIN_C_INCR_SYNCPT_CNTRL_0 */
+	u32 win_c_incr_syncpt_err;	/* _CMD_WIN_C_INCR_SYNCPT_ERROR_0 */
+
+	u32 reserved3[13];		/* reserved_3[13] */
+
+	/* Address 0x028 */
+	u32 cont_syncpt_vsync;		/* _CMD_CONT_SYNCPT_VSYNC_0 */
+
+	u32 reserved4[7];		/* reserved_4[7] */
+
+	/* Address 0x030 ~ 0x033 */
+	u32 ctxsw;			/* _CMD_CTXSW_0 */
+	u32 disp_cmd_opt0;		/* _CMD_DISPLAY_COMMAND_OPTION0_0 */
+	u32 disp_cmd;			/* _CMD_DISPLAY_COMMAND_0 */
+	u32 sig_raise;			/* _CMD_SIGNAL_RAISE_0 */
+
+	u32 reserved5[2];		/* reserved_0[2] */
+
+	/* Address 0x036 ~ 0x03e */
+	u32 disp_pow_ctrl;		/* _CMD_DISPLAY_POWER_CONTROL_0 */
+	u32 int_stat;			/* _CMD_INT_STATUS_0 */
+	u32 int_mask;			/* _CMD_INT_MASK_0 */
+	u32 int_enb;			/* _CMD_INT_ENABLE_0 */
+	u32 int_type;			/* _CMD_INT_TYPE_0 */
+	u32 int_polarity;		/* _CMD_INT_POLARITY_0 */
+	u32 sig_raise1;		/* _CMD_SIGNAL_RAISE1_0 */
+	u32 sig_raise2;		/* _CMD_SIGNAL_RAISE2_0 */
+	u32 sig_raise3;		/* _CMD_SIGNAL_RAISE3_0 */
+
+	u32 reserved6;			/* reserved_6 */
+
+	/* Address 0x040 ~ 0x043 */
+	u32 state_access;		/* _CMD_STATE_ACCESS_0 */
+	u32 state_ctrl;		/* _CMD_STATE_CONTROL_0 */
+	u32 disp_win_header;		/* _CMD_DISPLAY_WINDOW_HEADER_0 */
+	u32 reg_act_ctrl;		/* _CMD_REG_ACT_CONTROL_0 */
+};
+
+enum {
+	PIN_REG_COUNT		= 4,
+	PIN_OUTPUT_SEL_COUNT	= 7,
+};
+
+/* COM register 0x300 ~ 0x329 */
+struct dc_com_reg {
+	/* Address 0x300 ~ 0x301 */
+	u32 crc_ctrl;			/* _COM_CRC_CONTROL_0 */
+	u32 crc_checksum;		/* _COM_CRC_CHECKSUM_0 */
+
+	/* _COM_PIN_OUTPUT_ENABLE0/1/2/3_0: Address 0x302 ~ 0x305 */
+	u32 pin_output_enb[PIN_REG_COUNT];
+
+	/* _COM_PIN_OUTPUT_POLARITY0/1/2/3_0: Address 0x306 ~ 0x309 */
+	u32 pin_output_polarity[PIN_REG_COUNT];
+
+	/* _COM_PIN_OUTPUT_DATA0/1/2/3_0: Address 0x30a ~ 0x30d */
+	u32 pin_output_data[PIN_REG_COUNT];
+
+	/* _COM_PIN_INPUT_ENABLE0_0: Address 0x30e ~ 0x311 */
+	u32 pin_input_enb[PIN_REG_COUNT];
+
+	/* Address 0x312 ~ 0x313 */
+	u32 pin_input_data0;		/* _COM_PIN_INPUT_DATA0_0 */
+	u32 pin_input_data1;		/* _COM_PIN_INPUT_DATA1_0 */
+
+	/* _COM_PIN_OUTPUT_SELECT0/1/2/3/4/5/6_0: Address 0x314 ~ 0x31a */
+	u32 pin_output_sel[PIN_OUTPUT_SEL_COUNT];
+
+	/* Address 0x31b ~ 0x329 */
+	u32 pin_misc_ctrl;		/* _COM_PIN_MISC_CONTROL_0 */
+	u32 pm0_ctrl;			/* _COM_PM0_CONTROL_0 */
+	u32 pm0_duty_cycle;		/* _COM_PM0_DUTY_CYCLE_0 */
+	u32 pm1_ctrl;			/* _COM_PM1_CONTROL_0 */
+	u32 pm1_duty_cycle;		/* _COM_PM1_DUTY_CYCLE_0 */
+	u32 spi_ctrl;			/* _COM_SPI_CONTROL_0 */
+	u32 spi_start_byte;		/* _COM_SPI_START_BYTE_0 */
+	u32 hspi_wr_data_ab;		/* _COM_HSPI_WRITE_DATA_AB_0 */
+	u32 hspi_wr_data_cd;		/* _COM_HSPI_WRITE_DATA_CD */
+	u32 hspi_cs_dc;		/* _COM_HSPI_CS_DC_0 */
+	u32 scratch_reg_a;		/* _COM_SCRATCH_REGISTER_A_0 */
+	u32 scratch_reg_b;		/* _COM_SCRATCH_REGISTER_B_0 */
+	u32 gpio_ctrl;			/* _COM_GPIO_CTRL_0 */
+	u32 gpio_debounce_cnt;		/* _COM_GPIO_DEBOUNCE_COUNTER_0 */
+	u32 crc_checksum_latched;	/* _COM_CRC_CHECKSUM_LATCHED_0 */
+};
+
+enum dc_disp_h_pulse_pos {
+	H_PULSE0_POSITION_A,
+	H_PULSE0_POSITION_B,
+	H_PULSE0_POSITION_C,
+	H_PULSE0_POSITION_D,
+	H_PULSE0_POSITION_COUNT,
+};
+
+struct _disp_h_pulse {
+	/* _DISP_H_PULSE0/1/2_CONTROL_0 */
+	u32 h_pulse_ctrl;
+	/* _DISP_H_PULSE0/1/2_POSITION_A/B/C/D_0 */
+	u32 h_pulse_pos[H_PULSE0_POSITION_COUNT];
+};
+
+enum dc_disp_v_pulse_pos {
+	V_PULSE0_POSITION_A,
+	V_PULSE0_POSITION_B,
+	V_PULSE0_POSITION_C,
+	V_PULSE0_POSITION_COUNT,
+};
+
+struct _disp_v_pulse0 {
+	/* _DISP_H_PULSE0/1_CONTROL_0 */
+	u32 v_pulse_ctrl;
+	/* _DISP_H_PULSE0/1_POSITION_A/B/C_0 */
+	u32 v_pulse_pos[V_PULSE0_POSITION_COUNT];
+};
+
+struct _disp_v_pulse2 {
+	/* _DISP_H_PULSE2/3_CONTROL_0 */
+	u32 v_pulse_ctrl;
+	/* _DISP_H_PULSE2/3_POSITION_A_0 */
+	u32 v_pulse_pos_a;
+};
+
+enum dc_disp_h_pulse_reg {
+	H_PULSE0,
+	H_PULSE1,
+	H_PULSE2,
+	H_PULSE_COUNT,
+};
+
+enum dc_disp_pp_select {
+	PP_SELECT_A,
+	PP_SELECT_B,
+	PP_SELECT_C,
+	PP_SELECT_D,
+	PP_SELECT_COUNT,
+};
+
+/* DISP register 0x400 ~ 0x4c1 */
+struct dc_disp_reg {
+	/* Address 0x400 ~ 0x40a */
+	u32 disp_signal_opt0;		/* _DISP_DISP_SIGNAL_OPTIONS0_0 */
+	u32 disp_signal_opt1;		/* _DISP_DISP_SIGNAL_OPTIONS1_0 */
+	u32 disp_win_opt;		/* _DISP_DISP_WIN_OPTIONS_0 */
+	u32 mem_high_pri;		/* _DISP_MEM_HIGH_PRIORITY_0 */
+	u32 mem_high_pri_timer;	/* _DISP_MEM_HIGH_PRIORITY_TIMER_0 */
+	u32 disp_timing_opt;		/* _DISP_DISP_TIMING_OPTIONS_0 */
+	u32 ref_to_sync;		/* _DISP_REF_TO_SYNC_0 */
+	u32 sync_width;		/* _DISP_SYNC_WIDTH_0 */
+	u32 back_porch;		/* _DISP_BACK_PORCH_0 */
+	u32 disp_active;		/* _DISP_DISP_ACTIVE_0 */
+	u32 front_porch;		/* _DISP_FRONT_PORCH_0 */
+
+	/* Address 0x40b ~ 0x419: _DISP_H_PULSE0/1/2_  */
+	struct _disp_h_pulse h_pulse[H_PULSE_COUNT];
+
+	/* Address 0x41a ~ 0x421 */
+	struct _disp_v_pulse0 v_pulse0;	/* _DISP_V_PULSE0_ */
+	struct _disp_v_pulse0 v_pulse1;	/* _DISP_V_PULSE1_ */
+
+	/* Address 0x422 ~ 0x425 */
+	struct _disp_v_pulse2 v_pulse3;	/* _DISP_V_PULSE2_ */
+	struct _disp_v_pulse2 v_pulse4;	/* _DISP_V_PULSE3_ */
+
+	/* Address 0x426 ~ 0x429 */
+	u32 m0_ctrl;			/* _DISP_M0_CONTROL_0 */
+	u32 m1_ctrl;			/* _DISP_M1_CONTROL_0 */
+	u32 di_ctrl;			/* _DISP_DI_CONTROL_0 */
+	u32 pp_ctrl;			/* _DISP_PP_CONTROL_0 */
+
+	/* Address 0x42a ~ 0x42d: _DISP_PP_SELECT_A/B/C/D_0 */
+	u32 pp_select[PP_SELECT_COUNT];
+
+	/* Address 0x42e ~ 0x435 */
+	u32 disp_clk_ctrl;		/* _DISP_DISP_CLOCK_CONTROL_0 */
+	u32 disp_interface_ctrl;	/* _DISP_DISP_INTERFACE_CONTROL_0 */
+	u32 disp_color_ctrl;		/* _DISP_DISP_COLOR_CONTROL_0 */
+	u32 shift_clk_opt;		/* _DISP_SHIFT_CLOCK_OPTIONS_0 */
+	u32 data_enable_opt;		/* _DISP_DATA_ENABLE_OPTIONS_0 */
+	u32 serial_interface_opt;	/* _DISP_SERIAL_INTERFACE_OPTIONS_0 */
+	u32 lcd_spi_opt;		/* _DISP_LCD_SPI_OPTIONS_0 */
+	u32 border_color;		/* _DISP_BORDER_COLOR_0 */
+
+	/* Address 0x436 ~ 0x439 */
+	u32 color_key0_lower;		/* _DISP_COLOR_KEY0_LOWER_0 */
+	u32 color_key0_upper;		/* _DISP_COLOR_KEY0_UPPER_0 */
+	u32 color_key1_lower;		/* _DISP_COLOR_KEY1_LOWER_0 */
+	u32 color_key1_upper;		/* _DISP_COLOR_KEY1_UPPER_0 */
+
+	u32 reserved0[2];		/* reserved_0[2] */
+
+	/* Address 0x43c ~ 0x442 */
+	u32 cursor_foreground;		/* _DISP_CURSOR_FOREGROUND_0 */
+	u32 cursor_background;		/* _DISP_CURSOR_BACKGROUND_0 */
+	u32 cursor_start_addr;		/* _DISP_CURSOR_START_ADDR_0 */
+	u32 cursor_start_addr_ns;	/* _DISP_CURSOR_START_ADDR_NS_0 */
+	u32 cursor_pos;		/* _DISP_CURSOR_POSITION_0 */
+	u32 cursor_pos_ns;		/* _DISP_CURSOR_POSITION_NS_0 */
+	u32 seq_ctrl;			/* _DISP_INIT_SEQ_CONTROL_0 */
+
+	/* Address 0x442 ~ 0x446 */
+	u32 spi_init_seq_data_a;	/* _DISP_SPI_INIT_SEQ_DATA_A_0 */
+	u32 spi_init_seq_data_b;	/* _DISP_SPI_INIT_SEQ_DATA_B_0 */
+	u32 spi_init_seq_data_c;	/* _DISP_SPI_INIT_SEQ_DATA_C_0 */
+	u32 spi_init_seq_data_d;	/* _DISP_SPI_INIT_SEQ_DATA_D_0 */
+
+	u32 reserved1[0x39];		/* reserved1[0x39], */
+
+	/* Address 0x480 ~ 0x484 */
+	u32 dc_mccif_fifoctrl;		/* _DISP_DC_MCCIF_FIFOCTRL_0 */
+	u32 mccif_disp0a_hyst;		/* _DISP_MCCIF_DISPLAY0A_HYST_0 */
+	u32 mccif_disp0b_hyst;		/* _DISP_MCCIF_DISPLAY0B_HYST_0 */
+	u32 mccif_disp0c_hyst;		/* _DISP_MCCIF_DISPLAY0C_HYST_0 */
+	u32 mccif_disp1b_hyst;		/* _DISP_MCCIF_DISPLAY1B_HYST_0 */
+
+	u32 reserved2[0x3b];		/* reserved2[0x3b] */
+
+	/* Address 0x4c0 ~ 0x4c1 */
+	u32 dac_crt_ctrl;		/* _DISP_DAC_CRT_CTRL_0 */
+	u32 disp_misc_ctrl;		/* _DISP_DISP_MISC_CONTROL_0 */
+};
+
+enum dc_winc_filter_p {
+	WINC_FILTER_COUNT	= 0x10,
+};
+
+/* Window A/B/C register 0x500 ~ 0x628 */
+struct dc_winc_reg {
+
+	/* Address 0x500 */
+	u32 color_palette;		/* _WINC_COLOR_PALETTE_0 */
+
+	u32 reserved0[0xff];		/* reserved_0[0xff] */
+
+	/* Address 0x600 */
+	u32 palette_color_ext;		/* _WINC_PALETTE_COLOR_EXT_0 */
+
+	/* _WINC_H_FILTER_P00~0F_0 */
+	/* Address 0x601 ~ 0x610 */
+	u32 h_filter_p[WINC_FILTER_COUNT];
+
+	/* Address 0x611 ~ 0x618 */
+	u32 csc_yof;			/* _WINC_CSC_YOF_0 */
+	u32 csc_kyrgb;			/* _WINC_CSC_KYRGB_0 */
+	u32 csc_kur;			/* _WINC_CSC_KUR_0 */
+	u32 csc_kvr;			/* _WINC_CSC_KVR_0 */
+	u32 csc_kug;			/* _WINC_CSC_KUG_0 */
+	u32 csc_kvg;			/* _WINC_CSC_KVG_0 */
+	u32 csc_kub;			/* _WINC_CSC_KUB_0 */
+	u32 csc_kvb;			/* _WINC_CSC_KVB_0 */
+
+	/* Address 0x619 ~ 0x628: _WINC_V_FILTER_P00~0F_0 */
+	u32 v_filter_p[WINC_FILTER_COUNT];
+};
+
+/* WIN A/B/C Register 0x700 ~ 0x714*/
+struct dc_win_reg {
+	/* Address 0x700 ~ 0x714 */
+	u32 win_opt;			/* _WIN_WIN_OPTIONS_0 */
+	u32 byte_swap;			/* _WIN_BYTE_SWAP_0 */
+	u32 buffer_ctrl;		/* _WIN_BUFFER_CONTROL_0 */
+	u32 color_depth;		/* _WIN_COLOR_DEPTH_0 */
+	u32 pos;			/* _WIN_POSITION_0 */
+	u32 size;			/* _WIN_SIZE_0 */
+	u32 prescaled_size;		/* _WIN_PRESCALED_SIZE_0 */
+	u32 h_initial_dda;		/* _WIN_H_INITIAL_DDA_0 */
+	u32 v_initial_dda;		/* _WIN_V_INITIAL_DDA_0 */
+	u32 dda_increment;		/* _WIN_DDA_INCREMENT_0 */
+	u32 line_stride;		/* _WIN_LINE_STRIDE_0 */
+	u32 buf_stride;		/* _WIN_BUF_STRIDE_0 */
+	u32 uv_buf_stride;		/* _WIN_UV_BUF_STRIDE_0 */
+	u32 buffer_addr_mode;		/* _WIN_BUFFER_ADDR_MODE_0 */
+	u32 dv_ctrl;			/* _WIN_DV_CONTROL_0 */
+	u32 blend_nokey;		/* _WIN_BLEND_NOKEY_0 */
+	u32 blend_1win;		/* _WIN_BLEND_1WIN_0 */
+	u32 blend_2win_x;		/* _WIN_BLEND_2WIN_X_0 */
+	u32 blend_2win_y;		/* _WIN_BLEND_2WIN_Y_0 */
+	u32 blend_3win_xy;		/* _WIN_BLEND_3WIN_XY_0 */
+	u32 hp_fetch_ctrl;		/* _WIN_HP_FETCH_CONTROL_0 */
+};
+
+/* WINBUF A/B/C Register 0x800 ~ 0x80a */
+struct dc_winbuf_reg {
+	/* Address 0x800 ~ 0x80a */
+	u32 start_addr;		/* _WINBUF_START_ADDR_0 */
+	u32 start_addr_ns;		/* _WINBUF_START_ADDR_NS_0 */
+	u32 start_addr_u;		/* _WINBUF_START_ADDR_U_0 */
+	u32 start_addr_u_ns;		/* _WINBUF_START_ADDR_U_NS_0 */
+	u32 start_addr_v;		/* _WINBUF_START_ADDR_V_0 */
+	u32 start_addr_v_ns;		/* _WINBUF_START_ADDR_V_NS_0 */
+	u32 addr_h_offset;		/* _WINBUF_ADDR_H_OFFSET_0 */
+	u32 addr_h_offset_ns;		/* _WINBUF_ADDR_H_OFFSET_NS_0 */
+	u32 addr_v_offset;		/* _WINBUF_ADDR_V_OFFSET_0 */
+	u32 addr_v_offset_ns;		/* _WINBUF_ADDR_V_OFFSET_NS_0 */
+	u32 uflow_status;		/* _WINBUF_UFLOW_STATUS_0 */
+};
+
+/* Display Controller (DC_) regs */
+struct display_controller {
+	struct dc_cmd_reg cmd;		/* CMD register 0x000 ~ 0x43 */
+	u32 reserved0[0x2bc];
+
+	struct dc_com_reg com;		/* COM register 0x300 ~ 0x329 */
+	u32 reserved1[0xd6];
+
+	struct dc_disp_reg disp;	/* DISP register 0x400 ~ 0x4c1 */
+	u32 reserved2[0x3e];
+
+	struct dc_winc_reg winc;	/* Window A/B/C 0x500 ~ 0x628 */
+	u32 reserved3[0xd7];
+
+	struct dc_win_reg win;		/* WIN A/B/C 0x700 ~ 0x714*/
+	u32 reserved4[0xeb];
+
+	struct dc_winbuf_reg winbuf;	/* WINBUF A/B/C 0x800 ~ 0x80a */
+};
+
+#define BIT(pos)	(1U << pos)
+
+/* DC_CMD_DISPLAY_COMMAND 0x032 */
+#define CTRL_MODE_SHIFT		5
+#define CTRL_MODE_MASK		(0x3 << CTRL_MODE_SHIFT)
+enum {
+	CTRL_MODE_STOP,
+	CTRL_MODE_C_DISPLAY,
+	CTRL_MODE_NC_DISPLAY,
+};
+
+/* _WIN_COLOR_DEPTH_0 */
+enum win_color_depth_id {
+	COLOR_DEPTH_P1,
+	COLOR_DEPTH_P2,
+	COLOR_DEPTH_P4,
+	COLOR_DEPTH_P8,
+	COLOR_DEPTH_B4G4R4A4,
+	COLOR_DEPTH_B5G5R5A,
+	COLOR_DEPTH_B5G6R5,
+	COLOR_DEPTH_AB5G5R5,
+	COLOR_DEPTH_B8G8R8A8 = 12,
+	COLOR_DEPTH_R8G8B8A8,
+	COLOR_DEPTH_B6x2G6x2R6x2A8,
+	COLOR_DEPTH_R6x2G6x2B6x2A8,
+	COLOR_DEPTH_YCbCr422,
+	COLOR_DEPTH_YUV422,
+	COLOR_DEPTH_YCbCr420P,
+	COLOR_DEPTH_YUV420P,
+	COLOR_DEPTH_YCbCr422P,
+	COLOR_DEPTH_YUV422P,
+	COLOR_DEPTH_YCbCr422R,
+	COLOR_DEPTH_YUV422R,
+	COLOR_DEPTH_YCbCr422RA,
+	COLOR_DEPTH_YUV422RA,
+};
+
+/* DC_CMD_DISPLAY_POWER_CONTROL 0x036 */
+#define PW0_ENABLE		BIT(0)
+#define PW1_ENABLE		BIT(2)
+#define PW2_ENABLE		BIT(4)
+#define PW3_ENABLE		BIT(6)
+#define PW4_ENABLE		BIT(8)
+#define PM0_ENABLE		BIT(16)
+#define PM1_ENABLE		BIT(18)
+#define SPI_ENABLE		BIT(24)
+#define HSPI_ENABLE		BIT(25)
+
+/* DC_CMD_STATE_CONTROL 0x041 */
+#define GENERAL_ACT_REQ		BIT(0)
+#define WIN_A_ACT_REQ		BIT(1)
+#define WIN_B_ACT_REQ		BIT(2)
+#define WIN_C_ACT_REQ		BIT(3)
+#define GENERAL_UPDATE		BIT(8)
+#define WIN_A_UPDATE		BIT(9)
+#define WIN_B_UPDATE		BIT(10)
+#define WIN_C_UPDATE		BIT(11)
+
+/* DC_CMD_DISPLAY_WINDOW_HEADER 0x042 */
+#define WINDOW_A_SELECT		BIT(4)
+#define WINDOW_B_SELECT		BIT(5)
+#define WINDOW_C_SELECT		BIT(6)
+
+/* DC_DISP_DISP_CLOCK_CONTROL 0x42e */
+#define SHIFT_CLK_DIVIDER_SHIFT	0
+#define SHIFT_CLK_DIVIDER_MASK	(0xff << SHIFT_CLK_DIVIDER_SHIFT)
+#define	PIXEL_CLK_DIVIDER_SHIFT	8
+#define	PIXEL_CLK_DIVIDER_MSK	(0xf << PIXEL_CLK_DIVIDER_SHIFT)
+enum {
+	PIXEL_CLK_DIVIDER_PCD1,
+	PIXEL_CLK_DIVIDER_PCD1H,
+	PIXEL_CLK_DIVIDER_PCD2,
+	PIXEL_CLK_DIVIDER_PCD3,
+	PIXEL_CLK_DIVIDER_PCD4,
+	PIXEL_CLK_DIVIDER_PCD6,
+	PIXEL_CLK_DIVIDER_PCD8,
+	PIXEL_CLK_DIVIDER_PCD9,
+	PIXEL_CLK_DIVIDER_PCD12,
+	PIXEL_CLK_DIVIDER_PCD16,
+	PIXEL_CLK_DIVIDER_PCD18,
+	PIXEL_CLK_DIVIDER_PCD24,
+	PIXEL_CLK_DIVIDER_PCD13,
+};
+
+/* DC_DISP_DISP_INTERFACE_CONTROL 0x42f */
+#define DATA_FORMAT_SHIFT	0
+#define DATA_FORMAT_MASK	(0xf << DATA_FORMAT_SHIFT)
+enum {
+	DATA_FORMAT_DF1P1C,
+	DATA_FORMAT_DF1P2C24B,
+	DATA_FORMAT_DF1P2C18B,
+	DATA_FORMAT_DF1P2C16B,
+	DATA_FORMAT_DF2S,
+	DATA_FORMAT_DF3S,
+	DATA_FORMAT_DFSPI,
+	DATA_FORMAT_DF1P3C24B,
+	DATA_FORMAT_DF1P3C18B,
+};
+#define DATA_ALIGNMENT_SHIFT	8
+enum {
+	DATA_ALIGNMENT_MSB,
+	DATA_ALIGNMENT_LSB,
+};
+#define DATA_ORDER_SHIFT	9
+enum {
+	DATA_ORDER_RED_BLUE,
+	DATA_ORDER_BLUE_RED,
+};
+
+/* DC_DISP_DATA_ENABLE_OPTIONS 0x432 */
+#define DE_SELECT_SHIFT		0
+#define DE_SELECT_MASK		(0x3 << DE_SELECT_SHIFT)
+#define DE_SELECT_ACTIVE_BLANK	0x0
+#define DE_SELECT_ACTIVE	0x1
+#define DE_SELECT_ACTIVE_IS	0x2
+#define DE_CONTROL_SHIFT	2
+#define DE_CONTROL_MASK		(0x7 << DE_CONTROL_SHIFT)
+enum {
+	DE_CONTROL_ONECLK,
+	DE_CONTROL_NORMAL,
+	DE_CONTROL_EARLY_EXT,
+	DE_CONTROL_EARLY,
+	DE_CONTROL_ACTIVE_BLANK,
+};
+
+/* DC_WIN_WIN_OPTIONS 0x700 */
+#define H_DIRECTION		BIT(0)
+enum {
+	H_DIRECTION_INCREMENT,
+	H_DIRECTION_DECREMENT,
+};
+#define V_DIRECTION		BIT(2)
+enum {
+	V_DIRECTION_INCREMENT,
+	V_DIRECTION_DECREMENT,
+};
+#define COLOR_EXPAND		BIT(6)
+#define CP_ENABLE		BIT(16)
+#define DV_ENABLE		BIT(20)
+#define WIN_ENABLE		BIT(30)
+
+/* DC_WIN_BYTE_SWAP 0x701 */
+#define BYTE_SWAP_SHIFT		0
+#define BYTE_SWAP_MASK		(3 << BYTE_SWAP_SHIFT)
+enum {
+	BYTE_SWAP_NOSWAP,
+	BYTE_SWAP_SWAP2,
+	BYTE_SWAP_SWAP4,
+	BYTE_SWAP_SWAP4HW
+};
+
+/* DC_WIN_POSITION 0x704 */
+#define H_POSITION_SHIFT	0
+#define H_POSITION_MASK		(0x1FFF << H_POSITION_SHIFT)
+#define V_POSITION_SHIFT	16
+#define V_POSITION_MASK		(0x1FFF << V_POSITION_SHIFT)
+
+/* DC_WIN_SIZE 0x705 */
+#define H_SIZE_SHIFT		0
+#define H_SIZE_MASK		(0x1FFF << H_SIZE_SHIFT)
+#define V_SIZE_SHIFT		16
+#define V_SIZE_MASK		(0x1FFF << V_SIZE_SHIFT)
+
+/* DC_WIN_PRESCALED_SIZE 0x706 */
+#define H_PRESCALED_SIZE_SHIFT	0
+#define H_PRESCALED_SIZE_MASK	(0x7FFF << H_PRESCALED_SIZE)
+#define V_PRESCALED_SIZE_SHIFT	16
+#define V_PRESCALED_SIZE_MASK	(0x1FFF << V_PRESCALED_SIZE)
+
+/* DC_WIN_DDA_INCREMENT 0x709 */
+#define H_DDA_INC_SHIFT		0
+#define H_DDA_INC_MASK		(0xFFFF << H_DDA_INC_SHIFT)
+#define V_DDA_INC_SHIFT		16
+#define V_DDA_INC_MASK		(0xFFFF << V_DDA_INC_SHIFT)
+
+/* This holds information about a window which can be displayed */
+/* TODO: do we really need this for basic setup? Not sure yet. */
+struct disp_ctl_win {
+	enum win_color_depth_id fmt;	/* Color depth/format */
+	u32	bpp;		/* Bits per pixel */
+	u32	phys_addr;	/* Physical address in memory */
+	u32	x;		/* Horizontal address offset (bytes) */
+	u32	y;		/* Veritical address offset (bytes) */
+	u32	w;		/* Width of source window */
+	u32	h;		/* Height of source window */
+	u32	stride;		/* Number of bytes per line */
+	u32	out_x;		/* Left edge of output window (col) */
+	u32	out_y;		/* Top edge of output window (row) */
+	u32	out_w;		/* Width of output window in pixels */
+	u32	out_h;		/* Height of output window in pixels */
+};
+
+void display_startup(device_t dev);
+#endif /* __SOC_NVIDIA_TEGRA_DC_H */
diff --git a/src/soc/nvidia/tegra/gpio.c b/src/soc/nvidia/tegra/gpio.c
index d4b5bdd..0615320 100644
--- a/src/soc/nvidia/tegra/gpio.c
+++ b/src/soc/nvidia/tegra/gpio.c
@@ -26,41 +26,25 @@
 #include "gpio.h"
 #include "pinmux.h"
 
-static void gpio_input_common(int gpio_index, int pinmux_index,
-			      uint32_t pconfig)
+void __gpio_input(gpio_t gpio, u32 pull)
 {
-	pconfig |= PINMUX_INPUT_ENABLE;
-	gpio_set_int_enable(gpio_index, 0);
-	gpio_set_mode(gpio_index, GPIO_MODE_GPIO);
-	gpio_set_out_enable(gpio_index, 0);
-	pinmux_set_config(pinmux_index, pconfig);
+	u32 pinmux_config = PINMUX_INPUT_ENABLE | PINMUX_TRISTATE | pull;
+
+	gpio_set_int_enable(gpio, 0);
+	gpio_set_out_enable(gpio, 0);
+	gpio_set_mode(gpio, GPIO_MODE_GPIO);
+	pinmux_set_config(gpio >> GPIO_PINMUX_SHIFT, pinmux_config);
 }
 
-void gpio_input(int gpio_index, int pinmux_index)
+void gpio_output(gpio_t gpio, int value)
 {
-	gpio_input_common(gpio_index, pinmux_index, PINMUX_PULL_NONE);
-}
+	/* TODO: Set OPEN_DRAIN based on what pin it is? */
 
-void gpio_input_pullup(int gpio_index, int pinmux_index)
-{
-	gpio_input_common(gpio_index, pinmux_index, PINMUX_PULL_UP);
-}
-
-void gpio_input_pulldown(int gpio_index, int pinmux_index)
-{
-	gpio_input_common(gpio_index, pinmux_index, PINMUX_PULL_DOWN);
-}
-
-void gpio_output(int gpio_index, int pinmux_index, int value)
-{
-	uint32_t pconfig = PINMUX_PULL_NONE;
-
-	pinmux_set_config(pinmux_index, pconfig | PINMUX_TRISTATE);
-	gpio_set_int_enable(gpio_index, 0);
-	gpio_set_mode(gpio_index, GPIO_MODE_GPIO);
-	gpio_set_out_enable(gpio_index, 1);
-	gpio_set_out_value(gpio_index, value);
-	pinmux_set_config(pinmux_index, pconfig);
+	gpio_set_int_enable(gpio, 0);
+	gpio_set_out_value(gpio, value);
+	gpio_set_out_enable(gpio, 1);
+	gpio_set_mode(gpio, GPIO_MODE_GPIO);
+	pinmux_set_config(gpio >> GPIO_PINMUX_SHIFT, PINMUX_PULL_NONE);
 }
 
 enum {
@@ -74,167 +58,173 @@
 
 struct gpio_bank {
 	// Values
-	uint32_t config[GPIO_PORTS_PER_BANK];
-	uint32_t out_enable[GPIO_PORTS_PER_BANK];
-	uint32_t out_value[GPIO_PORTS_PER_BANK];
-	uint32_t in_value[GPIO_PORTS_PER_BANK];
-	uint32_t int_status[GPIO_PORTS_PER_BANK];
-	uint32_t int_enable[GPIO_PORTS_PER_BANK];
-	uint32_t int_level[GPIO_PORTS_PER_BANK];
-	uint32_t int_clear[GPIO_PORTS_PER_BANK];
+	u32 config[GPIO_PORTS_PER_BANK];
+	u32 out_enable[GPIO_PORTS_PER_BANK];
+	u32 out_value[GPIO_PORTS_PER_BANK];
+	u32 in_value[GPIO_PORTS_PER_BANK];
+	u32 int_status[GPIO_PORTS_PER_BANK];
+	u32 int_enable[GPIO_PORTS_PER_BANK];
+	u32 int_level[GPIO_PORTS_PER_BANK];
+	u32 int_clear[GPIO_PORTS_PER_BANK];
 
 	// Masks
-	uint32_t config_mask[GPIO_PORTS_PER_BANK];
-	uint32_t out_enable_mask[GPIO_PORTS_PER_BANK];
-	uint32_t out_value_mask[GPIO_PORTS_PER_BANK];
-	uint32_t in_value_mask[GPIO_PORTS_PER_BANK];
-	uint32_t int_status_mask[GPIO_PORTS_PER_BANK];
-	uint32_t int_enable_mask[GPIO_PORTS_PER_BANK];
-	uint32_t int_level_mask[GPIO_PORTS_PER_BANK];
-	uint32_t int_clear_mask[GPIO_PORTS_PER_BANK];
+	u32 config_mask[GPIO_PORTS_PER_BANK];
+	u32 out_enable_mask[GPIO_PORTS_PER_BANK];
+	u32 out_value_mask[GPIO_PORTS_PER_BANK];
+	u32 in_value_mask[GPIO_PORTS_PER_BANK];
+	u32 int_status_mask[GPIO_PORTS_PER_BANK];
+	u32 int_enable_mask[GPIO_PORTS_PER_BANK];
+	u32 int_level_mask[GPIO_PORTS_PER_BANK];
+	u32 int_clear_mask[GPIO_PORTS_PER_BANK];
 };
 
 static const struct gpio_bank *gpio_banks = (void *)TEGRA_GPIO_BASE;
 
-static uint32_t gpio_read_port(int index, size_t offset)
+static u32 gpio_read_port(int index, size_t offset)
 {
 	int bank = index / GPIO_GPIOS_PER_BANK;
 	int port = (index - bank * GPIO_GPIOS_PER_BANK) / GPIO_GPIOS_PER_PORT;
 
-	return read32((uint8_t *)&gpio_banks[bank] + offset +
-		      port * sizeof(uint32_t));
+	return read32((u8 *)&gpio_banks[bank] + offset +
+		      port * sizeof(u32));
 }
 
-static void gpio_write_port(int index, size_t offset,
-			    uint32_t mask, uint32_t value)
+static void gpio_write_port(int index, size_t offset, u32 mask, u32 value)
 {
 	int bank = index / GPIO_GPIOS_PER_BANK;
 	int port = (index - bank * GPIO_GPIOS_PER_BANK) / GPIO_GPIOS_PER_PORT;
 
-	uint32_t reg = read32((uint8_t *)&gpio_banks[bank] + offset +
-			      port * sizeof(uint32_t));
-	uint32_t new_reg = (reg & ~mask) | (value & mask);
+	u32 reg = read32((u8 *)&gpio_banks[bank] + offset +
+			      port * sizeof(u32));
+	u32 new_reg = (reg & ~mask) | (value & mask);
 
 	if (new_reg != reg) {
-		write32(new_reg, (uint8_t *)&gpio_banks[bank] + offset +
-			port * sizeof(uint32_t));
+		write32(new_reg, (u8 *)&gpio_banks[bank] + offset +
+			port * sizeof(u32));
 	}
 }
 
-void gpio_set_mode(int gpio_index, enum gpio_mode mode)
+void gpio_set_mode(gpio_t gpio, enum gpio_mode mode)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, config),
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, config),
 			1 << bit, mode ? (1 << bit) : 0);
 }
 
-int gpio_get_mode(int gpio_index)
+int gpio_get_mode(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, config));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, config));
 	return (port & (1 << bit)) != 0;
 }
 
-void gpio_set_lock(int gpio_index)
+void gpio_set_lock(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT + GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, config),
+	int bit = gpio % GPIO_GPIOS_PER_PORT + GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, config),
 			1 << bit, 1 << bit);
 }
 
-int gpio_get_lock(int gpio_index)
+int gpio_get_lock(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT + GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, config));
+	int bit = gpio % GPIO_GPIOS_PER_PORT + GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, config));
 	return (port & (1 << bit)) != 0;
 }
 
-void gpio_set_out_enable(int gpio_index, int enable)
+void gpio_set_out_enable(gpio_t gpio, int enable)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, out_enable),
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, out_enable),
 			1 << bit, enable ? (1 << bit) : 0);
 }
 
-int gpio_get_out_enable(int gpio_index)
+int gpio_get_out_enable(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, out_enable));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, out_enable));
 	return (port & (1 << bit)) != 0;
 }
 
-void gpio_set_out_value(int gpio_index, int value)
+void gpio_set_out_value(gpio_t gpio, int value)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, out_value),
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, out_value),
 			1 << bit, value ? (1 << bit) : 0);
 }
 
-int gpio_get_out_value(int gpio_index)
+int gpio_get_out_value(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, out_value));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, out_value));
 	return (port & (1 << bit)) != 0;
 }
 
-int gpio_get_in_value(int gpio_index)
+int gpio_get_in_value(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, in_value));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, in_value));
 	return (port & (1 << bit)) != 0;
 }
 
-int gpio_get_int_status(int gpio_index)
+int gpio_get_int_status(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, int_status));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, int_status));
 	return (port & (1 << bit)) != 0;
 }
 
-void gpio_set_int_enable(int gpio_index, int enable)
+void gpio_set_int_enable(gpio_t gpio, int enable)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, int_enable),
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, int_enable),
 			1 << bit, enable ? (1 << bit) : 0);
 }
 
-int gpio_get_int_enable(int gpio_index)
+int gpio_get_int_enable(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, int_enable));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, int_enable));
 	return (port & (1 << bit)) != 0;
 }
 
-void gpio_set_int_level(int gpio_index, int high_rise, int edge, int delta)
+void gpio_set_int_level(gpio_t gpio, int high_rise, int edge, int delta)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t value = (high_rise ? (0x000001 << bit) : 0) |
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 value = (high_rise ? (0x000001 << bit) : 0) |
 			 (edge ? (0x000100 << bit) : 0) |
-			 (delta ? (0x010000 << bit) : 0);
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, config),
+			(delta ? (0x010000 << bit) : 0);
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, config),
 			0x010101 << bit, value);
 }
 
-void gpio_get_int_level(int gpio_index, int *high_rise, int *edge, int *delta)
+void gpio_get_int_level(gpio_t gpio, int *high_rise, int *edge, int *delta)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	uint32_t port = gpio_read_port(gpio_index,
-				       offsetof(struct gpio_bank, int_level));
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	u32 port = gpio_read_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+				  offsetof(struct gpio_bank, int_level));
 	*high_rise = ((port & (0x000001 << bit)) != 0);
 	*edge = ((port & (0x000100 << bit)) != 0);
 	*delta = ((port & (0x010000 << bit)) != 0);
 }
 
-void gpio_set_int_clear(int gpio_index)
+void gpio_set_int_clear(gpio_t gpio)
 {
-	int bit = gpio_index % GPIO_GPIOS_PER_PORT;
-	gpio_write_port(gpio_index, offsetof(struct gpio_bank, int_clear),
+	int bit = gpio % GPIO_GPIOS_PER_PORT;
+	gpio_write_port(gpio & ((1 << GPIO_PINMUX_SHIFT) - 1),
+			offsetof(struct gpio_bank, int_clear),
 			1 << bit, 1 << bit);
 }
diff --git a/src/soc/nvidia/tegra/gpio.h b/src/soc/nvidia/tegra/gpio.h
index b62dc90..546ea05 100644
--- a/src/soc/nvidia/tegra/gpio.h
+++ b/src/soc/nvidia/tegra/gpio.h
@@ -22,12 +22,34 @@
 
 #include <stdint.h>
 
-/* Higher level functions for common GPIO configurations. */
+#include "pinmux.h"
 
-void gpio_input(int gpio_index, int pinmux_index);
-void gpio_input_pullup(int gpio_index, int pinmux_index);
-void gpio_input_pulldown(int gpio_index, int pinmux_index);
-void gpio_output(int gpio_index, int pinmux_index, int value);
+/* Wrapper type for GPIOs. Always use GPIO() macro to generate. */
+typedef u32 gpio_t;
+
+#define GPIO_PINMUX_SHIFT 16
+#define GPIO(name) ((gpio_t)(GPIO_##name##_INDEX | \
+			     (PINMUX_GPIO_##name << GPIO_PINMUX_SHIFT)))
+
+/* Higher level function wrappers for common GPIO configurations. */
+
+void gpio_output(gpio_t gpio, int value);
+void __gpio_input(gpio_t gpio, u32 pull);
+
+static inline void gpio_input(gpio_t gpio)
+{
+	__gpio_input(gpio, PINMUX_PULL_NONE);
+}
+
+static inline void gpio_input_pulldown(gpio_t gpio)
+{
+	__gpio_input(gpio, PINMUX_PULL_DOWN);
+}
+
+static inline void gpio_input_pullup(gpio_t gpio)
+{
+	__gpio_input(gpio, PINMUX_PULL_UP);
+}
 
 /* Functions to modify specific GPIO control values. */
 
@@ -35,29 +57,29 @@
 	GPIO_MODE_SPIO = 0,
 	GPIO_MODE_GPIO = 1
 };
-void gpio_set_mode(int gpio_index, enum gpio_mode);
-int gpio_get_mode(int gpio_index);
+void gpio_set_mode(gpio_t gpio, enum gpio_mode);
+int gpio_get_mode(gpio_t gpio);
 
 // Lock a GPIO with extreme caution since they can't be unlocked.
-void gpio_set_lock(int gpio_index);
-int gpio_get_lock(int gpio_index);
+void gpio_set_lock(gpio_t gpio);
+int gpio_get_lock(gpio_t gpio);
 
-void gpio_set_out_enable(int gpio_index, int enable);
-int gpio_get_out_enable(int gpio_index);
+void gpio_set_out_enable(gpio_t gpio, int enable);
+int gpio_get_out_enable(gpio_t gpio);
 
-void gpio_set_out_value(int gpio_index, int value);
-int gpio_get_out_value(int gpio_index);
+void gpio_set_out_value(gpio_t gpio, int value);
+int gpio_get_out_value(gpio_t gpio);
 
-int gpio_get_in_value(int gpio_index);
+int gpio_get_in_value(gpio_t gpio);
 
-int gpio_get_int_status(int gpio_index);
+int gpio_get_int_status(gpio_t gpio);
 
-void gpio_set_int_enable(int gpio_index, int enable);
-int gpio_get_int_enable(int gpio_index);
+void gpio_set_int_enable(gpio_t gpio, int enable);
+int gpio_get_int_enable(gpio_t gpio);
 
-void gpio_set_int_level(int gpio_index, int high_rise, int edge, int delta);
-void gpio_get_int_level(int gpio_index, int *high_rise, int *edge, int *delta);
+void gpio_set_int_level(gpio_t gpio, int high_rise, int edge, int delta);
+void gpio_get_int_level(gpio_t gpio, int *high_rise, int *edge, int *delta);
 
-void gpio_set_int_clear(int gpio_index);
+void gpio_set_int_clear(gpio_t gpio);
 
 #endif	/* __SOC_NVIDIA_TEGRA_GPIO_H__ */
diff --git a/src/soc/nvidia/tegra/i2c.c b/src/soc/nvidia/tegra/i2c.c
index ddb54a5..e9001f0 100644
--- a/src/soc/nvidia/tegra/i2c.c
+++ b/src/soc/nvidia/tegra/i2c.c
@@ -32,12 +32,10 @@
 {
 	while (data_len) {
 		uint32_t status = read32(&regs->fifo_status);
-		int tx_empty =
-			status & TEGRA_I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_MASK;
-		tx_empty >>= TEGRA_I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT;
-		int rx_full =
-			status & TEGRA_I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_MASK;
-		rx_full >>= TEGRA_I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT;
+		int tx_empty = status & I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_MASK;
+		tx_empty >>= I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT;
+		int rx_full = status & I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_MASK;
+		rx_full >>= I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT;
 
 		while (header_words && tx_empty) {
 			write32(*headers++, &regs->tx_packet_fifo);
@@ -73,19 +71,17 @@
 		uint32_t transfer_status =
 			read32(&regs->packet_transfer_status);
 
-		if (transfer_status & TEGRA_I2C_PKT_STATUS_NOACK_ADDR_MASK) {
+		if (transfer_status & I2C_PKT_STATUS_NOACK_ADDR) {
 			printk(BIOS_ERR,
 			       "%s: The address was not acknowledged.\n",
 			       __func__);
 			return -1;
-		} else if (transfer_status &
-			   TEGRA_I2C_PKT_STATUS_NOACK_DATA_MASK) {
+		} else if (transfer_status & I2C_PKT_STATUS_NOACK_DATA) {
 			printk(BIOS_ERR,
 			       "%s: The data was not acknowledged.\n",
 			       __func__);
 			return -1;
-		} else if (transfer_status &
-			   TEGRA_I2C_PKT_STATUS_ARB_LOST_MASK) {
+		} else if (transfer_status & I2C_PKT_STATUS_ARB_LOST) {
 			printk(BIOS_ERR,
 			       "%s: Lost arbitration.\n",
 			       __func__);
@@ -108,25 +104,22 @@
 		return -1;
 	}
 
-	headers[0] = (0 << IOHEADER_WORD0_PROTHDRSZ_SHIFT) |
-		     (1 << IOHEADER_WORD0_PKTID_SHIFT) |
-		     (bus << IOHEADER_WORD0_CONTROLLER_ID_SHIFT) |
-		     IOHEADER_WORD0_PROTOCOL_I2C |
-		     IOHEADER_WORD0_PKTTYPE_REQUEST;
+	headers[0] = (0 << IOHEADER_PROTHDRSZ_SHIFT) |
+		     (1 << IOHEADER_PKTID_SHIFT) |
+		     (bus << IOHEADER_CONTROLLER_ID_SHIFT) |
+		     IOHEADER_PROTOCOL_I2C | IOHEADER_PKTTYPE_REQUEST;
 
-	headers[1] = (data_len - 1) << IOHEADER_WORD1_PAYLOADSIZE_SHIFT;
+	headers[1] = (data_len - 1) << IOHEADER_PAYLOADSIZE_SHIFT;
 
 	uint32_t slave_addr = (chip << 1) | (read ? 1 : 0);
-	headers[2] = IOHEADER_I2C_REQ_ADDRESS_MODE_7BIT |
+	headers[2] = IOHEADER_I2C_REQ_ADDR_MODE_7BIT |
 		     (slave_addr << IOHEADER_I2C_REQ_SLAVE_ADDR_SHIFT);
 	if (read)
-		headers[2] |= IOHEADER_I2C_REQ_READ_WRITE_READ;
-	else
-		headers[2] |= IOHEADER_I2C_REQ_READ_WRITE_WRITE;
+		headers[2] |= IOHEADER_I2C_REQ_READ;
 	if (restart)
-		headers[2] |= IOHEADER_I2C_REQ_REPEAT_START_STOP_START;
+		headers[2] |= IOHEADER_I2C_REQ_REPEAT_START;
 	if (cont)
-		headers[2] |= IOHEADER_I2C_REQ_CONTINUE_XFER_MASK;
+		headers[2] |= IOHEADER_I2C_REQ_CONTINUE_XFER;
 
 	return tegra_i2c_send_recv(regs, read, headers, ARRAY_SIZE(headers),
 				   data, data_len);
@@ -136,8 +129,7 @@
 			 unsigned alen, uint8_t *buf, unsigned len, int read)
 {
 	const uint32_t max_payload =
-		(IOHEADER_WORD1_PAYLOADSIZE_MASK + 1) >>
-		IOHEADER_WORD1_PAYLOADSIZE_SHIFT;
+		(IOHEADER_PAYLOADSIZE_MASK + 1) >> IOHEADER_PAYLOADSIZE_SHIFT;
 	uint8_t abuf[sizeof(addr)];
 
 	int i;
@@ -176,5 +168,5 @@
 {
 	struct tegra_i2c_regs * const regs = tegra_i2c_bases[bus];
 
-	write32(TEGRA_I2C_CNFG_PACKET_MODE_EN_MASK, &regs->cnfg);
+	write32(I2C_CNFG_PACKET_MODE_EN, &regs->cnfg);
 }
diff --git a/src/soc/nvidia/tegra/i2c.h b/src/soc/nvidia/tegra/i2c.h
index 630d890..997ec9c 100644
--- a/src/soc/nvidia/tegra/i2c.h
+++ b/src/soc/nvidia/tegra/i2c.h
@@ -24,133 +24,90 @@
 
 void i2c_init(unsigned bus);
 
-#define IOHEADER_BITFIELD(name, shift, mask) \
-	IOHEADER_##name##_SHIFT = shift, \
-	IOHEADER_##name##_MASK = \
-		mask << IOHEADER_##name##_SHIFT
-
-#define IOHEADER_BITFIELD_VAL(field, name, val) \
-	IOHEADER_##field##_##name = \
-		val << IOHEADER_##field##_SHIFT
-
 enum {
 	/* Word 0 */
-	IOHEADER_BITFIELD(WORD0_PROTHDRSZ, 28, 0x3),
-
-	IOHEADER_BITFIELD(WORD0_PKTID, 16, 0xff),
-
-	IOHEADER_BITFIELD(WORD0_CONTROLLER_ID, 12, 0xf),
-
-	IOHEADER_BITFIELD(WORD0_PROTOCOL, 4, 0xf),
-	IOHEADER_BITFIELD_VAL(WORD0_PROTOCOL, I2C, 1),
-
-	IOHEADER_BITFIELD(WORD0_PKTTYPE, 0, 0x7),
-	IOHEADER_BITFIELD_VAL(WORD0_PKTTYPE, REQUEST, 0),
-	IOHEADER_BITFIELD_VAL(WORD0_PKTTYPE, RESPONSE, 1),
-	IOHEADER_BITFIELD_VAL(WORD0_PKTTYPE, INTERRUPT, 2),
-	IOHEADER_BITFIELD_VAL(WORD0_PKTTYPE, STOP, 3),
+	IOHEADER_PROTHDRSZ_SHIFT = 28,
+	IOHEADER_PROTHDRSZ_MASK = 0x3 << IOHEADER_PROTHDRSZ_SHIFT,
+	IOHEADER_PKTID_SHIFT = 16,
+	IOHEADER_PKTID_MASK = 0xff << IOHEADER_PKTID_SHIFT,
+	IOHEADER_CONTROLLER_ID_SHIFT = 12,
+	IOHEADER_CONTROLLER_ID_MASK = 0xf << IOHEADER_CONTROLLER_ID_SHIFT,
+	IOHEADER_PROTOCOL_SHIFT = 4,
+	IOHEADER_PROTOCOL_MASK = 0xf << IOHEADER_PROTOCOL_SHIFT,
+	IOHEADER_PROTOCOL_I2C = 1 << IOHEADER_PROTOCOL_SHIFT,
+	IOHEADER_PKTTYPE_SHIFT = 0,
+	IOHEADER_PKTTYPE_MASK = 0x7 << IOHEADER_PKTTYPE_SHIFT,
+	IOHEADER_PKTTYPE_REQUEST = 0 << IOHEADER_PKTTYPE_SHIFT,
+	IOHEADER_PKTTYPE_RESPONSE = 1 << IOHEADER_PKTTYPE_SHIFT,
+	IOHEADER_PKTTYPE_INTERRUPT = 2 << IOHEADER_PKTTYPE_SHIFT,
+	IOHEADER_PKTTYPE_STOP = 3 << IOHEADER_PKTTYPE_SHIFT,
 
 	/* Word 1 */
-	IOHEADER_BITFIELD(WORD1_PAYLOADSIZE, 0, 0xfff)
+	IOHEADER_PAYLOADSIZE_SHIFT = 0,
+	IOHEADER_PAYLOADSIZE_MASK = 0xfff << IOHEADER_PAYLOADSIZE_SHIFT
 };
 
 enum {
-	IOHEADER_BITFIELD(I2C_REQ_RESP_PKT_FREQ_SHIFT, 25, 0x1),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_RESP_PKT_FREQ_SHIFT, END, 0),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_RESP_PKT_FREQ_SHIFT, EACH, 1),
-
-	IOHEADER_BITFIELD(I2C_REQ_RESP_PKT_ENABLE, 24, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_HS_MODE, 22, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_CONTINUE_ON_NACK, 21, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_SEND_START_BYTE, 20, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_READ_WRITE, 19, 0x1),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_READ_WRITE, WRITE, 0),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_READ_WRITE, READ, 1),
-
-	IOHEADER_BITFIELD(I2C_REQ_ADDRESS_MODE, 18, 0x1),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_ADDRESS_MODE, 7BIT, 0),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_ADDRESS_MODE, 10BIT, 1),
-
-	IOHEADER_BITFIELD(I2C_REQ_IE, 17, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_REPEAT_START_STOP, 16, 0x1),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_REPEAT_START_STOP, STOP, 0),
-	IOHEADER_BITFIELD_VAL(I2C_REQ_REPEAT_START_STOP, START, 1),
-
-	IOHEADER_BITFIELD(I2C_REQ_CONTINUE_XFER, 15, 0x1),
-
-	IOHEADER_BITFIELD(I2C_REQ_HS_MASTER_ADDR, 12, 0x7),
-
-	IOHEADER_BITFIELD(I2C_REQ_SLAVE_ADDR, 0, 0x3ff)
+	IOHEADER_I2C_REQ_RESP_FREQ_MASK = 0x1 << 25,
+	IOHEADER_I2C_REQ_RESP_FREQ_END = 0 << 25,
+	IOHEADER_I2C_REQ_RESP_FREQ_EACH = 1 << 25,
+	IOHEADER_I2C_REQ_RESP_ENABLE = 0x1 << 24,
+	IOHEADER_I2C_REQ_HS_MODE = 0x1 << 22,
+	IOHEADER_I2C_REQ_CONTINUE_ON_NACK = 0x1 << 21,
+	IOHEADER_I2C_REQ_SEND_START_BYTE = 0x1 << 20,
+	IOHEADER_I2C_REQ_READ = 0x1 << 19,
+	IOHEADER_I2C_REQ_ADDR_MODE_MASK = 0x1 << 18,
+	IOHEADER_I2C_REQ_ADDR_MODE_7BIT = 0 << 18,
+	IOHEADER_I2C_REQ_ADDR_MODE_10BIT = 1 << 18,
+	IOHEADER_I2C_REQ_IE = 0x1 << 17,
+	IOHEADER_I2C_REQ_REPEAT_START = 0x1 << 16,
+	IOHEADER_I2C_REQ_STOP = 0x0 << 16,
+	IOHEADER_I2C_REQ_CONTINUE_XFER = 0x1 << 15,
+	IOHEADER_I2C_REQ_HS_MASTER_ADDR_SHIFT = 12,
+	IOHEADER_I2C_REQ_HS_MASTER_ADDR_MASK =
+		0x7 << IOHEADER_I2C_REQ_HS_MASTER_ADDR_SHIFT,
+	IOHEADER_I2C_REQ_SLAVE_ADDR_SHIFT = 0,
+	IOHEADER_I2C_REQ_SLAVE_ADDR_MASK =
+		0x3ff << IOHEADER_I2C_REQ_SLAVE_ADDR_SHIFT
 };
 
 enum {
-	TEGRA_I2C_CNFG_MSTR_CLR_BUS_ON_TIMEOUT_SHIFT = 15,
-	TEGRA_I2C_CNFG_MSTR_CLR_BUS_ON_TIMEOUT_MASK =
-		0x1 << TEGRA_I2C_CNFG_MSTR_CLR_BUS_ON_TIMEOUT_SHIFT,
-	TEGRA_I2C_CNFG_DEBOUNCE_CNT_SHIFT = 12,
-	TEGRA_I2C_CNFG_DEBOUNCE_CNT_MASK =
-		0x7 << TEGRA_I2C_CNFG_DEBOUNCE_CNT_SHIFT,
-	TEGRA_I2C_CNFG_NEW_MASTER_FSM_SHIFT = 11,
-	TEGRA_I2C_CNFG_NEW_MASTER_FSM_MASK =
-		0x1 << TEGRA_I2C_CNFG_NEW_MASTER_FSM_SHIFT,
-	TEGRA_I2C_CNFG_PACKET_MODE_EN_SHIFT = 10,
-	TEGRA_I2C_CNFG_PACKET_MODE_EN_MASK =
-		0x1 << TEGRA_I2C_CNFG_PACKET_MODE_EN_SHIFT,
-	TEGRA_I2C_CNFG_SEND_SHIFT = 9,
-	TEGRA_I2C_CNFG_SEND_MASK = 0x1 << TEGRA_I2C_CNFG_SEND_SHIFT,
-	TEGRA_I2C_CNFG_NOACK_SHIFT = 8,
-	TEGRA_I2C_CNFG_NOACK_MASK = 0x1 << TEGRA_I2C_CNFG_NOACK_SHIFT,
-	TEGRA_I2C_CNFG_CMD2_SHIFT = 7,
-	TEGRA_I2C_CNFG_CMD2_MASK = 0x1 << TEGRA_I2C_CNFG_CMD2_SHIFT,
-	TEGRA_I2C_CNFG_CMD1_SHIFT = 6,
-	TEGRA_I2C_CNFG_CMD1_MASK = 0x1 << TEGRA_I2C_CNFG_CMD1_SHIFT,
-	TEGRA_I2C_CNFG_START_SHIFT = 5,
-	TEGRA_I2C_CNFG_START_MASK = 0x1 << TEGRA_I2C_CNFG_START_SHIFT,
-	TEGRA_I2C_CNFG_SLV2_SHIFT = 4,
-	TEGRA_I2C_CNFG_SLV2_MASK = 0x1 << TEGRA_I2C_CNFG_SLV2_SHIFT,
-	TEGRA_I2C_CNFG_LENGTH_SHIFT = 1,
-	TEGRA_I2C_CNFG_LENGTH_MASK = 0x7 << TEGRA_I2C_CNFG_LENGTH_SHIFT,
-	TEGRA_I2C_CNFG_A_MOD_SHIFT = 0,
-	TEGRA_I2C_CNFG_A_MOD_MASK = 0x1 << TEGRA_I2C_CNFG_A_MOD_SHIFT
+	I2C_CNFG_MSTR_CLR_BUS_ON_TIMEOUT = 0x1 << 15,
+	I2C_CNFG_DEBOUNCE_CNT_SHIFT = 12,
+	I2C_CNFG_DEBOUNCE_CNT_MASK = 0x7 << I2C_CNFG_DEBOUNCE_CNT_SHIFT,
+	I2C_CNFG_NEW_MASTER_FSM = 0x1 << 11,
+	I2C_CNFG_PACKET_MODE_EN = 0x1 << 10,
+	I2C_CNFG_SEND = 0x1 << 9,
+	I2C_CNFG_NOACK = 0x1 << 8,
+	I2C_CNFG_CMD2 = 0x1 << 7,
+	I2C_CNFG_CMD1 = 0x1 << 6,
+	I2C_CNFG_START = 0x1 << 5,
+	I2C_CNFG_SLV2_SHIFT = 4,
+	I2C_CNFG_SLV2_MASK = 0x1 << I2C_CNFG_SLV2_SHIFT,
+	I2C_CNFG_LENGTH_SHIFT = 1,
+	I2C_CNFG_LENGTH_MASK = 0x7 << I2C_CNFG_LENGTH_SHIFT,
+	I2C_CNFG_A_MOD = 0x1 << 0,
 };
 
 enum {
-	TEGRA_I2C_PKT_STATUS_COMPLETE_SHIFT = 24,
-	TEGRA_I2C_PKT_STATUS_COMPLETE_MASK =
-		0x1 << TEGRA_I2C_PKT_STATUS_COMPLETE_SHIFT,
-	TEGRA_I2C_PKT_STATUS_PKT_ID_SHIFT = 16,
-	TEGRA_I2C_PKT_STATUS_PKT_ID_MASK =
-		0xff << TEGRA_I2C_PKT_STATUS_PKT_ID_SHIFT,
-	TEGRA_I2C_PKT_STATUS_BYTENUM_SHIFT = 4,
-	TEGRA_I2C_PKT_STATUS_BYTENUM_MASK =
-		0xfff << TEGRA_I2C_PKT_STATUS_BYTENUM_SHIFT,
-	TEGRA_I2C_PKT_STATUS_NOACK_ADDR_SHIFT = 3,
-	TEGRA_I2C_PKT_STATUS_NOACK_ADDR_MASK =
-		0x1 << TEGRA_I2C_PKT_STATUS_NOACK_ADDR_SHIFT,
-	TEGRA_I2C_PKT_STATUS_NOACK_DATA_SHIFT = 2,
-	TEGRA_I2C_PKT_STATUS_NOACK_DATA_MASK =
-		0x1 << TEGRA_I2C_PKT_STATUS_NOACK_DATA_SHIFT,
-	TEGRA_I2C_PKT_STATUS_ARB_LOST_SHIFT = 1,
-	TEGRA_I2C_PKT_STATUS_ARB_LOST_MASK =
-		0x1 << TEGRA_I2C_PKT_STATUS_ARB_LOST_SHIFT,
-	TEGRA_I2C_PKT_STATUS_BUSY_SHIFT = 0,
-	TEGRA_I2C_PKT_STATUS_BUSY_MASK =
-		0x1 << TEGRA_I2C_PKT_STATUS_BUSY_SHIFT,
+	I2C_PKT_STATUS_COMPLETE = 0x1 << 24,
+	I2C_PKT_STATUS_PKT_ID_SHIFT = 16,
+	I2C_PKT_STATUS_PKT_ID_MASK = 0xff << I2C_PKT_STATUS_PKT_ID_SHIFT,
+	I2C_PKT_STATUS_BYTENUM_SHIFT = 4,
+	I2C_PKT_STATUS_BYTENUM_MASK = 0xfff << I2C_PKT_STATUS_BYTENUM_SHIFT,
+	I2C_PKT_STATUS_NOACK_ADDR = 0x1 << 3,
+	I2C_PKT_STATUS_NOACK_DATA = 0x1 << 2,
+	I2C_PKT_STATUS_ARB_LOST = 0x1 << 1,
+	I2C_PKT_STATUS_BUSY = 0x1 << 0
 };
 
 enum {
-	TEGRA_I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT = 4,
-	TEGRA_I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_MASK =
-		0xf << TEGRA_I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT,
-
-	TEGRA_I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT = 0,
-	TEGRA_I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_MASK =
-		0xf << TEGRA_I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT
+	I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT = 4,
+	I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_MASK =
+		0xf << I2C_FIFO_STATUS_TX_FIFO_EMPTY_CNT_SHIFT,
+	I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT = 0,
+	I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_MASK =
+		0xf << I2C_FIFO_STATUS_RX_FIFO_FULL_CNT_SHIFT
 };
 
 extern void * const tegra_i2c_bases[];
diff --git a/src/soc/nvidia/tegra124/Kconfig b/src/soc/nvidia/tegra124/Kconfig
index 4e5bd62..30c9feb 100644
--- a/src/soc/nvidia/tegra124/Kconfig
+++ b/src/soc/nvidia/tegra124/Kconfig
@@ -4,6 +4,7 @@
 	select ARCH_RAMSTAGE_ARMV7
 	select HAVE_UART_SPECIAL
 	select BOOTBLOCK_CONSOLE
+	select DYNAMIC_CBMEM
 	select ARM_BOOTBLOCK_CUSTOM
 	bool
 	default n
diff --git a/src/soc/nvidia/tegra124/Makefile.inc b/src/soc/nvidia/tegra124/Makefile.inc
index 1a04051..d78553d 100644
--- a/src/soc/nvidia/tegra124/Makefile.inc
+++ b/src/soc/nvidia/tegra124/Makefile.inc
@@ -2,8 +2,13 @@
 bootblock-y += bootblock_asm.S
 bootblock-y += cbfs.c
 bootblock-y += clock.c
+bootblock-y += cpug.S
+bootblock-y += dma.c
 bootblock-y += i2c.c
+bootblock-y += dma.c
 bootblock-y += monotonic_timer.c
+bootblock-y += power.c
+bootblock-y += spi.c
 bootblock-y += ../tegra/gpio.c
 bootblock-y += ../tegra/i2c.c
 bootblock-y += ../tegra/pingroup.c
@@ -14,12 +19,31 @@
 endif
 
 romstage-y += cbfs.c
+romstage-y += cbmem.c
+romstage-y += early_display.c
+romstage-y += dma.c
+romstage-y += i2c.c
 romstage-y += monotonic_timer.c
+romstage-y += spi.c
+romstage-y += ../tegra/gpio.c
+romstage-y += ../tegra/i2c.c
+romstage-y += ../tegra/pinmux.c
 romstage-y += timer.c
 romstage-$(CONFIG_CONSOLE_SERIAL) += uart.c
 
 ramstage-y += cbfs.c
+ramstage-y += cbmem.c
+ramstage-y += cpug.S
+ramstage-y += clock.c
+ramstage-y += display.c
+ramstage-y += dma.c
+ramstage-y += i2c.c
 ramstage-y += monotonic_timer.c
+ramstage-y += soc.c
+ramstage-y += spi.c
+ramstage-y += ../tegra/gpio.c
+ramstage-y += ../tegra/i2c.c
+ramstage-y += ../tegra/pinmux.c
 ramstage-y += timer.c
 ramstage-$(CONFIG_CONSOLE_SERIAL) += uart.c
 
diff --git a/src/soc/nvidia/tegra124/bootblock.c b/src/soc/nvidia/tegra124/bootblock.c
index a3bed23..2698611 100644
--- a/src/soc/nvidia/tegra124/bootblock.c
+++ b/src/soc/nvidia/tegra124/bootblock.c
@@ -18,38 +18,19 @@
  */
 
 #include <arch/hlt.h>
-#include <arch/io.h>
+#include <bootblock_common.h>
 #include <cbfs.h>
 #include <console/console.h>
-#include <delay.h>
+#include <soc/clock.h>
 
-#include "clock.h"
 #include "pinmux.h"
-
-static void hacky_hardcoded_uart_setup_function(void)
-{
-	// Assert UART reset and enable clock.
-	setbits_le32((void *)(0x60006000 + 4 + 0), 1 << 6);
-
-	// Enable the clock.
-	setbits_le32((void *)(0x60006000 + 4 * 4 + 0), 1 << 6);
-
-	// Set the clock source.
-	clrbits_le32((void *)(0x60006000 + 0x100 + 4 * 0x1e), 3 << 30);
-
-	udelay(2);
-
-	// De-assert reset to UART.
-	clrbits_le32((void *)(0x60006000 + 4 + 0), 1 << 6);
-}
+#include "power.h"
 
 void main(void)
 {
 	void *entry;
 
-	set_avp_clock_to_clkm();
-
-	hacky_hardcoded_uart_setup_function();
+	clock_early_uart();
 
 	// Serial out, tristate off.
 	pinmux_set_config(PINMUX_KB_ROW9_INDEX, PINMUX_KB_ROW9_FUNC_UA3);
@@ -61,7 +42,26 @@
 	if (CONFIG_BOOTBLOCK_CONSOLE)
 		console_init();
 
+	clock_init();
+
+	bootblock_mainboard_init();
+
+	pinmux_set_config(PINMUX_CORE_PWR_REQ_INDEX,
+			  PINMUX_CORE_PWR_REQ_FUNC_PWRON);
+	pinmux_set_config(PINMUX_CPU_PWR_REQ_INDEX,
+			  PINMUX_CPU_PWR_REQ_FUNC_CPU);
+	pinmux_set_config(PINMUX_PWR_INT_N_INDEX,
+			  PINMUX_PWR_INT_N_FUNC_PMICINTR |
+			  PINMUX_TRISTATE |
+			  PINMUX_INPUT_ENABLE);
+
+	power_enable_cpu_rail();
+	power_ungate_cpu();
+
 	entry = cbfs_load_stage(CBFS_DEFAULT_MEDIA, "fallback/romstage");
 
+	if (entry)
+		clock_cpu0_config_and_reset(entry);
+
 	hlt();
 }
diff --git a/src/soc/nvidia/tegra124/cbfs.c b/src/soc/nvidia/tegra124/cbfs.c
index ede9146..4497d6a 100644
--- a/src/soc/nvidia/tegra124/cbfs.c
+++ b/src/soc/nvidia/tegra124/cbfs.c
@@ -20,7 +20,11 @@
 
 #include <cbfs.h>  /* This driver serves as a CBFS media source. */
 
+#include "spi.h"
+
 int init_default_cbfs_media(struct cbfs_media *media)
 {
-	return -1;
+	return initialize_tegra_spi_cbfs_media(media,
+		(void*)CONFIG_CBFS_CACHE_ADDRESS,
+		CONFIG_CBFS_CACHE_SIZE);
 }
diff --git a/src/soc/nvidia/tegra124/cbmem.c b/src/soc/nvidia/tegra124/cbmem.c
new file mode 100644
index 0000000..d80b3a0
--- /dev/null
+++ b/src/soc/nvidia/tegra124/cbmem.c
@@ -0,0 +1,27 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <cbmem.h>
+#include <soc/addressmap.h>
+
+void *cbmem_top(void)
+{
+	return (void *)(CONFIG_SYS_SDRAM_BASE +
+		((CONFIG_DRAM_SIZE_MB - FB_SIZE_MB)<< 20UL));
+}
diff --git a/src/soc/nvidia/tegra124/chip.h b/src/soc/nvidia/tegra124/chip.h
new file mode 100644
index 0000000..b05bcc7
--- /dev/null
+++ b/src/soc/nvidia/tegra124/chip.h
@@ -0,0 +1,91 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef __SOC_NVIDIA_TEGRA124_CHIP_H__
+#define __SOC_NVIDIA_TEGRA124_CHIP_H__
+#include <arch/cache.h>
+#include <soc/addressmap.h>
+#include "gpio.h"
+
+/* this is a misuse of the device tree. We're going to let it go for now but
+ * we should at minimum have a struct for the display controller, since
+ * the chip supports two.
+ */
+struct soc_nvidia_tegra124_config {
+	int xres;
+	int yres;
+	int framebuffer_bits_per_pixel;
+	int cache_policy;
+	/* there are two. It's not unimaginable that we might someday
+	 * have two of these structs in a single mainboard.
+	 */
+	u32 display_controller;
+	u32 framebuffer_base;
+	/* Technically, we can compute this. At the same time, some platforms
+	 * might want to specify a specific size for their own reasons. If it is
+	 * zero the soc code will compute it as xres*yres*framebuffer_bits_per_pixel/4
+	 */
+	u32 framebuffer_size;
+	/* GPIOs -- all, some, or none are used. Unused ones can be ignored
+	 * in devicetree.cb since if they are not set there they default to 0,
+	 * and 0 for a gpio means 'unused GPIO'.
+	 */
+	gpio_t backlight_en_gpio;
+	gpio_t lvds_shutdown_gpio;
+	gpio_t backlight_vdd_gpio;
+	gpio_t panel_vdd_gpio;
+
+	/* required info. */
+	/* pwm to use to set display contrast */
+	int pwm;
+	/* timings -- five numbers, all relative to the previous
+	 * event, not to absolute time.  e.g., vdd_data_delay is the
+	 * delay from vdd on to data, not from power on to data.
+	 * This is stated to be four timings in the
+	 * u-boot docs. In any event, in coreboot, we generally
+	 * only delay long enough to let the panel wake up and then
+	 * do the control operations -- meaming, for *coreboot*
+	 * we probably only need the vdd_delay, but payloads may
+	 * need the other info.
+	 */
+	/* Delay before from power on asserting vdd */
+	int vdd_delay;
+        /* delay between panel_vdd-rise and data-rise*/
+	int vdd_data_delay;
+	/* delay between data-rise and backlight_vdd-rise */
+	int data_backlight_delay;
+	/* delay between backlight_vdd and pwm-rise */
+	int backlight_pwm_delay;
+	/* delay between pwm-rise and backlight_en-rise */
+	int pwm_backlight_en_delay;
+	/* display timing.
+	 * we have not found a dts in which these are set */
+	int href_to_sync; /* u-boot code says 'set to 1' */
+	int hsync_width;
+	int hback_porch;
+	int hfront_porch;
+	int vref_to_sync; /* u-boot code says 'set to 1' */
+	int vsync_width;
+	int vback_porch;
+	int vfront_porch;
+
+	int pixel_clock;
+};
+
+#endif /* __SOC_NVIDIA_TEGRA124_CHIP_H__ */
diff --git a/src/soc/nvidia/tegra124/clk_rst.h b/src/soc/nvidia/tegra124/clk_rst.h
index badb58b..4280f3c 100644
--- a/src/soc/nvidia/tegra124/clk_rst.h
+++ b/src/soc/nvidia/tegra124/clk_rst.h
@@ -17,203 +17,275 @@
 #ifndef _TEGRA124_CLK_RST_H_
 #define _TEGRA124_CLK_RST_H_
 
-/* PLL registers - there are several PLLs in the clock controller */
-struct clk_pll {
-	u32 pll_base;	/* the control register */
-	/* pll_out[0] is output A control, pll_out[1] is output B control */
-	u32 pll_out[2];
-	u32 pll_misc;	/* other misc things */
-};
-
-/* PLL registers - there are several PLLs in the clock controller */
-struct clk_pll_simple {
-	u32 pll_base;		/* the control register */
-	u32 pll_misc;		/* other misc things */
-};
-
-struct clk_pllm {
-	u32 pllm_base;		/* the control register */
-	u32 pllm_out;		/* output control */
-	u32 pllm_misc1;	/* misc1 */
-	u32 pllm_misc2;	/* misc2 */
-};
-
-/* RST_DEV_(L,H,U,V,W)_(SET,CLR) and CLK_ENB_(L,H,U,V,W)_(SET,CLR) */
-struct clk_set_clr {
-	u32 set;
-	u32 clr;
-};
-
-/*
- * Most PLLs use the clk_pll structure, but some have a simpler two-member
- * structure for which we use clk_pll_simple. The reason for this non-
- * othogonal setup is not stated.
- */
-enum {
-	TEGRA_CLK_PLLS		= 6,	/* Number of normal PLLs */
-	TEGRA_CLK_SIMPLE_PLLS	= 3,	/* Number of simple PLLs */
-	TEGRA_CLK_REGS		= 3,	/* Number of clock enable regs L/H/U */
-	TEGRA_CLK_SOURCES	= 64,	/* Number of ppl clock sources L/H/U */
-	TEGRA_CLK_REGS_VW	= 2,	/* Number of clock enable regs V/W */
-	TEGRA_CLK_SOURCES_VW	= 32,	/* Number of ppl clock sources V/W */
-	TEGRA_CLK_SOURCES_X	= 32,	/* Number of ppl clock sources X */
-};
-
 /* Clock/Reset Controller (CLK_RST_CONTROLLER_) regs */
-struct clk_rst_ctlr {
-	u32 crc_rst_src;			/* _RST_SOURCE_0,0x00 */
-	u32 crc_rst_dev[TEGRA_CLK_REGS];	/* _RST_DEVICES_L/H/U_0 */
-	u32 crc_clk_out_enb[TEGRA_CLK_REGS];	/* _CLK_OUT_ENB_L/H/U_0 */
-	u32 crc_reserved0;		/* reserved_0,		0x1C */
-	u32 crc_cclk_brst_pol;		/* _CCLK_BURST_POLICY_0, 0x20 */
-	u32 crc_super_cclk_div;	/* _SUPER_CCLK_DIVIDER_0,0x24 */
-	u32 crc_sclk_brst_pol;		/* _SCLK_BURST_POLICY_0, 0x28 */
-	u32 crc_super_sclk_div;	/* _SUPER_SCLK_DIVIDER_0,0x2C */
-	u32 crc_clk_sys_rate;		/* _CLK_SYSTEM_RATE_0,	0x30 */
-	u32 crc_reserved01;		/* reserved_0_1,	0x34 */
-	u32 crc_reserved02;		/* reserved_0_2,	0x38 */
-	u32 crc_reserved1;		/* reserved_1,		0x3C */
-	u32 crc_cop_clk_skip_plcy;	/* _COP_CLK_SKIP_POLICY_0,0x40 */
-	u32 crc_clk_mask_arm;		/* _CLK_MASK_ARM_0,	0x44 */
-	u32 crc_misc_clk_enb;		/* _MISC_CLK_ENB_0,	0x48 */
-	u32 crc_clk_cpu_cmplx;		/* _CLK_CPU_CMPLX_0,	0x4C */
-	u32 crc_osc_ctrl;		/* _OSC_CTRL_0,		0x50 */
-	u32 crc_pll_lfsr;		/* _PLL_LFSR_0,		0x54 */
-	u32 crc_osc_freq_det;		/* _OSC_FREQ_DET_0,	0x58 */
-	u32 crc_osc_freq_det_stat;	/* _OSC_FREQ_DET_STATUS_0,0x5C */
-	u32 crc_reserved2[8];		/* reserved_2[8],	0x60-7C */
-
-	struct clk_pll crc_pll[TEGRA_CLK_PLLS];	/* PLLs from 0x80 to 0xdc */
-
-	/* PLLs from 0xe0 to 0xf4    */
-	struct clk_pll_simple crc_pll_simple[TEGRA_CLK_SIMPLE_PLLS];
-
-	u32 crc_reserved10;		/* _reserved_10,	0xF8 */
-	u32 crc_reserved11;		/* _reserved_11,	0xFC */
-
-	u32 crc_clk_src[TEGRA_CLK_SOURCES]; /*_I2S1_0...	0x100-1fc */
-
-	u32 crc_reserved20[32];	/* _reserved_20,	0x200-27c */
-
-	u32 crc_clk_out_enb_x;		/* _CLK_OUT_ENB_X_0,	0x280 */
-	u32 crc_clk_enb_x_set;		/* _CLK_ENB_X_SET_0,	0x284 */
-	u32 crc_clk_enb_x_clr;		/* _CLK_ENB_X_CLR_0,	0x288 */
-
-	u32 crc_rst_devices_x;		/* _RST_DEVICES_X_0,	0x28c */
-	u32 crc_rst_dev_x_set;		/* _RST_DEV_X_SET_0,	0x290 */
-	u32 crc_rst_dev_x_clr;		/* _RST_DEV_X_CLR_0,	0x294 */
-
-	u32 crc_reserved21[23];	/* _reserved_21,	0x298-2f0 */
-
-	u32 crc_dfll_base;		/* _DFLL_BASE_0,	0x2f4 */
-
-	u32 crc_reserved22[2];		/* _reserved_22,	0x2f8-2fc */
-
-	/* _RST_DEV_L/H/U_SET_0 0x300 ~ 0x314 */
-	struct clk_set_clr crc_rst_dev_ex[TEGRA_CLK_REGS];
-
-	u32 crc_reserved30[2];		/* _reserved_30,	0x318, 0x31c */
-
-	/* _CLK_ENB_L/H/U_CLR_0 0x320 ~ 0x334 */
-	struct clk_set_clr crc_clk_enb_ex[TEGRA_CLK_REGS];
-
-	u32 crc_reserved31;		/* _reserved_31,	0x338 */
-
-	u32 crc_ccplex_pg_sm_ovrd;	/* _CCPLEX_PG_SM_OVRD_0,    0x33c */
-
-	u32 crc_rst_cpu_cmplx_set;	/* _RST_CPU_CMPLX_SET_0,    0x340 */
-	u32 crc_rst_cpu_cmplx_clr;	/* _RST_CPU_CMPLX_CLR_0,    0x344 */
-	u32 crc_clk_cpu_cmplx_set;	/* _CLK_CPU_CMPLX_SET_0,    0x348 */
-	u32 crc_clk_cpu_cmplx_clr;	/* _CLK_CPU_CMPLX_SET_0,    0x34c */
-
-	u32 crc_reserved32[2];		/* _reserved_32,      0x350,0x354 */
-
-	u32 crc_rst_dev_vw[TEGRA_CLK_REGS_VW]; /* _RST_DEVICES_V/W_0 */
-	u32 crc_clk_out_enb_vw[TEGRA_CLK_REGS_VW]; /* _CLK_OUT_ENB_V/W_0 */
-	u32 crc_cclkg_brst_pol;	/* _CCLKG_BURST_POLICY_0,   0x368 */
-	u32 crc_super_cclkg_div;	/* _SUPER_CCLKG_DIVIDER_0,  0x36C */
-	u32 crc_cclklp_brst_pol;	/* _CCLKLP_BURST_POLICY_0,  0x370 */
-	u32 crc_super_cclkp_div;	/* _SUPER_CCLKLP_DIVIDER_0, 0x374 */
-	u32 crc_clk_cpug_cmplx;	/* _CLK_CPUG_CMPLX_0,       0x378 */
-	u32 crc_clk_cpulp_cmplx;	/* _CLK_CPULP_CMPLX_0,      0x37C */
-	u32 crc_cpu_softrst_ctrl;	/* _CPU_SOFTRST_CTRL_0,     0x380 */
-	u32 crc_cpu_softrst_ctrl1;	/* _CPU_SOFTRST_CTRL1_0,    0x384 */
-	u32 crc_cpu_softrst_ctrl2;	/* _CPU_SOFTRST_CTRL2_0,    0x388 */
-	u32 crc_reserved33[9];		/* _reserved_33,        0x38c-3ac */
-	u32 crc_clk_src_vw[TEGRA_CLK_SOURCES_VW];	/* 0x3B0-0x42C */
-	/* _RST_DEV_V/W_SET_0 0x430 ~ 0x43c */
-	struct clk_set_clr crc_rst_dev_ex_vw[TEGRA_CLK_REGS_VW];
-	/* _CLK_ENB_V/W_CLR_0 0x440 ~ 0x44c */
-	struct clk_set_clr crc_clk_enb_ex_vw[TEGRA_CLK_REGS_VW];
-	u32 crc_rst_cpug_cmplx_set;	/* _RST_CPUG_CMPLX_SET_0,   0x450*/
-	u32 crc_rst_cpug_cmplx_clr;	/* _RST_CPUG_CMPLX_CLR_0,   0x454*/
-	u32 crc_rst_cpulp_cmplx_set;	/* _RST_CPULP_CMPLX_SET_0,  0x458*/
-	u32 crc_rst_cpulp_cmplx_clr;	/* _RST_CPULP_CMPLX_CLR_0,  0x45C*/
-	u32 crc_clk_cpug_cmplx_set;	/* _CLK_CPUG_CMPLX_SET_0,  0x460 */
-	u32 crc_clk_cpug_cmplx_clr;	/* _CLK_CPUG_CMPLX_CLR_0,  0x464 */
-	u32 crc_clk_cpulp_cmplx_set;	/* _CLK_CPULP_CMPLX_SET_0, 0x468 */
-	u32 crc_clk_cpulp_cmplx_clr;	/* _CLK_CPULP_CMPLX_CLR_0, 0x46C */
-	u32 crc_cpu_cmplx_status;	/* _CPU_CMPLX_STATUS_0,    0x470 */
-	u32 crc_reserved40[1];		/* _reserved_40,        0x474 */
-	u32 crc_intstatus;		/* __INTSTATUS_0,       0x478 */
-	u32 crc_intmask;		/* __INTMASK_0,         0x47C */
-	u32 crc_utmip_pll_cfg0;	/* _UTMIP_PLL_CFG0_0,	0x480 */
-	u32 crc_utmip_pll_cfg1;	/* _UTMIP_PLL_CFG1_0,	0x484 */
-	u32 crc_utmip_pll_cfg2;	/* _UTMIP_PLL_CFG2_0,	0x488 */
-
-	u32 crc_plle_aux;		/* _PLLE_AUX_0,		0x48C */
-	u32 crc_sata_pll_cfg0;		/* _SATA_PLL_CFG0_0,	0x490 */
-	u32 crc_sata_pll_cfg1;		/* _SATA_PLL_CFG1_0,	0x494 */
-	u32 crc_pcie_pll_cfg0;		/* _PCIE_PLL_CFG0_0,	0x498 */
-
-	u32 crc_prog_audio_dly_clk;	/* _PROG_AUDIO_DLY_CLK_0, 0x49C */
-	u32 crc_audio_sync_clk_i2s0;	/* _AUDIO_SYNC_CLK_I2S0_0, 0x4A0 */
-	u32 crc_audio_sync_clk_i2s1;	/* _AUDIO_SYNC_CLK_I2S1_0, 0x4A4 */
-	u32 crc_audio_sync_clk_i2s2;	/* _AUDIO_SYNC_CLK_I2S2_0, 0x4A8 */
-	u32 crc_audio_sync_clk_i2s3;	/* _AUDIO_SYNC_CLK_I2S3_0, 0x4AC */
-	u32 crc_audio_sync_clk_i2s4;	/* _AUDIO_SYNC_CLK_I2S4_0, 0x4B0 */
-	u32 crc_audio_sync_clk_spdif;	/* _AUDIO_SYNC_CLK_SPDIF_0, 0x4B4 */
-
-	u32 crc_plld2_base;		/* _PLLD2_BASE_0, 0x4B8 */
-	u32 crc_plld2_misc;		/* _PLLD2_MISC_0, 0x4BC */
-	u32 crc_utmip_pll_cfg3;	/* _UTMIP_PLL_CFG3_0, 0x4C0 */
-	u32 crc_pllrefe_base;		/* _PLLREFE_BASE_0, 0x4C4 */
-	u32 crc_pllrefe_misc;		/* _PLLREFE_MISC_0, 0x4C8 */
-	u32 crs_reserved_50[7];	/* _reserved_50, 0x4CC-0x4E4 */
-	u32 crc_pllc2_base;		/* _PLLC2_BASE_0, 0x4E8 */
-	u32 crc_pllc2_misc0;		/* _PLLC2_MISC_0_0, 0x4EC */
-	u32 crc_pllc2_misc1;		/* _PLLC2_MISC_1_0, 0x4F0 */
-	u32 crc_pllc2_misc2;		/* _PLLC2_MISC_2_0, 0x4F4 */
-	u32 crc_pllc2_misc3;		/* _PLLC2_MISC_3_0, 0x4F8 */
-	u32 crc_pllc3_base;		/* _PLLC3_BASE_0, 0x4FC */
-	u32 crc_pllc3_misc0;		/* _PLLC3_MISC_0_0, 0x500 */
-	u32 crc_pllc3_misc1;		/* _PLLC3_MISC_1_0, 0x504 */
-	u32 crc_pllc3_misc2;		/* _PLLC3_MISC_2_0, 0x508 */
-	u32 crc_pllc3_misc3;		/* _PLLC3_MISC_3_0, 0x50C */
-	u32 crc_pllx_misc1;		/* _PLLX_MISC_1_0, 0x510 */
-	u32 crc_pllx_misc2;		/* _PLLX_MISC_2_0, 0x514 */
-	u32 crc_pllx_misc3;		/* _PLLX_MISC_3_0, 0x518 */
-	u32 crc_xusbio_pll_cfg0;	/* _XUSBIO_PLL_CFG0_0, 0x51C */
-	u32 crc_xusbio_pll_cfg1;	/* _XUSBIO_PLL_CFG0_1, 0x520 */
-	u32 crc_plle_aux1;		/* _PLLE_AUX1_0, 0x524 */
-	u32 crc_pllp_reshift;		/* _PLLP_RESHIFT_0, 0x528 */
-	u32 crc_utmipll_hw_pwrdn_cfg0;	/* _UTMIPLL_HW_PWRDN_CFG0_0, 0x52C */
-	u32 crc_pllu_hw_pwrdn_cfg0;	/* _PLLU_HW_PWRDN_CFG0_0, 0x530 */
-	u32 crc_xusb_pll_cfg0;		/* _XUSB_PLL_CFG0_0, 0x534 */
-	u32 crc_reserved51[1];		/* _reserved_51,     0x538 */
-	u32 crc_clk_cpu_misc;		/* _CLK_CPU_MISC_0, 0x53C */
-	u32 crc_clk_cpug_misc;		/* _CLK_CPUG_MISC_0, 0x540 */
-	u32 crc_clk_cpulp_misc;	/* _CLK_CPULP_MISC_0, 0x544 */
-	u32 crc_pllx_hw_ctrl_cfg;	/* _PLLX_HW_CTRL_CFG_0, 0x548 */
-	u32 crc_pllx_sw_ramp_cfg;	/* _PLLX_SW_RAMP_CFG_0, 0x54C */
-	u32 crc_pllx_hw_ctrl_status;	/* _PLLX_HW_CTRL_STATUS_0, 0x550 */
-	u32 crc_reserved52[1];		/* _reserved_52,     0x554 */
-	u32 crc_super_gr3d_clk_div;	/* _SUPER_GR3D_CLK_DIVIDER_0, 0x558 */
-	u32 crc_spare_reg0;		/* _SPARE_REG0_0, 0x55C */
-
-	/* T124 - skip to 0x600 here for new CLK_SOURCE_ regs */
-	u32 crc_reserved60[40];	/* _reserved_60,     0x560 - 0x5FC */
-	u32 crc_clk_src_x[TEGRA_CLK_SOURCES_X]; /* XUSB, etc, 0x600-0x678 */
+struct  __attribute__ ((__packed__)) clk_rst_ctlr {
+	u32 rst_src;			/* _RST_SOURCE,             0x000 */
+	u32 rst_dev_l;			/* _RST_DEVICES_L,          0x004 */
+	u32 rst_dev_h;			/* _RST_DEVICES_H,          0x008 */
+	u32 rst_dev_u;			/* _RST_DEVICES_U,          0x00c */
+	u32 clk_out_enb_l;		/* _CLK_OUT_ENB_L,          0x010 */
+	u32 clk_out_enb_h;		/* _CLK_OUT_ENB_H,          0x014 */
+	u32 clk_out_enb_u;		/* _CLK_OUT_ENB_U,          0x018 */
+	u32 _rsv0;			/*                          0x01c */
+	u32 cclk_brst_pol;		/* _CCLK_BURST_POLICY,      0x020 */
+	u32 super_cclk_div;		/* _SUPER_CCLK_DIVIDER,     0x024 */
+	u32 sclk_brst_pol;		/* _SCLK_BURST_POLICY,      0x028 */
+	u32 super_sclk_div;		/* _SUPER_SCLK_DIVIDER,     0x02C */
+	u32 clk_sys_rate;		/* _CLK_SYSTEM_RATE,        0x030 */
+	u32 _rsv1[3];			/*                      0x034-03c */
+	u32 cop_clk_skip_plcy;		/* _COP_CLK_SKIP_POLICY,    0x040 */
+	u32 clk_mask_arm;		/* _CLK_MASK_ARM,           0x044 */
+	u32 misc_clk_enb;		/* _MISC_CLK_ENB,           0x048 */
+	u32 clk_cpu_cmplx;		/* _CLK_CPU_CMPLX,          0x04C */
+	u32 osc_ctrl;			/* _OSC_CTRL,               0x050 */
+	u32 pll_lfsr;			/* _PLL_LFSR,               0x054 */
+	u32 osc_freq_det;		/* _OSC_FREQ_DET,           0x058 */
+	u32 osc_freq_det_stat;		/* _OSC_FREQ_DET_STATUS,    0x05C */
+	u32 _rsv2[8];			/*                      0x060-07C */
+	u32 pllc_base;			/* _PLLC_BASE,              0x080 */
+	u32 pllc_out;			/* _PLLC_OUT,               0x084 */
+	u32 pllc_misc2;			/* _PLLC_MISC2,             0x088 */
+	u32 pllc_misc;			/* _PLLC_MISC,              0x08c */
+	u32 pllm_base;			/* _PLLM_BASE,              0x090 */
+	u32 pllm_out;			/* _PLLM_OUT,               0x094 */
+	u32 pllm_misc1;			/* _PLLM_MISC1,             0x098 */
+	u32 pllm_misc2;			/* _PLLM_MISC2,             0x09c */
+	u32 pllp_base;			/* _PLLP_BASE,              0x0a0 */
+	u32 pllp_outa;			/* _PLLP_OUTA,              0x0a4 */
+	u32 pllp_outb;			/* _PLLP_OUTB,              0x0a8 */
+	u32 pllp_misc;			/* _PLLP_MISC,              0x0ac */
+	u32 plla_base;			/* _PLLA_BASE,              0x0b0 */
+	u32 plla_out;			/* _PLLA_OUT,               0x0b4 */
+	u32 _rsv3;			/*                          0x0b8 */
+	u32 plla_misc;			/* _PLLA_MISC,              0x0bc */
+	u32 pllu_base;			/* _PLLU_BASE,              0x0c0 */
+	u32 _rsv4[2];			/*                      0x0c4-0c8 */
+	u32 pllu_misc;			/* _PLLU_MISC,              0x0cc */
+	u32 plld_base;			/* _PLLD_BASE,              0x0d0 */
+	u32 _rsv5[2];			/*                      0x0d4-0d8 */
+	u32 plld_misc;			/* _PLLD_MISC,              0x0dc */
+	u32 pllx_base;			/* _PLLX_BASE,              0x0e0 */
+	u32 pllx_misc;			/* _PLLX_MISC,              0x0e4 */
+	u32 plle_base;			/* _PLLE_BASE,              0x0e8 */
+	u32 plle_misc;			/* _PLLE_MISC,              0x0ec */
+	u32 plls_base;			/* _PLLS_BASE,              0x0f0 */
+	u32 plls_misc;			/* _PLLS_MISC,              0x0f4 */
+	u32 _rsv6[2];			/*                      0x0f8-0fc */
+        u32 clk_src_i2s1;		/* _CLK_SOURCE_I2S1,        0x100 */
+        u32 clk_src_i2s2;		/* _CLK_SOURCE_I2S2,        0x104 */
+        u32 clk_src_spdif_out;		/* _CLK_SOURCE_SPDIF_OUT,   0x108 */
+        u32 clk_src_spdif_in;		/* _CLK_SOURCE_SPDIF_IN,    0x10c */
+        u32 clk_src_pwm;		/* _CLK_SOURCE_PWM,         0x110 */
+        u32 _rsv7;			/*                          0x114 */
+        u32 clk_src_sbc2;		/* _CLK_SOURCE_SBC2,        0x118 */
+        u32 clk_src_sbc3;		/* _CLK_SOURCE_SBC3,        0x11c */
+        u32 _rsv8;			/*                          0x120 */
+        u32 clk_src_i2c1;		/* _CLK_SOURCE_I2C1,        0x124 */
+        u32 clk_src_i2c5;		/* _CLK_SOURCE_I2C5,        0x128 */
+        u32 _rsv9[2];			/*                      0x12c-130 */
+        u32 clk_src_sbc1;		/* _CLK_SOURCE_SBC1,        0x134 */
+        u32 clk_src_disp1;		/* _CLK_SOURCE_DISP1,       0x138 */
+        u32 clk_src_disp2;		/* _CLK_SOURCE_DISP2,       0x13c */
+        u32 _rsv10[2];			/*                      0x140-144 */
+        u32 clk_src_vi;			/* _CLK_SOURCE_VI,          0x148 */
+        u32 _rsv11;			/*                          0x14c */
+        u32 clk_src_sdmmc1;		/* _CLK_SOURCE_SDMMC1,      0x150 */
+        u32 clk_src_sdmmc2;		/* _CLK_SOURCE_SDMMC2,      0x154 */
+        u32 clk_src_g3d;		/* _CLK_SOURCE_G3D,         0x158 */
+        u32 clk_src_g2d;		/* _CLK_SOURCE_G2D,         0x15c */
+        u32 clk_src_ndflash;		/* _CLK_SOURCE_NDFLASH,     0x160 */
+        u32 clk_src_sdmmc4;		/* _CLK_SOURCE_SDMMC4,      0x164 */
+        u32 clk_src_vfir;		/* _CLK_SOURCE_VFIR,        0x168 */
+        u32 clk_src_epp;		/* _CLK_SOURCE_EPP,         0x16c */
+        u32 clk_src_mpe;		/* _CLK_SOURCE_MPE,         0x170 */
+        u32 clk_src_hsi;		/* _CLK_SOURCE_HSI,         0x174 */
+        u32 clk_src_uarta;		/* _CLK_SOURCE_UARTA,       0x178 */
+        u32 clk_src_uartb;		/* _CLK_SOURCE_UARTB,       0x17c */
+        u32 clk_src_host1x;		/* _CLK_SOURCE_HOST1X,      0x180 */
+        u32 _rsv12[2];			/*                      0x184-188 */
+        u32 clk_src_hdmi;		/* _CLK_SOURCE_HDMI,        0x18c */
+        u32 _rsv13[2];			/*                      0x190-194 */
+        u32 clk_src_i2c2;		/* _CLK_SOURCE_I2C2,        0x198 */
+        u32 clk_src_emc;		/* _CLK_SOURCE_EMC,         0x19c */
+        u32 clk_src_uartc;		/* _CLK_SOURCE_UARTC,       0x1a0 */
+	u32 _rsv14;			/*                          0x1a4 */
+        u32 clk_src_vi_sensor;		/* _CLK_SOURCE_VI_SENSOR,   0x1a8 */
+        u32 _rsv15[2];			/*                      0x1ac-1b0 */
+        u32 clk_src_sbc4;		/* _CLK_SOURCE_SBC4,        0x1b4 */
+        u32 clk_src_i2c3;		/* _CLK_SOURCE_I2C3,        0x1b8 */
+        u32 clk_src_sdmmc3;		/* _CLK_SOURCE_SDMMC3,      0x1bc */
+        u32 clk_src_uartd;		/* _CLK_SOURCE_UARTD,       0x1c0 */
+        u32 clk_src_uarte;		/* _CLK_SOURCE_UARTE,       0x1c4 */
+        u32 clk_src_vde;		/* _CLK_SOURCE_VDE,         0x1c8 */
+        u32 clk_src_owr;		/* _CLK_SOURCE_OWR,         0x1cc */
+        u32 clk_src_nor;		/* _CLK_SOURCE_NOR,         0x1d0 */
+        u32 clk_src_csite;		/* _CLK_SOURCE_CSITE,       0x1d4 */
+        u32 clk_src_i2s0;		/* _CLK_SOURCE_I2S0,        0x1d8 */
+        u32 clk_src_dtv;		/* _CLK_SOURCE_DTV,         0x1dc */
+        u32 _rsv16[4];			/*                      0x1e0-1ec */
+        u32 clk_src_msenc;		/* _CLK_SOURCE_MSENC,       0x1f0 */
+        u32 clk_src_tsec;		/* _CLK_SOURCE_TSEC,        0x1f4 */
+	u32 _rsv17;			/*                          0x1f8 */
+        u32 clk_src_osc;		/* _CLK_SOURCE_OSC,         0x1fc */
+	u32 _rsv18[32];			/*                      0x200-27c */
+	u32 clk_out_enb_x;		/* _CLK_OUT_ENB_X_0,        0x280 */
+	u32 clk_enb_x_set;		/* _CLK_ENB_X_SET_0,        0x284 */
+	u32 clk_enb_x_clr;		/* _CLK_ENB_X_CLR_0,        0x288 */
+	u32 rst_devices_x;		/* _RST_DEVICES_X_0,        0x28c */
+	u32 rst_dev_x_set;		/* _RST_DEV_X_SET_0,        0x290 */
+	u32 rst_dev_x_clr;		/* _RST_DEV_X_CLR_0,        0x294 */
+	u32 _rsv19[23];			/*                      0x298-2f0 */
+	u32 dfll_base;			/* _DFLL_BASE_0,            0x2f4 */
+	u32 _rsv20[2];			/*                      0x2f8-2fc */
+	u32 rst_dev_l_set;		/* _RST_DEV_L_SET           0x300 */
+	u32 rst_dev_l_clr;		/* _RST_DEV_L_CLR           0x304 */
+	u32 rst_dev_h_set;		/* _RST_DEV_H_SET           0x308 */
+	u32 rst_dev_h_clr;		/* _RST_DEV_H_CLR           0x30c */
+	u32 rst_dev_u_set;		/* _RST_DEV_U_SET           0x310 */
+	u32 rst_dev_u_clr;		/* _RST_DEV_U_CLR           0x314 */
+	u32 _rsv21[2];			/*                      0x318-31c */
+	u32 clk_enb_l_set;		/* _CLK_ENB_L_SET           0x320 */
+	u32 clk_enb_l_clr;		/* _CLK_ENB_L_CLR           0x324 */
+	u32 clk_enb_h_set;		/* _CLK_ENB_H_SET           0x328 */
+	u32 clk_enb_h_clr;		/* _CLK_ENB_H_CLR           0x32c */
+	u32 clk_enb_u_set;		/* _CLK_ENB_U_SET           0x330 */
+	u32 clk_enb_u_clk;		/* _CLK_ENB_U_CLR           0x334 */
+	u32 _rsv22;			/*                          0x338 */
+	u32 ccplex_pg_sm_ovrd;		/* _CCPLEX_PG_SM_OVRD,      0x33c */
+	u32 rst_cpu_cmplx_set;		/* _RST_CPU_CMPLX_SET,      0x340 */
+	u32 rst_cpu_cmplx_clr;		/* _RST_CPU_CMPLX_CLR,      0x344 */
+	u32 clk_cpu_cmplx_set;		/* _CLK_CPU_CMPLX_SET,      0x348 */
+	u32 clk_cpu_cmplx_clr;		/* _CLK_CPU_CMPLX_SET,      0x34c */
+	u32 _rsv23[2];			/*                      0x350-354 */
+	u32 rst_dev_v;			/* _RST_DEVICES_V,          0x358 */
+	u32 rst_dev_w;			/* _RST_DEVICES_W,          0x35c */
+	u32 clk_out_enb_v;		/* _CLK_OUT_ENB_V,          0x360 */
+	u32 clk_out_enb_w;		/* _CLK_OUT_ENB_W,          0x364 */
+	u32 cclkg_brst_pol;		/* _CCLKG_BURST_POLICY,     0x368 */
+	u32 super_cclkg_div;		/* _SUPER_CCLKG_DIVIDER,    0x36c */
+	u32 cclklp_brst_pol;		/* _CCLKLP_BURST_POLICY,    0x370 */
+	u32 super_cclkp_div;		/* _SUPER_CCLKLP_DIVIDER,   0x374 */
+	u32 clk_cpug_cmplx;		/* _CLK_CPUG_CMPLX,         0x378 */
+	u32 clk_cpulp_cmplx;		/* _CLK_CPULP_CMPLX,        0x37c */
+	u32 cpu_softrst_ctrl;		/* _CPU_SOFTRST_CTRL,       0x380 */
+	u32 cpu_softrst_ctrl1;		/* _CPU_SOFTRST_CTRL1,      0x384 */
+	u32 cpu_softrst_ctrl2;		/* _CPU_SOFTRST_CTRL2,      0x388 */
+	u32 _rsv24[9];			/*                      0x38c-3ac */
+	u32 clk_src_g3d2;		/* _CLK_SOURCE_G3D2,        0x3b0 */
+	u32 clk_src_mselect;		/* _CLK_SOURCE_MSELECT,     0x3b4 */
+	u32 clk_src_tsensor;		/* _CLK_SOURCE_TSENSOR,     0x3b8 */
+	u32 clk_src_i2s3;		/* _CLK_SOURCE_I2S3,        0x3bc */
+	u32 clk_src_i2s4;		/* _CLK_SOURCE_I2S4,        0x3c0 */
+	u32 clk_src_i2c4;		/* _CLK_SOURCE_I2C4,        0x3c4 */
+	u32 clk_src_sbc5;		/* _CLK_SOURCE_SBC5,        0x3c8 */
+	u32 clk_src_sbc6;		/* _CLK_SOURCE_SBC6,        0x3cc */
+	u32 clk_src_audio;		/* _CLK_SOURCE_AUDIO,       0x3d0 */
+	u32 _rsv25;			/*                          0x3d4 */
+	u32 clk_src_dam0;		/* _CLK_SOURCE_DAM0,        0x3d8 */
+	u32 clk_src_dam1;		/* _CLK_SOURCE_DAM1,        0x3dc */
+	u32 clk_src_dam2;		/* _CLK_SOURCE_DAM2,        0x3e0 */
+	u32 clk_src_hda2codec_2x;	/* _CLK_SOURCE_HDA2CODEC_2X,0x3e4 */
+	u32 clk_src_actmon;		/* _CLK_SOURCE_ACTMON,      0x3e8 */
+	u32 clk_src_extperiph1;		/* _CLK_SOURCE_EXTPERIPH1,  0x3ec */
+	u32 clk_src_extperiph2;		/* _CLK_SOURCE_EXTPERIPH2,  0x3f0 */
+	u32 clk_src_extperiph3;		/* _CLK_SOURCE_EXTPERIPH3,  0x3f4 */
+	u32 clk_src_nand_speed;		/* _CLK_SOURCE_NAND_SPEED,  0x3f8 */
+	u32 clk_src_i2c_slow;		/* _CLK_SOURCE_I2C_SLOW,    0x3fc */
+	u32 clk_src_sys;		/* _CLK_SOURCE_SYS,         0x400 */
+	u32 _rsv26[7];			/*                      0x404-41c */
+	u32 clk_src_sata_oob;		/* _CLK_SOURCE_SATA_OOB,    0x420 */
+	u32 clk_src_sata;		/* _CLK_SOURCE_SATA,        0x424 */
+	u32 clk_src_hda;		/* _CLK_SOURCE_HDA,         0x428 */
+	u32 _rsv27;			/*                          0x42c */
+	u32 rst_dev_v_set;		/* _RST_DEV_V_SET,          0x430 */
+	u32 rst_dev_v_clr;		/* _RST_DEV_V_CLR,          0x434 */
+	u32 rst_dev_w_set;		/* _RST_DEV_W_SET,          0x438 */
+	u32 rst_dev_w_clr;		/* _RST_DEV_W_CLR,          0x43c */
+	u32 clk_enb_v_set;		/* _CLK_ENB_V_SET,          0x440 */
+	u32 clk_enb_v_clr;		/* _CLK_ENB_V_CLR,          0x444 */
+	u32 clk_enb_w_set;		/* _CLK_ENB_W_SET,          0x448 */
+	u32 clk_enb_w_clr;		/* _CLK_ENB_W_CLR,          0x44c */
+	u32 rst_cpug_cmplx_set;		/* _RST_CPUG_CMPLX_SET,     0x450 */
+	u32 rst_cpug_cmplx_clr;		/* _RST_CPUG_CMPLX_CLR,     0x454 */
+	u32 rst_cpulp_cmplx_set;	/* _RST_CPULP_CMPLX_SET,    0x458 */
+	u32 rst_cpulp_cmplx_clr;	/* _RST_CPULP_CMPLX_CLR,    0x45C */
+	u32 clk_cpug_cmplx_set;		/* _CLK_CPUG_CMPLX_SET,     0x460 */
+	u32 clk_cpug_cmplx_clr;		/* _CLK_CPUG_CMPLX_CLR,     0x464 */
+	u32 clk_cpulp_cmplx_set;	/* _CLK_CPULP_CMPLX_SET,    0x468 */
+	u32 clk_cpulp_cmplx_clr;	/* _CLK_CPULP_CMPLX_CLR,    0x46c */
+	u32 cpu_cmplx_status;		/* _CPU_CMPLX_STATUS,       0x470 */
+	u32 _rsv28;			/*                          0x474 */
+	u32 intstatus;			/* _INTSTATUS,              0x478 */
+	u32 intmask;			/* _INTMASK,                0x47c */
+	u32 utmip_pll_cfg0;		/* _UTMIP_PLL_CFG0,         0x480 */
+	u32 utmip_pll_cfg1;		/* _UTMIP_PLL_CFG1,         0x484 */
+	u32 utmip_pll_cfg2;		/* _UTMIP_PLL_CFG2,         0x488 */
+	u32 plle_aux;			/* _PLLE_AUX,               0x48c */
+	u32 sata_pll_cfg0;		/* _SATA_PLL_CFG0,          0x490 */
+	u32 sata_pll_cfg1;		/* _SATA_PLL_CFG1,          0x494 */
+	u32 pcie_pll_cfg0;		/* _PCIE_PLL_CFG0,          0x498 */
+	u32 prog_audio_dly_clk;		/* _PROG_AUDIO_DLY_CLK,     0x49c */
+	u32 audio_sync_clk_i2s0;	/* _AUDIO_SYNC_CLK_I2S0,    0x4a0 */
+	u32 audio_sync_clk_i2s1;	/* _AUDIO_SYNC_CLK_I2S1,    0x4a4 */
+	u32 audio_sync_clk_i2s2;	/* _AUDIO_SYNC_CLK_I2S2,    0x4a8 */
+	u32 audio_sync_clk_i2s3;	/* _AUDIO_SYNC_CLK_I2S3,    0x4ac */
+	u32 audio_sync_clk_i2s4;	/* _AUDIO_SYNC_CLK_I2S4,    0x4b0 */
+	u32 audio_sync_clk_spdif;	/* _AUDIO_SYNC_CLK_SPDIF,   0x4b4 */
+	u32 plld2_base;			/* _PLLD2_BASE,             0x4b8 */
+	u32 plld2_misc;			/* _PLLD2_MISC,             0x4bc */
+	u32 utmip_pll_cfg3;		/* _UTMIP_PLL_CFG3,         0x4c0 */
+	u32 pllrefe_base;		/* _PLLREFE_BASE,           0x4c4 */
+	u32 pllrefe_misc;		/* _PLLREFE_MISC,           0x4c8 */
+	u32 _rsv29[7];			/*                      0x4cc-4e4 */
+	u32 pllc2_base;			/* _PLLC2_BASE,             0x4e8 */
+	u32 pllc2_misc0;		/* _PLLC2_MISC_0,           0x4ec */
+	u32 pllc2_misc1;		/* _PLLC2_MISC_1,           0x4f0 */
+	u32 pllc2_misc2;		/* _PLLC2_MISC_2,           0x4f4 */
+	u32 pllc2_misc3;		/* _PLLC2_MISC_3,           0x4f8 */
+	u32 pllc3_base;			/* _PLLC3_BASE,             0x4fc */
+	u32 pllc3_misc0;		/* _PLLC3_MISC_0,           0x500 */
+	u32 pllc3_misc1;		/* _PLLC3_MISC_1,           0x504 */
+	u32 pllc3_misc2;		/* _PLLC3_MISC_2,           0x508 */
+	u32 pllc3_misc3;		/* _PLLC3_MISC_3,           0x50c */
+	u32 pllx_misc1;			/* _PLLX_MISC_1,            0x510 */
+	u32 pllx_misc2;			/* _PLLX_MISC_2,            0x514 */
+	u32 pllx_misc3;			/* _PLLX_MISC_3,            0x518 */
+	u32 xusbio_pll_cfg0;		/* _XUSBIO_PLL_CFG0,        0x51c */
+	u32 xusbio_pll_cfg1;		/* _XUSBIO_PLL_CFG1,        0x520 */
+	u32 plle_aux1;			/* _PLLE_AUX1,              0x524 */
+	u32 pllp_reshift;		/* _PLLP_RESHIFT,           0x528 */
+	u32 utmipll_hw_pwrdn_cfg0;	/* _UTMIPLL_HW_PWRDN_CFG0,  0x52c */
+	u32 pllu_hw_pwrdn_cfg0;		/* _PLLU_HW_PWRDN_CFG0,     0x530 */
+	u32 xusb_pll_cfg0;		/* _XUSB_PLL_CFG0,          0x534 */
+	u32 _rsv30;			/*                          0x538 */
+	u32 clk_cpu_misc;		/* _CLK_CPU_MISC,           0x53c */
+	u32 clk_cpug_misc;		/* _CLK_CPUG_MISC,          0x540 */
+	u32 clk_cpulp_misc;		/* _CLK_CPULP_MISC,         0x544 */
+	u32 pllx_hw_ctrl_cfg;		/* _PLLX_HW_CTRL_CFG,       0x548 */
+	u32 pllx_sw_ramp_cfg;		/* _PLLX_SW_RAMP_CFG,       0x54c */
+	u32 pllx_hw_ctrl_status;	/* _PLLX_HW_CTRL_STATUS,    0x550 */
+	u32 _rsv31;			/*                          0x554 */
+	u32 super_gr3d_clk_div;		/* _SUPER_GR3D_CLK_DIVIDER, 0x558 */
+	u32 spare_reg0;			/* _SPARE_REG0,             0x55c */
+	u32 _rsv32[40];			/*                      0x560-5fc */
+	u32 clk_src_xusb_core_host;	/* _CLK_SOURCE_XUSB_CORE_HOST 0x600 */
+	u32 clk_src_xusb_falcon;	/* _CLK_SOURCE_XUSB_FALCON  0x604 */
+	u32 clk_src_xusb_fs;		/* _CLK_SOURCE_XUSB_FS      0x608 */
+	u32 clk_src_xusb_core_dev;	/* _CLK_SOURCE_XUSB_CORE_DEV 0x60c */
+	u32 clk_src_xusb_ss;		/* _CLK_SOURCE_XUSB_SS      0x610 */
+	u32 clk_src_cilab;		/* _CLK_SOURCE_CILAB        0x614 */
+	u32 clk_src_cilcd;		/* _CLK_SOURCE_CILCD        0x618 */
+	u32 clk_src_cile;		/* _CLK_SOURCE_CILE         0x61c */
+	u32 clk_src_dsia_lp;		/* _CLK_SOURCE_DSIA_LP      0x620 */
+	u32 clk_src_dsib_lp;		/* _CLK_SOURCE_DSIB_LP      0x624 */
+	u32 clk_src_entropy;		/* _CLK_SOURCE_ENTROPY      0x628 */
+	u32 clk_src_dvfs_ref;		/* _CLK_SOURCE_DVFS_REF     0x62c */
+	u32 clk_src_dvfs_soc;		/* _CLK_SOURCE_DVFS_SOC     0x630 */
+	u32 clk_src_traceclkin;		/* _CLK_SOURCE_TRACECLKIN   0x634 */
+	u32 clk_src_adx0;		/* _CLK_SOURCE_ADX0         0x638 */
+	u32 clk_src_amx0;		/* _CLK_SOURCE_AMX0         0x63c */
+	u32 clk_src_emc_latency;	/* _CLK_SOURCE_EMC_LATENCY  0x640 */
+	u32 clk_src_soc_therm;		/* _CLK_SOURCE_SOC_THERM    0x644 */
 };
 
 #define TEGRA_DEV_L			0
@@ -254,31 +326,32 @@
 #define OSC_FREQ_OSC38P4		5	/* 38.4MHz */
 #define OSC_FREQ_OSC48			9	/* 48.0MHz */
 
-/* CLK_RST_CONTROLLER_PLLx_BASE_0 */
-#define PLL_BYPASS_SHIFT		31
-#define PLL_BYPASS_MASK			(1U << PLL_BYPASS_SHIFT)
+/* CLK_RST_CONTROLLER_PLL*_BASE_0 */
+#define PLL_BASE_BYPASS			(1U << 31)
+#define PLL_BASE_ENABLE			(1U << 30)
+#define PLL_BASE_REF_DIS		(1U << 29)
+#define PLL_BASE_OVRRIDE		(1U << 28)
+#define PLL_BASE_LOCK			(1U << 27)
 
-#define PLL_ENABLE_SHIFT		30
-#define PLL_ENABLE_MASK			(1U << PLL_ENABLE_SHIFT)
+#define PLL_BASE_DIVP_SHIFT		20
+#define PLL_BASE_DIVP_MASK		(7U << PLL_BASE_DIVP_SHIFT)
 
-#define PLL_BASE_OVRRIDE_MASK		(1U << 28)
-#define PLL_BASE_LOCK_MASK		(1U << 27)
+#define PLL_BASE_DIVN_SHIFT		8
+#define PLL_BASE_DIVN_MASK		(0x3ffU << PLL_BASE_DIVN_SHIFT)
 
-#define PLL_DIVP_SHIFT			20
-#define PLL_DIVP_MASK			(7U << PLL_DIVP_SHIFT)
-
-#define PLL_DIVN_SHIFT			8
-#define PLL_DIVN_MASK			(0x3ffU << PLL_DIVN_SHIFT)
-
-#define PLL_DIVM_SHIFT			0
-#define PLL_DIVM_MASK			(0x1f << PLL_DIVM_SHIFT)
+#define PLL_BASE_DIVM_SHIFT		0
+#define PLL_BASE_DIVM_MASK		(0x1f << PLL_BASE_DIVM_SHIFT)
 
 /* SPECIAL CASE: PLLM, PLLC and PLLX use different-sized fields here */
-#define PLLCMX_DIVP_MASK		(0xfU << PLL_DIVP_SHIFT)
-#define PLLCMX_DIVN_MASK		(0xffU << PLL_DIVN_SHIFT)
-#define PLLCMX_DIVM_MASK		(0xffU << PLL_DIVM_SHIFT)
+#define PLLCMX_BASE_DIVP_MASK		(0xfU << PLL_BASE_DIVP_SHIFT)
+#define PLLCMX_BASE_DIVN_MASK		(0xffU << PLL_BASE_DIVN_SHIFT)
+#define PLLCMX_BASE_DIVM_MASK		(0xffU << PLL_BASE_DIVM_SHIFT)
 
-/* CLK_RST_CONTROLLER_PLLx_OUTx_0 */
+/* Generic, indiscriminate divisor mask. May catch some innocent bystander bits
+ * on the side that we don't particularly care about. */
+#define PLL_BASE_DIV_MASK		(0xffffff)
+
+/* CLK_RST_CONTROLLER_PLL*_OUT*_0 */
 #define PLL_OUT_RSTN			(1 << 0)
 #define PLL_OUT_CLKEN			(1 << 1)
 #define PLL_OUT_OVRRIDE			(1 << 2)
@@ -293,21 +366,22 @@
 #define PLL_OUT2_RATIO_SHIFT		24
 #define PLL_OUT2_RATIO_MASK		(0xffU << PLL_OUT2_RATIO_SHIFT)
 
-/* CLK_RST_CONTROLLER_PLLx_MISC_0 */
-#define PLL_DCCON_SHIFT			20
-#define PLL_DCCON_MASK			(1U << PLL_DCCON_SHIFT)
+/* CLK_RST_CONTROLLER_PLL*_MISC_0 */
+#define PLL_MISC_DCCON			(1 << 20)
 
-#define PLL_LOCK_ENABLE_SHIFT		18
-#define PLL_LOCK_ENABLE_MASK		(1U << PLL_LOCK_ENABLE_SHIFT)
+#define PLL_MISC_CPCON_SHIFT		8
+#define PLL_MISC_CPCON_MASK		(0xfU << PLL_MISC_CPCON_SHIFT)
 
-#define PLL_CPCON_SHIFT			8
-#define PLL_CPCON_MASK			(15U << PLL_CPCON_SHIFT)
+#define PLL_MISC_LFCON_SHIFT		4
+#define PLL_MISC_LFCON_MASK		(0xfU << PLL_MISC_LFCON_SHIFT)
 
-#define PLL_LFCON_SHIFT			4
-#define PLL_LFCON_MASK			(15U << PLL_LFCON_SHIFT)
+/* This bit is different all over the place. Oh joy... */
+#define PLLC_MISC_LOCK_ENABLE		(1 << 24)
+#define PLLUD_MISC_LOCK_ENABLE		(1 << 22)
+#define PLLPAXS_MISC_LOCK_ENABLE	(1 << 18)
+#define PLLE_MISC_LOCK_ENABLE		(1 << 9)
 
-#define PLLU_VCO_FREQ_SHIFT		20
-#define PLLU_VCO_FREQ_MASK		(1U << PLLU_VCO_FREQ_SHIFT)
+#define PLLU_MISC_VCO_FREQ		(1 << 20)
 
 #define PLLP_OUT1_OVR			(1 << 2)
 #define PLLP_OUT2_OVR			(1 << 18)
@@ -334,19 +408,6 @@
 	IN_408_OUT_9_6_DIVISOR = 83,
 };
 
-/* CRC_PLLP_MISC_0 0xac */
-#define PLLP_MISC_PLLP_CPCON_8		(8 << 8)
-#define PLLP_MISC_PLLP_LOCK_ENABLE	(1 << 18)
-
-/* CRC_PLLU_BASE_0 0xc0 */
-#define PLLU_BYPASS_ENABLE		(1 << 31)
-#define PLLU_ENABLE_ENABLE		(1 << 30)
-#define PLLU_REF_DIS_REF_DISABLE	(1 << 29)
-#define PLLU_OVERRIDE_ENABLE		(1 << 24)
-
-/* CRC_PLLU_MISC_0 0xcc */
-#define PLLU_LOCK_ENABLE_ENABLE		(1 << 22)
-
 /* PLLX_BASE_0 0xe0 */
 #define PLLX_BASE_PLLX_ENABLE		(1 << 30)
 
@@ -371,17 +432,19 @@
  * get_periph_clock_source()) but it does not seem worth it since the code
  * already checks the ranges of values it is writing, in clk_get_divider().
  */
-#define OUT_CLK_DIVISOR_SHIFT		0
-#define OUT_CLK_DIVISOR_MASK		(0xffff << OUT_CLK_DIVISOR_SHIFT)
+#define CLK_DIVISOR_SHIFT		0
+#define CLK_DIVISOR_MASK		(0xffff << CLK_DIVISOR_SHIFT)
 
-#define OUT_CLK_SOURCE_SHIFT		30
-#define OUT_CLK_SOURCE_MASK		(3U << OUT_CLK_SOURCE_SHIFT)
+#define CLK_SOURCE_SHIFT		30
+#define CLK_SOURCE_MASK			(3U << CLK_SOURCE_SHIFT)
 
-#define OUT_CLK_SOURCE3_SHIFT		29
-#define OUT_CLK_SOURCE3_MASK		(7U << OUT_CLK_SOURCE3_SHIFT)
+#define CLK_SOURCE3_SHIFT		29
+#define CLK_SOURCE3_MASK		(7U << CLK_SOURCE3_SHIFT)
 
-#define OUT_CLK_SOURCE4_SHIFT		28
-#define OUT_CLK_SOURCE4_MASK		(15U << OUT_CLK_SOURCE4_SHIFT)
+#define CLK_SOURCE4_SHIFT		28
+#define CLK_SOURCE4_MASK		(15U << CLK_SOURCE4_SHIFT)
+
+#define CLK_UART_DIV_OVERRIDE		(1 << 24)
 
 /* CLK_RST_CONTROLLER_SCLK_BURST_POLICY */
 #define SCLK_SYS_STATE_SHIFT		28U
@@ -449,33 +512,6 @@
 #define CLK_SYS_RATE_APB_RATE_SHIFT     0
 #define CLK_SYS_RATE_APB_RATE_MASK      (3 << CLK_SYS_RATE_AHB_RATE_SHIFT)
 
-/* CLK_RST_CONTROLLER_RST_CPUxx_CMPLX_CLR 0x344 */
-#define CLR_CPURESET0			(1 << 0)
-#define CLR_CPURESET1			(1 << 1)
-#define CLR_CPURESET2			(1 << 2)
-#define CLR_CPURESET3			(1 << 3)
-#define CLR_DBGRESET0			(1 << 12)
-#define CLR_DBGRESET1			(1 << 13)
-#define CLR_DBGRESET2			(1 << 14)
-#define CLR_DBGRESET3			(1 << 15)
-#define CLR_CORERESET0			(1 << 16)
-#define CLR_CORERESET1			(1 << 17)
-#define CLR_CORERESET2			(1 << 18)
-#define CLR_CORERESET3			(1 << 19)
-#define CLR_CXRESET0			(1 << 20)
-#define CLR_CXRESET1			(1 << 21)
-#define CLR_CXRESET2			(1 << 22)
-#define CLR_CXRESET3			(1 << 23)
-#define CLR_L2RESET			(1 << 24)
-#define CLR_NONCPURESET			(1 << 29)
-#define CLR_PRESETDBG			(1 << 30)
-
-/* CLK_RST_CONTROLLER_CLK_CPU_CMPLX_CLR 0x34c */
-#define CLR_CPU3_CLK_STP		(1 << 11)
-#define CLR_CPU2_CLK_STP		(1 << 10)
-#define CLR_CPU1_CLK_STP		(1 << 9)
-#define CLR_CPU0_CLK_STP		(1 << 8)
-
 /* CRC_CLK_SOURCE_MSELECT_0 0x3b4 */
 #define MSELECT_CLK_SRC_PLLP_OUT0	(0 << 29)
 
@@ -494,4 +530,46 @@
 #define UTMIP_FORCE_PD_SAMP_B_POWERDOWN	(1 << 2)
 #define UTMIP_FORCE_PD_SAMP_A_POWERDOWN	(1 << 0)
 
+// CCLK_BRST_POL
+enum {
+	CRC_CCLK_BRST_POL_PLLX_OUT0 = 0x8,
+	CRC_CCLK_BRST_POL_CPU_STATE_RUN = 0x2
+};
+
+// SUPER_CCLK_DIVIDER
+enum {
+	CRC_SUPER_CCLK_DIVIDER_SUPER_CDIV_ENB = 1 << 31
+};
+
+// CLK_CPU_CMPLX_CLR
+enum {
+	CRC_CLK_CLR_CPU0_STP = 0x1 << 8,
+	CRC_CLK_CLR_CPU1_STP = 0x1 << 9,
+	CRC_CLK_CLR_CPU2_STP = 0x1 << 10,
+	CRC_CLK_CLR_CPU3_STP = 0x1 << 11
+};
+
+// RST_CPUG_CMPLX_CLR
+enum {
+	CRC_RST_CPUG_CLR_CPU0 = 0x1 << 0,
+	CRC_RST_CPUG_CLR_CPU1 = 0x1 << 1,
+	CRC_RST_CPUG_CLR_CPU2 = 0x1 << 2,
+	CRC_RST_CPUG_CLR_CPU3 = 0x1 << 3,
+	CRC_RST_CPUG_CLR_DBG0 = 0x1 << 12,
+	CRC_RST_CPUG_CLR_DBG1 = 0x1 << 13,
+	CRC_RST_CPUG_CLR_DBG2 = 0x1 << 14,
+	CRC_RST_CPUG_CLR_DBG3 = 0x1 << 15,
+	CRC_RST_CPUG_CLR_CORE0 = 0x1 << 16,
+	CRC_RST_CPUG_CLR_CORE1 = 0x1 << 17,
+	CRC_RST_CPUG_CLR_CORE2 = 0x1 << 18,
+	CRC_RST_CPUG_CLR_CORE3 = 0x1 << 19,
+	CRC_RST_CPUG_CLR_CX0 = 0x1 << 20,
+	CRC_RST_CPUG_CLR_CX1 = 0x1 << 21,
+	CRC_RST_CPUG_CLR_CX2 = 0x1 << 22,
+	CRC_RST_CPUG_CLR_CX3 = 0x1 << 23,
+	CRC_RST_CPUG_CLR_L2 = 0x1 << 24,
+	CRC_RST_CPUG_CLR_NONCPU = 0x1 << 29,
+	CRC_RST_CPUG_CLR_PDBG = 0x1 << 30,
+};
+
 #endif	/* _TEGRA124_CLK_RST_H_ */
diff --git a/src/soc/nvidia/tegra124/clock.c b/src/soc/nvidia/tegra124/clock.c
index af01b56..9dce867 100644
--- a/src/soc/nvidia/tegra124/clock.c
+++ b/src/soc/nvidia/tegra124/clock.c
@@ -13,31 +13,368 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
-
+#include <console/console.h>
 #include <delay.h>
 #include <arch/io.h>
 #include <soc/addressmap.h>
-
+#include <soc/clock.h>
 #include "clk_rst.h"
-#include "clock.h"
+#include "cpug.h"
+#include "flow.h"
+#include "pmc.h"
 
 static struct clk_rst_ctlr *clk_rst = (void *)TEGRA_CLK_RST_BASE;
-/*
- * On poweron, AVP clock source (also called system clock) is set to PLLP_out0
- * with frequency set at 1MHz. Before initializing PLLP, we need to move the
- * system clock's source to CLK_M temporarily. And then switch it to PLLP_out4
- * (204MHz) at a later time.
+static struct flow_ctlr *flow = (void *)TEGRA_FLOW_BASE;
+static struct tegra_pmc_regs *pmc = (void*)TEGRA_PMC_BASE;
+
+struct pll_dividers {
+	u32	n : 10;
+	u32	m : 8;
+	u32	p : 4;
+	u32	cpcon: 4;
+	u32	: 6;
+};
+
+/* Some PLLs have more restrictive divider bit lengths or are missing some
+ * fields. Make sure to use the right struct in the osc_table definition to get
+ * compile-time checking, but keep the bits aligned with struct pll_dividers so
+ * they can be used interchangeably at run time. Add new formats as required. */
+struct pllcx_dividers {
+	u32	n : 8;
+	u32	: 2;
+	u32	m : 8;
+	u32	p : 4;
+	u32	: 10;
+};
+struct pllpad_dividers {
+	u32	n : 10;
+	u32	m : 5;
+	u32	: 3;
+	u32	p : 3;
+	u32	: 1;
+	u32	cpcon : 4;
+	u32	: 6;
+};
+struct pllu_dividers {
+	u32	n : 10;
+	u32	m : 5;
+	u32	: 3;
+	u32	p : 1;
+	u32	: 3;
+	u32	cpcon : 4;
+	u32	: 6;
+};
+
+union __attribute__((transparent_union)) pll_fields {
+	u32 raw;
+	struct pll_dividers div;
+	struct pllcx_dividers cx;
+	struct pllpad_dividers pad;
+	struct pllu_dividers u;
+};
+
+/* This table defines the frequency dividers for every PLL to turn the external
+ * OSC clock into the frequencies defined by TEGRA_PLL*_KHZ in soc/clock.h.
+ * All PLLs have three dividers (N, M and P), with the governing formula for
+ * the output frequency being OUT = (IN / m) * N / (2^P). */
+struct {
+	int khz;
+	struct pllcx_dividers	pllx;	/* target: 1900 MHz */
+	struct pllpad_dividers	pllp;	/* target:  408 MHz */
+	struct pllcx_dividers	pllc;	/* target:  600 MHz */
+	struct pllpad_dividers	plld;	/* target:  925 MHz */
+	struct pllu_dividers	pllu;	/* target;  960 MHz */
+} static const osc_table[16] = {
+	[OSC_FREQ_OSC12]{
+		.khz = 12000,
+		.pllx = {.n = 158, .m =  1, .p = 0},		  /* 1896 MHz */
+		.pllp = {.n =  34, .m =  1, .p = 0, .cpcon = 2},
+		.pllc = {.n =  50, .m =  1, .p = 0},
+		.plld = {.n = 925, .m = 12, .p = 0, .cpcon = 12},
+		.pllu = {.n =  80, .m =  1, .p = 0, .cpcon = 3},
+	},
+	[OSC_FREQ_OSC13]{
+		.khz = 13000,
+		.pllx = {.n = 146, .m =  1, .p = 0},		  /* 1898 MHz */
+		.pllp = {.n = 408, .m = 13, .p = 0, .cpcon = 8},
+		.pllc = {.n = 231, .m =  5, .p = 0},		 /* 600.6 MHz */
+		.plld = {.n = 925, .m = 13, .p = 0, .cpcon = 12},
+		.pllu = {.n = 960, .m = 13, .p = 0, .cpcon = 12},
+	},
+	[OSC_FREQ_OSC16P8]{
+		.khz = 16800,
+		.pllx = {.n = 113, .m =  1, .p = 0},		/* 1898.4 MHz */
+		.pllp = {.n = 170, .m =  7, .p = 0, .cpcon = 4},
+		.pllc = {.n = 250, .m =  7, .p = 0},
+		.plld = {.n = 936, .m = 17, .p = 0, .cpcon = 12},/* 924.9 MHz */
+		.pllu = {.n = 400, .m =  7, .p = 0, .cpcon = 8},
+	},
+	[OSC_FREQ_OSC19P2]{
+		.khz = 19200,
+		.pllx = {.n =  98, .m =  1, .p = 0},		/* 1881.6 MHz */
+		.pllp = {.n =  85, .m =  4, .p = 0, .cpcon = 3},
+		.pllc = {.n = 125, .m =  4, .p = 0},
+		.plld = {.n = 819, .m = 17, .p = 0, .cpcon = 12},/* 924.9 MHz */
+		.pllu = {.n =  50, .m =  1, .p = 0, .cpcon = 2},
+	},
+	[OSC_FREQ_OSC26]{
+		.khz = 26000,
+		.pllx = {.n =  73, .m =  1, .p = 0},		  /* 1898 MHz */
+		.pllp = {.n = 204, .m = 13, .p = 0, .cpcon = 5},
+		.pllc = {.n =  23, .m =  1, .p = 0},		   /* 598 MHz */
+		.plld = {.n = 925, .m = 26, .p = 0, .cpcon = 12},
+		.pllu = {.n = 480, .m = 13, .p = 0, .cpcon = 8},
+	},
+	[OSC_FREQ_OSC38P4]{
+		.khz = 38400,
+		.pllx = {.n =  98, .m =  1, .p = 0},		/* 1881.6 MHz */
+		.pllp = {.n =  85, .m =  4, .p = 0, .cpcon = 3},
+		.pllc = {.n = 125, .m =  4, .p = 0},
+		.plld = {.n = 819, .m = 17, .p = 0, .cpcon = 12},/* 924.9 MHz */
+		.pllu = {.n =  50, .m =  1, .p = 0, .cpcon = 2},
+	},
+	[OSC_FREQ_OSC48]{
+		.khz = 48000,
+		.pllx = {.n = 158, .m =  1, .p = 0},		  /* 1896 MHz */
+		.pllp = {.n =  24, .m =  1, .p = 0, .cpcon = 2},
+		.pllc = {.n =  50, .m =  1, .p = 0},
+		.plld = {.n = 925, .m = 12, .p = 0, .cpcon = 12},
+		.pllu = {.n =  80, .m =  1, .p = 0, .cpcon = 3},
+	},
+};
+
+void clock_ll_set_source_divisor(u32 *reg, u32 source, u32 divisor)
+{
+        u32 value;
+
+        value = readl(reg);
+
+        value &= ~CLK_SOURCE_MASK;
+        value |= source << CLK_SOURCE_SHIFT;
+
+        value &= ~CLK_DIVISOR_MASK;
+        value |= divisor << CLK_DIVISOR_SHIFT;
+
+        writel(value, reg);
+}
+
+/* Get the oscillator frequency, from the corresponding hardware
+ * configuration field. This is actually a per-soc thing. Avoid the
+ * temptation to make it common.
  */
-void set_avp_clock_to_clkm(void)
+static u32 clock_get_osc_bits(void)
+{
+	return readl(&clk_rst->osc_ctrl) >> OSC_CTRL_OSC_FREQ_SHIFT;
+}
+
+int clock_get_osc_khz(void)
+{
+	return osc_table[clock_get_osc_bits()].khz;
+}
+
+static void adjust_pllp_out_freqs(void)
+{
+	u32 reg;
+	/* Set T30 PLLP_OUT1, 2, 3 & 4 freqs to 9.6, 48, 102 & 204MHz */
+	reg = readl(&clk_rst->pllp_outa); // OUTA contains OUT2 / OUT1
+	reg |= (IN_408_OUT_48_DIVISOR << PLLP_OUT2_RATIO) | PLLP_OUT2_OVR
+		| (IN_408_OUT_9_6_DIVISOR << PLLP_OUT1_RATIO) | PLLP_OUT1_OVR;
+	writel(reg, &clk_rst->pllp_outa);
+
+	reg = readl(&clk_rst->pllp_outb);   // OUTB, contains OUT4 / OUT3
+	reg |= (IN_408_OUT_204_DIVISOR << PLLP_OUT4_RATIO) | PLLP_OUT4_OVR
+		| (IN_408_OUT_102_DIVISOR << PLLP_OUT3_RATIO) | PLLP_OUT3_OVR;
+	writel(reg, &clk_rst->pllp_outb);
+}
+
+static void init_pll(u32 *base, u32 *misc, const union pll_fields pll)
+{
+	u32 dividers =  pll.div.n << PLL_BASE_DIVN_SHIFT |
+			pll.div.m << PLL_BASE_DIVM_SHIFT |
+			pll.div.p << PLL_BASE_DIVP_SHIFT;
+
+	/* Write dividers but BYPASS the PLL while we're messing with it. */
+	writel(dividers | PLL_BASE_BYPASS, base);
+
+	/* Set CPCON field (defaults to 0 if it doesn't exist for this PLL) */
+	writel(pll.div.cpcon << PLL_MISC_CPCON_SHIFT, misc);
+
+	/* enable PLL and take it back out of BYPASS (we don't wait for lock
+	 * because we assume that to be done by the time we start using it). */
+	writel(dividers | PLL_BASE_ENABLE, base);
+}
+
+/* Initialize the UART and put it on CLK_M so we can use it during clock_init().
+ * Will later move it to PLLP in clock_config(). The divisor must be very small
+ * to accomodate 12KHz OSCs, so we override the 16.0 UART divider with the 15.1
+ * CLK_SOURCE divider to get more precision. (This might still not be enough for
+ * some OSCs... if you use 13KHz, be prepared to have a bad time.) The 1800 has
+ * been determined through trial and error (must lead to div 13 at 24MHz). */
+void clock_early_uart(void)
+{
+	clock_ll_set_source_divisor(&clk_rst->clk_src_uarta, 3,
+		CLK_UART_DIV_OVERRIDE | CLK_DIVIDER(clock_get_osc_khz(), 1800));
+	setbits_le32(&clk_rst->clk_out_enb_l, CLK_L_UARTA);
+	udelay(2);
+	clrbits_le32(&clk_rst->rst_dev_l, CLK_L_UARTA);
+}
+
+void clock_cpu0_config_and_reset(void *entry)
+{
+	void * const evp_cpu_reset = (uint8_t *)TEGRA_EVP_BASE + 0x100;
+
+	write32(CONFIG_STACK_TOP, &cpug_stack_pointer);
+	write32((uintptr_t)entry, &cpug_entry_point);
+	write32((uintptr_t)&cpug_setup, evp_cpu_reset);
+
+	// Set up cclk_brst and divider.
+	write32((CRC_CCLK_BRST_POL_PLLX_OUT0 << 0) |
+		(CRC_CCLK_BRST_POL_PLLX_OUT0 << 4) |
+		(CRC_CCLK_BRST_POL_PLLX_OUT0 << 8) |
+		(CRC_CCLK_BRST_POL_PLLX_OUT0 << 12) |
+		(CRC_CCLK_BRST_POL_CPU_STATE_RUN << 28),
+		&clk_rst->cclk_brst_pol);
+	write32(CRC_SUPER_CCLK_DIVIDER_SUPER_CDIV_ENB,
+		&clk_rst->super_cclk_div);
+
+	// Enable the clocks for CPUs 0-3.
+	uint32_t cpu_cmplx_clr = read32(&clk_rst->clk_cpu_cmplx_clr);
+	cpu_cmplx_clr |= CRC_CLK_CLR_CPU0_STP | CRC_CLK_CLR_CPU1_STP |
+			 CRC_CLK_CLR_CPU2_STP | CRC_CLK_CLR_CPU3_STP;
+	write32(cpu_cmplx_clr, &clk_rst->clk_cpu_cmplx_clr);
+
+	// Enable other CPU related clocks.
+	setbits_le32(&clk_rst->clk_out_enb_l, CLK_L_CPU);
+	setbits_le32(&clk_rst->clk_out_enb_v, CLK_V_CPUG);
+
+	// Disable the reset on the non-CPU parts of the fast cluster.
+	write32(CRC_RST_CPUG_CLR_NONCPU,
+		&clk_rst->rst_cpug_cmplx_clr);
+	// Disable the various resets on the CPUs.
+	write32(CRC_RST_CPUG_CLR_CPU0 | CRC_RST_CPUG_CLR_CPU1 |
+		CRC_RST_CPUG_CLR_CPU2 | CRC_RST_CPUG_CLR_CPU3 |
+		CRC_RST_CPUG_CLR_DBG0 | CRC_RST_CPUG_CLR_DBG1 |
+		CRC_RST_CPUG_CLR_DBG2 | CRC_RST_CPUG_CLR_DBG3 |
+		CRC_RST_CPUG_CLR_CORE0 | CRC_RST_CPUG_CLR_CORE1 |
+		CRC_RST_CPUG_CLR_CORE2 | CRC_RST_CPUG_CLR_CORE3 |
+		CRC_RST_CPUG_CLR_CX0 | CRC_RST_CPUG_CLR_CX1 |
+		CRC_RST_CPUG_CLR_CX2 | CRC_RST_CPUG_CLR_CX3 |
+		CRC_RST_CPUG_CLR_L2 | CRC_RST_CPUG_CLR_PDBG,
+		&clk_rst->rst_cpug_cmplx_clr);
+}
+
+/**
+ * The T124 requires some special clock initialization, including setting up
+ * the DVC I2C, turning on MSELECT and selecting the G CPU cluster
+ */
+void clock_init(void)
 {
 	u32 val;
+	u32 osc = clock_get_osc_bits();
 
+	/*
+	 * On poweron, AVP clock source (also called system clock) is set to
+	 * PLLP_out0 with frequency set at 1MHz. Before initializing PLLP, we
+	 * need to move the system clock's source to CLK_M temporarily. And
+	 * then switch it to PLLP_out4 (204MHz) at a later time.
+	 */
 	val = (SCLK_SOURCE_CLKM << SCLK_SWAKEUP_FIQ_SOURCE_SHIFT) |
 		(SCLK_SOURCE_CLKM << SCLK_SWAKEUP_IRQ_SOURCE_SHIFT) |
 		(SCLK_SOURCE_CLKM << SCLK_SWAKEUP_RUN_SOURCE_SHIFT) |
 		(SCLK_SOURCE_CLKM << SCLK_SWAKEUP_IDLE_SOURCE_SHIFT) |
 		(SCLK_SYS_STATE_RUN << SCLK_SYS_STATE_SHIFT);
-	writel(val, &clk_rst->crc_sclk_brst_pol);
-	/* Wait 2-3us for the clock to flush thru the logic as per the TRM */
-	udelay(3);
+	writel(val, &clk_rst->sclk_brst_pol);
+	udelay(2);
+
+	/* Set active CPU cluster to G */
+	clrbits_le32(&flow->cluster_control, 1);
+
+	/* Change the oscillator drive strength */
+	val = readl(&clk_rst->osc_ctrl);
+	val &= ~OSC_XOFS_MASK;
+	val |= (OSC_DRIVE_STRENGTH << OSC_XOFS_SHIFT);
+	writel(val, &clk_rst->osc_ctrl);
+
+	/* Ambiguous quote from u-boot. TODO: what's this mean?
+	 * "should update same value in PMC_OSC_EDPD_OVER XOFS
+	   field for warmboot "*/
+	val = readl(&pmc->osc_edpd_over);
+	val &= ~PMC_OSC_EDPD_OVER_XOFS_MASK;
+	val |= (OSC_DRIVE_STRENGTH << PMC_OSC_EDPD_OVER_XOFS_SHIFT);
+	writel(val, &pmc->osc_edpd_over);
+
+	/* Disable IDDQ for PLLX before we set it up (from U-Boot -- why?) */
+	val = readl(&clk_rst->pllx_misc3);
+	val &= ~PLLX_IDDQ_MASK;
+	writel(val, &clk_rst->pllx_misc3);
+	udelay(2);
+
+	/* Set PLLC dynramp_step A to 0x2b and B to 0xb (from U-Boot -- why? */
+	writel(0x2b << 17 | 0xb << 9, &clk_rst->pllc_misc2);
+
+	adjust_pllp_out_freqs();
+
+	init_pll(&clk_rst->pllx_base, &clk_rst->pllx_misc, osc_table[osc].pllx);
+	init_pll(&clk_rst->pllp_base, &clk_rst->pllp_misc, osc_table[osc].pllp);
+	init_pll(&clk_rst->pllc_base, &clk_rst->pllc_misc, osc_table[osc].pllc);
+	init_pll(&clk_rst->plld_base, &clk_rst->plld_misc, osc_table[osc].plld);
+	init_pll(&clk_rst->pllu_base, &clk_rst->pllu_misc, osc_table[osc].pllu);
+
+	val = (1 << CLK_SYS_RATE_AHB_RATE_SHIFT);
+	writel(val, &clk_rst->clk_sys_rate);
+}
+
+void clock_config(void)
+{
+	/* Enable clocks for the required peripherals. */
+	/* TODO: can (should?) we use the _SET and _CLR registers here? */
+	setbits_le32(&clk_rst->clk_out_enb_l,
+		     CLK_L_CACHE2 | CLK_L_GPIO | CLK_L_TMR | CLK_L_I2C1 |
+		     CLK_L_SDMMC4);
+	setbits_le32(&clk_rst->clk_out_enb_h,
+		     CLK_H_EMC | CLK_H_I2C2 | CLK_H_I2C5 | CLK_H_SBC1 |
+		     CLK_H_PMC | CLK_H_APBDMA | CLK_H_MEM);
+	setbits_le32(&clk_rst->clk_out_enb_u,
+		     CLK_U_I2C3 | CLK_U_CSITE | CLK_U_SDMMC3);
+	setbits_le32(&clk_rst->clk_out_enb_v, CLK_V_MSELECT);
+	setbits_le32(&clk_rst->clk_out_enb_w, CLK_W_DVFS);
+
+	/*
+	 * Set MSELECT clock source as PLLP (00)_REG, and ask for a clock
+	 * divider that would set the MSELECT clock at 102MHz for a
+	 * PLLP base of 408MHz.
+	 */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_mselect, 0,
+		CLK_DIVIDER(TEGRA_PLLP_KHZ, 102000));
+
+	/* Give clock time to stabilize */
+	udelay(IO_STABILIZATION_DELAY);
+
+	/* I2C1 gets CLK_M and a divisor of 17 */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_i2c1, 3, 16);
+	/* I2C2 gets CLK_M and a divisor of 17 */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_i2c2, 3, 16);
+	/* I2C3 (cam) gets CLK_M and a divisor of 17 */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_i2c3, 3, 16);
+	/* I2C5 (PMU) gets CLK_M and a divisor of 17 */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_i2c5, 3, 16);
+
+	/* UARTA gets PLLP, deactivate CLK_UART_DIV_OVERRIDE */
+	writel(0 << CLK_SOURCE_SHIFT, &clk_rst->clk_src_uarta);
+
+	/* Give clock time to stabilize. */
+	udelay(IO_STABILIZATION_DELAY);
+
+	/* Take required peripherals out of reset. */
+
+	clrbits_le32(&clk_rst->rst_dev_l,
+		     CLK_L_CACHE2 | CLK_L_GPIO | CLK_L_TMR | CLK_L_I2C1 |
+		     CLK_L_SDMMC4);
+	clrbits_le32(&clk_rst->rst_dev_h,
+		     CLK_H_EMC | CLK_H_I2C2 | CLK_H_I2C5 | CLK_H_SBC1 |
+		     CLK_H_PMC | CLK_H_APBDMA | CLK_H_MEM);
+	clrbits_le32(&clk_rst->rst_dev_u,
+		     CLK_U_I2C3 | CLK_U_CSITE | CLK_U_SDMMC3);
+	clrbits_le32(&clk_rst->rst_dev_v, CLK_V_MSELECT);
+	clrbits_le32(&clk_rst->rst_dev_w, CLK_W_DVFS);
 }
diff --git a/src/soc/nvidia/tegra124/cpug.S b/src/soc/nvidia/tegra124/cpug.S
new file mode 100644
index 0000000..7f761e8
--- /dev/null
+++ b/src/soc/nvidia/tegra124/cpug.S
@@ -0,0 +1,56 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote products
+ *    derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+	.align 6
+	.arm
+
+	.global cpug_stack_pointer
+cpug_stack_pointer:
+	.word 0
+
+	.global cpug_entry_point
+cpug_entry_point:
+	.word 0
+
+	.global cpug_setup
+	.type cpug_setup, function
+	cpug_setup:
+
+	/*
+	 * Set the cpu to System mode with IRQ and FIQ disabled. Prefetch/Data
+	 * aborts may happen early and crash before the abort handlers are
+	 * installed, but at least the problem will show up near the code that
+	 * causes it.
+	 */
+	msr	cpsr_cxf, #0xdf
+
+	ldr	sp, cpug_stack_pointer
+	eor	lr, lr
+	ldr	r0, cpug_entry_point
+	bx	r0
diff --git a/src/soc/nvidia/tegra124/cpug.h b/src/soc/nvidia/tegra124/cpug.h
new file mode 100644
index 0000000..842843a
--- /dev/null
+++ b/src/soc/nvidia/tegra124/cpug.h
@@ -0,0 +1,29 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef __SOC_NVIDIA_TEGRA124_CPUG_H__
+#define __SOC_NVIDIA_TEGRA124_CPUG_H__
+
+#include <stdint.h>
+
+extern u32 cpug_stack_pointer;
+extern u32 cpug_entry_point;
+void cpug_setup(void);
+
+#endif	/* __SOC_NVIDIA_TEGRA124_CPUG_H__ */
diff --git a/src/soc/nvidia/tegra124/display.c b/src/soc/nvidia/tegra124/display.c
new file mode 100644
index 0000000..b7d3d7f
--- /dev/null
+++ b/src/soc/nvidia/tegra124/display.c
@@ -0,0 +1,304 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <console/console.h>
+#include <arch/io.h>
+#include <stdint.h>
+#include <lib.h>
+#include <stdlib.h>
+#include <delay.h>
+#include <soc/addressmap.h>
+#include <device/device.h>
+#include <stdlib.h>
+#include <string.h>
+#include <cpu/cpu.h>
+#include <boot/tables.h>
+#include <cbmem.h>
+#include <soc/clock.h>
+#include <soc/nvidia/tegra/dc.h>
+#include "clk_rst.h"
+#include "chip.h"
+#include <soc/display.h>
+
+static struct clk_rst_ctlr *clk_rst = (void *)TEGRA_CLK_RST_BASE;
+
+static const u32 rgb_enb_tab[PIN_REG_COUNT] = {
+	0x00000000,
+	0x00000000,
+	0x00000000,
+	0x00000000,
+};
+
+static const u32 rgb_polarity_tab[PIN_REG_COUNT] = {
+	0x00000000,
+	0x01000000,
+	0x00000000,
+	0x00000000,
+};
+
+static const u32 rgb_data_tab[PIN_REG_COUNT] = {
+	0x00000000,
+	0x00000000,
+	0x00000000,
+	0x00000000,
+};
+
+static const u32 rgb_sel_tab[PIN_OUTPUT_SEL_COUNT] = {
+	0x00000000,
+	0x00000000,
+	0x00000000,
+	0x00000000,
+	0x00210222,
+	0x00002200,
+	0x00020000,
+};
+
+static int update_display_mode(struct dc_disp_reg *disp,
+							   struct soc_nvidia_tegra124_config *config)
+{
+	u32 val;
+	u32 rate;
+	u32 div;
+
+	writel(0x0, &disp->disp_timing_opt);
+
+	writel(config->vref_to_sync << 16 | config->href_to_sync,
+		   &disp->ref_to_sync);
+	writel(config->vsync_width << 16 | config->hsync_width, &disp->sync_width);
+	writel(config->vback_porch << 16 | config->hback_porch, &disp->back_porch);
+	writel(config->vfront_porch << 16 | config->hfront_porch,
+		   &disp->front_porch);
+
+	writel(config->xres | (config->yres << 16), &disp->disp_active);
+
+	val = DE_SELECT_ACTIVE << DE_SELECT_SHIFT;
+	val |= DE_CONTROL_NORMAL << DE_CONTROL_SHIFT;
+	writel(val, &disp->data_enable_opt);
+
+	val = DATA_FORMAT_DF1P1C << DATA_FORMAT_SHIFT;
+	val |= DATA_ALIGNMENT_MSB << DATA_ALIGNMENT_SHIFT;
+	val |= DATA_ORDER_RED_BLUE << DATA_ORDER_SHIFT;
+	writel(val, &disp->disp_interface_ctrl);
+
+	/*
+	 * The pixel clock divider is in 7.1 format (where the bottom bit
+	 * represents 0.5). Here we calculate the divider needed to get from
+	 * the display clock (typically 600MHz) to the pixel clock. We round
+	 * up or down as requried.
+	 * We use pllp for now.
+	 */
+	rate = 600 * 1000000;
+	div = ((rate * 2 + config->pixel_clock / 2) / config->pixel_clock) - 2;
+	printk(BIOS_SPEW, "Display clock %d, divider %d\n", rate, div);
+
+	writel(0x00010001, &disp->shift_clk_opt);
+
+	val = PIXEL_CLK_DIVIDER_PCD1 << PIXEL_CLK_DIVIDER_SHIFT;
+	val |= div << SHIFT_CLK_DIVIDER_SHIFT;
+	writel(val, &disp->disp_clk_ctrl);
+
+	return 0;
+}
+
+static int setup_window(struct disp_ctl_win *win,
+						struct soc_nvidia_tegra124_config *config)
+{
+	int log2_bpp = log2(config->framebuffer_bits_per_pixel);
+	win->x = 0;
+	win->y = 0;
+	win->w = config->xres;
+	win->h = config->yres;
+	win->out_x = 0;
+	win->out_y = 0;
+	win->out_w = config->xres;
+	win->out_h = config->yres;
+	win->phys_addr = config->framebuffer_base;
+	win->stride = config->xres * (1 << log2_bpp) / 8;
+	printk(BIOS_SPEW, "%s: depth = %d\n", __func__, log2_bpp);
+	switch (log2_bpp) {
+		case 5:
+		case 24:
+			win->fmt = COLOR_DEPTH_R8G8B8A8;
+			win->bpp = 32;
+			break;
+		case 4:
+			win->fmt = COLOR_DEPTH_B5G6R5;
+			win->bpp = 16;
+			break;
+
+		default:
+			printk(BIOS_SPEW, "Unsupported LCD bit depth");
+			return -1;
+	}
+
+	return 0;
+}
+
+static void update_window(struct display_controller *dc,
+						  struct disp_ctl_win *win,
+						  struct soc_nvidia_tegra124_config *config)
+{
+	u32 h_dda, v_dda;
+	u32 val;
+
+	val = readl(&dc->cmd.disp_win_header);
+	val |= WINDOW_A_SELECT;
+	writel(val, &dc->cmd.disp_win_header);
+
+	writel(win->fmt, &dc->win.color_depth);
+
+	clrsetbits_le32(&dc->win.byte_swap, BYTE_SWAP_MASK,
+					BYTE_SWAP_NOSWAP << BYTE_SWAP_SHIFT);
+
+	val = win->out_x << H_POSITION_SHIFT;
+	val |= win->out_y << V_POSITION_SHIFT;
+	writel(val, &dc->win.pos);
+
+	val = win->out_w << H_SIZE_SHIFT;
+	val |= win->out_h << V_SIZE_SHIFT;
+	writel(val, &dc->win.size);
+
+	val = (win->w * win->bpp / 8) << H_PRESCALED_SIZE_SHIFT;
+	val |= win->h << V_PRESCALED_SIZE_SHIFT;
+	writel(val, &dc->win.prescaled_size);
+
+	writel(0, &dc->win.h_initial_dda);
+	writel(0, &dc->win.v_initial_dda);
+
+	h_dda = (win->w * 0x1000) / MAX(win->out_w - 1, 1);
+	v_dda = (win->h * 0x1000) / MAX(win->out_h - 1, 1);
+
+	val = h_dda << H_DDA_INC_SHIFT;
+	val |= v_dda << V_DDA_INC_SHIFT;
+	writel(val, &dc->win.dda_increment);
+
+	writel(win->stride, &dc->win.line_stride);
+	writel(0, &dc->win.buf_stride);
+
+	val = WIN_ENABLE;
+	if (win->bpp < 24)
+		val |= COLOR_EXPAND;
+	writel(val, &dc->win.win_opt);
+
+	writel((u32) win->phys_addr, &dc->winbuf.start_addr);
+	writel(win->x, &dc->winbuf.addr_h_offset);
+	writel(win->y, &dc->winbuf.addr_v_offset);
+
+	writel(0xff00, &dc->win.blend_nokey);
+	writel(0xff00, &dc->win.blend_1win);
+
+	val = GENERAL_ACT_REQ | WIN_A_ACT_REQ;
+	val |= GENERAL_UPDATE | WIN_A_UPDATE;
+	writel(val, &dc->cmd.state_ctrl);
+}
+
+/* this is really aimed at the lcd panel. That said, there are two display
+ * devices on this part and we may someday want to extend it for other boards.
+ */
+void display_startup(device_t dev)
+{
+	u32 val;
+	int i;
+	struct soc_nvidia_tegra124_config *config = dev->chip_info;
+	struct display_controller *dc = (void *)config->display_controller;
+	struct disp_ctl_win window;
+
+	/* should probably just make it all MiB ... in future */
+	u32 framebuffer_size_mb = config->framebuffer_size / MiB;
+	u32 framebuffer_base_mb= config->framebuffer_base / MiB;
+
+	printk(BIOS_SPEW,
+		"%s: xres %d yres %d framebuffer_bits_per_pixel %d\n",
+		__func__,
+	       config->xres, config->yres, config->framebuffer_bits_per_pixel);
+	if (framebuffer_size_mb == 0){
+		framebuffer_size_mb = ALIGN_UP(config->xres * config->yres *
+			(config->framebuffer_bits_per_pixel / 8), MiB)/MiB;
+	}
+
+	if (! framebuffer_base_mb)
+		framebuffer_base_mb = FB_BASE_MB;
+
+	mmu_config_range(framebuffer_base_mb, framebuffer_size_mb,
+		config->cache_policy);
+
+	/* Enable flushing after LCD writes if requested */
+	/* I don't understand this part yet.
+	   lcd_set_flush_dcache(config.cache_type & FDT_LCD_CACHE_FLUSH);
+	 */
+	printk(BIOS_SPEW, "LCD frame buffer at %dMiB to %dMiB\n", framebuffer_base_mb,
+		   framebuffer_base_mb + framebuffer_size_mb);
+
+	/* GPIO magic here if needed to start powering up things. You
+	 * really only want to enable vdd, wait a bit, and then enable
+	 * the panel. However ... the timings in the tegra20 dts make
+	 * no sense to me. I'm pretty sure they're wrong.
+	 * The panel_vdd is done in the romstage, so we need only
+	 * light things up here once we're sure it's all working.
+	 */
+	setbits_le32(&clk_rst->rst_dev_l, CLK_L_DISP1 | CLK_L_HOST1X);
+
+	clock_ll_set_source_divisor(&clk_rst->clk_src_host1x, 4,
+				    CLK_DIVIDER(TEGRA_PLLP_KHZ, 144000));
+	/* u-boot uses PLLC for DISP1.
+	 * But the u-boot code does not work and we don't set up PLLC anyway.
+	 * PLLP seems quite good enough, so run with that for now.  */
+	clock_ll_set_source_divisor(&clk_rst->clk_src_disp1, 0 /* 4 */,
+				    CLK_DIVIDER(TEGRA_PLLP_KHZ, 600000));
+
+	udelay(2);
+
+	clrbits_le32(&clk_rst->rst_dev_l, CLK_L_DISP1|CLK_L_HOST1X);
+
+	writel(0x00000100, &dc->cmd.gen_incr_syncpt_ctrl);
+	writel(0x0000011a, &dc->cmd.cont_syncpt_vsync);
+	writel(0x00000000, &dc->cmd.int_type);
+	writel(0x00000000, &dc->cmd.int_polarity);
+	writel(0x00000000, &dc->cmd.int_mask);
+	writel(0x00000000, &dc->cmd.int_enb);
+
+	val = PW0_ENABLE | PW1_ENABLE | PW2_ENABLE;
+	val |= PW3_ENABLE | PW4_ENABLE | PM0_ENABLE;
+	val |= PM1_ENABLE;
+	writel(val, &dc->cmd.disp_pow_ctrl);
+
+	val = readl(&dc->cmd.disp_cmd);
+	val |= CTRL_MODE_C_DISPLAY << CTRL_MODE_SHIFT;
+	writel(val, &dc->cmd.disp_cmd);
+
+	writel(0x00000020, &dc->disp.mem_high_pri);
+	writel(0x00000001, &dc->disp.mem_high_pri_timer);
+
+	for (i = 0; i < PIN_REG_COUNT; i++) {
+		writel(rgb_enb_tab[i], &dc->com.pin_output_enb[i]);
+		writel(rgb_polarity_tab[i], &dc->com.pin_output_polarity[i]);
+		writel(rgb_data_tab[i], &dc->com.pin_output_data[i]);
+	}
+
+	for (i = 0; i < PIN_OUTPUT_SEL_COUNT; i++)
+		writel(rgb_sel_tab[i], &dc->com.pin_output_sel[i]);
+
+	if (config->pixel_clock)
+		update_display_mode(&dc->disp, config);
+
+	if (!setup_window(&window, config))
+		update_window(dc, &window, config);
+
+}
+
diff --git a/src/soc/nvidia/tegra124/dma.c b/src/soc/nvidia/tegra124/dma.c
new file mode 100644
index 0000000..964bb7b
--- /dev/null
+++ b/src/soc/nvidia/tegra124/dma.c
@@ -0,0 +1,236 @@
+/*
+ * (C) Copyright 2010,2011
+ * NVIDIA Corporation <www.nvidia.com>
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <console/console.h>
+
+#include <arch/io.h>
+#include <soc/addressmap.h>
+
+#include "dma.h"
+
+/*
+ * Note: Many APB DMA controller registers are laid out such that each
+ * bit controls or represents the status for the corresponding channel.
+ * So we will not bother to list each individual bit in this case.
+ */
+#define APBDMA_COMMAND_GEN			(1 << 31)
+
+#define APBDMA_CNTRL_REG_COUNT_VALUE_MASK	0xffff
+#define APBDMA_CNTRL_REG_COUNT_VALUE_SHIFT	0
+
+struct apb_dma {
+	u32 command;		/* 0x00 */
+	u32 status;		/* 0x04 */
+	u32 rsvd1[2];
+	u32 cntrl_reg;		/* 0x10 */
+	u32 irq_sta_cpu;	/* 0x14 */
+	u32 irq_sta_cop;	/* 0x18 */
+	u32 irq_mask;		/* 0x1c */
+	u32 irq_mask_set;	/* 0x20 */
+	u32 irq_mask_clr;	/* 0x24 */
+	u32 trig_reg;		/* 0x28 */
+	u32 channel_trig_reg;	/* 0x2c */
+	u32 dma_status;		/* 0x30 */
+	u32 channel_en_reg;	/* 0x34 */
+	u32 security_reg;	/* 0x38 */
+	u32 channel_swid;	/* 0x3c */
+	u32 rsvd[1];
+	u32 chan_wt_reg0;	/* 0x44 */
+	u32 chan_wt_reg1;	/* 0x48 */
+	u32 chan_wt_reg2;	/* 0x4c */
+	u32 chan_wr_reg3;	/* 0x50 */
+	u32 channel_swid1;	/* 0x54 */
+} __attribute__((packed));
+struct apb_dma * const apb_dma = (struct apb_dma *)TEGRA_APB_DMA_BASE;
+
+/*
+ * Naming in the doc included a superfluous _CHANNEL_n_ for
+ * each entry and was left out for the sake of conciseness.
+ */
+#define APBDMACHAN_CSR_ENB			(1 << 31)
+#define APBDMACHAN_CSR_IE_EOC			(1 << 30)
+#define APBDMACHAN_CSR_HOLD			(1 << 29)
+#define APBDMACHAN_CSR_DIR			(1 << 28)
+#define APBDMACHAN_CSR_ONCE			(1 << 27)
+#define APBDMACHAN_CSR_FLOW			(1 << 21)
+#define APBDMACHAN_CSR_REQ_SEL_MASK		0x1f
+#define APBDMACHAN_CSR_REQ_SEL_SHIFT		16
+
+#define APBDMACHAN_STA_BSY			(1 << 31)
+#define APBDMACHAN_STA_ISE_EOC			(1 << 30)
+#define APBDMACHAN_STA_HALT			(1 << 29)
+#define APBDMACHAN_STA_PING_PONG_STA		(1 << 28)
+#define APBDMACHAN_STA_DMA_ACTIVITY		(1 << 27)
+#define APBDMACHAN_STA_CHANNEL_PAUSE		(1 << 26)
+
+#define APBDMACHAN_CSRE_CHANNEL_PAUSE		(1 << 31)
+#define APBDMACHAN_CSRE_TRIG_SEL_MASK		0x3f
+#define APBDMACHAN_CSRE_TRIG_SEL_SHIFT		14
+
+#define APBDMACHAN_AHB_PTR_MASK			(0x3fffffff)
+#define APBDMACHAN_AHB_PTR_SHIFT		2
+
+#define APBDMACHAN_AHB_SEQ_INTR_ENB		(1 << 31)
+#define APBDMACHAN_AHB_SEQ_AHB_BUS_WIDTH_MASK	0x7
+#define APBDMACHAN_AHB_SEQ_AHB_BUS_WIDTH_SHIFT	28
+#define APBDMACHAN_AHB_SEQ_AHB_DATA_SWAP	(1 << 27)
+#define APBDMACHAN_AHB_SEQ_AHB_BURST_MASK	0x7
+#define APBDMACHAN_AHB_SEQ_AHB_BURST_SHIFT	24
+#define APBDMACHAN_AHB_SEQ_DBL_BUF		(1 << 19)
+#define APBDMACHAN_AHB_SEQ_WRAP_MASK		0x7
+#define APBDMACHAN_AHB_SEQ_WRAP_SHIFT		16
+
+#define APBDMACHAN_AHB_SEQ_AHB_BUS_WIDTH_MASK	0x7
+#define APBDMACHAN_AHB_SEQ_AHB_BUS_WIDTH_SHIFT	28
+
+#define APBDMACHAN_APB_PTR_MASK			0x3fffffff
+#define APBDMACHAN_APB_PTR_SHIFT		2
+
+#define APBDMACHAN_APB_SEQ_APB_BUS_WIDTH_MASK	0x7
+#define APBDMACHAN_APB_SEQ_APB_BUS_WIDTH_SHIFT	28
+#define APBDMACHAN_APB_SEQ_APB_DATA_SWAP	(1 << 27)
+#define APBDMACHAN_APB_SEQ_APB_ADDR_WRAP_MASK	0x7
+#define APBDMACHAN_APB_SEQ_APB_ADDR_WRAP_SHIFT	16
+
+#define APBDMACHAN_WORD_TRANSFER_
+
+#define APBDMACHAN_WORD_TRANSFER_MASK		0x0fffffff
+#define APBDMACHAN_WORD_TRANSFER_SHIFT		2
+
+#define APB_DMA_OFFSET(n) \
+		(struct apb_dma_channel_regs *)(TEGRA_APB_DMA_BASE + n)
+struct apb_dma_channel apb_dma_channels[] = {
+	{ .num = 0, .regs = APB_DMA_OFFSET(0x1000) },
+	{ .num = 1, .regs = APB_DMA_OFFSET(0x1040) },
+	{ .num = 2, .regs = APB_DMA_OFFSET(0x1080) },
+	{ .num = 3, .regs = APB_DMA_OFFSET(0x10c0) },
+	{ .num = 4, .regs = APB_DMA_OFFSET(0x1100) },
+	{ .num = 5, .regs = APB_DMA_OFFSET(0x1140) },
+	{ .num = 6, .regs = APB_DMA_OFFSET(0x1180) },
+	{ .num = 7, .regs = APB_DMA_OFFSET(0x11c0) },
+	{ .num = 8, .regs = APB_DMA_OFFSET(0x1200) },
+	{ .num = 9, .regs = APB_DMA_OFFSET(0x1240) },
+	{ .num = 10, .regs = APB_DMA_OFFSET(0x1280) },
+	{ .num = 11, .regs = APB_DMA_OFFSET(0x12c0) },
+	{ .num = 12, .regs = APB_DMA_OFFSET(0x1300) },
+	{ .num = 13, .regs = APB_DMA_OFFSET(0x1340) },
+	{ .num = 14, .regs = APB_DMA_OFFSET(0x1380) },
+	{ .num = 15, .regs = APB_DMA_OFFSET(0x13c0) },
+	{ .num = 16, .regs = APB_DMA_OFFSET(0x1400) },
+	{ .num = 17, .regs = APB_DMA_OFFSET(0x1440) },
+	{ .num = 18, .regs = APB_DMA_OFFSET(0x1480) },
+	{ .num = 19, .regs = APB_DMA_OFFSET(0x14c0) },
+	{ .num = 20, .regs = APB_DMA_OFFSET(0x1500) },
+	{ .num = 21, .regs = APB_DMA_OFFSET(0x1540) },
+	{ .num = 22, .regs = APB_DMA_OFFSET(0x1580) },
+	{ .num = 23, .regs = APB_DMA_OFFSET(0x15c0) },
+	{ .num = 24, .regs = APB_DMA_OFFSET(0x1600) },
+	{ .num = 25, .regs = APB_DMA_OFFSET(0x1640) },
+	{ .num = 26, .regs = APB_DMA_OFFSET(0x1680) },
+	{ .num = 27, .regs = APB_DMA_OFFSET(0x16c0) },
+	{ .num = 28, .regs = APB_DMA_OFFSET(0x1700) },
+	{ .num = 29, .regs = APB_DMA_OFFSET(0x1740) },
+	{ .num = 30, .regs = APB_DMA_OFFSET(0x1780) },
+	{ .num = 31, .regs = APB_DMA_OFFSET(0x17c0) },
+};
+
+int dma_busy(struct apb_dma_channel * const channel)
+{
+	/*
+	 * In continuous mode, the BSY_n bit in APB_DMA_STATUS and
+	 * BSY in APBDMACHAN_CHANNEL_n_STA_0 will remain set as '1' so long
+	 * as the channel is enabled. So for this function we'll use the
+	 * DMA_ACTIVITY bit.
+	 */
+	return read32(&channel->regs->sta) & APBDMACHAN_STA_DMA_ACTIVITY ? 1 : 0;
+}
+/* claim a DMA channel */
+struct apb_dma_channel * const dma_claim(void)
+{
+	int i;
+	struct apb_dma_channel_regs *regs = NULL;
+
+	/*
+	 * Set global enable bit, otherwise register access to channel
+	 * DMA registers will not be possible.
+	 */
+	setbits_le32(&apb_dma->command, APBDMA_COMMAND_GEN);
+
+	for (i = 0; i < ARRAY_SIZE(apb_dma_channels); i++) {
+		regs = apb_dma_channels[i].regs;
+
+		if (!apb_dma_channels[i].in_use) {
+			u32 status = read32(&regs->sta);
+			if (status & (1 << i)) {
+				/* FIXME: should this be fatal? */
+				printk(BIOS_DEBUG, "%s: DMA channel %d busy?\n",
+						__func__, i);
+			}
+			break;
+		}
+	}
+
+	if (i == ARRAY_SIZE(apb_dma_channels))
+		return NULL;
+
+	apb_dma_channels[i].in_use = 1;
+	return &apb_dma_channels[i];
+}
+
+/* release a DMA channel */
+void dma_release(struct apb_dma_channel * const channel)
+{
+	int i;
+
+	/* FIXME: make this "thread" friendly */
+	while (dma_busy(channel))
+		;
+
+	channel->in_use = 0;
+
+	/* clear the global enable bit if no channels are in use */
+	for (i = 0; i < ARRAY_SIZE(apb_dma_channels); i++) {
+		if (apb_dma_channels[i].in_use)
+			return;
+	}
+
+	clrbits_le32(&apb_dma->command, APBDMA_COMMAND_GEN);
+}
+
+int dma_start(struct apb_dma_channel * const channel)
+{
+	struct apb_dma_channel_regs *regs = channel->regs;
+
+	/* Set ENB bit for this channel */
+	setbits_le32(&regs->csr, APBDMACHAN_CSR_ENB);
+
+	return 0;
+}
+
+int dma_stop(struct apb_dma_channel * const channel)
+{
+	struct apb_dma_channel_regs *regs = channel->regs;
+
+	/* Clear ENB bit for this channel */
+	clrbits_le32(&regs->csr, APBDMACHAN_CSR_ENB);
+
+	return 0;
+}
diff --git a/src/soc/nvidia/tegra124/dma.h b/src/soc/nvidia/tegra124/dma.h
new file mode 100644
index 0000000..d7e9090
--- /dev/null
+++ b/src/soc/nvidia/tegra124/dma.h
@@ -0,0 +1,61 @@
+/*
+ * (C) Copyright 2010,2011
+ * NVIDIA Corporation <www.nvidia.com>
+ *  Copyright (C) 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __NVIDIA_TEGRA124_DMA_H__
+#define __NVIDIA_TEGRA124_DMA_H__
+
+#include <inttypes.h>
+#include <soc/addressmap.h>
+
+/*
+ * The DMA engine operates on 4 bytes at a time, so make sure any data
+ * passed via DMA is aligned to avoid underrun/overrun.
+ */
+#define TEGRA_DMA_ALIGN_BYTES	4
+
+struct apb_dma_channel_regs {
+	u32 csr;		/* 0x00 */
+	u32 sta;		/* 0x04 */
+	u32 dma_byte_sta;	/* 0x08 */
+	u32 csre;		/* 0x0c */
+	u32 ahb_ptr;		/* 0x10 */
+	u32 ahb_seq;		/* 0x14 */
+	u32 apb_ptr;		/* 0x18 */
+	u32 apb_seq;		/* 0x1c */
+	u32 wcount;		/* 0x20 */
+	u32 word_transfer;	/* 0x24 */
+} __attribute__((packed));
+
+struct apb_dma_channel {
+	const int num;
+	struct apb_dma_channel_regs *regs;
+
+	/*
+	 * Basic high-level semaphore that can be used to "claim"
+	 * a DMA channel e.g. by SPI, I2C, or other peripheral driver.
+	 */
+	int in_use;
+};
+
+struct apb_dma_channel * const dma_claim(void);
+void dma_release(struct apb_dma_channel * const channel);
+int dma_start(struct apb_dma_channel * const channel);
+int dma_stop(struct apb_dma_channel * const channel);
+int dma_busy(struct apb_dma_channel * const channel);
+
+#endif	/* __NVIDIA_TEGRA124_DMA_H__ */
diff --git a/src/soc/nvidia/tegra124/early_display.c b/src/soc/nvidia/tegra124/early_display.c
new file mode 100644
index 0000000..4356a62
--- /dev/null
+++ b/src/soc/nvidia/tegra124/early_display.c
@@ -0,0 +1,61 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <console/console.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include <arch/cache.h>
+#include "chip.h"
+#include <soc/display.h>
+#include "gpio.h"
+
+/* the primary purpose of this function is to kick off things in
+ * romstage that are time consuming. No more should be done at this
+ * point than the bare minimum that will allow us to pick up the work
+ * in the ramstage and not require lots of udelays.  ramstage will do
+ * the rest. As it happens, that merely means 'turn off everything you can,
+ * save the minimum long-delay bit that you need to get graphics going'. It's
+ * important not to have the backlight on as people don't like display trash
+ * on startup, even if I do.
+ */
+void setup_display(struct soc_nvidia_tegra124_config *config)
+{
+	if (config->panel_vdd_gpio){
+		gpio_output(config->panel_vdd_gpio, 1);
+		printk(BIOS_SPEW,"%s: setting gpio %08x to %d\n",
+			__func__, config->panel_vdd_gpio, 1);
+	}
+	if (config->lvds_shutdown_gpio){
+		gpio_output(config->lvds_shutdown_gpio, 0);
+		printk(BIOS_SPEW,"%s: setting gpio %08x to %d\n",
+			__func__, config->lvds_shutdown_gpio, 0);
+	}
+	if (config->backlight_en_gpio){
+		gpio_output(config->backlight_en_gpio, 0);
+		printk(BIOS_SPEW,"%s: setting gpio %08x to %d\n",
+			__func__, config->backlight_en_gpio, 0);
+	}
+	if (config->backlight_vdd_gpio){
+		gpio_output(config->backlight_vdd_gpio, 0);
+		printk(BIOS_SPEW,"%s: setting gpio %08x to %d\n",
+			__func__, config->backlight_vdd_gpio, 0);
+	}
+
+}
+
diff --git a/src/soc/nvidia/tegra124/flow.h b/src/soc/nvidia/tegra124/flow.h
new file mode 100644
index 0000000..a974f09
--- /dev/null
+++ b/src/soc/nvidia/tegra124/flow.h
@@ -0,0 +1,49 @@
+/*
+ * Copyright (c) 2010-2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _TEGRA124_FLOW_H_
+#define _TEGRA124_FLOW_H_
+
+struct flow_ctlr {
+	u32 halt_cpu_events;	/* offset 0x00 */
+	u32 halt_cop_events;	/* offset 0x04 */
+	u32 cpu_csr;		/* offset 0x08 */
+	u32 cop_csr;		/* offset 0x0c */
+	u32 xrq_events;		/* offset 0x10 */
+	u32 halt_cpu1_events;	/* offset 0x14 */
+	u32 cpu1_csr;		/* offset 0x18 */
+	u32 halt_cpu2_events;	/* offset 0x1c */
+	u32 cpu2_csr;		/* offset 0x20 */
+	u32 halt_cpu3_events;	/* offset 0x24 */
+	u32 cpu3_csr;		/* offset 0x28 */
+	u32 cluster_control;	/* offset 0x2c */
+	u32 halt_cop1_events;	/* offset 0x30 */
+	u32 halt_cop1_csr;	/* offset 0x34 */
+	u32 cpu_pwr_csr;	/* offset 0x38 */
+	u32 mpid;		/* offset 0x3c */
+	u32 ram_repair;		/* offset 0x40 */
+};
+
+/* HALT_COP_EVENTS_0, 0x04 */
+#define EVENT_MSEC		(1 << 24)
+#define EVENT_USEC		(1 << 25)
+#define EVENT_JTAG		(1 << 28)
+#define EVENT_MODE_STOP		(2 << 29)
+
+/* FLOW_CTLR_CLUSTER_CONTROL_0 0x2c */
+#define ACTIVE_LP		(1 << 0)
+
+#endif	/*  _TEGRA124_FLOW_H_ */
diff --git a/src/soc/nvidia/tegra124/gpio.h b/src/soc/nvidia/tegra124/gpio.h
index 83d727d..f7d1c30 100644
--- a/src/soc/nvidia/tegra124/gpio.h
+++ b/src/soc/nvidia/tegra124/gpio.h
@@ -23,11 +23,14 @@
 #include <soc/nvidia/tegra/gpio.h>
 #include <stdint.h>
 
+#include "pinmux.h"	/* for pinmux constants in GPIO macro */
+
 /* GPIO index constants. */
 
 #define GPIO_PORT_CONSTANTS(port) \
-	GPIO_##port##0, GPIO_##port##1, GPIO_##port##2, GPIO_##port##3, \
-	GPIO_##port##4, GPIO_##port##5, GPIO_##port##6, GPIO_##port##7
+	GPIO_##port##0_INDEX, GPIO_##port##1_INDEX, GPIO_##port##2_INDEX, \
+	GPIO_##port##3_INDEX, GPIO_##port##4_INDEX, GPIO_##port##5_INDEX, \
+	GPIO_##port##6_INDEX, GPIO_##port##7_INDEX
 
 enum {
 	GPIO_PORT_CONSTANTS(A),
diff --git a/src/soc/nvidia/tegra124/include/soc/addressmap.h b/src/soc/nvidia/tegra124/include/soc/addressmap.h
index bf59d75..9836215 100644
--- a/src/soc/nvidia/tegra124/include/soc/addressmap.h
+++ b/src/soc/nvidia/tegra124/include/soc/addressmap.h
@@ -21,6 +21,8 @@
 #ifndef __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_ADDRESS_MAP_H__
 #define __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_ADDRESS_MAP_H__
 
+#include <stddef.h>
+
 enum {
 	TEGRA_SRAM_BASE = 0x40000000,
 	TEGRA_SRAM_SIZE = 0x20000
@@ -28,12 +30,15 @@
 
 enum {
 	TEGRA_ARM_PERIPHBASE =		0x50040000,
+	TEGRA_ARM_DISPLAYA =            0x54200000,
+	TEGRA_ARM_DISPLAYB =            0x54240000,
 	TEGRA_PG_UP_BASE =		0x60000000,
 	TEGRA_TMRUS_BASE =		0x60005010,
 	TEGRA_CLK_RST_BASE =		0x60006000,
 	TEGRA_FLOW_BASE =		0x60007000,
 	TEGRA_GPIO_BASE =		0x6000D000,
 	TEGRA_EVP_BASE =		0x6000F000,
+	TEGRA_APB_DMA_BASE =		0x60020000,
 	TEGRA_APB_MISC_BASE =		0x70000000,
 	TEGRA_APB_MISC_GP_BASE =	TEGRA_APB_MISC_BASE + 0x0800,
 	TEGRA_APB_PINGROUP_BASE =	TEGRA_APB_MISC_BASE + 0x0868,
@@ -51,12 +56,12 @@
 	TEGRA_I2C4_BASE =		TEGRA_APB_MISC_BASE + 0xC700,
 	TEGRA_I2C5_BASE =		TEGRA_APB_MISC_BASE + 0xD000,
 	TEGRA_I2C6_BASE =		TEGRA_APB_MISC_BASE + 0xD100,
-	TEGRA_SLINK1_BASE =		TEGRA_APB_MISC_BASE + 0xD400,
-	TEGRA_SLINK2_BASE =		TEGRA_APB_MISC_BASE + 0xD600,
-	TEGRA_SLINK3_BASE =		TEGRA_APB_MISC_BASE + 0xD800,
-	TEGRA_SLINK4_BASE =		TEGRA_APB_MISC_BASE + 0xDA00,
-	TEGRA_SLINK5_BASE =		TEGRA_APB_MISC_BASE + 0xDC00,
-	TEGRA_SLINK6_BASE =		TEGRA_APB_MISC_BASE + 0xDE00,
+	TEGRA_SPI1_BASE =		TEGRA_APB_MISC_BASE + 0xD400,
+	TEGRA_SPI2_BASE =		TEGRA_APB_MISC_BASE + 0xD600,
+	TEGRA_SPI3_BASE =		TEGRA_APB_MISC_BASE + 0xD800,
+	TEGRA_SPI4_BASE =		TEGRA_APB_MISC_BASE + 0xDA00,
+	TEGRA_SPI5_BASE =		TEGRA_APB_MISC_BASE + 0xDC00,
+	TEGRA_SPI6_BASE =		TEGRA_APB_MISC_BASE + 0xDE00,
 	TEGRA_DVC_BASE =		TEGRA_APB_MISC_BASE + 0xD000,
 	TEGRA_PMC_BASE =		TEGRA_APB_MISC_BASE + 0xE400,
 	TEGRA_EMC_BASE =		TEGRA_APB_MISC_BASE + 0xF400,
@@ -69,4 +74,9 @@
 	TEGRA_I2C_BASE_COUNT = 6,
 };
 
+enum {
+	FB_SIZE_MB = (32),
+	FB_BASE_MB = (CONFIG_SYS_SDRAM_BASE/MiB + (CONFIG_DRAM_SIZE_MB - FB_SIZE_MB)),
+};
+
 #endif /* __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_ADDRESS_MAP_H__ */
diff --git a/src/soc/nvidia/tegra124/include/soc/clock.h b/src/soc/nvidia/tegra124/include/soc/clock.h
new file mode 100644
index 0000000..056a38b
--- /dev/null
+++ b/src/soc/nvidia/tegra124/include/soc/clock.h
@@ -0,0 +1,171 @@
+/*
+ * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __SOC_NVIDIA_TEGRA124_CLOCK_H__
+#define __SOC_NVIDIA_TEGRA124_CLOCK_H__
+
+enum {
+	CLK_L_CPU = 0x1 << 0,
+	CLK_L_COP = 0x1 << 1,
+	CLK_L_TRIG_SYS = 0x1 << 2,
+	CLK_L_RTC = 0x1 << 4,
+	CLK_L_TMR = 0x1 << 5,
+	CLK_L_UARTA = 0x1 << 6,
+	CLK_L_UARTB = 0x1 << 7,
+	CLK_L_GPIO = 0x1 << 8,
+	CLK_L_SDMMC2 = 0x1 << 9,
+	CLK_L_SPDIF = 0x1 << 10,
+	CLK_L_I2S1 = 0x1 << 11,
+	CLK_L_I2C1 = 0x1 << 12,
+	CLK_L_NDFLASH = 0x1 << 13,
+	CLK_L_SDMMC1 = 0x1 << 14,
+	CLK_L_SDMMC4 = 0x1 << 15,
+	CLK_L_PWM = 0x1 << 17,
+	CLK_L_I2S2 = 0x1 << 18,
+	CLK_L_EPP = 0x1 << 19,
+	CLK_L_VI = 0x1 << 20,
+	CLK_L_2D = 0x1 << 21,
+	CLK_L_USBD = 0x1 << 22,
+	CLK_L_ISP = 0x1 << 23,
+	CLK_L_3D = 0x1 << 24,
+	CLK_L_DISP2 = 0x1 << 26,
+	CLK_L_DISP1 = 0x1 << 27,
+	CLK_L_HOST1X = 0x1 << 28,
+	CLK_L_VCP = 0x1 << 29,
+	CLK_L_I2S0 = 0x1 << 30,
+	CLK_L_CACHE2 = 0x1 << 31,
+
+	CLK_H_MEM = 0x1 << 0,
+	CLK_H_AHBDMA = 0x1 << 1,
+	CLK_H_APBDMA = 0x1 << 2,
+	CLK_H_KBC = 0x1 << 4,
+	CLK_H_STAT_MON = 0x1 << 5,
+	CLK_H_PMC = 0x1 << 6,
+	CLK_H_FUSE = 0x1 << 7,
+	CLK_H_KFUSE = 0x1 << 8,
+	CLK_H_SBC1 = 0x1 << 9,
+	CLK_H_SNOR = 0x1 << 10,
+	CLK_H_JTAG2TBC = 0x1 << 11,
+	CLK_H_SBC2 = 0x1 << 12,
+	CLK_H_SBC3 = 0x1 << 14,
+	CLK_H_I2C5 = 0x1 << 15,
+	CLK_H_DSI = 0x1 << 16,
+	CLK_H_HSI = 0x1 << 18,
+	CLK_H_HDMI = 0x1 << 19,
+	CLK_H_CSI = 0x1 << 20,
+	CLK_H_I2C2 = 0x1 << 22,
+	CLK_H_UARTC = 0x1 << 23,
+	CLK_H_MIPI_CAL = 0x1 << 24,
+	CLK_H_EMC = 0x1 << 25,
+	CLK_H_USB2 = 0x1 << 26,
+	CLK_H_USB3 = 0x1 << 27,
+	CLK_H_MPE = 0x1 << 28,
+	CLK_H_VDE = 0x1 << 29,
+	CLK_H_BSEA = 0x1 << 30,
+	CLK_H_BSEV = 0x1 << 31,
+
+	CLK_U_UARTD = 0x1 << 1,
+	CLK_U_UARTE = 0x1 << 2,
+	CLK_U_I2C3 = 0x1 << 3,
+	CLK_U_SBC4 = 0x1 << 4,
+	CLK_U_SDMMC3 = 0x1 << 5,
+	CLK_U_PCIE = 0x1 << 6,
+	CLK_U_OWR = 0x1 << 7,
+	CLK_U_AFI = 0x1 << 8,
+	CLK_U_CSITE = 0x1 << 9,
+	CLK_U_PCIEXCLK = 0x1 << 10,
+	CLK_U_AVPUCQ = 0x1 << 11,
+	CLK_U_TRACECLKIN = 0x1 << 13,
+	CLK_U_SOC_THERM = 0x1 << 14,
+	CLK_U_DTV = 0x1 << 15,
+	CLK_U_NAND_SPEED = 0x1 << 16,
+	CLK_U_I2C_SLOW = 0x1 << 17,
+	CLK_U_DSIB = 0x1 << 18,
+	CLK_U_TSEC = 0x1 << 19,
+	CLK_U_IRAMA = 0x1 << 20,
+	CLK_U_IRAMB = 0x1 << 21,
+	CLK_U_IRAMC = 0x1 << 22,
+
+	// Clock reset.
+	CLK_U_EMUCIF = 0x1 << 23,
+	// Clock enable.
+	CLK_U_IRAMD = 0x1 << 23,
+
+	CLK_U_CRAM2 = 0x2 << 24,
+	CLK_U_XUSB_HOST = 0x1 << 25,
+	CLK_U_MSENC = 0x1 << 27,
+	CLK_U_SUS_OUT = 0x1 << 28,
+	CLK_U_DEV2_OUT = 0x1 << 29,
+	CLK_U_DEV1_OUT = 0x1 << 30,
+	CLK_U_XUSB_DEV = 0x1 << 31,
+
+	CLK_V_CPUG = 0x1 << 0,
+	CLK_V_CPULP = 0x1 << 1,
+	CLK_V_3D2 = 0x1 << 2,
+	CLK_V_MSELECT = 0x1 << 3,
+	CLK_V_I2S3 = 0x1 << 5,
+	CLK_V_I2S4 = 0x1 << 6,
+	CLK_V_I2C4 = 0x1 << 7,
+	CLK_V_SBC5 = 0x1 << 8,
+	CLK_V_SBC6 = 0x1 << 9,
+	CLK_V_AUDIO = 0x1 << 10,
+	CLK_V_APBIF = 0x1 << 11,
+	CLK_V_DAM0 = 0x1 << 12,
+	CLK_V_DAM1 = 0x1 << 13,
+	CLK_V_DAM2 = 0x1 << 14,
+	CLK_V_HDA2CODEC_2X = 0x1 << 15,
+	CLK_V_ATOMICS = 0x1 << 16,
+	CLK_V_ACTMON = 0x1 << 23,
+	CLK_V_SATA = 0x1 << 28,
+	CLK_V_HDA = 0x1 << 29,
+
+	CLK_W_HDA2HDMICODEC = 0x1 << 0,
+	CLK_W_SATACOLD = 0x1 << 1,
+	CLK_W_CEC = 0x1 << 8,
+	CLK_W_XUSB_PADCTL = 0x1 << 14,
+	CLK_W_ENTROPY = 0x1 << 21,
+	CLK_W_AMX0 = 0x1 << 25,
+	CLK_W_ADX0 = 0x1 << 26,
+	CLK_W_DVFS = 0x1 << 27,
+	CLK_W_XUSB_SS = 0x1 << 28,
+	CLK_W_MC1 = 0x1 << 30,
+	CLK_W_EMC1 = 0x1 << 31
+};
+
+/* PLL stabilization delay in usec */
+#define CLOCK_PLL_STABLE_DELAY_US 300
+
+#define IO_STABILIZATION_DELAY (2)
+/* Calculate clock fractional divider value from ref and target frequencies */
+#define CLK_DIVIDER(REF, FREQ)	((((REF) * 2) / FREQ) - 2)
+
+/* Calculate clock frequency value from reference and clock divider value */
+#define CLK_FREQUENCY(REF, REG)	(((REF) * 2) / (REG + 2))
+
+/* soc-specific */
+#define TEGRA_PLLX_KHZ   (1900000)
+#define TEGRA_PLLP_KHZ   (408000)
+#define TEGRA_PLLC_KHZ   (600000)
+#define TEGRA_PLLD_KHZ   (925000)
+#define TEGRA_PLLU_KHZ   (960000)
+
+int clock_get_osc_khz(void);
+void clock_early_uart(void);
+void clock_cpu0_config_and_reset(void * entry);
+void clock_config(void);
+void clock_init(void);
+void clock_ll_set_source_divisor(u32 *reg, u32 source, u32 divisor);
+#endif /* __SOC_NVIDIA_TEGRA124_CLOCK_H__ */
diff --git a/src/soc/nvidia/tegra124/clock.h b/src/soc/nvidia/tegra124/include/soc/display.h
similarity index 69%
rename from src/soc/nvidia/tegra124/clock.h
rename to src/soc/nvidia/tegra124/include/soc/display.h
index 688505a..8c7e3e7 100644
--- a/src/soc/nvidia/tegra124/clock.h
+++ b/src/soc/nvidia/tegra124/include/soc/display.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
+ * Copyright 2013 Google Inc.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -14,9 +14,9 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
-#ifndef __SOC_NVIDIA_TEGRA124_CLOCK_H__
-#define __SOC_NVIDIA_TEGRA124_CLOCK_H__
+#ifndef __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_DISPLAY_H__
+#define __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_DISPLAY_H__
 
-void set_avp_clock_to_clkm(void);
+void setup_display(struct soc_nvidia_tegra124_config *config);
 
-#endif /* __SOC_NVIDIA_TEGRA124_CLOCK_H__ */
+#endif /* __SOC_NVIDIA_TEGRA124_INCLUDE_SOC_DISPLAY_H__ */
diff --git a/src/soc/nvidia/tegra124/pmc.h b/src/soc/nvidia/tegra124/pmc.h
new file mode 100644
index 0000000..1134abd
--- /dev/null
+++ b/src/soc/nvidia/tegra124/pmc.h
@@ -0,0 +1,200 @@
+/*
+ * Copyright (c) 2010 - 2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _TEGRA124_PMC_H_
+#define _TEGRA124_PMC_H_
+
+#include <stdint.h>
+
+enum {
+	POWER_PARTID_CRAIL = 0,
+	POWER_PARTID_TD = 1,
+	POWER_PARTID_VE = 2,
+	POWER_PARTID_VDE = 4,
+	POWER_PARTID_L2C = 5,
+	POWER_PARTID_MPE = 6,
+	POWER_PARTID_HEG = 7,
+	POWER_PARTID_CE1 = 9,
+	POWER_PARTID_CE2 = 10,
+	POWER_PARTID_CE3 = 11,
+	POWER_PARTID_CELP = 12,
+	POWER_PARTID_CE0 = 14,
+	POWER_PARTID_C0NC = 15,
+	POWER_PARTID_C1NC = 16,
+	POWER_PARTID_DIS = 18,
+	POWER_PARTID_DISB = 19,
+	POWER_PARTID_XUSBA = 20,
+	POWER_PARTID_XUSBB = 21,
+	POWER_PARTID_XUSBC = 22
+};
+
+struct tegra_pmc_regs {
+	u32 cntrl;
+	u32 sec_disable;
+	u32 pmc_swrst;
+	u32 wake_mask;
+	u32 wake_lvl;
+	u32 wake_status;
+	u32 sw_wake_status;
+	u32 dpd_pads_oride;
+	u32 dpd_sample;
+	u32 dpd_enable;
+	u32 pwrgate_timer_off;
+	u32 clamp_status;
+	u32 pwrgate_toggle;
+	u32 remove_clamping_cmd;
+	u32 pwrgate_status;
+	u32 pwrgood_timer;
+	u32 blink_timer;
+	u32 no_iopower;
+	u32 pwr_det;
+	u32 pwr_det_latch;
+	u32 scratch[24];
+	u32 secure_scratch[6];
+	u32 cpupwrgood_timer;
+	u32 cpupwroff_timer;
+	u32 pg_mask;
+	u32 pg_mask_1;
+	u32 auto_wake_lvl;
+	u32 auto_wake_lvl_mask;
+	u32 wake_delay;
+	u32 pwr_det_val;
+	u32 ddr_pwr;
+	u32 usb_debounce_del;
+	u32 usb_a0;
+	u32 crypto_op;
+	u32 pllp_wb0_override;
+	u32 scratch24[43 - 24];
+	u32 bondout_mirror[3];
+	u32 sys_33v_en;
+	u32 bondout_mirror_access;
+	u32 gate;
+	u32 wake2_mask;
+	u32 wake2_lvl;
+	u32 wake2_status;
+	u32 sw_wake2_status;
+	u32 auto_wake2_lvl_mask;
+	u32 pg_mask_2;
+	u32 pg_mask_ce1;
+	u32 pg_mask_ce2;
+	u32 pg_mask_ce3;
+	u32 pwrgate_timer_ce[7];
+	u32 pcx_edpd_cntrl;
+	u32 osc_edpd_over;
+	u32 clk_out_cntrl;
+	u32 sata_pwrgt;
+	u32 sensor_ctrl;
+	u32 rst_status;
+	u32 io_dpd_req;
+	u32 io_dpd_status;
+	u32 io_dpd2_req;
+	u32 io_dpd2_status;
+	u32 sel_dpd_tim;
+	u32 vddp_sel;
+	u32 ddr_cfg;
+	u32 e_no_vttgen;
+	u8 _rsv0[4];
+	u32 pllm_wb0_override_freq;
+	u32 test_pwrgate;
+	u32 pwrgate_timer_mult;
+	u32 dis_sel_dpd;
+	u32 utmip_uhsic_triggers;
+	u32 utmip_uhsic_saved_state;
+	u32 utmip_pad_cfg;
+	u32 utmip_term_pad_cfg;
+	u32 utmip_uhsic_sleep_cfg;
+	u32 utmip_uhsic_sleepwalk_cfg;
+	u32 utmip_sleepwalk_p[3];
+	u32 uhsic_sleepwalk_p0;
+	u32 utmip_uhsic_status;
+	u32 utmip_uhsic_fake;
+	u32 bondout_mirror3[5 - 3];
+	u32 secure_scratch6[8 - 6];
+	u32 scratch43[56 - 43];
+	u32 scratch_eco[3];
+	u32 utmip_uhsic_line_wakeup;
+	u32 utmip_bias_master_cntrl;
+	u32 utmip_master_config;
+	u32 td_pwrgate_inter_part_timer;
+	u32 utmip_uhsic2_triggers;
+	u32 utmip_uhsic2_saved_state;
+	u32 utmip_uhsic2_sleep_cfg;
+	u32 utmip_uhsic2_sleepwalk_cfg;
+	u32 uhsic2_sleepwalk_p1;
+	u32 utmip_uhsic2_status;
+	u32 utmip_uhsic2_fake;
+	u32 utmip_uhsic2_line_wakeup;
+	u32 utmip_master2_config;
+	u32 utmip_uhsic_rpd_cfg;
+	u32 pg_mask_ce0;
+	u32 pg_mask3[5 - 3];
+	u32 pllm_wb0_override2;
+	u32 tsc_mult;
+	u32 cpu_vsense_override;
+	u32 glb_amap_cfg;
+	u32 sticky_bits;
+	u32 sec_disable2;
+	u32 weak_bias;
+	u32 reg_short;
+	u32 pg_mask_andor;
+	u8 _rsv1[0x2c];
+	u32 secure_scratch8[24 - 8];
+	u32 scratch56[120 - 56];
+};
+
+enum {
+	PMC_PWRGATE_TOGGLE_PARTID_MASK = 0x1f,
+	PMC_PWRGATE_TOGGLE_PARTID_SHIFT = 0,
+	PMC_PWRGATE_TOGGLE_START = 0x1 << 8
+};
+
+enum {
+	PMC_CNTRL_KBC_CLK_DIS = 0x1 << 0,
+	PMC_CNTRL_RTC_CLK_DIS = 0x1 << 1,
+	PMC_CNTRL_RTC_RST = 0x1 << 2,
+	PMC_CNTRL_KBC_RST = 0x1 << 3,
+	PMC_CNTRL_MAIN_RST = 0x1 << 4,
+	PMC_CNTRL_LATCHWAKE_EN = 0x1 << 5,
+	PMC_CNTRL_GLITCHDET_DIS = 0x1 << 6,
+	PMC_CNTRL_BLINK_EN = 0x1 << 7,
+	PMC_CNTRL_PWRREQ_POLARITY = 0x1 << 8,
+	PMC_CNTRL_PWRREQ_OE = 0x1 << 9,
+	PMC_CNTRL_SYSCLK_POLARITY = 0x1 << 10,
+	PMC_CNTRL_SYSCLK_OE = 0x1 << 11,
+	PMC_CNTRL_PWRGATE_DIS = 0x1 << 12,
+	PMC_CNTRL_AOINIT = 0x1 << 13,
+	PMC_CNTRL_SIDE_EFFECT_LP0 = 0x1 << 14,
+	PMC_CNTRL_CPUPWRREQ_POLARITY = 0x1 << 15,
+	PMC_CNTRL_CPUPWRREQ_OE = 0x1 << 16,
+	PMC_CNTRL_INTR_POLARITY = 0x1 << 17,
+	PMC_CNTRL_FUSE_OVERRIDE = 0x1 << 18,
+	PMC_CNTRL_CPUPWRGOOD_EN = 0x1 << 19,
+	PMC_CNTRL_CPUPWRGOOD_SEL_SHIFT = 20,
+	PMC_CNTRL_CPUPWRGOOD_SEL_MASK =
+		0x3 << PMC_CNTRL_CPUPWRGOOD_SEL_SHIFT
+};
+
+enum {
+	PMC_CNTRL2_HOLD_CKE_LOW_EN = 0x1 << 12
+};
+
+enum {
+	PMC_OSC_EDPD_OVER_XOFS_SHIFT = 1,
+	PMC_OSC_EDPD_OVER_XOFS_MASK =
+		0x3f << PMC_OSC_EDPD_OVER_XOFS_SHIFT
+};
+
+#endif	/* _TEGRA124_PMC_H_ */
diff --git a/src/soc/nvidia/tegra124/power.c b/src/soc/nvidia/tegra124/power.c
new file mode 100644
index 0000000..a3cf5ef
--- /dev/null
+++ b/src/soc/nvidia/tegra124/power.c
@@ -0,0 +1,87 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ * Copyright (c) 2013, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <arch/io.h>
+#include <console/console.h>
+#include <soc/addressmap.h>
+
+#include "pmc.h"
+#include "power.h"
+
+static struct tegra_pmc_regs * const pmc = (void *)TEGRA_PMC_BASE;
+
+static int partition_powered(int id)
+{
+	return read32(&pmc->pwrgate_status) & (0x1 << id);
+}
+
+static void power_ungate_partition(uint32_t id)
+{
+	printk(BIOS_INFO, "Ungating power partition %d.\n", id);
+
+	if (!partition_powered(id)) {
+		uint32_t pwrgate_toggle = read32(&pmc->pwrgate_toggle);
+		pwrgate_toggle &= ~(PMC_PWRGATE_TOGGLE_PARTID_MASK);
+		pwrgate_toggle |= (id << PMC_PWRGATE_TOGGLE_PARTID_SHIFT);
+		pwrgate_toggle |= PMC_PWRGATE_TOGGLE_START;
+		write32(pwrgate_toggle, &pmc->pwrgate_toggle);
+
+		// Wait for the request to be accepted.
+		while (read32(&pmc->pwrgate_toggle) & PMC_PWRGATE_TOGGLE_START)
+			;
+		printk(BIOS_DEBUG, "Power gate toggle request accepted.\n");
+
+		// Wait for the partition to be powered.
+		while (!partition_powered(id))
+			;
+	}
+
+	printk(BIOS_INFO, "Ungated power partition %d.\n", id);
+}
+
+void power_enable_cpu_rail(void)
+{
+	// Set the power gate timer multiplier to 8 (why 8?).
+	uint32_t pwrgate_timer_mult = read32(&pmc->pwrgate_timer_mult);
+	pwrgate_timer_mult |= (0x3 << 0);
+
+	/*
+	 * From U-Boot:
+	 * Set CPUPWRGOOD_TIMER - APB clock is 1/2 of SCLK (102MHz),
+	 * set it for 5ms as per SysEng (102MHz/5mS = 510000).
+	 */
+	write32(510000, &pmc->cpupwrgood_timer);
+
+	power_ungate_partition(POWER_PARTID_CRAIL);
+
+	uint32_t cntrl = read32(&pmc->cntrl);
+	cntrl &= ~PMC_CNTRL_CPUPWRREQ_POLARITY;
+	cntrl |= PMC_CNTRL_CPUPWRREQ_OE;
+	write32(cntrl, &pmc->cntrl);
+}
+
+void power_ungate_cpu(void)
+{
+	// Ungate power to the non-core parts of the fast cluster.
+	power_ungate_partition(POWER_PARTID_C0NC);
+
+	// Ungate power to CPU0 in the fast cluster.
+	power_ungate_partition(POWER_PARTID_CE0);
+}
diff --git a/src/soc/nvidia/tegra124/power.h b/src/soc/nvidia/tegra124/power.h
new file mode 100644
index 0000000..6454699
--- /dev/null
+++ b/src/soc/nvidia/tegra124/power.h
@@ -0,0 +1,29 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef __SOC_NVIDIA_TEGRA124_POWER_H__
+#define __SOC_NVIDIA_TEGRA124_POWER_H__
+
+// This function does not enable the external power to the rail, it enables
+// the rail itself internal to the SOC.
+void power_enable_cpu_rail(void);
+
+void power_ungate_cpu(void);
+
+#endif	/* __SOC_NVIDIA_TEGRA124_POWER_H__ */
diff --git a/src/soc/nvidia/tegra124/soc.c b/src/soc/nvidia/tegra124/soc.c
new file mode 100644
index 0000000..e996c11
--- /dev/null
+++ b/src/soc/nvidia/tegra124/soc.c
@@ -0,0 +1,66 @@
+/*
+ * This file is part of the coreboot project.
+ *
+ * Copyright (C) 2007-2009 coresystems GmbH
+ * Copyright (C) 2011 The ChromiumOS Authors.  All rights reserved.
+ * Copyright 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <console/console.h>
+#include <device/device.h>
+#include <soc/nvidia/tegra/dc.h>
+#include <soc/addressmap.h>
+
+/* this sucks, but for now, fb size/location are hardcoded.
+ * Will break if we get 2. Sigh.
+ * We assume it's all multiples of MiB for MMUs sake.
+ */
+static void soc_enable(device_t dev)
+{
+	unsigned long fb_size = FB_SIZE_MB;
+	u32 lcdbase = FB_BASE_MB;
+	ram_resource(dev, 0, CONFIG_SYS_SDRAM_BASE/KiB,
+		(CONFIG_DRAM_SIZE_MB - fb_size)*KiB);
+	mmio_resource(dev, 1, lcdbase*KiB, fb_size*KiB);
+}
+
+static void soc_init(device_t dev)
+{
+	display_startup(dev);
+	printk(BIOS_INFO, "CPU: Tegra124\n");
+}
+
+static void soc_noop(device_t dev)
+{
+}
+
+static struct device_operations soc_ops = {
+	.read_resources   = soc_noop,
+	.set_resources    = soc_noop,
+	.enable_resources = soc_enable,
+	.init             = soc_init,
+	.scan_bus         = 0,
+};
+
+static void enable_tegra124_dev(device_t dev)
+{
+	dev->ops = &soc_ops;
+}
+
+struct chip_operations soc_nvidia_tegra124_ops = {
+	CHIP_NAME("SOC Nvidia Tegra124")
+	.enable_dev = enable_tegra124_dev,
+};
diff --git a/src/soc/nvidia/tegra124/spi.c b/src/soc/nvidia/tegra124/spi.c
new file mode 100644
index 0000000..f7e0750
--- /dev/null
+++ b/src/soc/nvidia/tegra124/spi.c
@@ -0,0 +1,815 @@
+/*
+ * NVIDIA Tegra SPI controller (T114 and later)
+ *
+ * Copyright (c) 2010-2013 NVIDIA Corporation
+ * Copyright (C) 2013 Google Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <assert.h>
+#include <cbfs.h>
+#include <cbfs_core.h>
+#include <inttypes.h>
+#include <spi-generic.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <arch/io.h>
+#include <console/console.h>
+#include <soc/addressmap.h>
+#include <delay.h>
+
+#include "dma.h"
+#include "spi.h"
+
+#if defined(CONFIG_DEBUG_SPI) && CONFIG_DEBUG_SPI
+# define DEBUG_SPI(x,...)	printk(BIOS_DEBUG, "TEGRA_SPI: " x)
+#else
+# define DEBUG_SPI(x,...)
+#endif
+
+/*
+ * 64 packets in FIFO mode, BLOCK_SIZE packets in DMA mode. Packets can vary
+ * in size from 4 to 32 bits. To keep things simple we'll use 8-bit packets.
+ */
+#define SPI_PACKET_SIZE_BYTES		1
+#define SPI_MAX_TRANSFER_BYTES_FIFO	(64 * SPI_PACKET_SIZE_BYTES)
+#define SPI_MAX_TRANSFER_BYTES_DMA	((65536 * SPI_PACKET_SIZE_BYTES) - \
+							TEGRA_DMA_ALIGN_BYTES)
+
+/* COMMAND1 */
+#define SPI_CMD1_GO			(1 << 31)
+#define SPI_CMD1_M_S			(1 << 30)
+#define SPI_CMD1_MODE_MASK		0x3
+#define SPI_CMD1_MODE_SHIFT		28
+#define SPI_CMD1_CS_SEL_MASK		0x3
+#define SPI_CMD1_CS_SEL_SHIFT		26
+#define SPI_CMD1_CS_POL_INACTIVE3	(1 << 25)
+#define SPI_CMD1_CS_POL_INACTIVE2	(1 << 24)
+#define SPI_CMD1_CS_POL_INACTIVE1	(1 << 23)
+#define SPI_CMD1_CS_POL_INACTIVE0	(1 << 22)
+#define SPI_CMD1_CS_SW_HW		(1 << 21)
+#define SPI_CMD1_CS_SW_VAL		(1 << 20)
+#define SPI_CMD1_IDLE_SDA_MASK		0x3
+#define SPI_CMD1_IDLE_SDA_SHIFT		18
+#define SPI_CMD1_BIDIR			(1 << 17)
+#define SPI_CMD1_LSBI_FE		(1 << 16)
+#define SPI_CMD1_LSBY_FE		(1 << 15)
+#define SPI_CMD1_BOTH_EN_BIT		(1 << 14)
+#define SPI_CMD1_BOTH_EN_BYTE		(1 << 13)
+#define SPI_CMD1_RX_EN			(1 << 12)
+#define SPI_CMD1_TX_EN			(1 << 11)
+#define SPI_CMD1_PACKED			(1 << 5)
+#define SPI_CMD1_BIT_LEN_MASK		0x1f
+#define SPI_CMD1_BIT_LEN_SHIFT		0
+
+/* COMMAND2 */
+#define SPI_CMD2_TX_CLK_TAP_DELAY	(1 << 6)
+#define SPI_CMD2_TX_CLK_TAP_DELAY_MASK	(0x3F << 6)
+#define SPI_CMD2_RX_CLK_TAP_DELAY	(1 << 0)
+#define SPI_CMD2_RX_CLK_TAP_DELAY_MASK	(0x3F << 0)
+
+/* SPI_TRANS_STATUS */
+#define SPI_STATUS_RDY			(1 << 30)
+#define SPI_STATUS_SLV_IDLE_COUNT_MASK	0xff
+#define SPI_STATUS_SLV_IDLE_COUNT_SHIFT	16
+#define SPI_STATUS_BLOCK_COUNT		0xffff
+#define SPI_STATUS_BLOCK_COUNT_SHIFT	0
+
+/* SPI_FIFO_STATUS */
+#define SPI_FIFO_STATUS_CS_INACTIVE	(1 << 31)
+#define SPI_FIFO_STATUS_FRAME_END	(1 << 30)
+#define SPI_FIFO_STATUS_RX_FIFO_FLUSH	(1 << 15)
+#define SPI_FIFO_STATUS_TX_FIFO_FLUSH	(1 << 14)
+#define SPI_FIFO_STATUS_ERR		(1 << 8)
+#define SPI_FIFO_STATUS_TX_FIFO_OVF	(1 << 7)
+#define SPI_FIFO_STATUS_TX_FIFO_UNR	(1 << 6)
+#define SPI_FIFO_STATUS_RX_FIFO_OVF	(1 << 5)
+#define SPI_FIFO_STATUS_RX_FIFO_UNR	(1 << 4)
+#define SPI_FIFO_STATUS_TX_FIFO_FULL	(1 << 3)
+#define SPI_FIFO_STATUS_TX_FIFO_EMPTY	(1 << 2)
+#define SPI_FIFO_STATUS_RX_FIFO_FULL	(1 << 1)
+#define SPI_FIFO_STATUS_RX_FIFO_EMPTY	(1 << 0)
+
+/* SPI_DMA_CTL */
+#define SPI_DMA_CTL_DMA			(1 << 31)
+#define SPI_DMA_CTL_CONT		(1 << 30)
+#define SPI_DMA_CTL_IE_RX		(1 << 29)
+#define SPI_DMA_CTL_IE_TX		(1 << 28)
+#define SPI_DMA_CTL_RX_TRIG_MASK	0x3
+#define SPI_DMA_CTL_RX_TRIG_SHIFT	19
+#define SPI_DMA_CTL_TX_TRIG_MASK	0x3
+#define SPI_DMA_CTL_TX_TRIG_SHIFT	15
+
+/* SPI_DMA_BLK */
+#define SPI_DMA_CTL_BLOCK_SIZE_MASK	0xff
+#define SPI_DMA_CTL_BLOCK_SIZE_SHIFT	0
+
+struct tegra_spi_regs {
+	u32 command1;		/* 0x000: SPI_COMMAND1 */
+	u32 command2;		/* 0x004: SPI_COMMAND2 */
+	u32 timing1;		/* 0x008: SPI_CS_TIM1 */
+	u32 timing2;		/* 0x00c: SPI_CS_TIM2 */
+	u32 trans_status;	/* 0x010: SPI_TRANS_STATUS */
+	u32 fifo_status;	/* 0x014: SPI_FIFO_STATUS */
+	u32 tx_data;		/* 0x018: SPI_TX_DATA */
+	u32 rx_data;		/* 0x01c: SPI_RX_DATA */
+	u32 dma_ctl;		/* 0x020: SPI_DMA_CTL */
+	u32 dma_blk;		/* 0x024: SPI_DMA_BLK */
+	u32 rsvd[56];		/* 0x028-0x107: reserved */
+	u32 tx_fifo;		/* 0x108: SPI_FIFO1 */
+	u32 rsvd2[31];		/* 0x10c-0x187 reserved */
+	u32 rx_fifo;		/* 0x188: SPI_FIFO2 */
+	u32 spare_ctl;		/* 0x18c: SPI_SPARE_CTRL */
+} __attribute__((packed));
+
+struct tegra_spi_channel {
+	struct spi_slave slave;
+	struct tegra_spi_regs *regs;
+};
+
+static struct tegra_spi_channel tegra_spi_channels[] = {
+	/*
+	 * Note: Tegra pinmux must be setup for corresponding SPI channel in
+	 * order for its registers to be accessible. If pinmux has not been
+	 * set up, access to the channel's registers will simply hang.
+	 *
+	 * TODO(dhendrix): Clarify or remove this comment (is clock setup
+	 * necessary first, or just pinmux, or both?)
+	 */
+	{
+		.slave = { .bus = 1, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI1_BASE,
+	},
+	{
+		.slave = { .bus = 2, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI2_BASE,
+	},
+	{
+		.slave = { .bus = 3, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI3_BASE,
+	},
+	{
+		.slave = { .bus = 4, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI4_BASE,
+	},
+	{
+		.slave = { .bus = 5, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI5_BASE,
+	},
+	{
+		.slave = { .bus = 6, },
+		.regs = (struct tegra_spi_regs *)TEGRA_SPI6_BASE,
+	},
+};
+
+static void flush_fifos(struct tegra_spi_regs *regs)
+{
+	setbits_le32(&regs->fifo_status, SPI_FIFO_STATUS_RX_FIFO_FLUSH |
+					SPI_FIFO_STATUS_TX_FIFO_FLUSH);
+	while (read32(&regs->fifo_status) &
+		(SPI_FIFO_STATUS_RX_FIFO_FLUSH | SPI_FIFO_STATUS_TX_FIFO_FLUSH))
+		;
+}
+
+void tegra_spi_init(unsigned int bus)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(tegra_spi_channels); i++) {
+		struct tegra_spi_regs *regs;
+
+		if (tegra_spi_channels[i].slave.bus == bus)
+			regs = tegra_spi_channels[i].regs;
+		else
+			continue;
+
+		/* software drives chip-select, set value to high */
+		setbits_le32(&regs->command1,
+				SPI_CMD1_CS_SW_HW | SPI_CMD1_CS_SW_VAL);
+
+		/* 8-bit transfers, unpacked mode, most significant bit first */
+		clrbits_le32(&regs->command1,
+				SPI_CMD1_BIT_LEN_MASK | SPI_CMD1_PACKED);
+		setbits_le32(&regs->command1, 7 << SPI_CMD1_BIT_LEN_SHIFT);
+
+		flush_fifos(regs);
+	}
+	printk(BIOS_INFO, "Tegra SPI bus %d initialized.\n", bus);
+
+}
+
+static struct tegra_spi_channel * const to_tegra_spi(int bus) {
+	return &tegra_spi_channels[bus - 1];
+}
+
+static unsigned int tegra_spi_speed(unsigned int bus)
+{
+	/* FIXME: implement this properly, for now use max value (50MHz) */
+	return 50000000;
+}
+
+/*
+ * This calls udelay() with a calculated value based on the SPI speed and
+ * number of bytes remaining to be transferred. It assumes that if the
+ * calculated delay period is less than MIN_DELAY_US then it is probably
+ * not worth the overhead of yielding.
+ */
+#define MIN_DELAY_US 250
+static void tegra_spi_delay(struct tegra_spi_channel *spi,
+				unsigned int bytes_remaining)
+{
+	unsigned int ns_per_byte, delay_us;
+
+	ns_per_byte = 1000000000 / (tegra_spi_speed(spi->slave.bus) / 8);
+	delay_us = (ns_per_byte * bytes_remaining) / 1000;
+
+	if (delay_us < MIN_DELAY_US)
+		return;
+
+	udelay(delay_us);
+}
+
+void spi_cs_activate(struct spi_slave *slave)
+{
+	struct tegra_spi_regs *regs = to_tegra_spi(slave->bus)->regs;
+	u32 val;
+
+	val = read32(&regs->command1);
+
+	/* select appropriate chip-select line */
+	val &= ~(SPI_CMD1_CS_SEL_MASK << SPI_CMD1_CS_SEL_SHIFT);
+	val |= (slave->cs << SPI_CMD1_CS_SEL_SHIFT);
+
+	/* drive chip-select with the inverse of the "inactive" value */
+	if (val & (SPI_CMD1_CS_POL_INACTIVE0 << slave->cs))
+		val &= ~SPI_CMD1_CS_SW_VAL;
+	else
+		val |= SPI_CMD1_CS_SW_VAL;
+
+	write32(val, &regs->command1);
+}
+
+void spi_cs_deactivate(struct spi_slave *slave)
+{
+	struct tegra_spi_regs *regs = to_tegra_spi(slave->bus)->regs;
+	u32 val;
+
+	val = read32(&regs->command1);
+
+	if (val & (SPI_CMD1_CS_POL_INACTIVE0 << slave->cs))
+		val |= SPI_CMD1_CS_SW_VAL;
+	else
+		val &= ~SPI_CMD1_CS_SW_VAL;
+
+	write32(val, &regs->command1);
+}
+
+static void dump_fifo_status(struct tegra_spi_channel *spi)
+{
+	u32 status = read32(&spi->regs->fifo_status);
+
+	printk(BIOS_INFO, "Raw FIFO status: 0x%08x\n", status);
+	if (status & SPI_FIFO_STATUS_TX_FIFO_OVF)
+		printk(BIOS_INFO, "\tTx overflow detected\n");
+	if (status & SPI_FIFO_STATUS_TX_FIFO_UNR)
+		printk(BIOS_INFO, "\tTx underrun detected\n");
+	if (status & SPI_FIFO_STATUS_RX_FIFO_OVF)
+		printk(BIOS_INFO, "\tRx overflow detected\n");
+	if (status & SPI_FIFO_STATUS_RX_FIFO_UNR)
+		printk(BIOS_INFO, "\tRx underrun detected\n");
+
+	printk(BIOS_INFO, "TX_FIFO: 0x%08x, TX_DATA: 0x%08x\n",
+		read32(&spi->regs->tx_fifo), read32(&spi->regs->tx_data));
+	printk(BIOS_INFO, "RX_FIFO: 0x%08x, RX_DATA: 0x%08x\n",
+		read32(&spi->regs->rx_fifo), read32(&spi->regs->rx_data));
+}
+
+static void clear_fifo_status(struct tegra_spi_channel *spi)
+{
+	clrbits_le32(&spi->regs->fifo_status,
+				SPI_FIFO_STATUS_ERR |
+				SPI_FIFO_STATUS_TX_FIFO_OVF |
+				SPI_FIFO_STATUS_TX_FIFO_UNR |
+				SPI_FIFO_STATUS_RX_FIFO_OVF |
+				SPI_FIFO_STATUS_RX_FIFO_UNR);
+}
+
+static void dump_spi_regs(struct tegra_spi_channel *spi)
+{
+	printk(BIOS_INFO, "SPI regs:\n"
+			"\tdma_blk: 0x%08x\n"
+			"\tcommand1: 0x%08x\n"
+			"\tdma_ctl: 0x%08x\n"
+			"\ttrans_status: 0x%08x\n",
+			read32(&spi->regs->dma_blk),
+			read32(&spi->regs->command1),
+			read32(&spi->regs->dma_ctl),
+			read32(&spi->regs->trans_status));
+}
+
+static void dump_dma_regs(struct apb_dma_channel *dma)
+{
+	printk(BIOS_INFO, "DMA regs:\n"
+			"\tahb_ptr: 0x%08x\n"
+			"\tapb_ptr: 0x%08x\n"
+			"\tahb_seq: 0x%08x\n"
+			"\tapb_seq: 0x%08x\n"
+			"\tcsr: 0x%08x\n"
+			"\tcsre: 0x%08x\n"
+			"\twcount: 0x%08x\n"
+			"\tdma_byte_sta: 0x%08x\n"
+			"\tword_transfer: 0x%08x\n",
+			read32(&dma->regs->ahb_ptr),
+			read32(&dma->regs->apb_ptr),
+			read32(&dma->regs->ahb_seq),
+			read32(&dma->regs->apb_seq),
+			read32(&dma->regs->csr),
+			read32(&dma->regs->csre),
+			read32(&dma->regs->wcount),
+			read32(&dma->regs->dma_byte_sta),
+			read32(&dma->regs->word_transfer));
+}
+
+static void dump_regs(struct tegra_spi_channel *spi,
+			struct apb_dma_channel *dma)
+{
+	if (dma)
+		dump_dma_regs(dma);
+	if (spi) {
+		dump_spi_regs(spi);
+		dump_fifo_status(spi);
+	}
+}
+
+static int fifo_error(struct tegra_spi_channel *spi)
+{
+	return read32(&spi->regs->fifo_status) & SPI_FIFO_STATUS_ERR ? 1 : 0;
+}
+
+static inline unsigned int spi_byte_count(struct tegra_spi_channel *spi)
+{
+	/* FIXME: Make this take total packet size into account */
+	return read32(&spi->regs->trans_status) &
+		(SPI_STATUS_BLOCK_COUNT << SPI_STATUS_BLOCK_COUNT_SHIFT);
+}
+
+static int tegra_spi_fifo_receive(struct tegra_spi_channel *spi,
+			u8 *din, unsigned int in_bytes)
+{
+	unsigned int received = 0, remaining = in_bytes;
+
+	printk(BIOS_SPEW, "%s: Receiving %d bytes\n", __func__, in_bytes);
+	setbits_le32(&spi->regs->command1, SPI_CMD1_RX_EN);
+
+	while (remaining) {
+		unsigned int from_fifo, count;
+
+		from_fifo = MIN(in_bytes, SPI_MAX_TRANSFER_BYTES_FIFO);
+		remaining -= from_fifo;
+
+		/* BLOCK_SIZE in SPI_DMA_BLK register applies to both DMA and
+		 * PIO transfers */
+		write32(from_fifo - 1, &spi->regs->dma_blk);
+
+		setbits_le32(&spi->regs->trans_status, SPI_STATUS_RDY);
+		setbits_le32(&spi->regs->command1, SPI_CMD1_GO);
+
+		while ((count = spi_byte_count(spi)) != from_fifo) {
+			tegra_spi_delay(spi, from_fifo - count);
+			if (fifo_error(spi))
+				goto done;
+		}
+
+		received += from_fifo;
+		while (from_fifo) {
+			*din = read8(&spi->regs->rx_fifo);
+			din++;
+			from_fifo--;
+		}
+	}
+
+done:
+	clrbits_le32(&spi->regs->command1, SPI_CMD1_RX_EN);
+	if ((received != in_bytes) || fifo_error(spi)) {
+		printk(BIOS_ERR, "%s: ERROR: Received %u bytes, expected %u\n",
+				__func__, received, in_bytes);
+		dump_regs(spi, NULL);
+		return -1;
+	}
+	return in_bytes;
+}
+
+static int tegra_spi_fifo_send(struct tegra_spi_channel *spi,
+			const u8 *dout, unsigned int out_bytes)
+{
+	unsigned int sent = 0, remaining = out_bytes;
+
+	printk(BIOS_SPEW, "%s: Sending %d bytes\n", __func__, out_bytes);
+	setbits_le32(&spi->regs->command1, SPI_CMD1_TX_EN);
+
+	while (remaining) {
+		unsigned int to_fifo, tmp;
+
+		to_fifo = MIN(out_bytes, SPI_MAX_TRANSFER_BYTES_FIFO);
+
+		/* BLOCK_SIZE in SPI_DMA_BLK register applies to both DMA and
+		 * PIO transfers */
+		write32(to_fifo - 1, &spi->regs->dma_blk);
+
+		tmp = to_fifo;
+		while (tmp) {
+			write32(*dout, &spi->regs->tx_fifo);
+			dout++;
+			tmp--;
+		}
+
+		setbits_le32(&spi->regs->trans_status, SPI_STATUS_RDY);
+		setbits_le32(&spi->regs->command1, SPI_CMD1_GO);
+
+		while (!(read32(&spi->regs->fifo_status) &
+				SPI_FIFO_STATUS_TX_FIFO_EMPTY)) {
+			tegra_spi_delay(spi, to_fifo - spi_byte_count(spi));
+			if (fifo_error(spi))
+				goto done;
+		}
+
+		remaining -= to_fifo;
+		sent += to_fifo;
+	}
+
+done:
+	clrbits_le32(&spi->regs->command1, SPI_CMD1_TX_EN);
+	if ((sent != out_bytes) || fifo_error(spi)) {
+		printk(BIOS_ERR, "%s: ERROR: Sent %u bytes, expected "
+				"to send %u\n", __func__, sent, out_bytes);
+		dump_regs(spi, NULL);
+		return -1;
+	}
+	return out_bytes;
+}
+
+static void tegra2_spi_dma_setup(struct apb_dma_channel *dma)
+{
+	/* APB bus width = 8-bits, address wrap for each word */
+	clrbits_le32(&dma->regs->apb_seq, 0x7 << 28);
+	/* AHB 1 word burst, bus width = 32 bits (fixed in hardware),
+	 * no address wrapping */
+	clrsetbits_le32(&dma->regs->ahb_seq,
+			(0x7 << 24) | (0x7 << 16), 0x4 << 24);
+	/* Set ONCE mode to transfer one "blocK" at a time (64KB). */
+	setbits_le32(&dma->regs->csr, 1 << 27);
+}
+
+/*
+ * Notes for DMA transmit and receive, experimentally determined (need to
+ * verify):
+ * - WCOUNT seems to be an "n-1" count, but the documentation does not
+ *   make this clear. Without the -1 dma_byte_sta will show 1 AHB word
+ *   (4 bytes) higher than it should and Tx overrun / Rx underrun will
+ *   likely occur.
+ *
+ * - dma_byte_sta is always a multiple 4, so we check for
+ *   dma_byte_sta < length
+ *
+ * - The RDY bit in SPI_TRANS_STATUS needs to be cleared manually
+ *   (set bit to clear) between each transaction. Otherwise the next
+ *   transaction does not start.
+ */
+
+static int tegra_spi_dma_receive(struct tegra_spi_channel *spi,
+		const void *din, unsigned int in_bytes)
+{
+	struct apb_dma_channel *dma;
+
+	dma = dma_claim();
+	if (!dma) {
+		printk(BIOS_ERR, "%s: Unable to claim DMA channel\n", __func__);
+		return -1;
+	}
+
+	printk(BIOS_SPEW, "%s: Receiving %d bytes\n", __func__, in_bytes);
+	tegra2_spi_dma_setup(dma);
+
+	/* set AHB & APB address pointers */
+	write32((u32)din, &dma->regs->ahb_ptr);
+	write32((u32)&spi->regs->rx_fifo, &dma->regs->apb_ptr);
+
+	setbits_le32(&spi->regs->command1, SPI_CMD1_RX_EN);
+
+	/* FIXME: calculate word count so that it corresponds to bus width */
+	write32(in_bytes - 1, &dma->regs->wcount);
+
+	/* specify BLOCK_SIZE in SPI_DMA_BLK */
+	write32(in_bytes - 1, &spi->regs->dma_blk);
+
+	/* Set DMA direction for APB (SPI) --> AHB (DRAM) */
+	clrbits_le32(&dma->regs->csr, 1 << 28);
+
+	/* write to SPI_TRANS_STATUS RDY bit to clear it */
+	setbits_le32(&spi->regs->trans_status, SPI_STATUS_RDY);
+
+	/* set DMA bit in SPI_DMA_CTL to start */
+	setbits_le32(&spi->regs->dma_ctl, SPI_DMA_CTL_DMA);
+
+	/* start APBDMA after SPI DMA so we don't read empty bytes
+	 * from Rx FIFO */
+	dma_start(dma);
+
+	while (spi_byte_count(spi) != in_bytes)
+		tegra_spi_delay(spi, in_bytes - spi_byte_count(spi));
+	clrbits_le32(&spi->regs->command1, SPI_CMD1_RX_EN);
+
+	while ((read32(&dma->regs->dma_byte_sta) < in_bytes) || dma_busy(dma))
+		;	/* this shouldn't take long, no udelay */
+	dma_stop(dma);
+	dma_release(dma);
+
+	if ((spi_byte_count(spi) != in_bytes) || fifo_error(spi)) {
+		printk(BIOS_ERR, "%s: ERROR: Received %u bytes, expected %u\n",
+				__func__, spi_byte_count(spi), in_bytes);
+		dump_regs(spi, dma);
+		return -1;
+	}
+
+	return in_bytes;
+}
+
+static int tegra_spi_dma_send(struct tegra_spi_channel *spi,
+		const u8 *dout, unsigned int out_bytes)
+{
+	struct apb_dma_channel *dma;
+	unsigned int count;
+
+	dma = dma_claim();
+	if (!dma) {
+		printk(BIOS_ERR, "%s: Unable to claim DMA channel\n", __func__);
+		return -1;
+	}
+
+	printk(BIOS_SPEW, "%s: Sending %d bytes\n", __func__, out_bytes);
+	tegra2_spi_dma_setup(dma);
+
+	/* set AHB & APB address pointers */
+	write32((u32)dout, &dma->regs->ahb_ptr);
+	write32((u32)&spi->regs->tx_fifo, &dma->regs->apb_ptr);
+
+	setbits_le32(&spi->regs->command1, SPI_CMD1_TX_EN);
+
+	/* FIXME: calculate word count so that it corresponds to bus width */
+	write32(out_bytes - 1, &dma->regs->wcount);
+
+	/* specify BLOCK_SIZE in SPI_DMA_BLK */
+	write32(out_bytes - 1, &spi->regs->dma_blk);
+
+	/* Set DMA direction for AHB (DRAM) --> APB (SPI) */
+	setbits_le32(&dma->regs->csr, (1 << 28));
+
+	/* write to SPI_TRANS_STATUS RDY bit to clear it */
+	setbits_le32(&spi->regs->trans_status, SPI_STATUS_RDY);
+
+	dma_start(dma);
+	/* set DMA bit in SPI_DMA_CTL to start */
+	setbits_le32(&spi->regs->dma_ctl, SPI_DMA_CTL_DMA);
+
+	while ((read32(&dma->regs->dma_byte_sta) < out_bytes) || dma_busy(dma))
+		tegra_spi_delay(spi, out_bytes - spi_byte_count(spi));
+	dma_stop(dma);
+
+	while ((count = spi_byte_count(spi)) != out_bytes)
+		tegra_spi_delay(spi, out_bytes - count);
+	clrbits_le32(&spi->regs->command1, SPI_CMD1_TX_EN);
+
+	dma_release(dma);
+
+	if ((spi_byte_count(spi) != out_bytes) || fifo_error(spi)) {
+		printk(BIOS_ERR, "%s: ERROR: Sent %u bytes, expected %u\n",
+				__func__, spi_byte_count(spi), out_bytes);
+		dump_regs(spi, dma);
+		return -1;
+	}
+
+	return out_bytes;
+}
+
+int spi_xfer(struct spi_slave *slave, const void *dout, unsigned int bitsout,
+	     void *din, unsigned int bitsin)
+{
+	unsigned int out_bytes = bitsout / 8, in_bytes = bitsin / 8;
+	struct tegra_spi_channel *spi = to_tegra_spi(slave->bus);
+	int ret = 0;
+	u8 *out_buf = (u8 *)dout;
+	u8 *in_buf = (u8 *)din;
+
+	ASSERT(bitsout % 8 == 0 && bitsin % 8 == 0);
+
+	/* tegra bus numbers start at 1 */
+	ASSERT(slave->bus >= 1 && slave->bus <= ARRAY_SIZE(tegra_spi_channels));
+
+	flush_fifos(spi->regs);
+
+	/*
+	 * DMA operates on 4 bytes at a time, so to avoid accessing memory
+	 * outside the specified buffers we'll only use DMA for 4-byte aligned
+	 * transactions accesses and transfer remaining bytes manually using
+	 * the Rx/Tx FIFOs.
+	 */
+
+	while (out_bytes > 0) {
+		unsigned int dma_out, fifo_out;
+
+		dma_out = MIN(out_bytes, SPI_MAX_TRANSFER_BYTES_DMA);
+		fifo_out = dma_out % TEGRA_DMA_ALIGN_BYTES;
+		dma_out -= fifo_out;
+
+		if (dma_out) {
+			ret = tegra_spi_dma_send(spi, out_buf, dma_out);
+			if (ret != dma_out) {
+				ret = -1;
+				goto spi_xfer_exit;
+			}
+			out_buf += dma_out;
+			out_bytes -= dma_out;
+		}
+		if (fifo_out) {
+			ret = tegra_spi_fifo_send(spi, out_buf, fifo_out);
+			if (ret != fifo_out) {
+				ret = -1;
+				goto spi_xfer_exit;
+			}
+			out_buf += fifo_out;
+			out_bytes -= fifo_out;
+		}
+	}
+
+	while (in_bytes > 0) {
+		unsigned int dma_in, fifo_in;
+
+		dma_in = MIN(in_bytes, SPI_MAX_TRANSFER_BYTES_DMA);
+		fifo_in = dma_in % TEGRA_DMA_ALIGN_BYTES;
+		dma_in -= fifo_in;
+
+		if (dma_in) {
+			ret = tegra_spi_dma_receive(spi, in_buf, dma_in);
+			if (ret != dma_in) {
+				ret = -1;
+				goto spi_xfer_exit;
+			}
+			in_buf += dma_in;
+			in_bytes -= dma_in;
+		}
+		if (fifo_in) {
+			ret = tegra_spi_fifo_receive(spi, in_buf, fifo_in);
+			if (ret != fifo_in) {
+				ret = -1;
+				goto spi_xfer_exit;
+			}
+			in_buf += fifo_in;
+			in_bytes -= fifo_in;
+		}
+	}
+
+	ret = 0;
+
+spi_xfer_exit:
+	if (ret < 0)
+		clear_fifo_status(spi);
+	return ret;
+}
+
+/* SPI as CBFS media. */
+struct tegra_spi_media {
+	struct spi_slave *slave;
+	struct cbfs_simple_buffer buffer;
+};
+
+static int tegra_spi_cbfs_open(struct cbfs_media *media)
+{
+	DEBUG_SPI("tegra_spi_cbfs_open\n");
+	return 0;
+}
+
+static int tegra_spi_cbfs_close(struct cbfs_media *media)
+{
+	DEBUG_SPI("tegra_spi_cbfs_close\n");
+	return 0;
+}
+
+#define JEDEC_READ		0x03
+#define JEDEC_READ_OUTSIZE	0x04
+/*      JEDEC_READ_INSIZE : any length */
+
+static size_t tegra_spi_cbfs_read(struct cbfs_media *media, void *dest,
+				   size_t offset, size_t count)
+{
+	struct tegra_spi_media *spi = (struct tegra_spi_media *)media->context;
+	u8 spi_read_cmd[JEDEC_READ_OUTSIZE];
+	int ret = count;
+
+	/* TODO: Dual mode (BOTH_EN_BIT) and packed mode */
+	spi_read_cmd[0] = JEDEC_READ;
+	spi_read_cmd[1] = (offset >> 16) & 0xff;
+	spi_read_cmd[2] = (offset >> 8) & 0xff;
+	spi_read_cmd[3] = offset & 0xff;
+
+	/* assert /CS */
+	spi_cs_activate(spi->slave);
+
+	if (spi_xfer(spi->slave, spi_read_cmd,
+			sizeof(spi_read_cmd) * 8, NULL, 0) < 0) {
+		ret = -1;
+		printk(BIOS_ERR, "%s: Failed to transfer %u bytes\n",
+				__func__, sizeof(spi_read_cmd));
+		goto tegra_spi_cbfs_read_exit;
+	}
+
+	if (spi_xfer(spi->slave, NULL, 0, dest, count * 8)) {
+		ret = -1;
+		printk(BIOS_ERR, "%s: Failed to transfer %u bytes\n",
+				__func__, count);
+	}
+
+tegra_spi_cbfs_read_exit:
+	/* de-assert /CS */
+	spi_cs_deactivate(spi->slave);
+	return (ret < 0) ? 0 : ret;
+}
+
+static void *tegra_spi_cbfs_map(struct cbfs_media *media, size_t offset,
+				 size_t count)
+{
+	struct tegra_spi_media *spi = (struct tegra_spi_media*)media->context;
+	void *map;
+	DEBUG_SPI("tegra_spi_cbfs_map\n");
+	map = cbfs_simple_buffer_map(&spi->buffer, media, offset, count);
+	printk(BIOS_INFO, "%s: map: 0x%p\n", __func__, map);
+	return map;
+}
+
+static void *tegra_spi_cbfs_unmap(struct cbfs_media *media,
+				   const void *address)
+{
+	struct tegra_spi_media *spi = (struct tegra_spi_media*)media->context;
+	DEBUG_SPI("tegra_spi_cbfs_unmap\n");
+	return cbfs_simple_buffer_unmap(&spi->buffer, address);
+}
+
+int initialize_tegra_spi_cbfs_media(struct cbfs_media *media,
+				     void *buffer_address,
+				     size_t buffer_size)
+{
+	// TODO Replace static variable to support multiple streams.
+	static struct tegra_spi_media context;
+	static struct tegra_spi_channel *channel;
+
+	channel = &tegra_spi_channels[CONFIG_BOOT_MEDIA_SPI_BUS - 1];
+	channel->slave.cs = CONFIG_BOOT_MEDIA_SPI_CHIP_SELECT;
+
+	DEBUG_SPI("Initializing CBFS media on SPI\n");
+
+	context.slave = &channel->slave;
+	context.buffer.allocated = context.buffer.last_allocate = 0;
+	context.buffer.buffer = buffer_address;
+	context.buffer.size = buffer_size;
+	media->context = (void*)&context;
+	media->open = tegra_spi_cbfs_open;
+	media->close = tegra_spi_cbfs_close;
+	media->read = tegra_spi_cbfs_read;
+	media->map = tegra_spi_cbfs_map;
+	media->unmap = tegra_spi_cbfs_unmap;
+
+	return 0;
+}
+
+struct spi_slave *spi_setup_slave(unsigned int bus, unsigned int cs)
+{
+	struct tegra_spi_channel *channel = to_tegra_spi(bus);
+	if (!channel)
+		return NULL;
+
+	return &channel->slave;
+}
+
+int spi_claim_bus(struct spi_slave *slave)
+{
+	tegra_spi_init(slave->bus);
+	spi_cs_activate(slave);
+	return 0;
+}
+
+void spi_release_bus(struct spi_slave *slave)
+{
+	spi_cs_deactivate(slave);
+}
diff --git a/src/soc/nvidia/tegra124/clock.h b/src/soc/nvidia/tegra124/spi.h
similarity index 68%
copy from src/soc/nvidia/tegra124/clock.h
copy to src/soc/nvidia/tegra124/spi.h
index 688505a..02495ab 100644
--- a/src/soc/nvidia/tegra124/clock.h
+++ b/src/soc/nvidia/tegra124/spi.h
@@ -14,9 +14,16 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
-#ifndef __SOC_NVIDIA_TEGRA124_CLOCK_H__
-#define __SOC_NVIDIA_TEGRA124_CLOCK_H__
+#ifndef __NVIDIA_TEGRA124_SPI_H__
+#define __NVIDIA_TEGRA124_SPI_H__
 
-void set_avp_clock_to_clkm(void);
+#include <stddef.h>
 
-#endif /* __SOC_NVIDIA_TEGRA124_CLOCK_H__ */
+struct cbfs_media;
+int initialize_tegra_spi_cbfs_media(struct cbfs_media *media,
+				     void *buffer_address,
+				     size_t buffer_size);
+
+void tegra_spi_init(unsigned int bus);
+
+#endif	/* __NVIDIA_TEGRA124_SPI_H__ */