arm: Redesign, clarify and clean up cache related code

This patch changes several cache-related pieces to be cleaner, faster or
more correct. The largest point is removing the old
arm_invalidate_caches() function and surrounding bootblock code to
initialize SCTLR and replace it with an all-assembly function that takes
care of cache and SCTLR initialization to bring the system to a known
state. It runs without stack and before coreboot makes any write
accesses to be as compatible as possible with whatever state the system
was left in by preceeding code. This also finally fixes the dreaded
icache bug that wasted hundreds of milliseconds during boot.

Old-Change-Id: I7bb4995af8184f6383f8e3b1b870b0662bde8bd4
Signed-off-by: Julius Werner <jwerner@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/183890
(cherry picked from commit 07a35925dc957919bf88dfc90515971a36e81b97)

nyan_big: apply cache-related changes from nyan

This applies the same changes from 07a3592 that were applied to nyan.

Old-Change-Id: Idcbe85436d7a2f65fcd751954012eb5f4bec0b6c
Reviewed-on: https://chromium-review.googlesource.com/184551
Commit-Queue: David Hendricks <dhendrix@chromium.org>
Tested-by: David Hendricks <dhendrix@chromium.org>
Reviewed-by: David Hendricks <dhendrix@chromium.org>
(cherry picked from commit 4af27f02614da41c611aee2c6d175b1b948428ea)

Squashed the followup patch for nyan_big into the original patch.

Change-Id: Id14aef7846355ea2da496e55da227b635aca409e
Signed-off-by: Isaac Christensen <isaac.christensen@se-eng.com>
(cherry picked from commit 4cbf25f8eca3a12bbfec5b015953c0fc2b69c877)
Signed-off-by: Marc Jones <marc.jones@se-eng.com>
Reviewed-on: http://review.coreboot.org/6993
Tested-by: build bot (Jenkins)
Reviewed-by: Stefan Reinauer <stefan.reinauer@coreboot.org>
diff --git a/src/mainboard/google/nyan_big/romstage.c b/src/mainboard/google/nyan_big/romstage.c
index 80eea77..5fd255e 100644
--- a/src/mainboard/google/nyan_big/romstage.c
+++ b/src/mainboard/google/nyan_big/romstage.c
@@ -69,26 +69,13 @@
    write_l2actlr(val);
 }
 
-void main(void)
+static void __attribute__((noinline)) romstage(void)
 {
 	int dram_size_mb;
 #if CONFIG_COLLECT_TIMESTAMPS
 	uint64_t romstage_start_time = timestamp_get();
 #endif
 
-	// Globally disable MMU, caches and branch prediction (these should
-	// already be disabled by default on reset).
-	uint32_t sctlr = read_sctlr();
-	sctlr &= ~(SCTLR_M | SCTLR_C | SCTLR_Z | SCTLR_I);
-	write_sctlr(sctlr);
-
-	arm_invalidate_caches();
-
-	// Renable icache and branch prediction.
-	sctlr = read_sctlr();
-	sctlr |= SCTLR_Z | SCTLR_I;
-	write_sctlr(sctlr);
-
 	configure_l2ctlr();
 	configure_l2actlr();
 
@@ -110,7 +97,6 @@
 			 CONFIG_DRAM_DMA_SIZE >> 20, DCACHE_OFF);
 	mmu_config_range(dram_end, 4096 - dram_end, DCACHE_OFF);
 	mmu_disable_range(0, 1);
-	dcache_invalidate_all();
 	dcache_mmu_enable();
 
 	/* For quality of the user experience, it's important to get
@@ -146,3 +132,10 @@
 #endif
 	stage_exit(entry);
 }
+
+/* Stub to force arm_init_caches to the top, before any stack/memory accesses */
+void main(void)
+{
+	asm ("bl arm_init_caches" ::: "r0","r1","r2","r3","r4","r5","ip");
+	romstage();
+}