Page 1

Page 2

Page 3

# Back to our example

## From the zero detection circuit example

• First implementationRewrote it into
```nz_detect4 [a] = a
nz_detect4 as = out
where
(as1,as2) = halveList as
out1 = nz_detect4 as1
out2 = nz_detect4 as2
out = or2(out1,out2)```
```binTree op [a] = a
binTree op as = out
where
(as1,as2) = halveList as
out1 = binTree op as1
out2 = binTree op as2
out = op(out1,out2)```
`nz_detect4 = binTree or2`
• Split up a monolithic solution into a reusable connection pattern and a simple solution
• Functional programming languages makes this easy!
Page 4

Page 5

# Comparing the two connection patterns

• We now have two similar connection patterns:
• `binTree :: ( (a,a)->a ) -> ( [a]->a )`
• `chain :: ( (a,a)->a ) -> ( [a]->a )`
• Similarities and differences:
• Both turn a two-input operator into an n-input operator
• One builds a balanced tree, the other builds a linear chain
• But do `chain op` and ```binTree op``` always compute the same result?
Page 6

# Comparing two connection patterns

• chain (*) [a,b,c,d,e,f,g,h] = a*(b*(c*(d*(e*(f*(g*h))))))
• binTree (*) [a,b,c,d,e,f,g,h] = ((a*b)*(c*d))*((e*f)*(g*h))
• a*(b*(c*(d*(e*(f*(g*h)))))) = ((a*b)*(c*d))*((e*f)*(g*h)) ?
• This holds if * is... associative: (a*b)*c=a*(b*c)
Page 7

# Measuring combinatorial circuit depth

• Take a look at the types of these connection patterns again:
• `binTree :: ( (a,a)->a ) -> ( [a] -> a)`
• `chain :: ( (a,a)->a ) -> ( [a] -> a)`
• They are generic functions, so they work for arbitrary Haskell values
• An operator for measuring circuit depth:
• `depth (a,b) = max a b + 1`
Page 8
##### Measuring combinatorial circuit depth
• We can now compare
• `chain depth (replicate 16 0)` = 15
• `binTree depth (replicate 16 0)` = 4
• Here we compute with pure Haskell values
Page 9

## We can also write functions that involve both circuit components and arbitrary Haskell values

### This hints at very powerful techniques for constructing circuits!

• Circuit generation rather than circuit description.
• Circuit generators can still look like circuit descriptions.
• Mary Sheeran: clever circuits can control the choice of components, wiring and topology.
• Designers can work with nice looking high level circuit descriptions, but still control low-level aspects of the design.
Page 10

# Linear chains

## Remember `row`?

• `chain` can be seen as a special case!
•  ```chain op (a:as) = out where (_,out) = row rowop (a,as) rowop (a,b) = ((),op(a,b))```
• Alternatively, `row` can be seen as a generalization of `chain`.
Page 11

# Classifying connection patterns

• Consider these two properties
• Structure: linear chain or balanced tree?
• Which outputs: only one final output or output from every op?
• Do we have all combinations?
• LinearTree
Final output`chain``binTree`
Every op`row`???
• What would we use the 4th operator for? ...
LinearTree
Final output

`slowZeroDetect = chain or2`

`fastZeroDetect = binTree or2`

Every op

`rippleCarryAdder = row fullAdder`

`fastAdder = ??? fullAdder`

• This requires some further thought...
Page 12

• The carries ripple through in a linear chain.
• Hard to imagine how to rearrange full adders into a tree structure.
• Where is the associative operator?
Page 13

# Carry Generate and Propagate

• g = a ∧ b
p = a ⊕ b
• cout = g ∨ (cin ∧ p)
• Note: if cin=0, then cout = g
Page 14

# Combining generate and propagate signals

## When would the combination of two full adders generate/propagate carry?

•  g = g2 ∨ (g1 ∧ p2) p = p1 ∧ p2
• In Lava:
• `dotOp ((g1,p1),(g2,p2)) = (g2 ∨ (p2 ∧ g1),p1 ∧ p2)`
• This operator is associative!
• `chain dotOp = binTree dotOp`
• We can compute the cout from a n-bit adder with a circuit of depth log₂ n.
Page 15

• We can build a fast adder from these blocks:  map gp ... binTree? dotOp ... map sumBit ...
• But we need the right tree pattern. `binTree` is not enough.
Page 16

# The Parallel Prefix Problem

• Given
• Inputs x1, x2, x3, ..., xn
• an associative operator *
• Compute
• x1, x1*x2, x1*x2*x3, ..., x1*x2*x3*...*xn
Page 17

## Uses for parallel prefix circuits

• Microprocessors cotain lots of parallel prefix circuits
• Addition (integer and floating point)
• Priority encoding
• ...
• They need to be fast and have low power consumption
Page 18

# The Parallel Prefix Problem in Lava

• We have
• `chain, binTree :: ( (a,a)->a ) -> ( [a]->a )`
• We need
• `ripple, sklansky :: ( (a,a)->a ) -> ( [a]->[a] )`
• Example (writing `abc` instead of `a*b*c` etc):
• ```chain (*) [a,b,c,d,e,f,g,h] = binTree (*) [a,b,c,d,e,f,g,h] = abcdefgh```
• ```ripple (*) [a,b,c,d,e,f,g,h] = sklansky (*) [a,b,c,d,e,f,g,h] = [a,ab,abc,abcd,abcde,abcdef,abcdefg,abcdefgh]```
Page 19

# Serial prefix solution

## Circuit diagram for `ripple`

• The circuit depth grows linearly with the number of input bits
Page 20

Page 21

## Lava code

• ```ripple :: ((a,a)->a) -> [a] -> [a]
ripple op [a] = [a]
ripple op (a:b:cs) = a:rs
where
rs = ripple op (ab:cs)
ab = op(a,b)
```
• Similar to `row`, but the single output from `op` is both part of the result and propagated to the next `op`.
Page 22

# Tree shaped prefix solution

Page 23

## How do we implement `sklansky` in Lava?

### `sklansky (*) [a,b,c,d,e,f,g,h]` = ?

• Use `halveList`:
• halveList [a,b,c,d,e,f,g,h] = ([a,b,c,d],[e,f,g,h])
• Compute the results for the halves:
• sklansky (*) [a,b,c,d] = [a, ab, abc, abcd]
• sklansky (*) [e,f,g,h] = [e, ef, efg, efgh]
• Combine the results:
• [a, abc, abcd] ++ [abcd*e, abcd*ef, abcd*efg, abcd*efgh] =
[a, ab, abc, abcd, abcde, abcdef, abcdefg, abcdefgh]
Page 24

Page 25

# Measuring circuit depth

• `ripple depth (replicate 16 0)` = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
• `sklansky depth (replicate 16 0)` = [0,1,2,2,3,3,3,3,4,4,4,4,4,4,4,4]
Page 26

• ```sklanskyAdder (as,bs) = sum
where
abs = zipp (as,bs)
gps = map gp abs
gps' = sklansky dotOp gps
cs = map fst gps'
sum = map sumBit (zipp (low:cs,abs++[(low,low)]))
```
• Room for improving the circuits for the first and last bit in the sum
• Rewrite the code, or
• use clever circuits!
Page 27

# Clever circuits

## A simple example: leaving some inputs disconnected

• ```nc = Nothing -- not connected
c = Just     -- connected

cleverSumBit (Nothing,Nothing) = low
cleverSumBit (Just c_in,Nothing) = c_in
cleverSumBit (Nothing,Just (a,b)) = a ⊕ b
cleverSumBit (Just c_in,Just (a,b)) = c_in ⊕ a ⊕ b
```
• ```cleverSklanskyAdder (as,bs) = sum
where
abs = zipp (as,bs)
gps = map gp abs
gps' = sklansky dotOp gps
cs = map fst gps'
sum = map cleverSumBit (zipp (nc:map c cs,map c abs++[nc]))
```
• Same as `sklanskyAdder`, except the last line
Page 28

# Formal verification

• ```propAdder' myAdder n =
forAll (list n) \$ \ as ->
forAll (list n) \$ \ bs ->

return (all (==Valid) rs)

```
• Verifies that the new adders are equivalent to binAdder from the Lava library, for all sizes [1..32].
Page 29

# Other parallel prefix solutions

• Brent Kung (fewer ops, deeper, fanout only 2)
• Clever circuits: dynamic programming to search for optimal solutions
Page 30

Page 31