rod logic

Completed: Spiral cam drive, intermodule connectivity

Based on the plan in the previous post, I have completed and tested the spiral cam drive. First, I cut to size some aluminum round stock on the lathe, and drilled a 3/4-inch hole through, also on the lathe, for the Acme rod. This became my blank.

IMG_0048.JPG

Next, I wrote a program to generate GCode to mill the spiral cam out of the blank. I went through several iterations, improving the efficiency each time. It still takes about 20 minutes to mill the spiral cam. And here is the inevitable CNC porn:

I drilled and tapped a hole radially into the cam, drilled a set hole radially into the Acme rod, and attached the cam to the rod. Here's the installation showing how the bearing rides on the cam surface:

IMG_0049.JPG

And here are some videos showing the mechanism in action, driving 12 rods. Four of the rods have extra-strength springs so this is the equivalent of 16 rods. The action is incredibly smooth, and doesn't take much force.

I also tested stringing fishing line between two modules, and that worked perfectly. I still need to work on getting the length of the string just right, since I can't adjust the spacing between modules for a given string length.

NextFab is closed for two weeks for the winter solstice holidays, and this also ends this phase of investigation. The next phase, to be started in January, is making a cylindrical gear to rotate the drive, and then putting together a single bit slice module. When those tasks are completed, I will have enough proof that I can complete the 16-bit ALU, which means that I can start a Kickstarter project. I hope to complete the bit slice module by the end of February.

Noise margins in the Logical Engine

Here's an explanation of how I implemented noise margins for the Logical Engine design. Recall that rods have holes for bumps every 0.3125 inches (5/16"), and that the spacing between rods is 1/2" vertically and 0.625 (5/8") horizontally. Also, I am using 1/4" hex standoffs that have the following measurements:

 

bump.png

Here's a setup which will illustrate noise margins. The logic here is two inverters, but that doesn't really matter.

noise-margin0.png

The blue bumps are the bit bumps, while the green bumps are the sense bumps. The hexes are aligned so that the pointy end is along the direction of motion.

Here are the measurements of the bottom pair:

noise-margin0-measure1.png

The key measurement is the distance between the flat end of the bit bump and the pointy end of the sense bump: 0.0432". The geometry of the second pair is identical, so the same measurements apply. So if the bit bump on rod #1 were to be slid in front of the sense bump...

noise-margin1.png

Then rod #2 will only be able to move that much, 0.0432":

noise-margin2.png

And now, if we slide rod #3, since the bit bump on rod #2 isn't in the way of the sense bump on rod #3, rod #3 can slide the full 0.3125". Or can it?

noise-margin3.png

Zero clearance! So if the sense bump on rod #2 were just a little thinner, rod #2 will move that much more, and then the sense bump on rod #3 will smack right into the bit bump. You might figure that because the sides are sloped with respect to each other, it's not such a big deal: rod #2 gets slid back by the force of rod #3.

But that's not what noise margin is about. The idea is that the bit on rod #2 should stop short of rod #3's sense bump's path, so that any noise in the movement will not affect the result.

By making the bit bumps square, we solve this problem:

noise-margin4.png

And so the noise margin for a zero bit is 0.0193". The noise margin for a one bit is, of course, 0.125", being half the width of a bit bump.

The Logical Mill, revisited

So now that I've had a chance to think long and hard about the design of the arithmetic logic unit, or Logical Mill as Babbage would have called it, I realized that I had several errors in the previous discussion. I also reorganized the Mill to make it possible to increase the number of bits more easily. I have also tested it thoroughly in simulation. I think I am also able to more coherently discuss the design, so here goes.

Overview

First, here is the list of operations that the Logical Mill can perform:

operations.png

The Mill is based on the bit-slice architecture, which means that regardless of how many bits the Mill has, the output for a given bit is based on the inputs for that specific bit, and on no other bit. There may, however, be logic outside the bit slices to provide signals that the bit slices need. For example, in order to add two bits, we need a carry in which comes from a separate section of logic.

The Bit Slice

Here is the first section of the bit slice. It is a kind of preprocessor of the bits:

bitslice1.png

A and B are the inputs to the bit slice, and F is a 5-bit function selector. The preprocessor simply inverts A if F0 is 1. This is useful for logic functions, for example simply inverting A bitwise. It is also useful arithmetically, since inverting A is equivalent to multiplying by -1 (and subtracting 1, but that's another story).

What the postprocessor does to B is slightly more complicated. The multiplexer selects either B or 0 for forwarding to the optional inverter. 0 is selected if F is 010xx, which corresponds to the four arithmetic additions without B. Thus, if F is 0100x the second operand to the additions is 0, and if F is 0101x, the second operand is 1111...1111 (i.e. -1). Otherwise, if F is 011xx, these are the four arithmetic operations which include B, inverted or not depending on F1.

Here are the equations corresponding to the above diagram, in AND-OR form, which is the easiest to translate to rod logic:

a-prime.png

b-prime.png

The final part of the bit slice is a multiplexer, which computes all the functions (shift, add, xor, and so on) and selects one of them based on the three highest bits of F.

bitslice2.png

In the above diagram, the function for F=111xx is logical or, not addition. Addition is handled by F=010xx and 011xx, as described above. SR is a function which computes the right-shift, SL is a function which computes the left-shift, and U is a function which computes the unary logical operations (A, B, or inversions of same). These are the specific equations for SR, SL, and U:

mux.png

In the above equation, M is a bit indicating whether the bit slice is for the most significant bit, which is important for right shift to determine whether the carry should be shifted in (logical shift) or whether the most significant bit should remain the same (arithmetic shift).

Likewise, L is a bit indicating the same. Note that for both shift right and shift left, the carry is only shifted in when A isn't inverted. Also, the carry isn't shifted in on arithmetic shift right, because the highest bit is kept. I decided that it didn't make much sense to shift carry in when shifting the invert of A.

So, the bit slice in block form looks like this:

bitslice-block.png

The Lookahead Carry Unit

P and G are special outputs called Propagate and Generate. These are used in the external function which feeds each bit slice with its carry in, and this external function is the Lookahead Carry Unit. The naive way to perform addition is to compute the output and the carry out of the lowest bit, then use the carry out as the carry in of the next bit, compute its output and carry out, and so on. This is a ripple carry, and leads to delays.

A lookahead carry unit, on the other hand, looks at all of the bits and computes carry in for every bit. Consider two bits, A and B, and a carry in. Regardless of the carry in, if A and B are both 1 then a carry out will be generated. If A or B are 1, then a carry in will result in a carry out: the carry in is propagated:

pg.png

The carry in for bit 0 is simply the carry in to the Mill. The carry in for bit 1 is set if bit 0 generates a carry, or if bit 0 propagates a carry and carry in is set. Similarly, the carry in for bit 2 is set if bit 1 generates a carry, or bit 0 generates a carry and bit 1 propagates a carry, or if bits 0 and 1 propagate a carry and carry in is set. And so on for each carry.

lcu.png

In this way, all the carries can be computed at once, and thus the delay suffered by the last bit is the same as the delay suffered by the first bit, and this delay is quite small.

The equations also show a group propagate and a group generate. The group propagate indicates whether the LCU propagates a carry in, and the group generate indicates whether the LCU generates a carry out. This is useful when we want to increase the number of bits. Simply multiply the hardware by four (thus 16 bits and 4 LCUs) and add an LCU to read in the group propagate and generate signals from each LCU from the first stage. The second stage LCU then generates the carries that feed the first stage LCUs, and the first stage LCUs generate the carries feeding the bit slices:

lcu-chain.png

The advantage is that the same LCU architecture used in the first stage is also used in the second stage. This can be repeated ad infinitum, with every doubling of bit width resulting in a doubling of hardware, but only a linear increase in delay.

Flags

Finally, we have the flags. I want Z (zero), N (negative), V (overflow), and C (carry out) flags. The equations for these flags are as follows:

flags.png

The reason carry out is not simply C4 from the LCU is that for shift left and shift right, the carry out is the most significant and least significant bits, respectively. In addition, the carry out should always be zero for the logical operations.

Rod Logic Diagram

I've constructed a convenient method for describing rod logic. Here is an example of an identity gate (top) and an inverter (bottom):

rodlogic-diagrams.png

On the left are the rod logic diagrams, and on the right are the corresponding physical realizations.

The line with feathers indicates an input, while the line with an arrowhead indicates an output. The order of pushing the rods is determined by following the directions of the arrows.

A filled-in circle indicates identity, while an open circle is an inverter.

So, here is what an XOR gate looks like:

xor-diagram.png

Let's start with the LCU, since it is fairly simple:

lcu-diagram.png

The reason that Cin is an inverted input is that all the carry outs are also inverted. Also, since the group Generate signal comes out inverted, the Generate inputs are also inverted.

Here is the bit slice itself. It is much more complicated, and has a few oddities which I will describe below.

bit-slice-diagram.png

First, notice the G and P outputs (these are output from level 3, where level 0 is the input level). Note that G is not inverted. In fact, it cannot be inverted unless we add a delay, and we don't want to do that. Thus, there will in fact be two versions of the LCU: one with G inputs negative as in the diagram above, and one with G inputs positive. I welcome any suggestions to have G be output negative as long as they do not add too many additional rods or any additional delay.

Note also that levels 7 and 8 do nothing. They are, in fact, present to provide a delay of 2. The LCU itself outputs the carry outs after a delay of 2, and this additional delay is here to show what we would need if we had an additional level of LCUs, thus making a delay of 4 from the P and G outputs.

The delay of this bit slice is 9.

The diagram for the flags module comes next:

flags-diagram.png

Here the V and C outputs are negative, and this is necessary to prevent additional delay. Thus, the total delay of the Mill from input to flags is 11.

Conclusion

I've designed the core of the Logical Mill so that it can be extended to any number of bits. I will start by constructing a 4-bit Logical Mill, and then provided the cost is not too much, scale up to a 16-bit Mill.

The registers surrounding the Mill still have to be designed, but I can start without them.