The Insider's Guide To Planning C166 Family Designs - Part I
|< back
 

 
 
 

 The limitations of HTML mean that the version of the book presented here has a
poor quality of reproduction. However a high quality PDF version is
available here.
 
     

The Insider's Guide To Planning C166 Family Designs - Part II

The Insider's Guide To Planning C166 Family Designs - Part III

The Insider's Guide To Planning C166 Family Designs - Part IV

The Insider's Guide To Planning C166 Family Designs - Part V

The Insider's Guide To Planning C166 Family Designs - Part VI


                   

Issue B

                   

Second Edition

         
                   

166 Designer's Guide - Page

 
           
                 
                   


                     

This guide contains basic information that is useful when doing your first 166 family design. There are many simple facts which if they are known at the outset can save a lot of time and money. Overall, it is intended as a complement to the user manuals by putting things into a practical context.

 

Some of the material can be found in the 166 family databooks but most of it is simply the result of our practical experience and so is only to be found here. Topics covered are those that are not obvious or are often missed. Where the user manuals provide a satisfactory explanation, you will be referred to it rather than duplicating information here. This is by no means a complete reference work and you are directed to the excellent work by one of the architecture's original designers Karl-Heinz Mattheis, available in the German language.

 

Note: While every effort has been made to ensure the accuracy of the information contained within this guide, Hitex cannot be held responsible for the consequences of any errors contained therein. Any subjective or anecdotal information presented is not necessarily the official view of either Hitex Development Tools Ltd. or Siemens Plc..

 

 

 

Prepared By:

Michael Beach

John Barstow

Karl Smith

 

Additional Material From:

Dave Greenhill
Ulrich Beier

Olaf Pfeiffer

Peter Mariutti

 

Second Editon, April 1999
                     
Hitex produces the largest range of 166 family emulation and simulation tools available from any manufacturer. By using both standard part and bondout-based technology, Hitex can uniquely provide the optimal emulation method for all 166 variants, whatever the application. Besides supplying the development tools, Hitex is also pleased to help and advise new and    
               
prospective 166 users in all aspects of hardware and software design, as this guide demonstrates - we are at your service!      
                     

Hitex Development Tools Ltd.

University Of Warwick Science Park

Sir William Lyons Road

Coventry, CV4 7EZ

Tel: 01203 692066 Fax: 01203 692066

Email: inside166@hitex.co.uk Web: www.hitex.co.uk
                     
             

166 Designer's Guide - Page

 
               
                     

               

166 Family Designer's Guide - Contents

RISC Architectures For Embedded Applications 6

Introduction 6

Behind The 166's Near-RISC Core 6

Conventional CISC Bottle-necks 6

The RISC Architecture For Embedded Control 7

Basic Definitions: 7

Bus Interface 8

RISC Interrupt Response 8

Registers And Multi-Tasking 8

Coping With RISC Instruction Set (Apparent) Omissions 10

RISC And Real World Peripherals 11

 

1. Getting Started With The 166 12

1.1 Basic Considerations 12

1.1.1 Family Overview 12

1.1.2 Fundamental Design Factors 12

1.2.1 Setting The CPU Hardware Configuration Options (166) 12

1.2.2 Setting The CPU Hardware Configuration Options (167) 12

1.3 Calculating The Pull-Down Resistor Values 13

Pull Down Resistor Calculation 13

1.4 Pull-Up Resistor Calculations 13

Pull Up Resistor Calculation 14

1.4 Setting The Configuration Without Pulldown Resistors 14

1.5 Port 0 Configuration Functions 15

1.6 Reset Control 16

2. Clock Speeds And Sources 17

2.1 166 Variants 17

2.2 165 And Basic 167 Variants 17

2.3 167SR & CR Variants 17

2.4 Generating The Clock 17

2.4.1 Designing Clock Circuits 17

2.4.2 Oscillator Modules 17

2.4.3 Designing Crystal Oscillator Circuits 18

2.4.4 Crystal Oscillator Components Test Procedure 18

2.4.5 Typical Component Values 19

2.4.6 Laying Out Clock Circuits 20

2.4.7 Symptoms Of A Poor Clock 20

 

3. Bus Modes 21

3.1 Flexible Bus Interface 21

3.2 Setting The Bus Mode 21

3.2.1 166 Variants 21

3.2.2 C165/7 Derivatives 21

3.3 Setting The Overall Addressing Capabilities 21

3.4 External Memory Access Times (167 Derivatives Only) 22

3.5 Expanding The Basic 166's Memory Space 22

 

4. Interfacing To External Devices 23

4.1 The Integral Chip Selects (167/5/4/3/1) 23

4.2 Setting The Number Of Chip Selects 24

4.3 READ/WRITE Chip Selects. 24

4.4 Replacing Address Lines With Chip Selects 25

4.5 Generating Extra Chip Selects 26

4.6 Confirming How The Pull-Down Resistors Are Configured 27

4.7 Generating Waitstates And Controlling Bus Cycle Timings 27

 

5. Interfacing To External Memory Devices 28

5.1 Using Byte-Wide Memory Devices In 16-bit 167 Systems 29

166 Designer's Guide - Page

 
           
             
               

               

5.2 Using The 166 With Byte-Wide Memories 30

5.3 Using DRAM With The 166 Family 31

 

6. Single Chip 166 Family Considerations 32

6.1 Single Chip Operation 32

6.2 In-Circuit Reprogrammability Of FLASH EPROM 32

6.3 Total Security For Proprietary Software 32

6.4 Keeping An External Bus 32

6.5 Hitex's In-Circuit FLASH Programming Utility Toolkit 32

6.5 Accommodating In-Circuit FLASH Programming 33

6.7 In-Circuit FLASH Programming Via CAN 33

 

7. The Basic Memory Map 34

7.1 On-Chip RAM Regions 34

7.1.1 166 Variants 34

7.1.2 167CR & 167SR, C165, Some 161 Variants 34

7.1.4 C167CS, C161CS 34

7.2 Planning The Memory Map 34

7.2.1 External ROM Applications 34

7.2.2 Internal ROM Applications 35

7.3 A Typical 167 System Memory Map 35

7.4 How CPU Throughput Is Related To The Bus Mode 36

7.5 Implications Of Bus Mode/Trading Port Pins For IO 36

 

8. System Programming Issues 37

8.1 Serial Port Baud Rates 37

8.1.1 166 Variants 37

Baudrates for 20 MHz 37

Baudrates for 16 MHz 37

8.1.2 Enhanced Baudrate Generator On 167 Variants 37

8.1.3 The Synchronous Port On The 167 37

8.2 Interrupt Performance 37

8.2.1 Conventional Interrupt Servicing Factors 37

8.2.2 Event-Driven Data Transfers Via The PEC System 38

PEC Usage Examples 38

8.2.3 Extending The PEC Address Ranges And Sizes Above 64K 39

8.2.4 Software Interrupts 39

8.2.5 Hardware Traps 39

8.2.6 Interrupt Vectors And Booting Up The 166 39

8.2.7 Interrupt Structure 40

8.3 The Bootstrap Loader 40

8.3.1 On-Chip Bootstrap Booted Systems 40

8.3.2 Freeware Bootstrap Utilities For 167 41

8.4 166 Family Stacks 41

8.5 Power Consumption 42

8.6 Understanding The DPPs 42

8.6.1 166 Derivatives 42

8.6.2 167 Derivatives 43

 

9. Allocating Pins/Port Pins In Your Application 44

9.1 General Points About Parallel IO Ports 44

9.2 Allocating Port Pins To Your Application 44

9.3 Port 0 44

Port 0 Pin Allocations: 44

9.4 Port 1 44

9.5 Port 2 45

9.5.1 The CAPCOM Unit 45

9.5.2 Time-Processor Unit Versus CAPCOM 45

9.5.3 32-bit Period Measurements 45

9.5.4 Generating PWM With The 166 CAPCOM Unit 46

9.5.5 Sinewave Synthesis Using The CAPCOM 46

166 Designer's Guide - Page

 
           
             
               

               

9.5.6 Automotive Applications Of CAPCOM1 46

9.5.7 Digital To Analog Conversion Using The CAPCOM Unit 47

9.5.8 Timebase Generation 47

9.5.9 Software UARTs 48

9.6 Port 3 49

9.6.1 Using GPT1 49

9.6.2 Using GPT2 50

9.7 Port 4 50

9.7.1 Interfacing To CAN Networks 50

9.8 Port 5 51

9.8.1 166 Analog To Digital Convertor 51

9.8.2 167 Analog To Digital Convertor 51

9.8.3 Over-Voltage Protected Analog Inputs 52

9.8.4 167/4-Specific Enhancements 52

- wait-for-ADDAT-read mode 52

- channel injection 52

- programmable sampling times 52

9.8.5 Matching The A/D Inputs To Signal Sources 53

9.8.6 165/3 54

9.9 Port 6 (167) 54

9.10 Port 7 (167 Only) 54

50ns PWM Module/High Resolution Digital To Analog Convertor 55

9.11 Port 8 (167 Only) 55

9.12 Summary Of Port Pin Interrupt Capabilities 55

9.12.1 Interrupts From Port Pins 55

9.12.2 166 Variants 55

9.12.3 167 Variants 55

9.13 Typical 166 Family Applications 56

9.13.1 Automotive Applications 56

9.13.2 Industrial Control Applications 56

9.13.3 Telecommuncations Applications 57

9.13.4 Transport Applications 57

9.13.5 Consumer Applications 57

9.13.6 Instrumentation Applications 57

 

10. 166 Compatibility With Other Architectures 58

 

11. Mounting 166 Family Devices 59

11.1 Package Types 59

11.2 Connecting Emulators To 166 Family Devices 60

11.2.1 Socketed Devices 60

11.2.2 The "PressON" Emulation Connector 60

11.3 166 Family PCBs 60

11.4 CAD Symbols 60

 

12. Direct PCB Emulation Interfaces For 166 Designs 61

12.1 The Problem 61

12.2 The ROMless Solution - ICEconnect166 61

12.3 The ROM/ROMless Solution - QuadConnect 61

 

13. Getting New Boards Going 62

13.1 External Bus Design Pitfalls 62

13.2 Single Chip Designs 64

13.3 Testing The System 64

 

14. Conclusion 65

15. Acknowledgements 65

16. Feedback 65

17. Contact Addresses 65

 

Appendix 1 - Siemens C166 Family Part Numbers 66

166 Designer's Guide - Page

 
           
             
               

       
 

RISC Architectures For Embedded Applications

 

Introduction

 

The 166 CPU core makes extensive use of Reduced Instruction Set Computer (RISC) concepts to acheive its blend of very high performance at modest cost. To understand why RISC techniques are especially suited to high-speed real time embedded systems, it might be useful to examine in detail how they grew out of the traditional Complex Instruction Set Computers (CISC) that reached their peak in the late 1980's to early 1990's.

 

Behind The 166's Near-RISC Core

 

The reasons behind the abandonment of traditional Complex Instruction Set Computers (CISC) has been the quest for ever greater throughput. The demands of workstations involved in CAD tasks and latterly advanced video games, have been the real driving force behind this. Traditionally, microprocessors have been designed with assembler instruction sets that have been geared towards making the assembler programmer's life easier through the extensive use of microcode to produce ever more powerful instructions`. By providing single assembler instructions that perform, for instance, three operand multiplication, the assembler programmer (and HLL compiler writer) has been relieved of the job of achieving the same result with simpler instructions.

 

The need for the CPU to be able to recognise and act on (decode) many hundreds of different instructions, requires complex silicon and many clock cycles. The greater the silicon area, the greater the cost of the device and power consumed. With physical limitations acting to restrict achievable clock speeds on silicon devices, the number of cycles per instruction is obviously very significant in gaining higher performance..

 

RISCs tend to shift the burden of programming from the microcoder to the assembler programmers and compiler writers. Work both within academia and commercial manufacturers has proved that a suitably programmed RISC machine can achieve a far higher throughput than a CISC for a given clock speed.

 

Strangley, the embedded world has been slow to question the suitability of the CISC-based microcontroller. Whilst at the very top end, devices such as the i80960 have enjoyed some success, for more commonplace embedded tasks, RISC is almost unknown. With the increasing complexity of modern control algorithms, the need for greater processing power is set to become an issue in anything but the simplest applications. In addition, here more than in the workstation world, the worst-case response time to non-deterministic events is crucial, an area where CISCs are especially poor.

 

Many current high-end microcontrollers are based on existing CISC architectures such as the 8086, 68000 etc., which in common with 8-bit devices such as the 8051, have an internal structure that dates back up to 19 years. With the silicon vendor's need to give existing users an upgrade path, apparently new designs are often based closely on the existing architecture/instruction set, so protecting the user's investment in expensive assembler-code.

 

Like workstations, microcontrollers are tending to be programmed in a high level language (HLL) to reduce coding times and enhance maintainability. Inevitably, even with the best compilers, some loss of performance is encountered, emphasising again the need for improved CPU performance.

 

In addition to straightforward data processing, microcontrollers must also handle real-world peripherals such as A/D converters, PWM's, timers, Ports, PLL's etc., all of which require real time processing.

 

Conventional CISC Bottle-necks

 

1. Long And Unpredictable Interrupt Latencies

 

Complicated "labour-saving" instructions must hold CPU's entire attention during execution, thus preventing real-world generated interrupts from being serviced. Unpredictable latency times result which can cause serious problems in hard real-time systems. One approach to overcoming the CISC's poor real-time response has been to bolt a secondary "time processor" onto the core to try and off-load the time-critical portions. However, this results in an awkward design and the need to use a very terse microcode to program it, in addition to the more usual C and assembler for the CISC core itself.

 

 

166 Designer's Guide - Page

       

               

2. Vast Instruction Sets Give Slow Decoding

 

Loaded instruction must be recognised from potentially many hundreds or even thousands of possibilities. Decoding is thus complicated and lengthy.

 

3. Frequent Accesses To Slow Memory Devices

 

Data is typically fetched from off-chip memory and placed in accumlator-type registers. Mathematical or logical operations are performed and then result written back to memory. Value is likely to be required again in course of procedure, thus requiring further movements to and from off-chip memory.

 

4. Slow Procedure Calling

 

When calling subroutines with parameters (essential in good HLL programming), parameters must be individually pushed on to stack. They must then be moved through accumulator register(s) for processing before being returned via stack to caller.

 

5. Strictly One Job At A time

 

Each peripheral device or interrupt source must have dedicated service routine which at the least will require the PSW, PC to be stacked and restored and data removed from or fed to peripheral device.

 

6. Software Has To Be Structured To Suit Architecture.

 

Embedded systems frequently contain many separate real time tasks which together form a complete system. Conventional CPU's make switching between tasks slow. Often, many registers have to be stacked to free them up for the incoming task. This problem is aggravated by the use of HLL compilers which tend to use a large number of local variables in library functions which must be preserved.

 

7. Redundant Instructions And Addressing Modes

 

With the move to HLLs, compilers are tending to dictate what instructions should be provided in silicon.

 

In practice, compilers tend to only make use of a small number of addressing modes. This results in a large number of unused addressing modes which serve only to complicate the opcode decoding process.

 

8. Inconsistent Instruction Sets

 

Instruction sets that have evolved tend to be difficult to use due to large number of different basic types and the inconsistent addressing modes allowed.

 

9. Bus Not Fully Utilised

 

Whilst complex instructions are being executed, bus is idle.

 

The RISC Architecture For Embedded Control

 

To show how RISC design is used to improve microcontroller throughput, the 166 is used as an example.

 

Basic Definitions:

 

1 state time = 2 * 1/oscillator frequency

 

- fundamental unit of time recognised within processor system.

 

1 machine cycle = 2 * state time

 

- minimum time required to perform the simplest meaningful task within cpu.

               

166 Designer's Guide - Page

 
           
             
               

               

The unit of state times is used when making comparisons between RISCs and CISCs as this removes any dependency on clock frequency.

 

- All state time counts are given in single chip operation mode for both 80C196 and 166.

 

Bus Interface

 

To maximise the rate at which instructions are executed, RISC CPU's are very heavily pipelined. Here, on any given machine cycle, up to 4 instructions may be processed by overlapping the various steps thus:

 

FETCH: - get opcode from program store

DECODE: - identify opcode from a small list and fetch operands

EXECUTE: - perform operation denoted by opcode

WRITE-BACK: - result returned to specified location

 

Thus although the instruction takes four machine cycles, it is apparently executed in just one (2 state times). Pipelining has considerable benefits for speeding sequential code execution as the bus is guaranteed to be fully occupied.

 

RISC Interrupt Response

 

In the 166, branches to interrupts make use of the injected instruction technique and so vectoring to a service routine is achieved in only 4 machine cycles (400ns). The effect of complex but necessary instructions such as MUL and DIV (5 and 10 cycles respectively) stretch this but it is interesting to note that the 80C166 does provide these as interruptable instructions.

 

Very fast interrupt service is crucial in high-end applications such as engine management systems, servo drives and radar systems where real-world timings are is used in DSP-style calculations. As these normally form part of a larger closed control loop, erratic latency times manifest themselves as an undesirable jitter in the controlled variable.

 

Registers And Multi-Tasking

 

Traditional microcontrollers have one or more special registers which can be used for mathematical, logical or Boolean operations. In the 8051, there is a single "accumulator" with 8 other registers which may be used for handling local variables or intermediate results in complex calculations. These additional registers are also used to access memory locations via indirect and/or indexed addressing.

 

As pointed out in section 3 and 4 above, conventional CPU's spend much time moving data from slow memory areas into active registers. The RISC offers a very large number of general purpose registers which may be used for locals, parameters and intermediates. The 166 provides 16 word-wide general purpose registers (GPRs), each of which is effectively an accumulator, indirect pointer and index. With such a large number of GPR's available, it becomes realistic to keep all locals and intermediates within the CPU throughout quite large procedures. This can yield a great increase in speed.

 

Further significant benefits are derived from the RISC technique of register windowing. As has been said, up to 16 registers are available for use by the program. However, by making the active register bank movable within a larger on-chip RAM, the job of real time multi-tasking is considerably eased.

 

Central to this is the concept of a "Context Pointer" (CP), which defines the current absolute base address of the active bank. Thus a reference to "R0" means the register at the address indicated by the CP. Thereafter, the 16 registers originating from CP are accessed by a fast 4-bit offset.

 

The best example of how the CP is exploited is perhaps a background task and a real-time interrupt co-existing. When the interrupt occurs, rather than pushing all GPR's onto the stack, the CP of the current register bank is stacked and simply switched to a new value, determined at link time, to yield a fresh register bank. This results in a complete context switch in just one machine cycle but does rule out the use of recursion.

 

A hybrid method, which permits re-entrancy, uses the stack pointer to calculate the new CP dynamically.

               
         

166 Designer's Guide - Page

 
           
               

               

Here, on entering the interrupt, the number of registers now required is subtracted from the current SP and the result placed in CP, with the old CP stacked. Thus the new register bank is located at the top of the old stack, with the old CP and then the new stack following on immediately afterwards. On exiting the interrupt routine, the original registerbank is restored by POPping the old CP from the stack. The SP is reinstated by adding the size of the new register bank onto the current SP.

 

A further RISC refinement is register window overlapping whereby when a new procedure is called, part of the new register bank defined by CP' is coincident with the original at CP:

          R3'     ; Register for subroutine's locals and intermediates
          R2'     ; Register for subroutine's locals and intermediates
     R7     R1'     ; Common register, R7 == R1'
CP'     R6     R0'     ; Common register, R6 == R0'
     R5          ; Register for caller's locals and intermediates
     R4          ; Register for caller's locals and intermediates
     R3          ; Register for caller's locals and intermediates
     R2          ; Register for caller's locals and intermediates
     R1          ; Register for caller's locals and intermediates
CP     R0          ; Register for caller's locals and intermediates

MODULE 1

 

; *** Assignment Of GPRs To Local Variables - Caller ***

x_var     LIT     `R0'          ; Local variable
y_var     LIT     `R1'          ; Local variable

parm1     LIT     `R6'          ; Passed parameter 1
parm2     LIT     `R7'          ; Passed parameter 2

result     LIT     `R6'          ; Value returned from sub routine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

MODULE 2

; *** Assignment Of GPRs To Local Variables - Sub Routine ***

a_var     LIT     `R2'          ; Local variable
b_var     LIT     `R3'          ; Local variable

input1     LIT     `R0'          ; Received parameter 1
input2     LIT     `R1'          ; Received parameter 2
ret1     LIT     `R0'          ; Final result returned in R0

Fig. A - Giving GPR's Meaningful Names



      

By using some forethought, the programmer should arrange for any value to be passed to the sub routine to be located in the common area, so that all the normal loading and unloading of parameters is avoided. This technique can be used in either absolute or SP-relative registerbank modes.

 

To get the best from a RISC's registers, the location of data needs close consideration: although highly orthogonal, the limited number of addressing modes provided for MUL and DIV for example, can appear somewhat restrictive. Fortunately though, most operands involved will already be in registers, so eliminating the need for many addressing techniques. As might be expected, the instructions with the widest range of addressing modes are the simple data moves - the fact that RISC's are the result of very careful analysis of the requirements for fast execution becomes obvious after a short acquaintance!

               

166 Designer's Guide - Page

 
           
             
               

               

Coping With RISC Instruction Set (Apparent) Omissions

 

With largely single machine cycle execution, some conventional "fast" instructions such as CLEAR, INC and DEC become redundant. Therefore, to keep the total number of instructions to a minimum, RISC's simply omit them. Examples are given below:

 

Instruction 80C196 States 80C166 States

 

Clear Word CLR 4 AND Rn,#0 2

Decrement Word DEC 4 SUB Rn,#01 2

Increment Word INC 4 ADD Rn,#01 2

 

- all direct addressing mode

 

Three-operand instructions are also commonplace in CISCs but not present in RISCs. Although additional instructions are required, the overall number of states is still less than the three operand CISC equivalent, plus the shorter RISC instructions allow greater opportunity for interrupt servicing.

 

The following example illustrates this:

 

Perform: z = x + y

 

 

80C196 (CISC)



      

z,x and y are directly addressed memory locations

x     DW     1
y     DW     1
z     DW     1

     ADD     z,x,y     ; 5 states - no interrupt possible

166 (RISC)

 

z,x and y are memory locations, Rw is a GPR

 

x     DW     1
y     DW     1
z     DW     1

     MOV     Rw,x          ; 2 states 
                    ; * Interruptable here 
     ADD     Rw,y          ; 2 states
                    ; * Interruptable here 
     MOV     z,Rw          ; 2 states
                    ; ————
                    ; 6 states

One extra state required when using RISC approach. However, if the variables are assigned recognising that this is a RISC:



      

x and y are memory locations, z is a GPR

 

x     DW     1
y     DW     1

z     LIT     `R0'          ; z is assigned to GPR R0 via a LITeral definition

     MOV     z,x          ; 2 states
                    ; * Interruptable here
     ADD     z,y          ; 2 states
                    ; ————

; 4 states



      

- 1 state saved over CISC. The above was chosen as a worst case RISC, best case CISC example.

               
         

166 Designer's Guide - Page

 
           
               


To request us to send you this book by email or post....



Getting started with the C16x microcontrollers for just £140!


View the next chapter of this document...




 

 

ST10F168