Posts Tagged ‘Xmega’

h1

My magic AVR build script

2011/08/11

Don’t get me wrong, I like make…  Hah! Yeah, right!

make has its place, no doubt, but small microcontroller projects not so much.  In my projects I like to just start coding, pulling in existing code where it makes sense.  Maintaining a makefile for that gets old really fast, and is massively overwrought compared to the requirements.

So, over the last couple years I’ve developed my own scripts to build AVR projects.  The latest variation has just received a major speedup, so I thought I’d share it along with some selected Bash tips.

Read the rest of this entry ?

h1

DIY HVAC zoning

2011/06/17

Last December (just in time for the “perfect storm” of energy credits…) we had a new heat-pump installed to replace the old gas furnace.  Additional work was done to run ducts to my office upstairs.  The plan was to make the upstairs actually workable during the summer, rather than having the window unit chill the office overnight, then lose the battle at 2 or 3pm and effectively end my productive work day.  So far the system has worked nicely, though the actual compressor has been a persistent noise problem (which Carrier is slowly working on).

The problem is, there really is no such thing as “balance” in a straight ducted system like this.  If the heat is on at any point, the duct to upstairs has to be completely closed off in order to not boil me alive, since there’s plenty of existing heat-producing equipment on my desk.  If it switches to cooling mode, I have to open the duct lest the ambient temperature ooze through the roof and equally boil me alive.  To help this later situation the installation included a duct-assist fan for the upstairs registers, which has to be turned on in many cases to get enough cold air (which is heavy and doesn’t want to go upstairs) where it needs to be.

Unfortunately, the assumption that there would be enough air return “falling” down the stairs hasn’t turned out to be true, which means I’m likely going to have to look into adding return registers and figuring out how to route a duct alongside the supply to the basement.  That’ll mean disassembly of the supply duct because it was positioned in the middle of the available space….

All of this is of course a preamble to the real topic (as per the, um, topic line) of this post, which is my plans to turn this into a zoned system.  The actual meat is after the break…. Read the rest of this entry ?

h1

Graphic demonstration of remote clock sync

2011/02/03

So I’ve been perfecting the method I use in my current project to sync up the two clocks.  In a previous post I explained the generation method, which consists of using a master-generated reference pulse on the data bus to drive a PID that pushes the slave’s crystal frequency around to get them to line up.

Originally I’d maintained a local difference between the master and slave clocks, and any request for the clock would be adjusted by that.  However, I realized it would be very useful for the actual hardware timers to line up with the same values, and it turned out to be almost a trivial change.  Instead of just taking the offset and storing it, I pause the timers and reset them with a value that’s adjusted by the offset (plus a fudge factor for the time it takes to actually change the timers).  Any slight deviation left over can be handled by the I and D of the PID controller.

The main cycle counter is actually two daisy-chained 16-bit timers, and the daisy-chain mechanism is what on Xmega is called an “event”.  If I select the right event channel (0 out of 0..7) and flip a few config bits, I can output the 16th bit of the cycle counter to a pin, which I can then connect my scope to.  I hooked up my scope to this output on both the master and slave event outputs, and set it to trigger off the master’s pulse.

Initially there is only a master trace (channel 1) visible, with the slave (on channel 3) somewhere entirely else along the 488.3KHz cycle period (32MHz / 16 bits).  Halfway through the video I hit the key that triggers the sync protocol, and you can immediately see the slave’s clock trace show up and home in on a lock.

A quick eye-ball measurement of the jitter and delay gives me about a 1 cycle average delay (32MHz cycle) or about 31nS, and a worst-case jitter of about +-2 cycles or about 62nS. I’m betting some of this has to do with the tuning of my PID, as I can clearly see some periodic overshoot in the crystal adjustment parameter. My goal was supposed to be around 25,000nS if I remember right, so I think I’ve managed fairly well ;-)

Next step is to trigger a single pulse at another time and use that to start the ADC’s conversion clock…

h1

Abirtrary clock generation (with benefits)

2010/09/16

The product I’m working on right now has a load of very arcane requirements, forcing me to delve into areas totally new to me.  One of those is precision clocking, since the product requires synchronized capture of data across multiple units.  As such I went hunting for PLLs that would do the job I need.  The key is the ability to adjust the speed very slightly in order to synchronize multiple clocks on multiple units to the exact same speed over the long term.

TI’s CDCE9xx series of chips turned out to have all the right features.  Each chip in the series has an onboard VCXO, which allows me to tweak the exact speed of the crystal with a voltage input, up to +-150ppm.  The various chips in the line have differing numbers of PLLs, each with 2 outputs on separate dividers.  The 12+9 bit N/M divider and 7-bit predivider allow for almost any sane clock speed you can dream up.  Coupled with the right crystal, it does exactly what I need.

But, there’s the problem.  TI specifies a whole mess of arcane crystal parameters needed in order to make the pullable VCXO actually work.  The catch is that nobody selling crystals actually publishes most of those numbers.  That means that you can neither determine the pullability range of a given crystal, nor find a crystal that actually works.  They list a number of specific crystals that “should work”, but not only are those datasheets no more help, but nobody (Digikey, Newark, Mouser, or anybody else I can buy from) sells them.  Hard stop.

Well, as I was looking through the listed vendors hoping to get my hands on some at least some samples, I tried to focus on US-based companies so I could actually communicate with them (all the others are in China etc).  Turns out Pletronics is based in Lynnwood, Washington, just a couple hundred miles north of here.  I sent them an email asking if they could point me to a crystal that’s supposed to work with and be pullable by the CDCE9xx series, and got back an answer that surprised me, and made my week.  They happen to sell a part that’s a CDCE9xx and matching crystal in one package!

The FD77T is the biggest of that line, based on the CDCE949.  The package is all of 5x7mm, which is ever so slightly larger than the CDCE913 alone (5×6.4mm), and noticeable smaller than the CDCE949 (7.8×6.4mm).  Compared to the CDCE949, crystal, and related parts, it’s radically smaller and easier to deal with.  It takes VCCIO, VCCINT, I2C, and VCXO control and spits out 7 PLL outputs, end of story.  The smaller versions (FD7[345]T) have fewer PLLs and outputs, but in the exact same package.  Pletronics seems to stock the FD77T for at least sample quantities with 24MHz, 24.576MHz, and 25MHz crystals.

The kicker is that the 1,000 unit pricing I was quoted was in the $2.50 range.  That’s cheaper than either the CDCE913 or the high-spec crystal separately!

In order to do testing, I made up a tiny adapter board for the FD77T that brings out all the pins, supplies the 1.8V VCCINT, and adds an extra set of top-side headers for the I2C programming interface:

In order to program the actual chip, you have to set up a rather complex sequence of registers.  TI provides a program called ClockPro that does it for you, but it seems to be written in MatLab and is ludicrously slow.  To top it off, it’s Windoze only and doesn’t provide any easy way of getting the register values out in a form that can be programmed in.  I’ve had to resort to literally typing in hex from the binary show in the bit viewer, which is very error-prone and not much fun at all.  So last night I constructed a Python script that does the core PLL calculation, and this morning added code to use the BusPirate to set the clock.  So far it works like a charm!

The code uses an iterative method, but one I’ve already thought through as far as implementation in a microcontroller.  It works back from the target clock, first finding a Pdiv that results in a Fvco closest to the nominal 135MHz.  It then tries all valid Pdiv’s above and below that still fit within the 80-230MHz range.  For each Pdiv/Fvco combination, it tries all the N multipliers from 1 through the max 4095, checking if the multiplied frequency modulo the Fvco results in no remainder, producing a viable M.  From there it calculates the intermediate values required to shove into the PLL registers, and outputs an I2C string to the BusPirate to set the clock.

As useful as the above is, setting the clock speed can sometimes be something you want to do at run-time.  Or more likely, you just don’t want to have to go through all that trouble when prototyping requires you to change the clock.  To solve that problem, I also designed a board that includes a microcontroller:

Populating this board is lower priority, but it’ll happen sometime soon.  The Xmega32A4 on the right is intended to run a microcontroller version of the above algorithm, with some tweaks in place for performance.  Mostly that means trying to fit the calculation into common bit widths, and dealing with the cases where it can’t gracefully and efficiently.  The goal is to have a register set on the MCU accessible via the various available protocols (the upper row of the board is a straight copy of a “serial” port on the Xmega, providing hardware async serial, I2C, and SPI) that lets you simply say “I want xx.yyyMHz on pin X, go!” and get a clock out.  Things get a little bit more complicated when you ask it to deliver two different clocks on pins that share a PLL, but that’s only a matter of finding a common Fvco with which two Pdiv’s generate the requested clocks.

A more interesting feature of the code destined for the MCU is the ability to lock the crystal to an external pulse.  To do so you would designate one of the interface pins as a trigger, and configure it for a particular mode.  To start off, the MCU will switch its clock from the internal oscillator to one of the PLL’s outputs, thus creating a cycle counter based on the crystal.  When these pulses are detected on the pin, the MCU will compute the difference between the actual counter and the desired counter, and run a PID to generate the VCXO control voltage, thereby dragging the crystal back into alignment with the external source.

The trigger modes would be pulse-before-count, pulse-after-count, and pulse-per-X.   In the first, the controlling circuit is expected to provide a trigger pulse, and then set a register with the cycle count that it was supposed to be located at.  In the second, you would set the register before the trigger pulse.  In the last, you set the number of cycles that are supposed to occur between pulses that come at regular intervals.  This mode is ideal for interfacing with the 1PPS signal coming off a GPS receiver.  Run the MCU off Y1 at the base 24.576MHz frequency, set the register to 24,576,000, connect 1PPS to the right pin, and suddenly you have a locked atomic frequency.

I intend to sell both of these modules on my (still upcoming) webshop.  The bare module should probably run about $10, while the Xmega-based module will likely run around  $20-$25.  As usual, if you need one of these and can’t wait for the webshop, let me know and I’ll see what I can do.  Just remember the code for the MCU doesn’t exist just yet.

h1

Desktop disaster of the day

2010/08/27

Here’s a shot of my desk from yesterday:



The red boards on the right are SparkFun FT232RL adapters.  Bottom right is a Teensy++.  The two long boards are my ATxmega256A3 adapter boards.  The red and blue square boards are SparkFun nrf24l01+ radio boards.  Hanging mid-air is my ATxmega32A4 + nrf24l01+ board, with its debug lines soldered to jumpers that hold it midair.  Below the protoboard is a nrf24l01+ board designed to mount to a Teensy, for USB bridging.  Just above the USB key on the left is an edge view of SparkFun’s nrf24LU1 board that’s intended to supercede the Teensy-based unit. The little round board at the bottom is a compass-based servo board for a borehole geophone.  To the left above the red Sharpie is the latest version of my ACAM GP2 development board, used for ~50ps resolution time-of-flight measurements.  Its predecessor is the square board just above the protoboard.  On the extreme left edge you can see the round 2-board stackup of the 3rd revision of my main project, and the little Bluetooth debug adapter just below that.  Misc cables, programmers, hubs, tools, etc. are scattered everywhere else.

And that’s just that part of my desk.  The next 3 or 4 feet left contains my soldering environment, with iron and toaster, hand tools, and piles of parts everywhere…..
h1

Xmega fractional baud-rate source code

2010/08/18

Earlier I posted a spreadsheet I created that calculated the BSEL and BSCALE for the Xmega’s fractional baud-rate generator.  This works well to determine what the potential is for getting your chip to run a viable baud-rate for a given clock, but isn’t so useful when you actually want to write a configurable piece of code.

Since then I’ve developed two methods for generating the appropriate register settings for a given baud rate.  The first method was designed around the original constraints I had, which were that the CPU frequency and baud-rate were set statically in the source code, and never changed or dealt with programmatically.  As such, it’s a set of macros that determine the best available BSEL and BSCALE:

#ifndef __XMEGA_BAUD_H__
#define __XMEGA_BAUD_H__

#define _BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,bscale) (                \
((bscale) < 0) ?                                                      \
  (int)((((float)(f_cpu)/(8*(float)(baud)))-1)*(1<<-(bscale)))        \
: (int)((float)(f_cpu)/((1<<(bscale))*8*(float)(baud)))-1 )

#define _BSCALE(f_cpu,baud) (                                         \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-7) < 4096) ? -7 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-6) < 4096) ? -6 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-5) < 4096) ? -5 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-4) < 4096) ? -4 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-3) < 4096) ? -3 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-2) < 4096) ? -2 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,-1) < 4096) ? -1 :              \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,0) < 4096) ? 0 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,1) < 4096) ? 1 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,2) < 4096) ? 2 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,3) < 4096) ? 3 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,4) < 4096) ? 4 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,5) < 4096) ? 5 :                \
(_BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,6) < 4096) ? 6 :                \
7 )

#define BSEL(f_cpu,baud)                                              \
  _BAUD_BSEL_FROM_BAUDSCALE(f_cpu,baud,_BSCALE(f_cpu,baud))

#define BSCALE(f_cpu,baud) ((_BSCALE(f_cpu,baud)<0) ? (16+_BSCALE(f_cpu,baud)) : _BSCALE(f_cpu,baud))

#endif /* __XMEGA_BAUD_H__ */

(beware the line continuations!) Basically, the BSCALE macro steps through the +-7 range hunting for the highest legal (12-bit) BSEL value, and the BSEL macro uses that to generate the right divider.  A typical usage would be something like this:

#define F_CPU 32000000
#define BAUDRATE 115200

USARTC0.BAUDCTRLA = BSEL(F_CPU,BAUDRATE) & 0xff;
USARTC0.BAUDCTRLB = (BSCALE(F_CPU,BAUDRATE) << USART_BSCALE0_bp) | (BSEL(F_CPU,BAUDRATE) >> 8);

More recently I’ve been developing a more “object oriented” set of routines that allow me to stack one thing on top of another (more about that later).  As a result, I needed to develop a form of the above code that would work at runtime.  As you can see from the first macro in the above code, a naive approach would bring a microcontroller to its knees in a matter of seconds (as in: it could take entire seconds to calculate…).  In order to solve this problem I took a look at the problem from a different perspective, and ended up with the following code:

#define F_CPU 32000000

uint8_t xmega_usart_setspeed (USART_t *usart, uint32_t baud) {
  uint32_t div1k;
  uint8_t bscale = 0;
  uint16_t bsel;

  if (baud > (F_CPU/16)) return 0;

  div1k = ((F_CPU*128) / baud) - 1024;
  while ((div1k < 2096640) && (bscale < 7)) {
    bscale++;
    div1k <<= 1;
  }

  bsel = div1k >> 10;

  usart->BAUDCTRLA = bsel&0xff;
  usart->BAUDCTRLB = (bsel>>8) | ((16-bscale) << 4);

  return 1;
}

The above code will result in the best available baud rate, calculated with 0.1% precision (but does not guarantee 0.1% baud-rate accuracy), using only a single 32-bit divide.  My current headache prevents me from properly explaining how it works, but the clever reader should be able to puzzle it out pretty quickly.  I’ll try to replace this excuse with an actual explanation at some point in the future.  If I haven’t yet, write a comment reminding me….

If you’re running on a system with a variable system clock (e.g. stepping the clock up for a burst of performance and back down for a long sleep), you could easily modify the function to take the F_CPU as a function parameter rather than a #define.  Replacing the (F_CPU/16) with (F_CPU>>4) and (F_CPU*128) with (F_CPU<<7) might be necessary to hint the compiler, but everything else should work the same.  You could then precalculate and store the BAUDCTRL values for each clock speed, and swap them in as needed, or if your clock is more variable than that, just run the calculation each time.

I haven’t profiled the runtime of the code yet, but I suspect it’s well under 1000 cycles, dominated by the 32-bit divide.

h1

Reduced-pin-count program+debug connector for Xmega

2010/07/25

I’ve been struggling recently with the extremely small JST SH connectors, which are 1.0mm crimp-style.  No matter what wire I use, I can’t get them to stop breaking, and they’re a royal pain to put together in the first place.  This is a problem, because it’s one of the only connectors I can find that’s small enough to fit on my current project’s board with 6 pins.  I need those pins for power, ground, the two Xmega PDI lines, and a debug serial port.  Currently I connect either a standard AVR-ISP mkII or a Bluetooth serial adapter to the boards via this tiny connector, and all that wear and tear on the connector and wires is becoming quite evident.

I’m about to ditch Bluetooth for quite a few reasons, but that’s a whole other story.  Suffice it to say: the replacement will be an Xmega-based device, and that’s where this nifty little trick comes from.

I did some hunting and found that there actually is a connector that’s even smaller than the SH-6: USB micro-B.  It’s a hair narrower, and about 20% shallower, resulting in that much more board space to work my routing magic.  The only problem: USB only has 5 pins….

My first (or near enough) thought was “use the shield”.  Since I’d be constructing the cable from scratch, I can solder a 6th wire to the shield (which is not otherwise connected to anything but the cable’s braid) and use that as ground.  Cheesy, but functional.

Then I thought about how the PDI protocol is actually implemented.  I’d rewritten most of the PDI stack based on the example code in LUFA recently, and it turns out that PDI is basically a bidirectional synchronous serial protocol.  That means that RX and TX are both connected via 220R resistors to the PDI_DATA line, and XCK drives PDI_CLK.  This presents the possibility that the PDI and serial lines could be shared.

As it turns out, this works very nicely.  On my breadboard I hooked up an Xmega target and Xmega programmer, with a crazy serial loopback chain that routed from an FT232R through the programmer to the target, and out another FT232R.  The PDI and serial ports from the target are routed to the middle, where the resistors create PDI_DATA from RX and TX.  PDI_CLK, RX, and TX hop over to the three pins on the programmer.  The software running on the programmer is a two-port serial loop-through, with the exception of a “p” coming from the test computer.  In that case, it shuts down the normal serial port, starts up the PDI interface, confirms the chip’s signature, then switches back.

The end result is a 5-pin interconnect between the two chips with both full hardware programming and serial debugging capabilities.  As such, we can now route it through a USB micro-B connector ;-)

(click for larger, readable version)

Now, this won’t work with a “discrete” PDI prorgammer, since you only get the combined Rx+Tx line out of the programming header, and this trick depends on combining them on the device.  Thus, you pretty much have to have a “custom” unit doing the programming and serial bridging.  However, that’s exactly where I’m headed after Bluetooth bites the dust…

h1

Getting started with Xmega: differences from ATmega (part 2)

2010/06/27

In Part 1 I explained some of the high-level differences between the older ATmega and the newer Xmega chips.  This includes things like pinout cleanliness, enhanced peripheral count, and much less arbitrary overlap between functions.  In Part 2, I’ll be delving deeper into the architectural changes that result from this design, and how they make writing software for the Xmega much more manageable.

The issue at hand now is not where the peripherals are placed physically on the chip, but how they’re interacted with by software, logically.  As with any other microcontroller, this is done via registers.  These are specific locations in memory (or sometimes a “third” address space, besides memory and code) that when read from or written to will cause some behavior within the peripheral that the register is associated with.  For instance, writing to a USART data register will typically push the written byte into a temporary buffer and start transmitting that byte over the serial port.  Reading from the same register will pull from a different temporary buffer and retrieve the byte that was most recently received.  Other registers contain flags, such as the Transmit Enable flag in one of the USART’s control registers.

To start off with, we’ll again go back to the venerable ATmega*8 as used in the Arduino.  Let’s list all the registers that have anything to do with any of the Port D pins, and what their register address is:

  • 0xC6 UDR0
  • 0xC5 UBRR0H
  • 0xC4 UBRR0L
  • 0xC2 UCSR0C
  • 0xC1 UCSR0B
  • 0xC0 UCSR0A
  • 0xB4 OCR2B
  • 0xB3 OCR2A
  • 0xB2 TCNT2
  • 0xB1 TCCR2B
  • 0xB0 TCCR2A
  • 0x7F DIDR1
  • 0x7B ADCSRB
  • 0×70 TIMSK2
  • 0x6E TIMSK1
  • 0x6D PCMSK2
  • 0×69 EICRA
  • 0×50 ACSR
  • 0×48 OCR0B
  • 0×47 OCR0A
  • 0×46 TCNT0
  • 0×45 TCCR0B
  • 0×44 TCCR0A
  • 0x3D EIMSK
  • 0x3E EIFR
  • 0x3B PCIFR
  • 0×37 TIFR2
  • 0×35 TIFR0
  • 0x2B PORTD
  • 0x2A DRD
  • 0×29 PIND

That’s a lot of registers!  Now while I’m not going to claim that the Xmega uses particularly fewer registers than the ATmega, I challenge you to tell me quickly what every one of those registers does…  In comparison, the registers needed for Port D on an Xmega:

  • PORTCFG.
    • MPCMASK
    • CLKEVOUT
  • PORTD.
    • DIR,DIRSET,DIRCLR,DIRTGL
    • OUT,OUTSET,OUTCLR,OUTTGL
    • IN
    • INTCTRL
    • INT0MASK,INT1MASK
    • INTFLAGS
    • PIN0CTRL,PIN1CTRL,PIN2CTRL,PIN3CTRL,PIN4CTRL,PIN5CTRL,PIN6CTRL,PIN7CTRL
  • TCD0.
    • CTRLA
    • CTRLA,CTRLB,CTRLC,CTRLD,
    • CTRLE
    • INTCTRLA
    • INTCTRLB
    • CTRLFCLR,CTRLFSET
    • CTRLGCLR,CTRLGSET
    • INTFLAGS
    • TEMP
    • CNTH,CNTL
    • PERH,PERL
    • CC0H,CC0L,CC1H,CC1L,CC2H,CC2L,CC3H,CC3L
    • PERBUFH,PERBUFL
    • CC0BUFH,CC0BUFL,CC1BUFH,CC1BUFL,CC2BUFH,CC2BUFL,CC3BUFH,CC3BUFL
  • TCD1.*
  • HIRESD.CTRLA
  • USARTD0.
    • CTRLA,CTRLB,CTRLC
    • DATA
    • STATUS
    • BAUDCTRLA,BAUDCTRLB
  • USARTD1.*
  • TWID.
    • CTRL
    • MASTER.
      • CTRLA,CTRLB,CTRLC
      • STATUS
      • BAUD
      • ADDR
      • DATA
    • SLAVE.
      • CTRLA,CTRLB
      • STATUS
      • ADDR
      • DATA
      • ADDRMASK
  • SPID.
    • CTRL
    • INTCTRL
    • STATUS
    • DATA

Now this is somewhat more comprehensible.  Yes, there are a metric ton more registers, but they all represent significantly enhanced capabilities.  More importantly, they’re all grouped very clearly by module.  If you want to use the first USART on Port D, you start by setting USARTD0.CTRLA, and work from there, rather than trying to remember UCSR0A.  Good luck remembering which UCSR* goes with which port on a bigger chip like the ATmega128…

You’ll notice that both TCD1 and USARTD1 aren’t enumerated, but just listed with a *.  That’s because they have the exact same registers as their D0 counterparts (except that TCx1 drop the 2nd and 3rd compare registers).  Compared to the ATmega, that’s a major bonus: all the peripherals are the same, both between multiple instances in the same chip and between chips in the series.

Delving even deeper, let’s look at how the SPI port is described first in the ATmega8 header file:

/* SPI */
#define SPCR    _SFR_IO8(0x0D)
#define SPSR    _SFR_IO8(0x0E)
#define SPDR    _SFR_IO8(0x0F)

…and now how it’s defined in the Xmega headers:

/* Serial Peripheral Interface */
typedef struct SPI_struct
{
    register8_t CTRL;  /* Control Register */
    register8_t INTCTRL;  /* Interrupt Control Register */
    register8_t STATUS;  /* Status Register */
    register8_t DATA;  /* Data Register */
} SPI_t;
#define SPIC    (*(SPI_t *) 0x08C0)  /* Serial Peripheral Interface C */
#define SPID    (*(SPI_t *) 0x09C0)  /* Serial Peripheral Interface D */

In the Xmega, every peripheral is given a block of register space, and all the individual registers are allocated within that block.  The SPID register block looks exactly like the SPIC, and SPIE, and SPIF register blocks, except for the starting address.  Thus, the only difference between the ATxmega*A4 and ATxmega*A3 is the following:

#define SPIE    (*(SPI_t *) 0x0AC0)  /* Serial Peripheral Interface E */
#define SPIF    (*(SPI_t *) 0x0BC0)  /* Serial Peripheral Interface F */

A major side effect of all these structures is that you can now easily construct functions and other code structures that can operate on one of these peripherals purely by address:

void spi_init(SPI_t *port) {
    port->CTRL = 0xd0;
    port->STATUS = 0x80;
}
spi_init(&SPID);

With simply dot notation, you can significantly simplify your hardware configuration:

#define LED_PORT     PORTC
#define LED_RED_bp   0
#define LED_GREEN_bp 0
#define LED_BLUE_bp  0

LED_PORT.DIRSET = _BV(LED_RED_bp) | _BV(LED_GREEN_bp) | _BV(LED_BLUE_bp);
LED_PORT.OUTCLR = _BV(LED_RED_bp) | _BV(LED_GREEN_bp) | _BV(LED_BLUE_bp);
// . . .
LED_PORT.OUTSET = _BV(LED_BLUE_bp);

In comparison, the same for the ATmega8 would be:

#define LED_PORT    PORTC
#define LED_DIR     DDRC
// . . .

If you are using peripherals more complex than just an output port, you can imagine how having to #define all the various registers to keep track of which of potentially several similar peripherals (e.g. USARTs) is used by that logical device would get rather obnoxious.

This particular feature has saved me uncounted hours on the product I’m developing, by allowing me to keep the same codebase across multiple revisions of the hardware.  “hardware.h” contains a switchout that loads “hardware-rev1.h” or “hardware-rev2.h” or whichever.  All these files list the same board-level peripherals (the LED, the debug and RS-485 ports, I2C for the clock PLL, etc.), and as long as I use them exclusively in my actual code, all I have to do when I switch around the hardware layout is to generate a new file and change which ports and such are referenced.

(Part 3: how this peripheral interchangeability can make your project design radically more flexible)

h1

Getting started with Xmega: differences from ATmega (part 1)

2010/06/26

To start out this series on getting started with Atmel’s new Xmega chips, I first need to explain what it is that makes it an upgrade from the original AVR ATmega chips.  While there are a lot of common elements, the combination of a large number of peripherals and the mechanisms Atmel provides to connect them all together makes for a very powerful chip.  The Xmega are capable of things that an ordinary AVR can only dream about.

For reference, let’s start with the configuration of the ever-popular ATmega*8, the core of the Arduino series:

Here we have a color-coded diagram showing the pins with all their alternate functions.  There are a total of 3 ports, only two of which have all 8 bits.  Port C is missing PC7 entirely, and PC6 is generally unavailable as it is multiplexed with the RESET pin, required to reprogram the device.  Port B will be lacking PB6 and PB7 in most applications, as they are multiplexed with the crystal driver.  In addition, notice that the pins of a given port are not only scattered around in various places on the chip, but not necessarily even in order.  The ATmega*8 does better than some, and certainly light-years better than any PIC I’ve seen, but it’s still a routing challenge waiting to happen.

Even more than the pin orderings, notice the fact that there’s only one serial port, one I2C (sorry, TWI) port, and one SPI port.  Three timers give a total of 6 potential PWM outputs if you don’t need the timers for anything else, and you don’t need the SPI port that overlaps 2 of them.  Analog is spread between the 6 ADC pins on Port C (two of which are lost if you need I2C), and the comparator steals another PWM output from a different port entirely.  However a major upgrade to the ATmega*8 series versus previous generations is the addition of the PCINT* capability.  Instead of being stuck with just INT0 and INT1 for external interrupts, every single pin can be configured to trigger one of a cluster of interrupts.

Now let’s look at another popular AVR in a bigger package, the ATmega1284:


This looks a lot better, due in part to the larger package.  Not only do we get all 8 bits of every port, but they’re actually all in order.  We gain an additional serial port (TXD/RXD1) though without synchronous capability (no XCK1).  A couple more ADC pins are available, since Port A is complete and the TWI pins have moved elsewhere.  We’re still stuck with only 6 PWM’s, but only one of them is potentially unavailable, and only if the SPI module is used in slave mode (since SS# can be moved anywhere when the chip is in master mode).  The RESET# and XTAL pins have also moved to their own dedicated pins, so that’s even fewer lost pins, though with the drawback that we end up “losing” two pins if there’s no crystal attached.

Now let’s take a look at the ATxmega*A4, the smallest of the new line:

Right off the bat we notice a slight change: the package is no longer DIP, but TQFP.  This is the main drawback of the chips: they’re only available in surface-mount package.  However, I’ve rectified that by developing (and selling) adapters that convert the chips into standard DIP pinouts: (insert link here).

The next thing you should notice is a preponderance of highlighted pin functions.  Instead of 9 or 11 “major” alternate functions (serial, TWI, SPI), we have 27.  This chip has 5 serial ports, 2 TWI ports, and 2 SPI ports.  Even better, every single one is identical from a software perspective, but more on that later.  We also see a total of 12 ADC inputs, and even two DAC outputs!  Spread between ports C through E we find 16 PWM outputs, and the diagram doesn’t even bother showing the “PCINT” functionality, because every single pin of every single port is capable of various types of interrupts.  The crystal pins are available for use as a normal port (R) if you only need the internal oscillators.

A key feature is the fact that the programming pins are completely dedicated to the task.  Marked in purple above, RESET# and the CLK/DATA pins are all that are needed to program the Xmega chips (besides reference power and ground).  These pins are never multiplexed with anything else, so no more careful wiring of the SPI port so you can still flash the chip…

On the bigger end of things, we have the ATxmega*A1 chips:

Being the largest chip in the series it has 100 pins.  You should be able to click on the above image to get a larger one you might be able to read the labels on…

Working from ports A to R, this chip has: 16 inputs on 2 separate ADCs, 4 outputs on 2 separate DACs, 8 serial ports, 4 TWI ports, 4 SPI ports, 24 PWM outputs, and a memory interface capable of both SRAM and SDRAM up to 16MB.  A “timer” crystal connection is available on the extra 4 pins at the top just in case.

The pin arrangement is very clean, with every port in order around the chip, all contiguous, and all running in the same pin order (though the same can’t really be said of the BGA version, Atmel has been made aware of the serious flaws in pin placement there…).  There are power and ground pins for every port, capable of 200mA each.  In particular, that makes the chip capable of driving 20mA on every single pin simultaneously, a potential boon for those using discrete LEDs.

(Part 2: structural differences in how registers are managed make the plethora of peripherals more manageable)

h1

Xmega USART fractional baud-rate speadsheet

2010/06/16

The default baud rate on the Bluetooth adapters I’m using to program and debug the current generation of board for my main contract is 115.2K.  That’s actually rather slow when shoving 50-80KB onto the chip every time I make a code change.  The adapters are capable of up to 921.6K, but even at 32MHz a normal USART baud-rate generator ends up with a particularly ugly error percentage (8.5% as it happens, well outside the allowed 2.0%).  However, the Xmega has a fractional baud-rate generator.  I’m not actually sure how it operates, but I know it’s capable of generating much more accurate serial clocks.

Because the calculations are rather tedious, I designed a simple spreadsheet to tell you what the usable BSEL and BSCALE values are for a given rate.  Plug in your main clock rate and target baud rate, and it’ll show you the viable combinations.  For 32MHz 921.6Kbaud, the BSCALE has to be set to -2 or lower, with -7 providing a combination that’s only off 0.1% of nominal.

http://www.omegacs.net/products/atxmega/USART-fractional.ods

Follow

Get every new post delivered to your Inbox.