Thursday, October 06, 2016

Fixed Point Math with AVR-GCC

Wow, I see that it has been a long time since my last post. Sorry about that. I've been very busy. I have lots to talk about. I'd like to write about reading encoders, and I'd like to write about communicating with EEPROM chips that use two-wire protocols (I2C-like) as opposed to SPI-like protocols. But in the meantime I hope this short post will be useful to someone.

Embedded C

I recently had reason to do some non-integer math on a small microcontroller, a member of the Atmel ATtiny series. Floating point math on this chip is pretty much out of the question; there is no floating-point hardware. I think some of the chips in this family are big enough to hold floating-point library functions, but they will certainly eat up an enormous amount of the available program space, and given that they are eight-bit microcontrollers in most ways -- the registers are 8 bits wide -- it is probably best to just avoid floating point.

So I began looking into fixed-point math. It is always possible to roll your own code for this kind of thing, but I thought I would see if I could take advantage of existing, debugged library code first. I found some free software libraries online, but because I develop code that runs in commercial products, I was not really happy with their license terms. It also was not very clear how to use them or whether they would fit on the ATtiny chips.

I discovered that there is, in fact, a standard for fixed-point types in C. It has not been widely adopted, and like the C standard itself it is a little loose in parts, in that it doesn't dictate the numeric limits of types, but rather specifies a range of acceptable sizes. And it turns out that my toolchain supports this standard, at least in part.

I won't try to describe everything covered in the Embedded C document. I'll spare you my struggle trying to find adequate documentation for it or determine how to do certain things in an implementation that doesn't implement everything in the Embedded C document.

Instead I will try to do something more modest, and just explain how I managed to use a couple of fixed-point types to solve my specific problems.

You can find more information on the Embedded C standard here: https://en.wikipedia.org/wiki/Embedded_C

The actual Embedded C standards document in PDF form can be found here (note: this is a link to a PDF file): http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1169.pdf.

At the time of this writing, this seems to be the latest version available, dated April 4, 2006. The document indicates a copyright, but unlike the C and C++ standards, it looks like you can download it for no cost, at least at present.

avr-gcc

The compiler I'm using is avr-gcc. My actual toolchain for this project is Atmel Studio version 7.0.1006. Atmel Studio is available for download at no cost. The avr-gcc toolchain that Atmel Studio uses under the hood is available in other toolchains and as source code. I'm not going to try to document all the ways you can get it, but you can find out more here: https://gcc.gnu.org/wiki/avr-gcc.

As I understand it, these Embedded C extensions are not generally in across other versions of GCC.

The Basics of Fixed Point Types in Embedded C

I'm assuming I don't have to go into too much detail about what fixed-point math is. To put it briefly, fixed point types are like signed or unsigned integral types except there is an implicit binary point (not a decimal point, a binary point). To the left of that binary point, the bits indicate ascending powers of two as usual: 1, 2, 4, 8, etc. To the right of that binary point, the bits indicate fractional powers of two: 1/2, 1/4, 1/8.

The Embedded C extensions for fixed-point math came about, I believe, at least originally because many microcontrollers and digital signal processors have hardware support for fixed-point math. I've used DSPs from Motorola and Texas Instruments that offered accumulators for fixed-point math in special wide sizes, such as 56 bits, and also offered saturation arithmetic. Using these registers from C required special vendor-specific compiler support. If they were supported instead using Embedded C types, programmers would have a better shot at writing portable code.

There are a couple of basic approaches to these types mentioned in the standard. There are fractional types, indicated with the keyword Fract, with values between -1.0 and 1.0, and types that have an integral part and a fractional part, indicated with the keyword Acum. It is expected that implementations will give these aliases, like fract and accum, but I think the authors did not want to introduce potential name clashes with existing code.

The standard specifies the minimal formats for a number of types. For example, unsigned long accum is provides a minimum of 4 integer bits and 23 fractional bits. In our implementation, unsigned long accum actually provides 32 integral bits and 32 fractional bits. It maps to an underlying type that can hold the same number of bits. On this platform, that underlying type is unsigned long long, which on this platform is 64 bits.

Accumulator Types

For my algorithms, I don't have much interest in the Fract types and I'm going to use only the Accum types. I would have more interest in Fract_ types if there were standard ways available to multiply them together. In that case I could use a Fract_ type as a scale factor to apply to a large-ish integer in Accum representation. For example, let's say I want to generate an unsigned binary value to send to a DAC that accepts 18-bit values. I could create a value of an Accum type that represents the largest 18-bit value, and scale this by a _Fract value indicating a fraction to apply.

The advantage of this approach would be, I thought, that I would use types that were only as wide as I needed, resulting in less code. However, since this does not seem easy or convenient to do, in my own code I am using only _Accum types at present.

And, in fact, I'm using only unsigned _Accum types, specifically types that have a 16-bit unsigned integer value and a 16-bit fractional value (aka “16.16”), unsigned accum, and a 32-bit unsigned integer value and a 32-bit fractional value (aka “32.32”), unsigned long accum. The underlying types used to implement unsigned accum and unsigned long accum are unsigned long (32 bits) and unsigned long long (64 bits).

Fixed Point Constants

There are new suffixes to allow specifying fixed-point constants. For example, instead of specifying 15UL (for unsigned long), one can write 15UK for an unsigned accum type, or 15ULK for an unsigned long accum type. One can specify the fractional part, like 1.5UK.

On this platform, 1.5UK assigned to a variable of unsigned accum type will produce the 16.16 bit pattern 0000 0000 0000 0001 1000 0000 0000 0000 (hex 00018000), where the most significant 16 bits represent the integer part, and the least significant 16 bits represent the fractional part.

Accuracy

For our purposes we will mostly be using the integer results of fixed-point calculations. We don’t need to use the FX_FULL_PRECISION pragma; error of 2 ULPs for multiplication and division operations is fine.

A Very Simple Example

Here's a small program that shows a very simple calculation using unsigned accum types. I created a simple project in Atmel Studio that targets the ATtiny 841 microcontroller, which has 512 bytes of SRAM and 8 KiB of flash memory for programs. Today I'm not using a hardware debugger or attached chip. It is possible to configure the project's "Tool" settings to use a simulator instead of a hardware debugger or programmer.

#include <avr/io.h>
#include <stdfix.h>

static unsigned accum sixteen_sixteen_half = 0.5UK;
static unsigned accum sixteen_sixteen_quarter = 0.25UK;
static unsigned accum sixteen_sixteen_scaled;

int main(void)
{
    sixteen_sixteen_scaled = sixteen_sixteen_half * sixteen_sixteen_quarter;
}

We can watch this run in the debugger. In fact, this is the reason for including the volatile keyword in the variable declarations. Even with optimizations turned off, the compiler will still aggressively put variables in registers and avoid using memory at all if it can. While I don't seem to be able to use watches on these variables, as I can when using a hardware debugger and microcontroller, I can see the values change in memory as I step through the program. The values are organized as little-endian. Translating this, I can see that sixteen_sixteen_half shows up as 0x00008000, sixteen_sixteen_quarter shows up as 0x00004000, and the result of the multiplication operation, sixteen_sixteen_scaled, is assigned 0x00002000, representing one-eighth.

Code Size

If I bring up the Solution Explorer window (via the View menu in Atmel Studio), I can take a look at the output file properties by right-clicking. The generated .hex file indicates that it is using 310 bytes of flash. If I do the same calculation using float types, the library support for floating-point multiplication makes the flash use 580 bytes.

What happens if I scale up to a larger type? Well, if I change my unsigned accum declarations to use unsigned long accum, suddenly my flash usage goes up to 2776 bytes. That's a lot given that I have 8192 bytes of flash, but it still leaves me quite a bit of room for my own program code.

A Few Techniques

Let's say we want to scale a value to send to a linear DAC. Our DAC accepts 18-bit values. That means we can send it a value between 0x0 and 0x3FFFF.

To work directly with an _Accum type that will represent these values, we have to use an unsigned long accum. To declare an unsigned long accum variable that is initialized from an unsigned long variable, I can just cast it:

unsigned long accum encoder_accum = ( unsigned long accum )encoder_val;

We can also cast from a shorter integral type -- for example, from an unsigned accum -- and get the correct results. Beware of mixing signed and unsigned types! (As you always should, when working in C).

We can do math on our unsigned long accum types using the usual C math operators.

Let's say we want to get the unsigned long accum value converted back to an integral type. How would we do that? We use bitsulk to get the bitwise value (this is actually just a cast operation under the hood). Because we're going to truncate the fractional part, I add 0.5ULK first.

unsigned long val = bitsulk( encoder_accum + 0.5ULK ) >> ULACCUM_FBIT;

If we want the remainder as an unsigned long accum, we can get it. Remember that the fractional part of the accumulator type be [0.0..1.0] (that is, inclusive of zero, exclusive of 1). Note that the use of the mask here is not very portable, although there are some tricks I could do to make it more portable, but for now, I am more concerned about readability.

unsigned long accum remainder = ulkbits( bitsulk( encoder_accum ) & 0xFFFFFFFF );

The ulkbits and bitsulk operations are just casts, under the hood, so this boils down to a shift and mask.

The Embedded C specification defines a number of library functions that work with the fractional and accumulator types. For example, abslk() will give absolute value of an unsigned long accum argument. There are also rounding functions, like roundulk(). I have not actually had need of these. They seem to be supported in avr-gcc, but so far I have not needed them.

Conclusion

I hope this very brief tutorial may have saved you some time and aggravation in trying to use these rather obscure, but very useful, language features. If you come across anything interesting having to do with avr-gcc's support for the Embedded C fixed-point types, please leave a comment!

Saginaw, Michigan
October 6, 2016

Friday, February 26, 2016

SPI Communications with the Arduino Uno and M93C46 EEPROM: Easy, Fun, Relaxing

When I write code for an embedded microprocessor, I frequently need to use communications protocols that allow the micro to communicate with other chips. Often there are peripherals built in to the micro that will handle the bulk of the work for me, freeing up micro clock cycles and allowing me to write fewer lines of code. Indeed, the bulk of modern microcontroller datasheets is usually devoted to explaining these peripherals. So, if you aren't trying to do anything unusual, your micro may have a peripheral that will do most of the work for you. There might be a pre-existing driver library you can use to drive the peripheral. But, sometimes, you don't have a peripheral, or it won't do just what you need it to do, for one reason or another In that case, or if you just want to learn how the protocols work, you can probably seize control of the GPIO pins and implement the protocol yourself.

That's what I will do, in the example below. I will show you how to implement the SPI (Serial Peripheral Interface) protocol, for communicating with an EEPROM. I've used SPI communication in a number of projects on a number of microcontrollers now. The basics are the same, but there are always issues to resolve. The SPI standard is entertaining keeps you on your toes, precisely because it is so non-standard; just about every vendor extends or varies the standard a bit.

The basics of SPI are pretty simple. There are four signals: chip select, clock, incoming data, and outgoing data. The protocol is asymmetrical; the microcontroller is usually the master, and other chips on the board are slaves -- although it would be possible for the micro to act as a slave, too. The asymmetry is because the master drives the chip select and clock. In a basic SPI setup, the slaves don't drive these signals; the slave only drives one data line. I'll be showing you how to implement the master's side of the conversation.

Chip select, sometimes known as slave select from the perspective of the slave chip, is a signal from the master to the slave chip. This signal cues the slave chip, informing the chip that it is now "on stage," ready for its close-up, and it should get ready to communicate. Whether the chip select is active high, or active low, varies. Chip select can sometimes be used for some extra signalling, but in the basic use case the micro set the chip select to the logically active state, then after a short delay, starts the clock, runs the clock for a while as it sets and reads the data signals, stops the clock, waits a bit, and turns off the chip select.

Here's a picture showing the relationship between clock and chip select, as generated by my code. Note that I have offset the two signals slightly in the vertical direction, so that it is easier to see them:

The clock signal is usually simple. The only common question is whether the clock is high when idle, or low when idle. Clock speeds can vary widely. Speeds of 2 to 10 MHz are common. Often you can clock a part much slower, though. CMOS parts can be clocked at an arbitrarily slow speed; you can even stop the clock in the middle of a transfer, and it will wait patiently.

What is less simple is the number of clocks used in a transaction. That can become very complex. Some parts use consistent transfer lengths, where for each transaction, they expect the same number of clock cycles. Other parts might use different numbers of clock cycles for different types of commands.

From the perspective of the slave, the incoming data arrives on a pin that is often known, from the perspective of the microcontroller, as MOSI (master out, slave in). This is again a simple digital signal, but the exact way it is interpreted can vary. Essentially, one of the possible clock transitions tells the slave to read the data. For example, if the clock normally idles low, a rising clock edge might signal the slave to read the data. For reliability, it is very important that the master and slave are in agreement about which edge triggers the read. Above all, you want to avoid the case where the slave tries to read the incoming data line on the wrong edge, the edge when the master is allowed to change it. If that happens, communication might seem to work, but it works only accidentally, because the slave just happens to catch the data line slightly after it has changed, and it may fail when the hardware parameters change slightly, such as when running at a higher temperature.

Let me be as clear as I can: when implementing communication using SPI, be certain you are very clear about the idle state of the clock line, and which clock transition will trigger the slave to read the data line. Then, make sure you only change the data line on the opposite transition.

Terminology surrounding SPI transactions can be very confusing. According to Wikipedia and Byte Paradigm, polarity zero means the clock is zero (low) when inactive; polarity one means the clock is one (high) when inactive.

Phase zero means the slave reads the data on the leading edge, and the master can change the value on the trailing edge, while phase one means the slave reads the data line on the rising edge, and the master changes the data line on the falling edge).

But some Atmel documentation (like this application note PDF file) uses the opposite meaning for "phase," where phase one means the slave reads data on the leading edge.

Because of this confusion, in my view it is best not to specify a SPI implementation by specifying "polarity" and "phase." So what would be clearer?

Aardvark tools use the terms "rising/falling" or "falling/rising" to describe the clock behavior, and "sample/setup" or "setup/sample" to indicate the sampling behaviors. I find this to be less ambiguous. If the clock is "rising/falling," it means that the clock is low when idle, and rises and then falls for each pulse. If the "sample" comes first, it means that the slave should read the data line on the leading edge, and if the "setup" comes first, it means that the slave should read the data on the trailing edge.

Here's a picture of my clock signal along with my MOSI (master out, slave in) signal. This SPI communication variant is "rising/falling" and "sample/setup." In order to allow the slave to read a valid bit on the leading clock edge, my code sets the MOSI line to its initial state before the rising edge of the first clock pulse. Again, I have offset the signals slightly in the vertical direction, so that it is easier to see them:

In the screen shot above, the master is sending nine bits: 100110000. Each bit is sampled on the rising clock edge. On the first rising clock edge, the MOSI line (in blue) is high. On the second rising clock edge, the MOSI line is low.

From the perspective of the slave, the outgoing data is sent on a pin that is often known as MISO (master in, slave out). This works in a similar way as the incoming data, except that the slave asserts the line.

When the master sends data to the slave, the master turns on the chip select (whether that means setting it low, or setting it high), changes the MOSI line and clock as needed, and then turns off the chip select.

When the master receives data from the slave, the behavior is slightly more confusing. To get data from the slave, the master has to generate clock cycles. This means that it is also sending something, depending on how it has set the MOSI line. During the read operation, what it is sending may consist of "I don't care" bits that the slave will not read. Receiving data can sometimes require one transaction to prepare the slave for the read operation, and then another to "clock in" the data. Sometimes a receive operation may be done as one transaction, but with two parts: the master sends a few bits indicating a read command, and then continues to send clock cycles while reading the slave's data line. Sometimes there are dummy bits or extra clock cycles in between the parts of this transaction.

Here's a picture that shows a read operation. I'm showing clock and MISO (mmmm... miso!) This shows a long transaction where the master sends a request (the MOSI line is not shown in this picture) and then continues to generate clock pulses while the slave toggles the MISO line to provide the requested data.

Now let's look at my hardware and software. I wrote some code to allow an Arduino Uno to communicate with a serial EEPROM chip. The chip in question is a M93C46 part. This is a 1Kb (one kilobit, or 1024 bits) chip. The parts are widely available from different vendors. I have a few different through-hole versions that I got from various eBay sellers; in testing them, they all worked fine. The datasheet I used for reference is from the ST Microelectronics version of the part.

These parts all seem to have similar pinouts. Pin 1 is the chip select, called slave select in the STM documentation. Pin 2 is the clock. Pins 3 and 4 are data pins. On the other side of the chip, there is a pin for +5V or +3.3V, a pin for ground, an unused pin presumably used by the manufacturer for testing, and a pin identified as ORG (organization), which determines whether the data on the chip is organized into 64 16-bit words, or 128 8-bit bytes.

There are other versions of this chip; the 1Kb is only one version. The command set differs slightly between sizes, but it should be pretty easy to adapt my example to a different-sized part. A full driver would be configurable to handle different memory sizes. It would not be hard to implement that, but for this example I am keeping things simple.

Here's my simple circuit, on a prototype shield mounted to an Arduino Uno:

Here's a simple schematic showing the Arduino pins connected to the EEPROM chip:

I'm not much of an electrical engineer, but that should convey that pin 1, usually marked with a little dot or tab on the chip, is on the lower right. We count pins counter-clockwise around the chip. So pin 5 goes to ground (I used the ground next to the data pins; that is the green wire going across the board). Make sure you are careful to connect the right pins to power and ground, or you can let the magic smoke one of these little EEPROM chips, and maybe disable your Arduino board, too, perhaps permanently (you'll never guess how I know this!)

I also have three LEDs connected to three more pins, connected through 220 ohm resistors, with the negative side of the LEDs going to a ground pin on the left side of the prototype board. Those are not required; they are there solely to create a simple busy/pass/fail display. You can use the serial monitor, if the Arduino is attached to your computer, or whatever other debugging method is your favorite.

I have done this kind of debugging with elaborate, expensive scopes that have many inputs and will decode SPI at full speed. That is very nice, but you don't necessarily need all for a simple project like this. I got this project working using a Rigol two-channel scope. I was not able to capture a trace of all our lines at once using this scope, but I didn't need to. With two channels, I could confirm that the chip select and clock were changing correctly with respect to each other. Then I could look at the MOSI along with the clock and verify that the data was changing on the expected clock transition. Then I could look at the MISO along with the clock to verify the bits the Arduino was getting back from the serial EEPROM chip. Here's my modest setup, using a separate breadboard rather than a shield:

Here's a view of a SPI conversation with the EEPROM chip: a write operation, followed by a read operation to verify that I can get back what I just wrote. This shows clock and MOSI, so we don't see the slave's response, but you can see that the second burst has a number of clock cycles where the master is not changing the data line. Those are "don't care" cycles where the master is listening to what the slave is saying. Note also that I am running this conversation at a very slow clock speed; each transition is 1 millisecond apart, which means that my clock is running at 500 Hertz (not MHz or even KHz). I could certainly run it faster, but this makes it easy to see what is happening, if I toggle an LED along with the chip select to show me when the master is busy.

Now, here's some code.

You don't have to use these pins, but these are the ones I used.

#define SLAVESELECT 10 /* SS   */
#define SPICLOCK    11 /* SCK  */
#define DATAOUT     12 /* MOSI */
#define DATAIN      13 /* MISO */

Here's a "template" 32-bit word that holds a 16-bit write command.

#define CMD_16_WRITE ( 5UL  << 22 )
#define CMD_16_WRITE_NUM_BITS ( 25 )

This defines a 25-bit command. There is a start bit, a 2-bit opcode, a six-bit address (for selecting addresses 0 through 63), and 16 data bits.

To use this template to assemble a write command, there's a little helper function:

uint32_t assemble_CMD_16_WRITE( uint8_t addr, uint16_t val )
{
    return ( uint32_t )CMD_16_WRITE   |
           ( ( uint32_t )addr << 16 ) |
             ( uint32_t )val;
}

Now we need a function that will send that command. First, let's start with a function that will send out a sequence of bits, without worrying about the chip select and final state of the clock.

void write_bit_series( uint32_t bits, uint8_t num_bits_to_send )
{
    uint8_t num_bits_sent;

    for ( num_bits_sent = 0; num_bits_sent < num_bits_to_send;
          num_bits_sent += 1 )
    {
        digitalWrite( SPICLOCK, LOW );
        digitalWrite( DATAOUT, bits & ( 1UL <<
            ( num_bits_to_send - num_bits_sent - 1 ) ) ? HIGH : LOW );
        delay( INTER_CLOCK_TRANSITION_DELAY_MSEC );
        digitalWrite( SPICLOCK, HIGH );
        delay( INTER_CLOCK_TRANSITION_DELAY_MSEC );

    }

}

This maps the bits to the DATAOUT (or MISO) line. We change the data line on the falling edge of the clock. We aren't using a peripheral to handle the SPI data; we just "bit bang" the outputs using a fixed delay.

Here's a function that will send a command that is passed to it. It works for write commands:

void write_cmd( uint32_t bits, uint8_t num_bits_to_send )
{
    digitalWrite( SLAVESELECT, HIGH );
    delay ( SLAVE_SEL_DELAY_PRE_CLOCK_MSEC );

    write_bit_series( bits, num_bits_to_send );

    /*
        Leave the data and clock lines low after the last bit sent
     */
    digitalWrite( DATAOUT, LOW );
    digitalWrite( SPICLOCK, LOW );

    delay ( SLAVE_SEL_DELAY_POST_CLOCK_MSEC );
    digitalWrite( SLAVESELECT, LOW );

}

That's really all you need to send out a command. For example, you could send a write command like this:

write_cmd( assemble_CMD_16_WRITE( addr, write_val ), CMD_16_WRITE_NUM_BITS );

Note that before you can write successfully, you have to set the write enable. My code shows how to do that. Basically, you just define another command:

#define CMD_16_WEN   ( 19UL <<  4 )
#define CMD_16_WEN_NUM_BITS   (  9 )

write_cmd( ( uint16_t )CMD_16_WEN, CMD_16_WEN_NUM_BITS );

This EEPROM chip will erase each byte or word as part of a write operation, so you don't need to perform a separate erase. That may not be true of all EEPROM chips.

To read the data back, we need a slightly more complex procedure. Our read command uses the write_bit_series function to send out the first part of the read command, then starts clocking out "don't care" bits and reading the value of the MOSI line:

uint16_t read_16( uint8_t addr )
{
    uint8_t num_bits_to_read = 16;
    uint16_t in_bits = 0;
    uint32_t out_bits = assemble_CMD_16_READ( addr );

    digitalWrite( SLAVESELECT, HIGH );
    delay ( SLAVE_SEL_DELAY_PRE_CLOCK_MSEC );

    /*
        Write out the read command and address
    */
    write_bit_series( out_bits, CMD_16_READ_NUM_BITS );

    /*
        Insert an extra clock to handle the incoming dummy zero bit
    */
    digitalWrite( DATAOUT, LOW );

    digitalWrite( SPICLOCK, LOW );
    delay( 1 );

    digitalWrite( SPICLOCK, HIGH );
    delay( 1 );

    /*
        Now read 16 bits by clocking. Leave the outgoing data line low.
        The incoming data line should change on the rising edge of the
        clock, so read it on the falling edge.
    */
    for ( ; num_bits_to_read > 0; num_bits_to_read -= 1 )
    {
        digitalWrite( SPICLOCK, LOW );
        uint16_t in_bit = ( ( HIGH == digitalRead( DATAIN ) ) ? 1UL : 0UL );
        in_bits |= ( in_bit << ( num_bits_to_read - 1 ) );
        delay( INTER_CLOCK_TRANSITION_DELAY_MSEC );

        digitalWrite( SPICLOCK, HIGH );
        delay( INTER_CLOCK_TRANSITION_DELAY_MSEC );
    }

    /*
        Leave the data and clock lines low after the last bit sent
     */
    digitalWrite( DATAOUT, LOW );
    digitalWrite( SPICLOCK, LOW );

    delay ( SLAVE_SEL_DELAY_POST_CLOCK_MSEC );
    digitalWrite( SLAVESELECT, LOW );

    return in_bits;

}

And that's the basics. To test this, I put an EEPROM chip on a breadboard and just wired up the pins as specified in the code. Check your datasheet to determine if you can power the part with 5V or 3V. The chips I got seem to work fine with either, although if you are testing with a scope, you might want to use 5V so that the data out you get back from the chip has the same level as the 5V Arduino outputs.

You can find the full sketch on GitHub here.

Good luck, and if you found this useful, let me know by posting a comment. Comments are moderated, so they will not show up immediately, but I will post all (non-abusive, non-spam) comments. Thanks for reading!

Friday, January 01, 2016

Star Wars: The Force Awakens

This review contains many spoilers.

I want to start out by saying that I really was expecting, even hoping, to dislike The Force Awakens.

Entering the theater a cynical, somewhat bitter middle-aged man, I fully expected to be able to take my distaste for the other work of J. J. Abrams (particularly, his atrocious 2009 Star Trek reboot), and Disney, and recycled nostalgia in general, and throw it directly at the screen.

An original fan of Star Wars -- I saw the first one perhaps a dozen times in the theater -- and pretty much agree with the critical consensus about the prequels. Their utter failure led me to believe that the things I loved most about Episode IV had, for the most part, little to do with big-budget filmmaking, but were the result of giving a bunch of really brilliant costume and set designers and cinematographers and editors and sound designers a lot of creative control and a relatively low budget -- a situation unlikely to be replicated in a truly big film, an important investment where none of the investing parties would want to take any significant risks.

I was wrong, and I'm still somewhat troubled by that. Is The Force Awakens a good movie, or was I just primed by my age and the bad experience with the prequels to suck up something relatively bad and call it good, simply because it lacks the awfulness of the prequels, and smells a lot like the 1977 original? I don't think I can actually answer that question, definitively, at least not easily because really taking that up requires me to think critically about the original 1977 Star Wars, something I find hard to do, given the way the film imprinted itself upon my nine-year-old brain. Is it really all that and a bag of chips? Or did it just land at the right time to be the formative movie of my childhood?

One of my sons is nine, by the way. He enjoyed the new movie, but I don't think it blew his mind the way the original Star Wars blew mine, simply because we have, ever since 1977, lived in the era that had Star Wars in it.

To be clear -- it's not the case that there weren't big action movies back then, and big science fiction movies back then. We had movies like 2001: a Space Odyssey, which also formed my tastes. We had Silent Running. We had Logan's Run. But it would be impossible to overstate the shock wave of Star Wars -- the innovative effects, editing, and yes, even marketing. We just can't go back to that world. He's seen a lot of things that are Star Wars-ish, while in 1977, I never had.

And make no mistake, the new Star Wars is, most definitely, Star Wars-ish, in the way that the prequels were not. The world of the prequels was too clean, to sterile, too political, and too comic. Star Wars may have been the single most successful blending of genres ever attempted; a recent article called it "postmodern," and I think that is correct. The prequels might have been attempts at post-modern, too, but they seem to have a different set of influences, and just seem, in every respect, to have been assembled lazily, and without artfulness. For just one example, see how one of the prequel lightsaber battle scenes was actually filmed.

The Force Awakens follows the 1977 formula so closely that it is perilously close to coming across as a kind of remake or pastiche of the original. But it is not that. It is actually an homage to the original. There are a lot of parallel details and actual "easter eggs," where props make cameos, audio clips from the original movies are sprinkled into the new one. In one of my favorite moments, on Starkiller base we hear a clip from the first movie, "we think they may be splitting up." Some reviewers have made their reviews catalogs of these moments, and consider this excessive, complaining about the "nostalgia overload." But although it is noticeable, I think the producers knew just how much nostalgia would be appreciated, and how much would become annoying, and walked that line very well. The film re-creates the world where Han Solo and Leia Organa will not look out of place. And when Harrison Ford and Carrie Fisher actually appear on screen, the weight of almost 40 years suddenly lands on me, and it's a gut punch. I must have gotten something in my eye.

While Solo is an important character in this film, Leia has very few scenes, and the action centers largely around new characters. The casting is what you might call modern. No one could claim that Rey is not a strong, compelling female character. Daisy Ridley's acting in this movie is very impressive. Without her strong performance, we'd be prone to spend time musing on the oddness of her almost-absent back-story. As it is, we aren't really given a lot of time to meditate on such things, because she keeps very busy, kicking ass and taking names.

John Boyega as Finn is good too, although he doesn't seem, to me, to quite inhabit his character the way Ridley inhabits Rey. And so I find myself spending a little time wondering about the gaps and inconsistencies in his character's back-story. He describes himself as a sanitation worker. If that's true, why is he on the landing craft in the movie's opening scenes, part of the First Order interplanetary SWAT team sent to Jakku to retrieve the MacGuffin? He's supposedly a low-level worker on Starkiller base, but he knows how to disable the shields? He's a stormtrooper, trained since birth to kill, but unable to kill. Has he never been "blooded" before? We're unfortunately reminded that this doesn't actually make a lot of sense. Of course this is true of many elements of the original trilogy. The key to making that kind of thing not matter, for a Star Wars movie, is to keep everything happening so fast that the audience doesn't have time to worry about all that.

The story moves along quickly and we meet one of the most interesting characters, Kylo Ren, played by Adam Driver. Driver plays an adolescent, and puts Hayden Christiansen's portrayal of Anakin to utter shame -- although one senses that much of Christiansen's failure may have been due to Lucas's poor direction of the young actor. Driver is completely compelling on-screen, and his scenes with Ridley are just mesmerizing. I really can't say enough good things about them. I've seen two screenings now, and I would happily see it again, just to watch those two characters interact. It's really impressive.

That's really enough to hang a movie on -- a few really great performances, a few good performances, some terrific scenes, and no scenes that are actually bad. (Howard Hawks famously said that to make a good movie, you needed three good scenes and no bad ones; The Force Awakens exceeds that requirement).

Of course, there are a lot of confusing, unconvincing, and unwieldy things about this film. For example, Rey is much stronger in the ways of the force, and a very powerful fighter, right off the bat. She's grown up on Jakku, apparently spending years alone, and entirely untrained, while in the original trilogy we watched Luke start off with some talent for using the Force, but not much skill, and get trained up like Rocky Balboa. How did this come to pass? Well, it's a mystery we just have to accept for now. Maybe she had a lot of karate classes as a very young child. I maintain that when a movie likes this leaves things unexplained, the audience will do the work for the screenwriters and make it work -- if the audience has decided to side with the movie and help it along. And if they haven't, no amount of rationalization will explain away the inevitable plot holes in a satisfying way. This movie has done such a good job at entertaining the audience, and introducing a compelling character early on, that we as the audience are pretty happy to go along, and willing to make a few allowances and give it the benefit of the doubt. With the prequels, we were bored and full of doubt, for good reasons.

There are a few flaws that I think are worth noting. The movie is just slightly too long. The reawakening of Artoo-Detoo, just after the destruction of the big bad Starkiller base, allowing the plot to continue with a literal deus ex machina -- is just slightly too silly.

What is up with Kylo Ren's helmet, and Captain Phasma's helmet? One of the notable things about the Empire was the extreme precision and cleanliness of the costumes, including the stormtrooper helmets and Darth Vader's helmets. But in the new movie, Ren's helmet is dinged and dented, with chipped paint, and Phasma's helmet is covered in fingerprints. It's not accidental; even the action figures of Kylo Ren have molded-in dents, and there is no way that someone simply forgot to polish Phasma's helmet; such an error would certainly be caught. They were made to look that way deliberately, in stark contrast to the other uniforms and suits of armor. Why is that?

There are some scenes with the Resistance, preparing X-wing fighters, that look like they were literally shot on the site of a freeway overpass; that reminded me of the way J. J. Abrams decided it was a good idea to use a brewery for the engine room of the Enterprise -- an incredibly dumb, unconvincing, revisionist look for the Engineering set. The Imperial wreckage on Jakku -- both Imperial Star Destroyers and the walkers from the invasion of Hoth in Empire -- is nostalgic, but bizarre.

There are some coincidences that feel just a little too coincidental. How did Luke's lightsaber wind up in Maz's basement, in an unlocked trunk, in an otherwise empty room?

Starkiller Base makes very little sense; the physics of it just don't work, in any reasonable universe. The Resistance leaders explain that it sucks up "the sun" -- not "the nearest star" -- in a galaxy with billions of suns, in a film set on multiple planets, around multiple stars, the producers apparently don't trust the audience to understand how stars and planets work; don't confuse them!

But none of this is really a deal-breaker, because the movie moves so fast, and is so willing to break things. Which brings me to the biggest spoiler of all.

The movie kill Han Solo. Yes, they went there. It was at that moment that the film won me over completely. It was a brave move, and it needed to happen. The screenwriters, including Lawrence Kasdan, who worked on Empire, knew very well that if the audience was to take this movie seriously, it would need to show them that it was serious. That's what the death of Han Solo means. Harrison Ford -- who, by the way, is excellent in this film -- has a terrific death. This is also the reason that, for episode IX to work, the screenwriters will have to kill another major character -- most likely, General Leia -- in the first ten minutes.

Given the impressive start to that trilogy, I believe they will do the right thing -- and it will be glorious. And we'll regard the prequels as an unfortunate, non-canon interlude, a mere glitch, in the continuity of the Star Wars story -- and Lucas will continue his slide into irrelevant lunacy.

And meanwhile, as I approach fifty, I still have to wonder. What was the point of Star Wars? Was it ever anything resembling a genuine artistic statement, or was it always a coldly calculated money-grabbing machine, powered by myth, in which Lucas figured out how to monetize the scholarship of Joseph Campbell? Was Star Wars ever actually about anything? Was it "real" art, more than a dizzying whirlwind of entertainment, built on genre tropes and with very little in it that was groundbreaking but the improved technology of movie-making?

Was I simply bamboozled, as a child, into imagining that I was seeing a piece of art, something meaningful? If so, does it matter? Is that dizzying whirlwind of entertainment, blended with a calculated human story arc, really enough? Can real art ever be made out of genre fiction? How about Tolkien? What about smashed-together, postmodern genre fiction? Is it just screenwriting that somehow loses the status of "art?" If I enjoy both Moby Dick and Star Wars, is there something wrong with me?

And, if these distinctions don't matter, and the Disney corporation buys George Lucas's property for four billion dollars, knowing they will turn enormous profits on that investment for decades, and makes us a compelling Star Wars entirely cynically, built literally out of the formulaic building blocks of the original, but it works as well, as wonderfully -- distractingly, entertainingly, wonderfully -- as the original, does that matter? And what does it say about art, and about its audience?