Thursday, August 28, 2014

Flash Memory Library


Flash Memory Library



This library provides routines for accessing microcontroller's (internal) Flash memory.

On the dsPIC30/33 and PIC24, Flash memory is mapped to address space 3:2, which means that every 3 consecutive bytes of Flash have 2 consecutive address locations available. That is why mikroE's library allows data to be written to flash in two ways: "regular" and "compact". In the "regular" mode, which is used for word(16-bit) variables, the 3rd (un-addressable) flash memory byte remains unused. In the "compact" mode, which can be used for 1 byte-sized variables/arrays, all flash bytes are being used.

All dsPIC30/33 and PIC24 MCUs use the RTSP module to perform Read/Erase/Write operations on Flash memory. This, together with the internal structure of the Flash, imposes certain rules to be followed when working with Flash memory:

dsPIC30:


  • Erasing can be done only in 32-instructions (64 addresses, 96 bytes) memory blocks. This means that the block start address should be a multiply of 64 (i.e. have 6 lower bits set to zero).
  • Data is read and written in 4-instructions (8 addresses, 12 bytes) blocks.This means that the block start address should be a multiply of 8 (i.e. have 3 lower bits set to zero).
  • On the dsPIC30s, 2 address locations are assigned on every 3 bytes of (flash) program memory. Due to this specific and non-one-to-one address mapping, the mikroC PRO for dsPIC30/33 and PIC24 offers two sets of Flash handling functions: "regular" and "compact".
    Using the "regular" set, the user can write one byte of data to a single address, which means that each byte of written data has its own address, but on every 2 written bytes one byte of Flash memory remains empty.
    Using the "compact" set, every byte of Flash memory, including those non-addressable, is filled with data; this method can only be used for data organized in bytes.
    The "compact" functions have _Compact as name suffix.
  • For run-time FLASH read/write, the dsPIC30's RTSP module is being used. It organizes data into rows and panels. Each row contains write latches that can hold 4 instructions (12 bytes). The number of panels varies from one dsPIC30 MCU model to another. Because of that, the flash write sequence has been split into several operations (_Write_Init(), _Write_LoadLatch4(), _Write_DoWrite()), in order to be usable on all dsPICs.

PIC24 and dsPIC33:


  • Erasing can be done only in 512-instructions (1024 addresses, 1536 bytes) memory blocks, which means that the block start address should be a multiply of 1024 (i.e. have 10 lower bits set to zero).
  • Data is read and written in 64-instructions (128 addresses, 192 bytes) blocks.This means that the block start address should be a multiply of 128 (i.e. have 7 lower bits set to zero).
  • On the dsPIC33 and PIC24s, 2 address locations are assigned on every 3 bytes of (flash) program memory. Due to this specific and non-one-to-one address mapping, the mikroC PRO for dsPIC30/33 and PIC24 offers two sets of Flash handling functions: "regular" and "compact".
    Using the "regular" set, the user can write one byte of data to a single address, which means that each byte of written data has its own address, but on every 2 written bytes one byte of Flash memory remains empty.
    Using the "compact" set, every byte of Flash memory, including those non-addressable, is filled with data; this method can only be used for data organized in bytes.
    The "compact" functions have _Compact as name suffix.

24F04KA201 and 24F16KA102 Family Specifics :


  • These MCU's have their Flash memory organized into memory blocks of 32 instructions (96 bytes), unlike other PIC24 devices.
  • Erasing can be done only in 32-instructions (64 addresses, 96 bytes) memory blocks, which means that the block start address should be a multiply of 64 (i.e. have 6 lower bits set to zero).
  • Data is read and written in 32-instructions (64 addresses, 96 bytes) blocks. This means that the block start address should be a multiply of 64 (i.e. have 6 lower bits set to zero).
  • Unlike other PIC24 devices, writing or erasing one block of data (32 instructions), is followed by erasing the memory block of the same size (32 instructions).

Library Routines


dsPIC30 Functions





PIC24 and dsPIC33 Functions



dsPIC30 Functions


FLASH_Erase32


Prototype void FLASH_Erase32(unsigned long address);
Description Erases one block (32 instructions, 64 addresses, 96 bytes)from the program FLASH memory.
Parameters
  • address: starting address of the FLASH memory block
Returns Nothing.
Requires Nothing.
Example
//--- erase the 32-instruction block, starting from address 0x006000
FLASH_Erase32(0x006000);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write_Block


Prototype void FLASH_Write_Block(unsigned long address, unsigned int *data_);
Description Fills one writeable block of Flash memory (4 instructions, 8 addresses, 12 bytes) in the "regular" mode. Addresses and data are being mapped 1-on-1. This also means that 3rd byte of each program location remains unused.
Parameters
  • address: starting address of the FLASH memory block
  • data_: data to be written
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code (through the RTSP), or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
Example
unsigned long flash_address = 0x006000;
unsigned int Buffer[4] = {'A', 'B', 'C', 'D'};
...
FLASH_Write_Block(flash_address, Buffer);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write_Compact


Prototype void FLASH_Write_Compact(unsigned long address, void *data_, unsigned bytes);
Description Fills a portion of Flash memory using the dsPIC30 RTSP module, in the "compact" manner. In this way, several blocks of RTSP's latch can be written in one pass. One latch block contains 4 instructions (8 addresses, 12 bytes). Up to 8 latch blocks can be written in one round, resulting in a total of 8*12 = 96 bytes. This method uses all available bytes of the program FLASH memory, including those that are not mapped to address space (every 3rd byte).
Parameters
  • address: starting address of the FLASH memory block
  • data_: data to be written
  • bytes: number of bytes to be written. The amount of bytes to be written must be a multiply of 12, since this is the size of the RTSP's write latch(es).
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code FLASH_Erase32, or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
Example
unsigned long flash_address = 0x006000;
char Buffer[] = "supercalifragillisticexpialidotious";
...
FLASH_Write_Compact(flash_address, Buffer, 36);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write_Init


Prototype void FLASH_Write_Init(unsigned long address, void *data_);
Description Initializes RTSP for write-to-FLASH operation.
Parameters
  • address: starting address of the FLASH memory block
  • data_: data to be written
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code FLASH_Erase32, or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
Example
//--- Initializes the Flash to be written, starting from address 0x006100, the data is located at *pv1
void *pv1;
...
FLASH_Write_Init(0x006100, pv1);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write_Loadlatch4


Prototype void FLASH_Write_Loadlatch4();
Description Loads the current RTSP write latch with data (4 instructions, 8 addresses, 12 bytes). The data is filled in the "regular" mode.
Parameters None.
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code FLASH_Erase32, or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
This function is used as a part of the Flash write sequence, therefore the FLASH_Write_Init function must be called before this one.
This function can be called several times before commiting the actual write-to-Flash operation FLASH_Write_DoWrite. This depends on the organization of the RTSP module for the certain dsPIC30. Please consult the Datasheet for particular dsPIC30 on this subject.
Example
//--- writes data from an array, in "regular" manner
unsigned int iArr[16] = {'m', 'i', 'k', 'r', 'o', 'E', 'l', 'e', 'k'};
void * pv1;
...
pv1 = iArr;
FLASH_Write_Init(0x006100, pv1);
FLASH_Write_Loadlatch4();
FLASH_Write_Loadlatch4();
FLASH_Write_DoWrite();
Notes None.

FLASH_Write_Loadlatch4_Compact


Prototype void FLASH_Write_Loadlatch4_Compact();
Description Loads the current RTSP write latch with data (4 instructions, 8 addresses, 12 bytes). The data is filled in the "compact" mode.
Parameters None.
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code FLASH_Erase32, or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
This function is used as a part of the Flash write sequence, therefore the FLASH_Write_Init function must be called before this one.
This function can be called several times before committing actual write-to-Flash operation FLASH_Write_DoWrite. This depends on the organization of the RTSP module for the certain dsPIC30. Please consult the Datasheet for particular dsPIC30 on this subject.
Example
//--- writes data from an array of char, in "compact" manner
char cArr[] = "supercalifragillisticexpialidotious"; //35+1 bytes
void * pv1;
...
pv1 = cArr;
FLASH_Write_Init(0x006000, pv1);                     //init
FLASH_Write_Loadlatch4_Compact();                    //12 bytes
FLASH_Write_Loadlatch4_Compact();                    //12 bytes
FLASH_Write_Loadlatch4_Compact();                    //12 bytes
FLASH_Write_DoWrite();                               //commit write
 
Notes None.

FLASH_Write_DoWrite


Prototype void FLASH_Write_DoWrite();
Description Commits the FLASH write operation.
Parameters None.
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code FLASH_Erase32, or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
This function is used as a part of the Flash write sequence, therefore FLASH_Write_Init and certain number of FLASH_Write_Loadlatch4 or FLASH_Write_Loadlatch4_Compact function calls must be made before this one.
This function is to be called once, at the and of the FLASH write sequence.
Example
//--- writes data from an array, in "regular" manner
unsigned int iArr[16] = {'m', 'i', 'k', 'r', 'o', 'E', 'l', 'e', 'k'};
void * pv1;
...
pv1 = iArr;
FLASH_Write_Init(0x006100, pv1);
FLASH_Write_Loadlatch4();
FLASH_Write_Loadlatch4();
FLASH_Write_DoWrite();
Notes None.

FLASH_Read4


Prototype unsigned int* FLASH_Read4(unsigned long address, unsigned int *write_to);
Description Reads one latch row (4 instructions, 8 addresses) in the "regular" mode.

Parameters
  • address: starting address of the FLASH memory block to be read
  • write_to: starting address of RAM buffer for storing read data
Returns Starting address of RAM buffer for storing read data.
Requires Nothing.
Example
//--- reads 8 bytes (4 words) from location 0x006000 and stores it to *pv1;
unsigned int *pv1;
...
FLASH_Read4(0x006000, pv1);
Notes The user should take care of the address alignment (see the explanation at the beginning of this page).

FLASH_Read4_Compact


Prototype void* FLASH_Read4_Compact(unsigned long address, void *write_to);
Description Reads one latch row (4 instructions, 8 addresses) in the "compact" mode.
Parameters
  • address: starting address of the FLASH memory block to be read
  • write_to: starting address of RAM buffer for storing read data
Returns Starting address of RAM buffer for storing read data.
Requires Nothing.
Example
//--- reads 12 bytes (4 words) from location 0x006000 and stores it to *pv1;
unsigned int *pv1;
...
FLASH_Read4_Compact(0x006000, pv1);
Notes The user should take care of the address alignment (see the explanation at the beginning of this page).

PIC24 and dsPIC33 Functions


FLASH_Erase


Prototype void FLASH_Erase(unsigned long address);
Description Erases one block (512 instructions, 1024 addresses, 1536 bytes) from the program FLASH memory.
Parameters
  • address: starting address of the FLASH memory block
Returns Nothing.
Requires Nothing.
Example
//--- erase the flash memory block, starting from address 0x006400
unsigned long flash_address = 0x006400;
...
FLASH_Erase(flash_address);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write


Prototype void FLASH_Write(unsigned long address, unsigned int *data_);
Description Fills one writeable block of Flash memory (64 instructions, 128 addresses, 192 bytes) in the "regular" mode. Addresses and data are being mapped 1-on-1. This also means that 3rd byte of each program location remains unused.
Parameters
  • address: starting address of the FLASH memory block
  • data_: data to be written
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code (through the RTSP), or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
Example
unsigned int iArr[64] = {'m', 'i', 'k', 'r', 'o', 'E', 'l', 'e', 'k', 't', 'r', 'o', 'n', 'i', 'k', 'a'};
void * pv1;
...
pv1 = iArr;
FLASH_Write(0x006500, pv1);
Notes The user should take care about the address alignment (see the explanation at the beginning of this page).

FLASH_Write_Compact


Prototype void FLASH_Write_Compact(unsigned long address, char *data_);
Description Fills a portion of Flash memory (64 instructions, 128 addresses, 192 bytes) using the dsPIC33 and PIC24s RTSP (Run Time Self Programming) module, in the "compact" manner. This method uses all available bytes of the program FLASH memory, including those that are not mapped to address space (every 3rd byte).
Parameters
  • address: starting address of the FLASH memory block
  • data_: data to be written
Returns Nothing.
Requires The block to be written to must be erased first, either from the user code (FLASH_Erase), or during the programming of MCU. Please note that block size that is to be erased is different from the one that can be written with this function!
Example
char cArr[] = "supercalifragillisticexpialidotiousABCDEFGHIJKLMNOPRSTUVWXYZ1234";
void * pv1;
...
pv1 = cArr;
FLASH_Write_Compact(0x006400, pv1);
Notes The user should take care of the address alignment (see the explanation at the beginning of this page).

FLASH_Read


Prototype unsigned int* FLASH_Read(unsigned long address, unsigned int *write_to, unsigned NoWords);
Description Reads required number of words from the flash memory in the "regular" mode.
Parameters
  • address: starting address of the FLASH memory block to be read
  • write_to: starting address of RAM buffer for storing read data
  • NoWords: number of words to be read
Returns Address of RAM buffer for storing read data.
Requires
Example
unsigned Buffer[64];
unsigned long start_address = 0x6500;
...
FLASH_Read(start_address, Buffer, 10);
Notes The user should take care of the address alignment (see the explanation at the beginning of this page).

FLASH_Read_Compact


Prototype void *FLASH_Read_Compact(unsigned long address, void *write_to, unsigned NoBytes);
Description Reads required number of bytes from the flash memory in the "compact" mode.
Parameters
  • address: starting address of the FLASH memory block to be read
  • write_to: starting address of RAM buffer for storing read data
  • NoBytes: number of bytes to be read
Returns Address of RAM buffer for storing read data.
Requires
Example
char Buffer[64];
unsigned long start_address = 0x6500;
...
FLASH_Read_Compact(start_address, Buffer, 10);
Notes The user should take care of the address alignment (see the explanation at the beginning of this page).

Library Example


In this example written for dsPIC30F4013, various read/write tecniques to/from the on-chip FLASH memory are shown. Flash memory is mapped to address space 3:2, meaning every 3 consecutive bytes of Flash have 2 consecutive address locations available.
That is why mikroE's library allows data to be written to Flash in two ways: 'regular' and 'compact'. In 'regular' mode, which is used for variables that are size of 2 bytes and more, the 3rd (un-addressable) byte remains unused.
In 'compact' mode, which can be used for 1 byte-sized variables/arrays, all bytes of flash are being used.

unsigned int iArr[8] = {'m', 'i', 'k', 'r', 'o', 'E', 'l', 'e'};
char cArr[] = "mikroElektronika Flash example";
char cArr2[40];

void * pv1;
unsigned bb;

void main() {
  unsigned i;

  pv1 = cArr;

/*
  This is what FLASH_Write_Compact() does 'beneath the hood'
 *
  FLASH_Write_Init(0x006000, pv1);
  FLASH_Write_Loadlatch4_Compact();
  FLASH_Write_Loadlatch4_Compact();
  FLASH_Write_Loadlatch4_Compact();
  FLASH_Write_DoWrite();
 */

  //--- erase the block first
  FLASH_Erase32(0x006000);

  //--- write compact format to flash
  FLASH_Write_Compact(0x006000, pv1, 36);

  //--- read compact format
  pv1 = cArr2;
  FLASH_Read4_Compact(0x006000, pv1);
  pv1 += 12;
  FLASH_Read4_Compact(0x006008, pv1);
  pv1 += 12;
  FLASH_Read4_Compact(0x006010, pv1);
  pv1 += 12;
  *pv1 = 0;  //termination
  
  //--- show what has been written
  i = 0;
  UART1_Init(9600);
//  UART1_Write_Text("Start");
  UART1_Write(10);
  UART1_Write(13);
  while(cArr2[i]) {
    bb = cArr2[i++];
    UART1_Write(bb);
  }

  //--- now for some non-compact flash-write
  pv1 = iArr;
  //--- erase the block first
  FLASH_Erase32(0x006100);
  FLASH_Write_Init(0x006100, pv1);
  FLASH_Write_Loadlatch4();
  FLASH_Write_Loadlatch4();
  FLASH_Write_DoWrite();
}

ROBOT ARM TUTORIAL

ROBOT ARM TUTORIAL

Robot Arm Tutorial Degrees of Freedom Robot Workspace Mobile Manipulators Force Calculations Forward Kinematics Inverse Kinematics Motion Planning Velocity Sensing End Effector Design

About this Robot Arm Tutorial
The robot arm is probably the most mathematically complex robot you could ever build. As such, this tutorial can't tell you everything you need to know. Instead, I will cut to the chase and talk about the bare minimum you need to know to build an effective robot arm. Enjoy! To get you started, here is a video of a robot arm assignment I had when I took Robotic Manipulation back in college. My group programmed it to type the current time into the keyboard . . . (lesson learned, don't crash robot arms into your keyboard at full speed while testing in front of your professor) You might be also interested in a robot arm I built that can shuffle, cut, and deal playing cards.
Degrees of Freedom (DOF)
The degrees of freedom, or DOF, is a very important term to understand. Each degree of freedom is a joint on the arm, a place where it can bend or rotate or translate. You can typically identify the number of degrees of freedom by the number of actuators on the robot arm. Now this is very important - when building a robot arm you want as few degrees of freedom allowed for your application!!! Why? Because each degree requires a
motor, often an encoder, and exponentially complicated algorithms and cost. Denavit-Hartenberg (DH) Convention
The Robot Arm Free Body Diagram (FBD)

The Denavit-Hartenberg (DH) Convention is the accepted method of drawing robot arms in FBD's. There are only two motions a joint could make: translate and rotate. There are only three axes this could happen on: x, y, and z (out of plane). Below I will show a few robot arms, and then draw a FBD next to it, to demonstrate the DOF relationships and symbols. Note that I did not count the DOF on the gripper (otherwise known as the end effector). The gripper is often complex with multiple DOF, so for simplicity it is treated as separate in basic robot arm design.
4 DOF Robot Arm, three are out of plane:

4 DOF Denso Robot Arm4 DOF FBD
3 DOF Robot Arm, with a translation joint:

3 DOF Denso Robot Arm3 DOF FBD
5 DOF Robot Arm:

Servo Robot Arm5 DOF FBD
Notice between each DOF there is a linkage of some particular length. Sometimes a joint can have multiple DOF in the same location. An example would be the human shoulder. The shoulder actually has three coincident DOF. If you were to mathematically represent this, you would just say link length = 0. Human Arm Human Arm FBD Also note that a DOF has its limitations, known as the configuration space. Not all joints can swivel 360 degrees! A joint has some max angle restriction. For example, no human joint can rotate more than about 200 degrees. Limitations could be from wire wrapping, actuator capabilities, servo max angle, etc. It is a good idea to label each link length and joint max angle on the FBD. Label Robot Arm
(image credit: Roble.info) Your robot arm can also be on a mobile base, adding additional DOF. If the wheeled robot can rotate, that is a rotation joint, if it can move forward, then that is a translational joint. This mobile manipulator robot is an example of a 1 DOF arm on a 2 DOF robot (3 DOF total).
Robot Workspace
The robot workspace (sometimes known as reachable space) is all places that the end effector (gripper) can reach. The workspace is dependent on the DOF angle/translation limitations, the arm link lengths, the angle at which something must be picked up at, etc. The workspace is highly dependent on the robot configuration.

Since there are many possible configurations for your robot arm, from now on we will only talk about the one shown below. I chose this 3 DOF configuration because it is simple, yet isnt limiting in ability. Simple Robot Arm Now lets assume that all joints rotate a maximum of 180 degrees, because most servo motors cannot exceed that amount. To determine the workspace, trace all locations that the end effector can reach as in the image below. Robot Arm Workspace Now rotating that by the base joint another 180 degrees to get 3D, we have this workspace image. Remember that because it uses servos, all joints are limited to a max of 180 degrees. This creates a workspace of a shelled semi-sphere (its a shape because I said so). 3D Robot Arm Workspace If you change the link lengths you can get very different sizes of workspaces, but this would be the general shape. Any location outside of this space is a location the arm cant reach. If there are objects in the way of the arm, the workspace can get even more complicated. Here are a few more robot workspace examples:
Robot Arm WorkSpace Robot Arm WorkSpace
Cartesian Gantry Robot Arm

Robot Arm WorkSpace Robot Arm WorkSpace
Cylindrical Robot Arm

Robot Arm WorkSpace Robot Arm WorkSpace
Spherical Robot Arm

Robot Arm WorkSpace Robot Arm WorkSpace
Scara Robot Arm

Robot Arm WorkSpace Robot Arm WorkSpace
Articulated Robot Arm


Mobile Manipulators
A
moving robot with a robot arm is a sub-class of robotic arms. They work just like other robotic arms, but the DOF of the vehicle is added to the DOF of the arm. If say you have a differential drive robot (2 DOF) with a robot arm (5 DOF) attached (see yellow robot below), that would give the robot arm a total sum of 7 DOF. What do you think the workspace on this type of robot would be? Mobile Robot with Arm - Mobile Manipulator
Force Calculations of Joints
This is where this tutorial starts getting heavy with math. Before even continuing, I strongly recommend you read the mechanical engineering tutorials for
statics and dynamics. This will give you a fundamental understanding of moment arm calculations. The point of doing force calculations is for motor selection. You must make sure that the motor you choose can not only support the weight of the robot arm, but also what the robot arm will carry (the blue ball in the image below). The first step is to label your FBD, with the robot arm stretched out to its maximum length.
Robot Arm Torque and Force Analysis Choose these parameters:
  • weight of each linkage
  • weight of each joint
  • weight of object to lift
  • length of each linkage

Next you do a moment arm calculation, multiplying downward force times the linkage lengths. This calculation must be done for each lifting actuator. This particular design has just two DOF that requires lifting, and the center of mass of each linkage is assumed to be Length/2. Torque About Joint 1:
    M1 = L1/2 * W1 + L1 * W4 + (L1 + L2/2) * W2 + (L1 + L3) * W3

Torque About Joint 2:
    M2 = L2/2 * W2 + L3 * W3

As you can see, for each DOF you add the math gets more complicated, and the joint weights get heavier. You will also see that shorter arm lengths allow for smaller torque requirements.
Robot Arm Calculator Too lazy to calculate forces and torques yourself? Try my robot arm calculator to do the math for you.






Forward Kinematics
Forward kinematics is the method for determining the orientation and position of the end effector, given the joint angles and link lengths of the robot arm. To calculate forward kinematics, all you need is highschool trig and algebra.

For our robot arm example, here we calculate end effector location with given joint angles and link lengths. To make visualization easier for you, I drew blue triangles and labeled the angles. Robot Arm Kinematics Assume that the base is located at x=0 and y=0. The first step would be to locate x and y of each joint. Joint 0 (with x and y at base equaling 0):
    x0 = 0
    y0 = L0

Joint 1 (with x and y at J1 equaling 0):
    cos(psi) = x1/L1 => x1 = L1*cos(psi)
    sin(psi) = y1/L1 => y1 = L1*sin(psi)

Joint 2 (with x and y at J2 equaling 0):
    sin(theta) = x2/L2 => x2 = L2*sin(theta)
    cos(theta) = y2/L2 => y2 = L2*cos(theta)

End Effector Location (make sure your signs are correct):
    x0 + x1 + x2, or 0 + L1*cos(psi) + L2*sin(theta)
    y0 + y1 + y2, or L0 + L1*sin(psi) + L2*cos(theta)
    z equals alpha, in cylindrical coordinates

The angle of the end effector, in this example, is equal to theta + psi. Too lazy to calculate forward kinematics yourself?
Check out my
Robot Arm Designer v1 in excel.
Inverse Kinematics
Inverse kinematics is the opposite of forward kinematics. This is when you have a desired end effector position, but need to know the joint angles required to achieve it. The robot sees a kitten and wants to grab it, what angles should each joint go to? Although way more useful than forward kinematics, this calculation is much more complicated too. As such, I will not show you how to derive the equation based on your robot arm configuration.

Instead, I will just give you the equations for our specific robot design:
    psi = arccos((x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2))
    theta = arcsin((y * (L1 + L2 * c2) - x * L2 * s2) / (x^2 + y^2))

    where c2 = (x^2 + y^2 - L1^2 - L2^2) / (2 * L1 * L2);
    and s2 = sqrt(1 - c2^2);

So what makes inverse kinematics so hard? Well, other than the fact that it involves non-linear simultaneous equations, there are other reasons too. First, there is the very likely possibility of multiple, sometimes infinite, number of solutions (as shown below). How would your arm choose which is optimal, based on torques, previous arm position, gripping angle, etc.? Multiple Solutions for Inverse Kinematics There is the possibility of zero solutions. Maybe the location is outside the workspace, or maybe the point within the workspace must be gripped at an impossible angle. Singularities, a place of infinite acceleration, can blow up equations and/or leave motors lagging behind (motors cant achieve infinite acceleration). And lastly, exponential equations take forever to calculate on a
microcontroller. No point in having advanced equations on a processor that cant keep up. Too lazy to calculate inverse kinematics yourself?
Check out my Robot Arm Designer v1 in excel.
Motion Planning
Motion planning on a robot arm is fairly complex so I will just give you the basics.

Robot Motion Planning Suppose your robot arm has objects within its workspace, how does the arm move through the workspace to reach a certain point? To do this, assume your robot arm is just a simple mobile robot navigating in 3D space. The end effector will traverse the space just like a mobile robot, except now it must also make sure the other joints and links do not collide with anything too. This is extremely difficult to do . . . What if you want your robot end effector to draw straight lines with a pencil? Getting it to go from point A to point B in a straight line is relatively simple to solve. What your robot should do, by using inverse kinematics, is go to many points between point A and point B. The final motion will come out as a smooth straight line. You can not only do this method with straight lines, but curved ones too. On expensive professional robotic arms all you need to do is program two points, and tell the robot how to go between the two points (straight line, fast as possible, etc.). For further reading, you could use the wavefront algorithm to plan this two point trajectory.
Velocity (and more Motion Planning)
Calculating end effector velocity is mathematically complex, so I will go only into the basics. The simplest way to do it is assume your robot arm (held straight out) is a rotating wheel of L diameter. The joint rotates at Y rpm, so therefore the velocity is

    Velocity of end effector on straight arm = 2 * pi * radius * rpm

However the end effector does not just rotate about the base, but can go in many directions. The end effector can follow a straight line, or curve, etc. With robot arms, the quickest way between two points is often not a straight line. If two joints have two different motors, or carry different loads, then max velocity can vary between them. When you tell the end effector to go from one point to the next, you have two decisions. Have it follow a straight line between both points, or tell all the joints to go as fast as possible - leaving the end effector to possibly swing wildly between those points. In the image below the end effector of the robot arm is moving from the blue point to the red point. In the top example, the end effector travels a straight line. This is the only possible motion this arm can perform to travel a straight line. In the bottom example, the arm is told to get to the red point as fast as possible. Given many different trajectories, the arm goes the method that allows the joints to rotate the fastest. Robot Arm Velocity vs Motion Planning Which method is better? There are many deciding factors. Usually you want straight lines when the object the arm moves is really heavy, as it requires the momentum change for movement (momentum = mass * velocity). But for maximum speed (perhaps the arm isn't carrying anything, or just light objects) you would want maximum joint speeds. Now suppose you want your robot arm to operate at a certain rotational velocity, how much torque would a joint need? First, lets go back to our FBD: Rotational Velocity About Alpha Now lets suppose you want joint J0 to rotate 180 degrees in under 2 seconds, what torque does the J0 motor need? Well, J0 is not affected by gravity, so all we need to consider is momentum and inertia. Putting this in equation form we get this: torque = moment_of_inertia * angular_acceleration breaking that equation into sub components we get: torque = (mass * distance^2) * (change_in_angular_velocity / change_in_time) and change_in_angular_velocity = (angular_velocity1)-(angular_velocity0) angular_velocity = change_in_angle / change_in_time Now assuming at start time 0 that angular_velocity0 is zero, we get torque = (mass * distance^2) * (angular_velocity / change_in_time) where distance is defined as the distance from the rotation axis to the center of mass of the arm: center of mass of the arm = distance = 1/2 * (arm_length)
(use arm mass) but you also need to account for the object your arm holds: center of mass of the object = distance = arm_length
(use object mass) So then calculate torque for both the arm and then again for the object, then add the two torques together for the total: torque(of_object) + torque(of_arm) = torque(for_motor) And of course, if J0 was additionally affected by gravity, add the
torque required to lift the arm to the torque required to reach the velocity you need. To avoid doing this by hand, just use the robot arm calculator. But it gets harder . . . the above equation is for rotational motion and not for straight line motions. Look up something called a Jacobian if you enjoy mathematical pain =P Another Video!
In order to better understand robot arm dynamics, we had a robot arm bowling competition using the same DENSO 6DOF robot arms as in the clocks video. Each team programs an arm to do two tasks:
  • Try to place all three of its pegs in the opponents' goal
  • Block opponent pegs from going in your own goal

Enjoy! (notice the different arm trajectories) Arm Sagging
Arm sagging is a common affliction of badly designed robot arms. This is when an arm is too long and heavy, bending when outwardly stretched. When designing your arm, make sure the arm is reinforced and lightweight. Do a finite element analysis to determine bending deflection/stress such as I did on my ERP robot: Robot Arm Stress Analysis Keep the heaviest components, such as motors, as close to the robot arm base as possible. It might be a good idea for the middle arm joint to be chain/belt driven by a motor located at the base (to keep the heavy motor on the base and off the arm). The sagging problem is even worse when the arm wobbles between stop-start motions. The solve this, implement a PID controller so as to slow the arm down before it makes a full stop.
Sensing
Most robot arms only have internal sensors, such as
encoders. But for good reasons you may want to add additional sensors, such as video, touch, haptic, etc. A robot arm without video sensing is like an artist painting with his eyes closed. Using basic visual feedback algorithms, a robot arm could go from point to point on its own without a list of preprogrammed positions. Giving the arm a red ball, it could actually reach for it (visual tracking and servoing). If the arm can locate a position in X-Y space of an image, it could then direct the end effector to go to that same X-Y location (by using inverse kinematics). If you are interested in learning more about the vision aspect of visual servoing, please read the Computer Vision Tutorials for more information.
Robot Arm Visual Tracking and Servoing Haptic sensing is a little different in that there is a human in the loop. The human controls the robot arm movements remotely. This could be done by wearing a special glove, or by operating a miniature model with position sensors. Robotic arms for amputees are doing a form of haptic sensing. Also to note, some robot arms have feed back sensors (such as touch) that gets directed back to the human (vibrating the glove, locking model joints, etc.). Phantom Haptic Sensing Tactile sensing (sensing by touch) usually involves force feedback sensors and current sensors. These sensors detect collisions by detecting unexpected force/current spikes, meaning a collision has occurred. A robot end effector can detect a successful grasp, and not grasp too tight or too lightly, just by measuring force. Another method would be to use current limiters - sudden large current draws generally mean a collision/contact has occurred. An arm could also adjust end effector velocity by knowing if it is carrying a heavy object or a light object - perhaps even identify the object by its weight. Robot Pincers Friction Try this. Close your eyes, and put both of your hands in your lap. Now keeping your eyes closed, move your hand slowly to reach for your computer mouse. Do it!!!! You will see why soon . . . Now what will happen is that your hand will partially miss, but at least one of your fingers will touch the mouse. After that finger touches, your hand will suddenly re-adjust its position because it now knows exactly where that mouse is. This is the benefit of tactile sensing - no precision encoders required for perfect contact!
End Effector Design
In the future I will write a separate tutorial on how to design robot grippers, as it will require many more pages of material.

In the meantime, you might be interested in reading the tutorial for calculating friction and force for robot end effectors. I also went in to some detail describing my robot arm card dealing gripper. Anyway, I hope you have enjoyed this robot arm tutorial! Robot Gripper

How to Calculate Point-to-Point SCARA Movements

How to calculate point-to-point SCARA movements

 
 

Gerard Bush at servo motion specialist INMOCO explains what machine builders need to know about motion control for SCARA robot arms.

Machine builders, system integrators and automation engineers are increasingly called upon to provide two-dimensional positioning systems for applications such as packing multiple products into a tray. Normally a Cartesian mechanism is used for this, with two perpendicular linear actuators working in unison, one providing the x-axis movement, the other the y-axis.
But in some applications a jointed, articulated robot arm is better. These look intriguingly like a human arm, with a shoulder joint and an elbow joint. But programming one to be able to move to positions defined by XY co-ordinates can seem a bit daunting at first. The mathematical discipline used to calculate the necessary movements of the upper and lower arms is known as 'motion kinematics'; while this may seem complicated, it is in fact based on some simple trigonometric principles.
Articulated robot arms are often referred to as SCARA (Selective Compliant Assembly Robot Arm) robots, or PUMA (Programmable Universal Manipulation Arm) robots. In the most common SCARA robot configurations there are four axes of motion, although only two - the shoulder rotation and the elbow rotation - contribute to the positioning of the load point in the XY plane. The third axis is the wrist (or load point), which is often a tool or end effector, and the fourth axis moves the entire arm up and down.

SCARA positioning

How do the two main motion actuators, the shoulder joint and the elbow joint, position the end effector? To answer this, we must calculate the 'forward kinematics' of the arm, ie equations that translate the combined shoulder and elbow rotations into the X and Y co-ordinates of the load point.

The equations, which denote the shoulder-connected segment as length Ls, the elbow-connected segment as Length Le, the shoulder angle as phi (Φ), and the elbow angle as theta (Θ), are given by the equations on the right.This is basic trigonometry. If we know the lengths of the upper and lower arm parts and the angles of the two joints, we can calculate the X and Y axis co-ordinates of the end effector.
But there is an obvious problem here – the mathematical procedure is the reverse of the practical situation. What we actually know is the desired XY co-ordinates; what we want to calculate is the two joint angles. So we need to use 'inverse kinematics' - which is more complicated.

To begin with, there are always two ways to get to every location in the motion plane: with the arm cranked to the left or with the arm cranked to the right. Nevertheless, still using basic trigonometry we can derive the equations for the two joint angles (phi and theta) given X and Y. They are given by the equations on the right.

Dynamics

The above equations are used to calculate positions for the load point, but when it moves from one position to another the natural trajectory is not a straight line but a curve. This is due to the effect of the combined rotation of the two joints. The curved trajectory leads to extra energy consumption compared with a straight line, and can make tracking the load point more difficult. Thus a straight line motion, while not always strictly necessary, is usually preferable.
In order to achieve a straight line motion the speed of rotation of both joints must be constantly varied. This is, in fact, the primary challenge of controlling non-Cartesian mechanisms: to continuously calculate and reset the speed of each joint's rotation to produce direct-line travel.
Fortunately, there are a number of ways to do this. For instance, some motion controllers allow you to download equations, and have sufficient computing power to perform inverse kinematic calculations on the fly.
Another approach, and perhaps the most common method used until relatively recently, is to break the curved trajectory into discrete steps and use standard motion control programs to convert each one into a straight line. Even just a few steps will improve the straightness, and if you break the movement into dozens or hundreds of pieces, very smooth motions will be achieved. The main downside of this segmented approach is that it requires a lot of work from the supervisory software program.
The third approach, now becoming increasingly popular, uses software lookup tables to perform 'reference frame conversions' on the fly. It works like this: you program the starting and ending XY co-ordinates, plus several intermediate co-ordinates that are on the straight line trajectory. Offline the software does all the calculations to move the end effector through each point in turn. Execution of the profile by the motion controller then happens in real time, via a simple and fast lookup operation.
Lookup tables could require fairly powerful controllers. In fact a table that contained every possible movement would take up an impractical amount of memory. So the practical answer is to use only a specific library of moves that are needed for the application. Thus most motion systems have instructions for moves such as 'extend straight out', 'retract straight in' and 'move from station #3 to station #8'.
This library method is useful and convenient, but has certain limitations. The main one is that you must create a separate table for each application. Another is that you cannot use the feedforward techniques employed by servo engineers working with Cartesian systems to compensate for load inertia.

Summary

The fundamental mathematics of control for SCARA robot arms and similar systems are described by forward and inverse kinematics. But these equations can be complex, and create challenges for managing path execution. There are techniques to manage this complexity, of which translation tables that convert XY points to phi and theta angular movements are increasingly the most popular.
This branch of motion control mathematics can seem daunting but, with practice, it can be mastered. And it is always a good idea to seek the assistance of the experts from your motion control provider – after all, they practise the techniques day in and day out.

 

Contour Analysis for Image Recognition in C#


Contour Analysis for Image Recognition in C#

The theory of contour analysis and its practical application to image recognition and OCR
 
Source code and demo include all needed OpenCV libs. Project requires .NET Framework 4.

Introduction

The article describes the theoretical bases of the contour analysis and aspects of its practical application for image recognition. The article also includes library for operation with the contour analysis, and a demo-example.
The first part of the article contains the main definitions and theorems of the contour analysis. I tried to select the principal moments which allow to understand quickly enough an essence of the contour analysis, and to begin its application in practice. Also, I added something from myself. In the core, it concerns some aspects of the theory, and also problems of optimization of algorithms of the contour analysis. The second part of the article is devoted to it. In the same place results of work of algorithms are brought, problems and deficiencies of the given method are described.
The third part describes C# library ContourAnalysis.

Part 1: Bases of the Contour Analysis

What is Necessary for the Contour Analysis (CA)

The CA allows to describe, store, compare and find the objects presented in the form of the exterior outlines - contours.
It is supposed that the contour contains the necessary information on the object shape. Interior points of the object are not accepted to attention. It restricts area of applicability of algorithms of a CA, but reviewing only contours allows to pass from two-dimensional space of the image - to space of contours and by that to lower computing and algorithmic complexity.
CA allows to effectively solve the main problems of a pattern recognition - transposition, turn and a rescaling of the image of object. CA methods are invariant to these transformations.

The Main Concepts

At first, we define such an object contour. The contour is a boundary of object, a population of points (pixels), separating object from a background.
In systems of computer vision, some formats of coding of a contour are used - the code of Freeman, two-dimensional coding, polygonal coding are most known. But all these formats of coding are not used in a CA.
Instead, in a CA the contour is encoded by the sequence consisting of complex numbers. On a contour, the point which is called as starting point is fixed. Then, the contour is scanned (is admissible - clockwise), and each vector of offset is noted by a complex number a+ib. Where a - point offset on x axis, and b - offset on y axis. Offset is noted concerning the previous point.
Owing to the physical nature of three-dimensional objects, their contours are always closed and cannot have self-intersection. It allows to define unambiguously a way of bypass of a contour (to within a direction - on or counter-clockwise). The last vector of a contour always leads to the starting point.
Each vector of a contour we will name elementary vector (EV). And sequence of complex-valued numbers - vector-contour (VC).
Vectors-contours we will designate the big Greek letters, and their elementary a vector - small Greek letters.
Thus, vector-contour Γ of length k can be designated as:
Why in a CA complex-valued coding is used? Because operations over a contour as over a vector of complex numbers possesses remarkable mathematical properties, in comparison with other modes of coding.
Basically, complex coding is close to two-dimensional coding where the contour is defined as a population of the EVs presented in the two-dimensional coordinates. But a difference between operation of scalar product for vectors and for complex numbers - are various. This circumstance also gives priority to CA methods.

Properties of Contours

  1. The sum of an EV of a closed contour is equal to zero. It is trivial - as the elementary vectors result in starting point, their sum is equal to a zero-vector.
  2. The contour-vector does not depend on parallel transposition of the source image. As the contour is encoded relative to starting point, this mode of coding is invariant to shift of an initial contour.
  3. Image turn on certain angle is equivalent to turn of each EV of a contour on the same angle.
  4. The starting point modification conducts to VC cycle shift. As EVs are encoded concerning the previous point, it is clear that at a modification of starting point, the sequence of an EV will be the same, but the first EV will be what begins in the starting point.
  5. The source image rescaling can be considered as multiplication of each EV of a contour to scale factor.

Scalar Product of Contours

As scalar product of contours, Γ and N are called such complex number:
Where k - dimensionality of a VC, γn - n the elementary vector of contour Γ, νn - n EV of contour N. (γn, νn) - the scalar product of complex numbers calculated as:
Let's pay attention to that in a CA the scalar product only a VC of identical dimensionality is supposed. That is the number of the elementary vectors in contours should coincide.
The scalar product of usual vectors and scalar product of complex numbers - differ. If we multiplied an EV as simple a vector, their scalar product would look so:
Compare this formula to the formula (2) and you note that:
  • Outcome of scalar product of vectors is the real number. And outcome of product of complex numbers - a complex number.
  • The real part of scalar product of complex numbers coincides with scalar product of appropriate vectors. That is complex product includes vectorial scalar product.
And now let's remember linear algebra. To be exact - physical sense and properties of scalar product. The scalar product is equal in the linear algebra to product of lengths of vectors on a cosine of the angle in between. It means that two perpendicular vectors will always have zero scalar product, collinear a vector - opposite, will give maximum value of scalar product.
These properties of product allow to use it as a certain measure of closeness of vectors. If it is more - the less angle between vectors, the "more close" they are to each other. For perpendicular vectors - it is lowered to zero, and further becomes negative for the vectors directed every which way. It appears, scalar product (1) also possesses similar properties.
Let's introduce one more concept - the normalized scalar product (NSP):
Where |Γ| and |N| - the norms (length) of contours calculated as:
The NSP in space of complex numbers, also is a complex number.
Thus, unity is greatest possible value of norm of NSP (it follows from a Cauchy–Bunyakovsky–Schwarz inequality: |ab| <= |a||b|), and it is reached only if...
...where μ - the arbitrary complex number.
What does it mean in practice? We recall physical sense of multiplication of complex numbers. At multiplication of complex numbers, their lengths are multiplied, and arguments (angles) - are added. The contour μN means it is the same contour N, but turned both scaled. The scale and turn is defined by a complex number μ.
So, norm of NSP reaches maximum value - unity, only if contour Γ is the same contour N, but turned on some angle and scaled on certain coefficient.
For an example, we consider a scalar multiplication of a contour most on themselves, but turned on a certain angle.
So if to count a NSP of a vector most on itself, we receive NSP=1, if to turn a contour on 90 degrees, we receive NSP=0+i, turn on 180 degrees gives a NSP=-1. Thus, the real part a NSP will give us a cosine of the angle between contours, and the norm of NSP will be always equal to 1.
Similarly, if we increase a VC by some real coefficient (scale) we also receive NSP=1 (it simply to see from the formula (4)).
Note: Norm of NSP is invariant to transposition, rotation and scaling of contours.
So, the norm of the normalized scalar product of contours gives unity only in the event that these two contours are equal to within turn and a scale. Otherwise, the norm of NSP it will be strict less unity.
It is a central conclusion of a CA. Actually, the norm a NSP is an invariant on transposition, rotation and scaling of contours. If there are two identical contours their NSP always gives a unity, is not dependent on where contours are, what their angle of rotation and a scale. Similarly, if contours are various, their NSP will be strict less 1, and also independent of a place, rotation and a scale.
Note: Norm of NSP is measure of contours closeness.
The norm gives a measure of a likeness of contours, and argument a NSP (equal atan(b/a)) - gives us an angle of rotation of contours rather each other.

Correlation Functions of Contours

In the previous chapter, we ascertained that the NSP is extremely useful formula for search of contours similar among themselves. Unfortunately, there is one circumstance not allowing to use it directly. And this circumstance - a starting point choice.
The matter is that the equality (6) is reached only if starting points of contours - coincide. If contours are identical, but the EV reference begins with other starting point the norm the NSP of such contours will not be equal to a unity.
The matter is that the equality (6) is reached only if starting points of contours - coincide. If contours are identical, but the EV reference begins with other starting point the norm the NSP of such contours will not be equal to a unity.
Let's introduce the concept of intercorrelation function (ICF) of two contours:
Where N(m) - a contour received from N by cycle shift by its EV on m of elements.
For an example, if N = (n1, n2, n3, n4), N(1) = (n2, n3, n4, n1), N(2) = (n3, n4, n1, n2) and so on.
What does intercorrelation function show? Values of this function show contours Γ and N are how much similar if to shift starting point N on m positions.
ICF it is defined on all set of integral numbers but as cycle shift on k leads us to an initial contour the ICF is periodic, with phase k. Therefore us will interest values of this function only in limits from 0 to k-1.
Let's discover the magnitude having the maximum norm among values an ICF:
From determinations a NSP and an ICF, it is clear that τmax is a measure of similarity of two contours, invariant to transposition, scaling, rotation and starting point shift.
Thus, the norm |τmax| shows a level of similarity of contours, and reaches unity for identical contours, and the argument arg(τmax) gives an angle of rotation of one contour, concerning another.
Note: Maximum of norm of ICF is a measure of similarity of two contours.
Note: Maximum of norm of ICF is invariant to transposition, scaling, rotation and starting point shift.
Let's introduce one more concept - an autocorrelation function (ACF). The Autocorrelation function is an ICF for which N=Γ. As a matter of fact is a scalar product of a contour most on itself at various shifts of starting point:
Let's consider some properties an ACF:
  1. The ACF does not depend on a choice of starting point of a contour. Really, we look at determination of a scalar product (1). As we see, the starting point modification leads simply to a modification of the order of summable elements and does not result to a sum modification. This conclusion is not so obvious but if to ponder upon sense an ACF it is clear.
  2. The norm an ACF is symmetric concerning a central reference k/2. As the ACF is the total of pairwise products of an EV of a contour each pair meets two times on an interval from 0 to k. Let, for example, N = (n1, n2, n3, n4), we write values an ACF for different m:
    ACF(0)=(n1,n1)+(n2,n2)+(n3,n3)+(n4,n4)
    ACF(1)=(n1,n2)+(n2,n3)+(n3,n4)+(n4,n1)
    ACF(2)=(n1,n3)+(n2,n4)+(n3,n1)+(n4,n2)
    ACF(3)=(n1,n4)+(n2,n1)+(n3,n2)+(n4,n3)
    ACF(4)=(n1,n1)+(n2,n2)+(n3,n3)+(n4,n4)

    Let's note that items in an ACF(1) same, as in an ACF(3), to within permutation of factors. And recalling that for complex numbers (a, b) = (b, a)*, we receive that an ACF(1) = ACF(3)*, where * - a conjugate complex number sign.
    And as |a*| = |a| turns out that norms of ACF(1) and an ACF(3) - coincide.
    Similarly, norms an ACF(0) and an ACF(4) coincide.
    Further, everywhere under an ACF we will understand only a function part on an interval from 0 to k/2 as the remaining part of function - is symmetric to the first part.
  3. If the contour has any symmetry to turn its ACF has similar symmetry. For an example, we bring graphics an ACF for some contours:
    In pictures, the norm the ACF is represented by dark blue color (an ACF it is represented only for an interval from 0 to k/2).
    As we see, all contours, except last have symmetry to turn that the ACF leads to symmetry. The last contour of such symmetry has no, and the graph its ACF - is not symmetric.
  4. In a sense, it is possible to consider a contour ACF as characteristic of the shape of a contour. So, shapes, close to a circle have the uniform values of the norm an ACF (see a picture for a circle). Shapes strongly prolated in one direction - have a dip in a central part an ACF (see a rectangle picture). The shapes passing in at turn, have a maxima an ACF in an appropriate place (the quadrate ACF see).
  5. Normed the ACF does not depend on a scale, position, rotation and a choice of starting point of a contour. It follows from point of 1st and from properties a NSP. Video showing this fact:
On it, we finish a theoretical part of a CA. In this part, I brought only the main concepts and theoretical calculations which I will immediately apply for pattern recognition in the second part.

Part 2: Practical Application of the Contour Analysis

The General Algorithm of Recognition

So, we will solve the pattern recognition task on the image.
The general sequence of an operation at recognition looks so:
  1. Preliminary handling of the image - smoothing, a filtration of noise, a contrast raise
  2. Binarization of the image and selection of contours of objects
  3. Initial filtration of contours on perimeter, squares, to a crest factor, fractality and so on
  4. Coercion of contours to uniform length, smoothing
  5. Search of all discovered contours, searching of the template maximum similar to the given contour
We will not consider points 1 and 3, they are specific to application area, and have the small relation to a CA.
Further we consider an algorithm body - searching and comparing of contours with templates. And then - we stop on binarization, coercion to uniform length and smoothing of contours a little.

Estimation of Performance of Algorithms in a CA

At once, we make an estimation of high-speed performance of the algorithms based on a CA.
Let the image already binarized and on it contours are selected.
As further we will work only with points of contours, we estimate their general amount on the image.
For this purpose, we take the image a size n*n pixels. Then breed its uniform grid with a step s. The total length of all grid lines is:
It turns out that passage from the plane two-dimensional image to contours does not reduce dimensionality of the task. We as before work in complexity O(n2).
However, here it is possible to make some notes:
  1. The brought estimation is extremal. In the real image, not all contours have the minimum size, and they do not cover all square of the image. Hence, the number of contours and their total perimeter can be much less 2n2/s. Ideally - if the image contains one object, without noise the CA will work only with one contour, and its number of pixels makes not so square, and the linear association from n. In a case of operation with the image as with a two-dimensional array of pixels (for example, applying the cascade filters of Haar) - we always work with all pixels of images irrespective of, how many objects on it are represented.
  2. As the image in the form of contours already has natural segmentation - is divided into contours it is possible to carry out a filtration of parts of the image to simple indications. Among them - contour square, perimeter, the ratio of quadrate of perimeter to squares. Thus, there is enough simple and effective mechanism of a preliminary filtration of parts of the image.
  3. Basic algorithms of two-dimensional methods, as a rule, are not invariant to a scale (SURF, SIFT) or to rotation (Haar). Therefore, actually, they work in three-dimensional space where one more measurement appears because of necessity to sort out scales or angles of rotation. As CA methods are invariant to a scale and rotation here such problem does not exist (elimination - uninvariance to starting point, but more low we consider effective methods of struggle against this complexity).
  4. The CA allows to process the image in a progressive mode. It means that we can sort contours on any to an indication (for example, by square or on a gradient of boundaries, or on brightness, etc.). And then to treat the first contour, and to produce outcome. Remaining contours to process in a background mode. It means that the first outcome (and in many application-oriented tasks it and it is necessary) can be received for O(n) that is an excellent estimation for algorithms of pattern recognition.
  5. As contours are independent from each other algorithms of a recognition it is easy to parallelize. Besides, algorithms are very simple and can be executed on graphic processors.
All above-stated speculations concern only handlings of contours. Certainly, the stage of obtaining of contours does not disappear anywhere, and it demands at least O(n2). However, in practice, this stage occupies no more than 10-20 % from the general time of handling.
The population of these factors essentially reduces a constant of computing complexity, and at the modern progress of computers, the CA can quite be used as algorithm of real time.
Let's up reasons of complexity O(n2). The contour, as a matter of fact, is one-dimensional object. It is a vector of complex-valued numbers. However the count of contours grows proportionally to image size quadrate. From here also, there is complexity O(n2). From this, it is possible to draw a conclusion that the most effective mode of performance gain is reduction of count of contours by a preliminary filtration.
Now we estimate processing complexity separately the taken contour.
In the previous part, we became clear that for comparing of contours, it is necessary to discover a maxima an ICF (formulas 7, 8). We estimate complexity of an evaluation. The scalar product demands O (k) operations, where k - length of a contour. And the ICF needs to be produced for an evaluation k evaluations of scalar products. Means, the ICF demands evaluations of order O (k2). Besides, considering that comparing goes not with one template, and with set of templates the total time of searching of a template for a contour can be estimated as
Where k - length of a contour, and t - number of template contours.
It is not such a good estimation. In the following chapter, we will struggle with it, and we will manage to reduce essentially a constant at this estimation.
Total, complexity of identification of all contours of the image makes:
Where - n - the linear image size, k - length of a contour, t - count of templates.
Complexity of identification of the first contour (in a progressive mode):

Contour Descriptor

As it has been shown in the previous chapter, complexity of an evaluation of ICF is O(k2). Despite that k - a constant (to uniform length we consider process of coercion of contours further), all the same, an evaluation an ICF - labor-intensive process. At k=30 on an evaluation the ICF is required 900 operations of multiplying of complex numbers. And at presence t=1000 templates, searching of a template for a contour demands about one million multiplyings of complex numbers.
For fast searching of templates, it is necessary to introduce the certain descriptor characterizing the shape of a contour. Thus, close among themselves contours should have the close descriptors. It would save us the procedure of an evaluation an ICF of a contour with each template. Would be to compare only descriptors and if they are close - only in that case enough - to calculate an ICF. Comparing of descriptors should be fast. Ideally, one number should be a descriptor.
Unfortunately, in the literature, I did not discover explicit exposition of the suitable descriptors describing the shape of a contour. Nevertheless, such descriptor exists.
In the first part, we explicitly considered properties of an autocorrelation function. In the same place, we marked that an ACF invariantly to transposition, rotation, scaling and a starting point choice. And besides, the ACF is a function of one contour, instead of two, as an ICF.
Proceeding from it, the ACF can be selected as the descriptor of shape of a contour. The close contours will always have the close values an ACF.
Note: ACF is shape descriptor of a contour.
Comparing two ACF has complexity O(k) that already considerably it is better than O(k2) for an ICF. Besides, as we remember, an ACF because of symmetry, is defined only on an interval from 0 to k/2 that else twice reduces time of evaluations.
If the base of templates stores their ACF, searching of a template for a contour, by comparing the ACF, makes O(kt). It is already good estimation. But also to us it was to be refined.
For reduction of time of comparing an ACF, I apply wavelet convolution an ACF. I will omit description of this process, it is possible to look at it in source codes.
Let's mark the following:
  1. Comparing an ACF, generally, does not save us of the necessity of an evaluation an ICF. Only the ICF states an exact estimation of closeness of contours. The ACF, generally, can coincide for various contours. But, thus, preliminary selection of templates on an ACF essentially narrows down count of candidates on comparing on an ICF.
  2. In some cases, comparing the ACF can be enough for identification of contours. If on the image markers with the simple geometrical shape are searched, and the probability of failed recognitions is not essential, it is possible to omit process of comparing an ICF, having restricted only to comparing of convolutions an ACF. Such algorithm gives the greatest system performance.

Equalization of Contours

As it has already been marked in part 1, CA methods assume identical length of contours. In the real image contours have arbitrary length.
Therefore, for searching and comparing of contours, all of them should be led to uniform length. This process is called equalization.
At first, we fix length of a VC which we will use in our system of a recognition. We designate it k.
Then, for each initial contour A we create vector-contour N in length k. Further probably two variants - or the initial contour has greater number of an EV than k, or smaller number than k.
If an initial contour more necessary it is sorted out all by its EV, and we consider elements N as the total of all EVs, as follows (C#):
Complex[] newPoint = new Complex[newCount];

for (int i = 0; i < Count; i++)
    newPoint[i * newCount / Count] += this[i];
The picture shows the meaning of equalization:


This algorithm is rough enough, especially for lengths the little big k, however it quite puts into practice. If the initial contour is less k we produce interpolation, and is considered approximately so:
Complex[] newPoint = new Complex[newCount];

for (int i = 0; i < newCount; i++)
{
    double index = 1d * i * Count / newCount;
    int j = (int)index;
    double k = index - j;
    newPoint[i] = this[j] * (1 - k) + this[j + 1] * k;
}
There is a problem which needs to select value k, what value is optimal? The answer to this problem is completely defined by specificity of application area. On the one hand, the big length k means the big expenditures on evaluations. On the other hand - small values k carry less information, and accuracy of a recognition decreases, and noise recognition - increases.
The diagram of the main parameters of system of a recognition, depending on the selected length of a VC:
This diagram was under construction for system of a recognition of Latin letters, on the basis of tests ICDAR.
As we see, at great values (from 80 and above) lengths of a contour, system parameters are stabilized and change (except time of handling which, naturally, increases) a little.
At small values k (less than 30) - the number of noise recognitions (a recognition of noise or other not character image elements - as letters), decreases number of successful recognized, and the number of false recognitions (fail recognitions) increases.
Thus, value k=30 is optimal for the given system of recognition.
Pay attention that magnification of length of a contour, after certain level (25 and above) at all does not lead to improving of quality of a recognition. It is connected by that in the described method, equalization is led simultaneously with smoothing of contours. At great values of length, smoothing becomes more and more small-scale, and the contour becomes too detailed, and therefore, more distinct from template contours (in the brought test numerals of learning sampling have no exact coincidence to test images).

CA Limitation

The CA has two groups of factors negatively influencing outcomes of a recognition.
The first group of factors is connected to a problem of selection of a contour on images. The contour is strictly certain discrete structure. However the objects feeblly expressed on an environmental background have a great number of real images. The object cannot have a clear boundary, it can be identical on brightness and color with a background, it can be noised and so on. All these factors lead to that the contour or cannot be selected generally, or it is selected incorrectly, and does not correspond to boundary of objects.
In a picture on the right, it is shown as a binarized image. Small "bridges" between the image of digit and an environmental background makes a digit contour unrecognizable by CA methods.
Such cases are very heavy for a CA. After all, the CA makes sense, only in that case when the object contour is defined unambiguously correctly in all points.
The second group of the factors complicating a CA, is connected to principles of the contour analysis. CA methods assume that the contour describes all object bodily, and does not suppose any intersections with other objects or incomplete visibility of object.
On a picture on the right - binarized image. It is visible that the photo flare, going horizontal line, does letters indiscernible for a CA.
Thus, the CA has feeble stability to parasites, does not suppose intersection or the partial visibility of object.

Outcomes of Testing of a Method

Despite limitation, CA methods are attractive for the simplicity and high-speed performance. In the presence of accurately expressed object on a contrasting background and lack of parasites of a CA well copes with a recognition.
Testing of methods of a CA for tests ICDAR yields outcome of 48% of the recognized symbols. It is quite a good outcome, considering that this test is very uneasy, and contains a great number bad-read images of letters. Thus, the CA treated 249 images of a various size (from 400x400 to 1280x960) for 30 seconds.
Besides recognition of freeze frame images, high-speed performance of a CA allows to process video in a real-time mode. The video showing a possibility of a CA:

Part 3: ContourAnalysis Library

The library includes two projects. The first project ContourAnalysis - implements base functions of the contour analysis - creation of contours, a scalar product of contours, equalization, evaluation ICF and ACF, comparing and searching of templates.
The class Contour - creates and stores contours. It contains basic operations for contours - scalar product, scaling, equalization, normalization, a spectrum evaluation, evaluation ACF and ICF.
The class Template is used for creation of base of templates. This class stores a contour, it ACF, descriptors ACF, the linear parameters of an initial contour (area), norm of a contour. Also, the template has a name which is used as the recognized value.
Class TemplateFinder implements fast searching of a template for the given contour. Outcome of operation of this class is FoundTemplateDesc which contains an initial contour, and the template discovered for the given contour. Besides, FoundTemplateDesc contains similarity rate, angle of rotation and a scale of a contour, relative to a template.
The second project - ContourAnalysisProcessing - contains methods for preliminary handling of the image, selection of contours, their filtrations and a recognition. Also, it contains tools for automatic generation of templates for recognition of printing symbols.
Project ContourAnalysisProcessing uses library OpenCV (EmguCV .NET wrapper) for operation with the image.
The class ImageProcessor is used for image handling. It, also, stores base of templates.
Method ImageProcessor.ProcessImage() receives the image on an input. Outcome of operation are lists discovered contours (ImageProcessor.samples) and the list of the recognized contours (ImageProcessor.foundTemplates).
The class ImageProcessor contains also settings for searching of contours.
The ImageProcessor works as follows:
  1. At first, the image will be transformed to gray-scale:
  2. Then it is binarized by AdaptiveThreshold:
  3. Contours are extracted:
  4. Contours are filtered on linear parameters (length, square, etc.):
  5. Contours are equalized, there is calculation ACF and ACF descriptors:
  6. And then there is maximum suitable template:
The static class TemplateGenerator is used for automatic generation of templates of numerals of a certain font.
Except two library projects, there is a demo-example showing operation of library with the webcam. The demo contains tools for creation and editing of templates, recognition tuning, and allows to produce recognition of contours from the webcam, and also allows to create the augmented reality.

How to "train" CA